Investigating Errors with Logs and Traces
Distributed tracing becomes even more powerful when combined with logs. When an error occurs, you get not just the error message, but the complete trace context showing what led up to the error.
Why Traces + Logs Together?
Traditional logging shows you:
- What went wrong
- When it happened
- In which service
Distributed tracing + logs show you:
- What went wrong
- When it happened
- In which service
- What the user was doing
- What happened before the error
- Which database queries ran
- How long each operation took
- The complete request flow
This context is invaluable for debugging production issues.
Using Sentry Logs with OTLP
Scenario 1: Debugging Payment Failures
Users are reporting failed checkouts. Let’s use logs and traces to understand why payments are failing.
-
Navigate to Logs Explorer
In your Sentry project, go to Explore > Logs
-
Search for payment errors
Use the search bar to filter logs:
severity is error AND exception.message is payment failed (toggle ignore case in search bar)Or search by text:
"Payment failed"You’ll see log entries showing payment failures with structured context.
-
Examine log properties
Click any log entry to expand it. You’ll see properties like:
{"severity": "error","message": "Payment failed for order","order.id": 42,"order.user_id": 1,"payment.error": "Payment failed: card_declined","payment.reason": "card_declined","trace_id": "a1b2c3d4e5f6..."} -
Navigate to the trace
Click the trace_id link or the “View Trace” button. This opens the Trace Waterfall View showing:
└─ POST /api/orders (650ms)├─ order.validate_user (45ms)├─ order.validate_products (220ms)├─ inventory.check (180ms)├─ order.create_record (95ms)├─ inventory.reserve (145ms)└─ payment.process (250ms) [payment.status: error] -
Understand the failure flow
In the trace, you can see:
- The order was created successfully (order.id: 42)
- Inventory was reserved
- Payment failed with reason:
card_declined - Inventory was automatically released (compensating transaction)
- Order status was updated to
cancelled
This shows your error handling is working correctly!
-
Group failures by reason
Back in Logs Explorer, filter:
severity is errorClick the Aggregates tab, then:
- Group by:
exception.message
You’ll see the distribution of error types:
Product not found: 4Insufficient inventory for one or more items: 45Connection terminated due to connection timeout: 31password authentication failed for user 'neondb_owner': 26Payment failed: expired_card: 6Payment failed: card_declined: 11
This helps prioritize which errors need investigation - connection and authentication errors may indicate infrastructure issues, while payment failures might need better user messaging.
- Group by:
Scenario 2: Investigating “Product Not Found” Errors
Your logs show 404 errors for product lookups. Are these legitimate missing products or a caching issue?
-
Search for not found errors
In Logs Explorer:
severity is error AND "Product not found"Or filter by custom properties:
exception.code is NOT_FOUND AND http.path is /api/products -
Examine the error details
Expand a log entry:
{"severity": "error","message": "Product not found","exception.code": "NOT_FOUND","http.method": "GET","http.path": "/api/products/:id","trace_id": "x1y2z3..."} -
View the trace waterfall
Click the trace link. You’ll see:
└─ GET /api/products/999 (85ms)├─ cache.get (3ms) [cache.hit: false]└─ db SELECT products (78ms)Analysis:
- Cache miss (expected for non-existent product)
- Database returned 0 rows
- This is a legitimate “product doesn’t exist” case
-
Analyze frequency over time
In the Logs Explorer, click on the chart view to see when these errors occur:
- Is this a sudden spike?
- Is it consistent traffic?
- Does it correlate with deployments?
-
Group by error patterns
Click the Aggregates tab and group by:
exception.message- See the exact error messageshttp.path- Identify which endpoints have the most not found errors
This helps you understand:
- Are errors concentrated on specific paths?
- Are these likely bot/scraper traffic patterns?
- Do you need to add better error handling or caching?
Using Logs to Monitor Application Health
Beyond errors, logs help you monitor application behavior.
Monitoring Successful Operations
-
Track successful orders
Search logs:
severity is info AND message is Order created successfullyView properties:
order.id: 42order.total_amount: 459.97order.items_count: 3payment.transaction_id: txn_123 -
Create dashboard widgets
From the Logs Explorer query, save as dashboard widgets:
Widget 1: Order Volume
- Metric:
count() - Visualization: Time Series
Widget 2: Average Order Value
- Metric:
avg(order.total_amount) - Visualization: Big Number or Time Series
Widget 3: Items per Order
- Group by:
order.items_count - Metric:
count() - Visualization: Table or Bar Chart
- Metric:
-
Track cache effectiveness
Create a Time Series widget with two series to compare:
Series 1: Cache Hits
message contains "Products fetched from cache"Metric:
count()Series 2: Database Fetches
message contains "Products fetched from database"Metric:
count()View both series on the same chart to visualize your cache hit rate over time.
Connecting Logs, Traces, and Errors
The power of OTLP observability is in the connections:
From Logs to Traces
- Find a relevant log entry
- Click the
trace_idlink - View the complete request flow in waterfall
From Traces to Logs
- View a trace in the waterfall
- Logs associated with that trace appear inline
- See exactly what was logged at each step
Creating a Feedback Loop
- Logs tell you something went wrong
- Traces show you the complete flow
- Span attributes give you the context
- Aggregations reveal patterns
- Alerts notify you proactively
Setting Up Error and Log Alerts
Don’t wait for users to report issues. Create proactive alerts based on log patterns and error frequencies.
How to Create Log Alerts
From Logs Explorer, click Save As > Alert after running your query. Then configure:
- Define your metric: Choose
countwith your desired interval - Filter events: Add your search query to filter specific logs
- Set thresholds: Define Critical/Warning/Resolved levels
- Set actions: Configure notification channels (email, Slack, PagerDuty)
Critical Error Alerts
-
High Error Rate
- Metric:
count(logs) - Filter:
severity is error - Interval: 1 hour
- Threshold: Critical when above 50
Why: Sudden spike in errors indicates a deployment issue, infrastructure problem, or attack. This is your canary alert.
- Metric:
-
Payment Failure Spike
- Metric:
count(logs) - Filter:
message contains "Payment failed" - Interval: 1 hour
- Threshold: Critical when above 10
Why: Payment failures directly impact revenue. While some failures are expected (expired cards), a spike indicates payment gateway issues or integration bugs.
- Metric:
-
Database Connection Errors
- Metric:
count(logs) - Filter:
severity is error AND message contains "connection" - Interval: 1 hour
- Threshold: Critical when above 5
Why: Connection errors mean your app can’t reach the database. This causes cascading failures across all endpoints.
- Metric:
-
Inventory Issues
- Metric:
count(logs) - Filter:
message contains "Insufficient inventory" - Interval: 1 hour
- Threshold: Warning when above 20
Why: Frequent inventory failures might indicate a race condition where multiple orders can reserve the same stock.
- Metric:
Building Error Monitoring Dashboards
Dashboards help you spot error trends before they become critical.
How to Create Log Dashboard Widgets
From Logs Explorer, click Save As > Dashboard Widget after running your query. Widget types:
- Time Series: Error rates over time
- Big Number: Current error count or percentage
- Table: Top errors by type, user, or endpoint
Essential Error Dashboard Widgets
1. Error Rate Overview
Widget Type: Bar or Time Series Configuration:
- Dataset: Logs
- Visualize:
count - Filter:
severity is error
Why: Single pane of glass for overall application health. Spikes indicate incidents.
2. Error Rate by Severity
Widget Type: Time Series Configuration: Add multiple series (click + Add Series)
Series 1:
- Visualize:
count - Filter:
severity is error - Legend Alias: “Errors”
Series 2:
- Visualize:
count - Filter:
severity is warn - Legend Alias: “Warnings”
Why: Understand error severity distribution. Increasing warnings might predict future errors.
3. Top Error Messages
Widget Type: Table Configuration:
- Dataset: Logs
- Visualize:
count - Filter:
severity is error - Group by:
exception.message
Why: Focus engineering effort on most common errors. Sort by count descending to see top issues.
4. Payment Failure Breakdown
Widget Type: Table or Bar Chart Configuration:
- Dataset: Logs
- Visualize:
count - Filter:
message contains "Payment failed" - Group by:
exception.message
Why: Understand why payments fail. Improve error messages or fraud detection based on patterns.
Summary
By combining logs and traces from OpenTelemetry:
✓ Logs provide searchable, structured events ✓ Traces show the complete request flow ✓ Span attributes add rich context ✓ Automatic linking connects logs to traces seamlessly
This combination gives you powerful debugging capabilities even without Sentry SDK’s automatic error detection. The key is thoughtful instrumentation with structured logging and meaningful span attributes.