Investigating Errors with Logs and Traces

Distributed tracing becomes even more powerful when combined with logs. When an error occurs, you get not just the error message, but the complete trace context showing what led up to the error.

Why Traces + Logs Together?

Traditional logging shows you:

What went wrong
When it happened
In which service

Distributed tracing + logs show you:

What went wrong
When it happened
In which service
What the user was doing
What happened before the error
Which database queries ran
How long each operation took
The complete request flow

This context is invaluable for debugging production issues.

Using Sentry Logs with OTLP

Scenario 1: Debugging Payment Failures

Users are reporting failed checkouts. Let’s use logs and traces to understand why payments are failing.

Navigate to Logs Explorer

In your Sentry project, go to Explore > Logs
Search for payment errors

Use the search bar to filter logs:
```
severity is error AND exception.message is payment failed (toggle ignore case in search bar)
```
Or search by text:
```
"Payment failed"
```
You’ll see log entries showing payment failures with structured context.

Examine log properties

Click any log entry to expand it. You’ll see properties like:

{
  "severity": "error",
  "message": "Payment failed for order",
  "order.id": 42,
  "order.user_id": 1,
  "payment.error": "Payment failed: card_declined",
  "payment.reason": "card_declined",
  "trace_id": "a1b2c3d4e5f6..."
}

Navigate to the trace

Click the trace_id link or the “View Trace” button. This opens the Trace Waterfall View showing:

└─ POST /api/orders (650ms)
   ├─ order.validate_user (45ms)
   ├─ order.validate_products (220ms)
   ├─ inventory.check (180ms)
   ├─ order.create_record (95ms)
   ├─ inventory.reserve (145ms)
   └─ payment.process (250ms) [payment.status: error]

Understand the failure flow

In the trace, you can see:
- The order was created successfully (order.id: 42)
- Inventory was reserved
- Payment failed with reason: card_declined
- Inventory was automatically released (compensating transaction)
- Order status was updated to cancelled
This shows your error handling is working correctly!
Group failures by reason

Back in Logs Explorer, filter:
```
severity is error
```
Click the Aggregates tab, then:
- Group by: exception.message
You’ll see the distribution of error types:
- Product not found: 4
- Insufficient inventory for one or more items: 45
- Connection terminated due to connection timeout: 31
- password authentication failed for user 'neondb_owner': 26
- Payment failed: expired_card: 6
- Payment failed: card_declined: 11
This helps prioritize which errors need investigation - connection and authentication errors may indicate infrastructure issues, while payment failures might need better user messaging.

Scenario 2: Investigating “Product Not Found” Errors

Your logs show 404 errors for product lookups. Are these legitimate missing products or a caching issue?

Search for not found errors

In Logs Explorer:

severity is error AND "Product not found"

Or filter by custom properties:

exception.code is NOT_FOUND AND http.path is /api/products

Examine the error details

Expand a log entry:

{
  "severity": "error",
  "message": "Product not found",
  "exception.code": "NOT_FOUND",
  "http.method": "GET",
  "http.path": "/api/products/:id",
  "trace_id": "x1y2z3..."
}

View the trace waterfall

Click the trace link. You’ll see:
```
└─ GET /api/products/999 (85ms)
   ├─ cache.get (3ms) [cache.hit: false]
   └─ db SELECT products (78ms)
```
Analysis:
- Cache miss (expected for non-existent product)
- Database returned 0 rows
- This is a legitimate “product doesn’t exist” case
Analyze frequency over time

In the Logs Explorer, click on the chart view to see when these errors occur:
- Is this a sudden spike?
- Is it consistent traffic?
- Does it correlate with deployments?
Group by error patterns

Click the Aggregates tab and group by:
- exception.message - See the exact error messages
- http.path - Identify which endpoints have the most not found errors
This helps you understand:
- Are errors concentrated on specific paths?
- Are these likely bot/scraper traffic patterns?
- Do you need to add better error handling or caching?

Using Logs to Monitor Application Health

Beyond errors, logs help you monitor application behavior.

Monitoring Successful Operations

Track successful orders

Search logs:

severity is info AND message is Order created successfully

View properties:

order.id: 42
order.total_amount: 459.97
order.items_count: 3
payment.transaction_id: txn_123

Create dashboard widgets

From the Logs Explorer query, save as dashboard widgets:

Widget 1: Order Volume
- Metric: count()
- Visualization: Time Series
Widget 2: Average Order Value
- Metric: avg(order.total_amount)
- Visualization: Big Number or Time Series
Widget 3: Items per Order
- Group by: order.items_count
- Metric: count()
- Visualization: Table or Bar Chart
Track cache effectiveness

Create a Time Series widget with two series to compare:

Series 1: Cache Hits
```
message contains "Products fetched from cache"
```
Metric: count()

Series 2: Database Fetches
```
message contains "Products fetched from database"
```
Metric: count()

View both series on the same chart to visualize your cache hit rate over time.

Connecting Logs, Traces, and Errors

The power of OTLP observability is in the connections:

From Logs to Traces

Find a relevant log entry
Click the trace_id link
View the complete request flow in waterfall

From Traces to Logs

View a trace in the waterfall
Logs associated with that trace appear inline
See exactly what was logged at each step

Creating a Feedback Loop

Logs tell you something went wrong
Traces show you the complete flow
Span attributes give you the context
Aggregations reveal patterns
Alerts notify you proactively

Setting Up Error and Log Alerts

Don’t wait for users to report issues. Create proactive alerts based on log patterns and error frequencies.

How to Create Log Alerts

From Logs Explorer, click Save As > Alert after running your query. Then configure:

Define your metric: Choose count with your desired interval
Filter events: Add your search query to filter specific logs
Set thresholds: Define Critical/Warning/Resolved levels
Set actions: Configure notification channels (email, Slack, PagerDuty)

Critical Error Alerts

High Error Rate
- Metric: count(logs)
- Filter: severity is error
- Interval: 1 hour
- Threshold: Critical when above 50
Why: Sudden spike in errors indicates a deployment issue, infrastructure problem, or attack. This is your canary alert.
Payment Failure Spike
- Metric: count(logs)
- Filter: message contains "Payment failed"
- Interval: 1 hour
- Threshold: Critical when above 10
Why: Payment failures directly impact revenue. While some failures are expected (expired cards), a spike indicates payment gateway issues or integration bugs.
Database Connection Errors
- Metric: count(logs)
- Filter: severity is error AND message contains "connection"
- Interval: 1 hour
- Threshold: Critical when above 5
Why: Connection errors mean your app can’t reach the database. This causes cascading failures across all endpoints.
Inventory Issues
- Metric: count(logs)
- Filter: message contains "Insufficient inventory"
- Interval: 1 hour
- Threshold: Warning when above 20
Why: Frequent inventory failures might indicate a race condition where multiple orders can reserve the same stock.

Building Error Monitoring Dashboards

Dashboards help you spot error trends before they become critical.

How to Create Log Dashboard Widgets

From Logs Explorer, click Save As > Dashboard Widget after running your query. Widget types:

Time Series: Error rates over time
Big Number: Current error count or percentage
Table: Top errors by type, user, or endpoint

Essential Error Dashboard Widgets

1. Error Rate Overview

Widget Type: Bar or Time Series Configuration:

Dataset: Logs
Visualize: count
Filter: severity is error

Why: Single pane of glass for overall application health. Spikes indicate incidents.

2. Error Rate by Severity

Widget Type: Time Series Configuration: Add multiple series (click + Add Series)

Series 1:

Visualize: count
Filter: severity is error
Legend Alias: “Errors”

Series 2:

Visualize: count
Filter: severity is warn
Legend Alias: “Warnings”

Why: Understand error severity distribution. Increasing warnings might predict future errors.

3. Top Error Messages

Widget Type: Table Configuration:

Dataset: Logs
Visualize: count
Filter: severity is error
Group by: exception.message

Why: Focus engineering effort on most common errors. Sort by count descending to see top issues.

4. Payment Failure Breakdown

Widget Type: Table or Bar Chart Configuration:

Dataset: Logs
Visualize: count
Filter: message contains "Payment failed"
Group by: exception.message

Why: Understand why payments fail. Improve error messages or fraud detection based on patterns.

Summary

By combining logs and traces from OpenTelemetry:

✓ Logs provide searchable, structured events ✓ Traces show the complete request flow ✓ Span attributes add rich context ✓ Automatic linking connects logs to traces seamlessly

This combination gives you powerful debugging capabilities even without Sentry SDK’s automatic error detection. The key is thoughtful instrumentation with structured logging and meaningful span attributes.