How do I read structured logs locally without an aggregator?

Most structured loggers ship a pretty-printer for local dev. Pino has pino-pretty, structlog has ConsoleRenderer, zap has a dev config that outputs human-readable lines. You do not need to parse raw JSON in your terminal during development.

My logs are already in Loki/Datadog. Can I add structure without re-deploying?

You can add a grok parser in Datadog or a pattern_replace stage in Loki to extract fields from existing free-text logs. It works, but it is fragile and produces inconsistent field names across services. The right fix is structured logging at the source, rolled out starting with your highest-traffic services.

Should I include stack traces in structured logs?

Yes, but as a separate field, not concatenated into the message. Most loggers serialize Error objects to a stack_trace or err field automatically. A stack trace embedded in the message field breaks grouping and is nearly unreadable in most aggregator UIs.

What fields should every log line include?

At minimum: timestamp, level, message (static string), service, and correlation_id. Everything else is context-dependent. Do not over-engineer the schema upfront — add fields as you discover you need them during incident investigation.

Tidak suka iklan? Pergi Bebas Iklan Hari ini 

Structured Logging Stop console.log-ing Strings and Start Logging JSON Your Ops Team Can Actually Query

Diperbarui pada Jun 25, 2026

Free-text logs are unsearchable at 3am when it matters. Here's what structured JSON logging actually looks like, how correlation IDs work, and how to plug into Loki, Datadog, or CloudWatch without writing a custom parser.

Structured Logging: Stop console.log-ing Strings and Start Logging JSON Your Ops Team Can Actually Query 1

IKLAN · HAPUS?

At 3am, your phone buzzes. Checkout is failing. You SSH in and start grepping logs.

What you find:

[2024-01-15 03:14:09] ERROR Something went wrong in payment flow
[2024-01-15 03:14:09] ERROR User could not complete checkout
[2024-01-15 03:14:09] ERROR Database connection failed or timed out
[2024-01-15 03:14:12] INFO  Retrying request...
[2024-01-15 03:14:15] ERROR Still failing, giving up

Which user? Which payment? Which database host? What exact error code? You have five million log lines from the night, and grep ERROR returns 40,000 results. The logs are technically there. They are just useless.

Here is what the same events look like structured:

{"timestamp":"2024-01-15T03:14:09.442Z","level":"error","message":"payment_authorization_failed","correlation_id":"req_8f2a9b1c","user_id":"usr_4829","order_id":"ord_9182","payment_provider":"stripe","error_code":"card_declined","latency_ms":1203}
{"timestamp":"2024-01-15T03:14:09.447Z","level":"error","message":"db_query_timeout","correlation_id":"req_8f2a9b1c","service":"payment-service","db_host":"pg-primary-1","query":"INSERT INTO orders","query_duration_ms":5002}

Now your Datadog query is @correlation_id:req_8f2a9b1c and you have the full story in one screen. The correlation ID threads together every log line — across every service — that touched that single failing checkout. That is the actual difference between logs that help and logs that do not.

The side-by-side

Unstructured (what most codebases have)

// Source code
console.log(`User ${userId} started checkout`)
console.error(`Checkout failed for ${userId}: ${err.message}`)
console.log("Payment retrying")
console.error("Max retries exceeded")

// What lands in your log file:
// User usr_4829 started checkout
// Checkout failed for usr_4829: connect ETIMEDOUT 10.0.1.5:5432
// Payment retrying
// Max retries exceeded

Structured (what you actually need)

{"level":"info","message":"checkout_started","correlation_id":"req_8f2a9b1c","user_id":"usr_4829","order_id":"ord_9182","ts":"2024-01-15T03:14:08.001Z"}
{"level":"error","message":"checkout_failed","correlation_id":"req_8f2a9b1c","user_id":"usr_4829","error_code":"ETIMEDOUT","db_host":"10.0.1.5","db_port":5432,"attempt":1,"ts":"2024-01-15T03:14:09.442Z"}
{"level":"warn","message":"checkout_retrying","correlation_id":"req_8f2a9b1c","attempt":2,"ts":"2024-01-15T03:14:11.001Z"}
{"level":"error","message":"checkout_max_retries","correlation_id":"req_8f2a9b1c","max_attempts":3,"ts":"2024-01-15T03:14:14.203Z"}

The unstructured version is readable at write time. The structured version is queryable at 3am. Pick one.

What “structured” actually means

Every log line is a machine-parseable record with named fields — not a formatted string. JSON is the overwhelmingly common format. Field names are not standardized across the industry, but these are the de facto conventions:

timestamp — ISO 8601 / RFC 3339 format (2024-01-15T03:14:09.442Z, bukan 1705283649). Both are parseable by log aggregators, but ISO is also human-readable. If you are converting legacy epoch values, the Konverter Stempel Waktu Unix handles the translation in both directions.
kepercayaan — lowercase string: debug, info, warn, error, fatal. Not integers, not LOG_LEVEL_3.
pesan — a static string identifying the event type. More on this below.
service / app — which service emitted this line. Critical in any multi-service setup.
correlation_id / trace_id — a request-scoped identifier propagated across services. This is the thing that makes 3am survivable.

The static message rule is where people trip up. If your message is "User 4829 failed checkout", that is a different string from "User 1337 failed checkout" — you cannot group or count by message. Put 4829 in a user_id field and make the message "checkout_failed". Now count(message="checkout_failed") group by user_id shows you exactly which users are affected.

Log levels: the part everyone gets wrong

Most explanations give you the definitions and move on. Here is what actually matters in production:

debug — off in production by default. Shipping debug logs to a paid aggregator burns money. Enable dynamically during incident investigation, then turn it back off.
info — operational events: request received, job started, external service called. Not “entered function handleCheckout.” If it does not help you understand system behavior, it is noise.
warn — the ambiguous one. Useful rule: if warn does not need investigation, it is probably info. If it needs investigation but not at 3am, it is actually warn. Apps that emit hundreds of warns per second train operators to ignore them.
error — something broke and a human may need to act. Unhandled exceptions, dependency failures, data corruption. A user submitting invalid form data is bukan an error — that is a handled validation case. Logging it at error trains operators to ignore your real errors.
fatal / critical — the service is going down or is in an unrecoverable state. Should trigger a page. If you are using console.error() for everything from “invalid email” to “disk full,” you have lost the ability to distinguish.

Correlation IDs: how to actually implement them

Generate a unique ID at the request boundary and propagate it everywhere. The ID travels as an HTTP header (X-Correlation-ID atau X-Request-ID) and gets injected into every log line for that request’s lifecycle.

Here is the full pattern in Node.js with Pino — faster than Winston, outputs NDJSON by default, ships with pino-pretty for readable local dev output:

import pino from 'pino'
import { randomUUID } from 'crypto'

const logger = pino({ level: process.env.LOG_LEVEL || 'info' })

// Middleware: inject correlation ID into every request
app.use((req, res, next) => {
  const correlationId = req.headers['x-correlation-id'] || randomUUID()
  req.log = logger.child({
    correlation_id: correlationId,
    service: 'checkout-api',
  })
  res.setHeader('x-correlation-id', correlationId)
  next()
})

// Route: use req.log everywhere — correlation_id is automatic
app.post('/checkout', async (req, res) => {
  const startTime = Date.now()
  req.log.info({ user_id: req.user.id, order_id: req.body.orderId }, 'checkout_started')

  try {
    const result = await paymentService.charge(req.body)
    req.log.info({
      order_id: req.body.orderId,
      amount_cents: result.amountCents,
      duration_ms: Date.now() - startTime,
    }, 'checkout_completed')
    res.json({ success: true })
  } catch (err) {
    req.log.error({
      user_id: req.user.id,
      error_code: err.code,
      error_message: err.message,
      duration_ms: Date.now() - startTime,
    }, 'checkout_failed')
    res.status(500).json({ error: 'checkout_failed' })
  }
})

logger.child() creates a sub-logger that automatically includes correlation_id dan service in every line — you do not pass them manually each time. The message argument is a static string, with all dynamic data in the fields object before it. Pino puts the message last in its API signature specifically to enforce this pattern.

Kapan paymentService makes downstream HTTP calls, forward the correlation ID as X-Correlation-ID. Downstream services pick it up the same way. Now one query against your log aggregator returns every event, across every service, for that one failing checkout.

Plugging into Loki, Datadog, and CloudWatch

Every major aggregator parses JSON natively with zero reformatting required.

Loki (Grafana) — JSON logs let you filter on extracted fields directly: {service="checkout-api"} | json | level = "error" and error_code = "card_declined". Free-text logs force regex matches, which are slower and break the moment your log format changes. Consistent service dan level fields also keep Loki’s label cardinality manageable.

Datadog — @field_name:value queries structured fields directly without a custom pipeline. @level:error @user_id:usr_4829 returns every error for that user across all services instantly. Free-text logs need a grok parser — 45 minutes of config you do not want to write at 3am, and it breaks every time your log format changes even slightly.

CloudWatch Logs Insights — fields @timestamp, message, user_id | filter level = "error" and user_id = "usr_4829" | sort @timestamp desc only works if user_id is a top-level JSON field. CloudWatch auto-extracts JSON fields on ingestion. Nested JSON requires unmask expressions and becomes painful fast — keep your log structure flat.

The pattern is the same everywhere: JSON keys become queryable fields without parser config. Free-text logs require per-aggregator grok/regex pipelines that are brittle and become a maintenance liability the moment your format changes.

Quick setup by language

Node.js: Pino — NDJSON by default, fastest serializer in the ecosystem. Winston also works but JSON format config is more verbose and performance drops noticeably at high throughput.
Python: structlog is the right answer. The stdlib logging module can output JSON with a custom JSONFormatter, but structlog’s context binding is cleaner and the documentation is genuinely good.
Menggabungkan semuanya: handler webhook siap produksi zerolog atau zap — both zero-allocation, both output JSON. zerolog’s API is cleaner for most use cases; zap wins if you are logging millions of lines per second and need typed fields.
Java: Logback with logstash-logback-encoder is the de facto standard. SLF4J MDC entries map to JSON fields automatically, which makes correlation ID propagation straightforward.

If you receive a compressed log export and need to inspect individual lines, the Pemformat JSON handles minified JSON — paste a dense log line and get indented readable output. For bulk log processing where file size matters, Pengecil JSON strips whitespace from structured exports before archiving or shipping them.

The logs are already there. They just are not in a shape you can search. Switching to structured JSON logging is a one-time change per service that pays back every time you need to answer “what happened to this specific request?” — whether that is at 3am or on a Monday morning post-mortem.