Quick Start: Elasticsearch + OpenTelemetry Collector

I recently began overhauling our logging architecture to address several challenges, including centralized log management, debugging difficulties, performance overhead, rate limiting, authentication concerns, data enrichment needs, and robust error handling. To validate the new approach, I developed a Proof of Concept (PoC) and an Architectural Decision Record (ADR), which I have detailed in this blog post to share our findings and the benefits of the updated architecture. Why This Architecture? In a distributed microservices world, observability becomes a must-have rather than a nice-to-have. Traditional “log to disk” methods don’t cut it when you need to troubleshoot cross-service issues or correlate logs with traces. By combining Elasticsearch with an OpenTelemetry Collector, you get: Centralized Logs for simplified searching and analytics. Structured, ECS-compliant log data, making it easy to parse and visualize. Scalability and resilience via the Collector’s buffering, batching, and resource limits. Future-proofing: OpenTelemetry is quickly becoming the standard for logs, metrics, and traces. Problems Solved: Flexible Indexing: This architecture allows you to control which index the logs are sent to. Difficult debugging: With distributed tracing and ECS fields, you can quickly pinpoint errors. Performance overhead: The Collector offloads log processing tasks from your app, so your services can focus on business logic. Rate Limiting: Manage the flow of log data to prevent system overload and ensure consistent performance. Authentication: Secure the log pipeline by ensuring only authorized sources can send logs. Data Enrichment: Enhance logs with additional context for more insightful analysis. Error Handling: Implement robust error handling to gracefully manage failures in log processing. Monitoring and Alerts: Set up monitoring and alerting to proactively address issues in the logging pipeline. Additional Reading: OpenTelemetry Official Docs Elastic Common Schema (ECS) Elasticsearch Observability Overview Why Set Up Elasticsearch + OpenTelemetry Collector? Centralize Logs: Capture logs from various sources in a single store (Elasticsearch). Structured Logging: By sending logs via OTLP (OpenTelemetry) in JSON format, your data is easier to query and analyze. Scalability: The Collector buffers and processes logs, reducing overhead on your apps. Observability: Combine logs with distributed tracing or metrics to gain deeper insights. Step 1: Generate docker-compose.yml $composeContent = @" services: elasticsearch: image: docker.elastic.co/elasticsearch/elasticsearch:7.10.1 environment: - discovery.type=single-node ports: - "39200:9200" healthcheck: test: ["CMD", "curl", "-f", "http://localhost:9200"] interval: 3s timeout: 3s retries: 15 opentelemetry-collector: image: otel/opentelemetry-collector-contrib:latest environment: - ELASTIC_API_KEY=elasticsearch-apikey command: --config /etc/otel-collector-config.yml volumes: - ./otel-collector-config.yml:/etc/otel-collector-config.yml ports: - "4317:4317" - "4318:4318" depends_on: elasticsearch: condition: service_healthy "@ $composeFile = "docker-compose.yml" $composeContent | Out-File -FilePath $composeFile -Encoding UTF8 Explanation Elasticsearch runs in single-node mode, exposed at port 39200. OpenTelemetry Collector listens on ports 4317 (gRPC) and 4318 (HTTP). We write the Compose file to docker-compose.yml on the fly, ensuring everyone gets the same environment. Step 2: Create otel-collector-config.yml $otelConfig = @" receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 processors: memory_limiter: check_interval: 1s limit_mib: 4000 spike_limit_mib: 800 # natively that setup do not support substitute values for creating indexes like that '[service.name]-logs-yyyy.MM' # So we need to fill in elasticsearch.index.prefix and elasticsearch.index.suffix or use logstash resource: attributes: - action: upsert key: "elasticsearch.index.prefix" from_attribute: "service.name" batch: timeout: 10s send_batch_size: 10000 send_batch_max_size: 11000 exporters: elasticsearch: endpoint: "http://elasticsearch:9200" headers: Authorization: "ApiKey ${ELASTIC_API_KEY}" logs_index: "logs" logs_dynamic_index: enabled: true mapping: mode: "ecs" timeout: 5s service: pipelines: logs: receivers: [otlp] processors: [memory_limiter, resource, batch] exporters: [elasticsearch] "@ $otelConfigFile = "otel-collector-config.yml" $otelConfig | Out-File -FilePath $otelConfigFile -Encoding UTF8 Explanation Receivers: Accept OTLP logs over

Jan 16, 2025 - 13:24
Quick Start: Elasticsearch + OpenTelemetry Collector

I recently began overhauling our logging architecture to address several challenges, including centralized log management, debugging difficulties, performance overhead, rate limiting, authentication concerns, data enrichment needs, and robust error handling. To validate the new approach, I developed a Proof of Concept (PoC) and an Architectural Decision Record (ADR), which I have detailed in this blog post to share our findings and the benefits of the updated architecture.

Why This Architecture?

In a distributed microservices world, observability becomes a must-have rather than a nice-to-have. Traditional “log to disk” methods don’t cut it when you need to troubleshoot cross-service issues or correlate logs with traces. By combining Elasticsearch with an OpenTelemetry Collector, you get:

  • Centralized Logs for simplified searching and analytics.
  • Structured, ECS-compliant log data, making it easy to parse and visualize.
  • Scalability and resilience via the Collector’s buffering, batching, and resource limits.
  • Future-proofing: OpenTelemetry is quickly becoming the standard for logs, metrics, and traces.

Problems Solved:

  • Flexible Indexing: This architecture allows you to control which index the logs are sent to.
  • Difficult debugging: With distributed tracing and ECS fields, you can quickly pinpoint errors.
  • Performance overhead: The Collector offloads log processing tasks from your app, so your services can focus on business logic.
  • Rate Limiting: Manage the flow of log data to prevent system overload and ensure consistent performance.
  • Authentication: Secure the log pipeline by ensuring only authorized sources can send logs.
  • Data Enrichment: Enhance logs with additional context for more insightful analysis.
  • Error Handling: Implement robust error handling to gracefully manage failures in log processing.
  • Monitoring and Alerts: Set up monitoring and alerting to proactively address issues in the logging pipeline.

Additional Reading:

Why Set Up Elasticsearch + OpenTelemetry Collector?

  1. Centralize Logs: Capture logs from various sources in a single store (Elasticsearch).
  2. Structured Logging: By sending logs via OTLP (OpenTelemetry) in JSON format, your data is easier to query and analyze.
  3. Scalability: The Collector buffers and processes logs, reducing overhead on your apps.
  4. Observability: Combine logs with distributed tracing or metrics to gain deeper insights.

Step 1: Generate docker-compose.yml


$composeContent = @"
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.10.1
    environment:
      - discovery.type=single-node
    ports:
      - "39200:9200"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9200"]
      interval: 3s
      timeout: 3s
      retries: 15

  opentelemetry-collector:
    image: otel/opentelemetry-collector-contrib:latest
    environment:
      - ELASTIC_API_KEY=elasticsearch-apikey
    command: --config /etc/otel-collector-config.yml
    volumes:
      - ./otel-collector-config.yml:/etc/otel-collector-config.yml
    ports:
      - "4317:4317"
      - "4318:4318"
    depends_on:
      elasticsearch:
        condition: service_healthy
"@

$composeFile = "docker-compose.yml"
$composeContent | Out-File -FilePath $composeFile -Encoding UTF8

Explanation

  • Elasticsearch runs in single-node mode, exposed at port 39200.
  • OpenTelemetry Collector listens on ports 4317 (gRPC) and 4318 (HTTP).
  • We write the Compose file to docker-compose.yml on the fly, ensuring everyone gets the same environment.

Step 2: Create otel-collector-config.yml


$otelConfig = @"
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  memory_limiter:
    check_interval: 1s
    limit_mib: 4000
    spike_limit_mib: 800

# natively that setup do not support substitute values for creating indexes like that '[service.name]-logs-yyyy.MM'
# So we need to fill in elasticsearch.index.prefix and elasticsearch.index.suffix or use logstash
  resource:
    attributes:
      - action: upsert
        key: "elasticsearch.index.prefix"
        from_attribute: "service.name"

  batch:
    timeout: 10s
    send_batch_size: 10000
    send_batch_max_size: 11000

exporters:
  elasticsearch:
    endpoint: "http://elasticsearch:9200"
    headers:
      Authorization: "ApiKey ${ELASTIC_API_KEY}"
    logs_index: "logs"
    logs_dynamic_index:
      enabled: true
    mapping:
      mode: "ecs"
    timeout: 5s

service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [memory_limiter, resource, batch]
      exporters: [elasticsearch]
"@

$otelConfigFile = "otel-collector-config.yml"
$otelConfig | Out-File -FilePath $otelConfigFile -Encoding UTF8

Explanation

  • Receivers: Accept OTLP logs over HTTP/gRPC.
  • Processors: Add a memory limiter, resource enrichment (e.g., elasticsearch.index.prefix), and batching.
  • Exporters: Send logs to Elasticsearch in ECS-compliant format.

Step 3: Spin Up Docker Services


Write-Host "`n--- Bringing up Docker Compose services ---"

# Down any existing services
docker compose -p otel-poc -f $composeFile down

docker compose -p otel-poc -f $composeFile up -d

# Optional: Wait for Elasticsearch / Collector to be healthy
Write-Host "`nWaiting up to 60 seconds for Elasticsearch to become healthy..."
$healthCheckPassed = $false
for ($i = 1; $i -le 20; $i++) {
    # Check if ES responds on port 39200
    try {
        $result = Invoke-RestMethod -Uri "http://localhost:39200" -UseBasicParsing -ErrorAction Stop
        if ($result.name) {
            Write-Host "Elasticsearch is healthy."
            $healthCheckPassed = $true
            break
        }
    }
    catch {
        Write-Host "Still waiting..."
    }
    Start-Sleep -Seconds 3
}

if (-not $healthCheckPassed) {
    Write-Host "`nElasticsearch did not become healthy in time. Exiting..."
    exit 1
}

Explanation

  • docker compose down: Stops any existing containers with the same project name (otel-poc).
  • docker compose up -d: Spins up fresh containers in the background.
  • We then poll http://localhost:39200 to confirm Elasticsearch is ready before proceeding.

Step 4: Send a Test Log to the Collector


Write-Host "`n--- Sending test log to the Collector ---"

# Adjust $otlpEndpoint or add Authorization headers as needed
$otlpEndpoint = "http://localhost:4318/v1/logs"
$headers = @{
    "Content-Type" = "application/json"
    # "Authorization" = "Bearer otlp-collector-token"  # If needed
}

$logPayload = @"
{
  "resourceLogs": [
    {
      "resource": {
        "attributes": [
          {
            "key": "service.name",
            "value": { "stringValue": "myservice-" }
          },
          {
            "key": "elasticsearch.index.prefix",
            "value": { "stringValue": "myservice-" }
          },
          {
            "key": "elasticsearch.index.suffix",
            "value": { "stringValue": "-data" }
          }
        ]
      },
      "scopeLogs": [
        {
          "scope": {
            "name": "test-logger",
            "version": "1.0"
          },
          "logRecords": [
            {
              "timeUnixNano": "1673452395000000000",
              "severityNumber": 9,
              "severityText": "ERROR",
              "body": { "stringValue": "This is a test log from PowerShell" },
              "attributes": [
                {
                  "key": "attribute_key",
                  "value": { "stringValue": "attribute_value" }
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}
"@

try {
    $response = Invoke-RestMethod -Uri $otlpEndpoint -Method Post -Headers $headers -Body $logPayload
    Write-Host "Collector Response: $response"
} catch {
    Write-Host "Error sending log: $($_.Exception.Message)"
    exit 1
}

Explanation

  • We build a JSON payload with a single log record:
    • service.name is set to “myservice-”.
    • severityText is “ERROR” (just an example).
    • The body is a short string indicating a test log.
  • We then POST it to http://localhost:4318/v1/logs, the OTLP (HTTP) endpoint for the Collector.

Step 5: Query Elasticsearch

# --- STEP 5: Query Elasticsearch to verify log ingestion ---
# <# We need to wait a bit for the log to be ingested #>
# Start-Sleep -Seconds 60

Write-Host "`n--- Querying Elasticsearch to see if our log is present ---"

$esQueryBody = @{
  query = @{
    match_all = @{}
  }
}

try {
    $searchResponse = Invoke-RestMethod `
        -Method Post `
        -Uri "http://localhost:39200/myservice-logs-data/_search?pretty" `
        -ContentType "application/json" `
        -Body ($esQueryBody | ConvertTo-Json)

    $hits = $searchResponse.hits.hits

    if ($hits.Count -gt 0) {
        Write-Host "`nFound $($hits.Count) log(s) in Elasticsearch!"
        Write-Host "Here are the hits:"
        $hits | ForEach-Object { $_._source | ConvertTo-Json -Depth 10 }
    }
    else {
        Write-Host "`nNo logs found in Elasticsearch index 'test-logs'."
    }
}
catch {
    Write-Host "Error querying Elasticsearch: $($_.Exception.Message)"
}

Write-Host "`nDone!"

Explanation

  • We construct a match_all query to the myservice-logs-data index.
  • If logs exist, it prints the result to the console in JSON format.
  • This confirms the log made the full journey: CollectorElasticsearchQueried.

Summary

This step-by-step script helps you quickly validate:

  1. Elasticsearch is up and responding.
  2. OpenTelemetry Collector is receiving logs and exporting them in ECS-compatible format.
  3. Your custom log payload arrives in Elasticsearch as expected.

By breaking the script into separate blocks, you can understand each phase and modify it as needed—maybe you want to add a Kibana container, use TLS, or fine-tune the Collector’s memory settings. This quick start approach should get you up and running fast, so you can focus on observability rather than wrestling with config files.

Enjoy building out your structured logging pipeline! Drop a comment below if you have any questions or improvements. Happy logging!