Kubernetes API Server Audit Log: Policy and SIEM Pipeline

When teams talk about Kubernetes security, most of them focus on RBAC, network policy, and image signing. Those are right. But if you can’t give a clean answer to “who, when, and what did they do?” you’re flying blind during incident response. One of the main signal sources that closes that blind spot is the API Server audit log.

In this post my goal isn’t just “we turned audit logging on.” It’s to design an operable audit log pipeline:

Keeping noise under control (cost + signal quality)
Masking sensitive data (response body / secret leakage)
Building correlation in the SIEM (IDP + cluster + node)
Sharpening the operational runbook (validation, rollback, retention)

1) What do we actually want from the audit log?

The biggest value of the audit log is being able to tie an action that the security team flags as suspicious back to evidence:

A spike in create tokenreviews / create subjectaccessreviews
Privilege escalation moves like create/patch clusterrolebinding
“Interactive” access patterns like exec/portforward
Reads of sensitive objects like get secrets

2) Audit policy design: the “log everything” trap

There are four basic levels in Kubernetes audit:

None: no logging
Metadata: who/what/where (no body)
Request: includes the request body
RequestResponse: includes both request and response body

A practical production approach generally looks like this:

Default: Metadata
Very chatty endpoints: None
A handful of very critical actions: Request (rarely)
RequestResponse: highly exceptional (in most organizations it’s an unnecessary risk)

Sample audit policy (starting point)

This file is a “core” example of a policy; expand it to fit your environment:

apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
  - "RequestReceived"
rules:
  # 1) Gürültüyü kes: healthz/readyz/livez
  - level: None
    nonResourceURLs:
      - "/healthz*"
      - "/readyz*"
      - "/livez*"
      - "/version"

  # 2) Sistem bileşenleri: default metadata
  - level: Metadata
    userGroups:
      - "system:authenticated"

  # 3) Secret erişimini görünür tut: metadata (body yok)
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets"]

  # 4) Yetki değişiklikleri: metadata + daha sıkı izleme
  - level: Metadata
    resources:
      - group: "rbac.authorization.k8s.io"
        resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]

  # 5) Interactive aksiyonlar: metadata
  - level: Metadata
    resources:
      - group: ""
        resources: ["pods/exec", "pods/portforward", "pods/attach"]

  # 6) Fallback: metadata
  - level: Metadata

3) The pipeline: API Server → collector → SIEM

The most stable approach is to write the audit log to the node disk, pick it up with an agent, normalize it, and ship it.

A simple architecture:

API Server: audit file output (rotation enabled)
Node: log collector (Vector/Fluent Bit/Alloy) tailing the file
Pipeline: parse + normalize + redaction
SIEM: index + correlation + alerting

Normalization: the fields a SIEM likes

Standardizing the following fields will pay off a lot when it comes to search and correlation in the SIEM:

cluster: a stable identifier such as prod-eu-1
user.username, user.groups
sourceIPs (the real client IP if the LB/ingress carries it)
verb, objectRef.resource, objectRef.namespace, objectRef.name
responseStatus.code
requestURI
userAgent

4) Retention and cost: don’t burn the log

Audit log cost grows quickly. So:

Drop noisy non-resource URLs to None
Omit the RequestReceived stage (most of the time it’s not required for correlation)
Treat only the critical rules at alert level and keep the rest “searchable”
Tier retention: hot (7-14 days), cold (30-90 days), archive (per compliance need)

5) Alarm ideas (things that work in the field)

Practical signals to start with:

clusterrolebinding create/patch (especially cluster-admin bindings)
A jump in secrets list/get (per-user anomaly)
Repeated pods/exec (especially in production namespaces)
A flood of tokenreviews (suspicious auth probing)
Actions performed by a “break-glass” user (expected, but should be visible)

6) Validation and runbook

A mini runbook for rolling changes out safely:

Activate the policy on a canary control-plane node (where possible)
Watch log volume and SIEM ingestion for 15-30 minutes
Confirm there are no secret body leaks (sample search: \"kind\":\"Secret\" + data:)
Identify the noisy points and set them to None/Metadata in the policy
Have a rollback path ready: revert the policy file + apiserver restart procedure

Designed well, the audit log is a “signal generation pipeline” for the security team and an “operational memory” for the platform team. Designed badly, it just produces cost and risk. So build policy + pipeline + runbook together.

Kubernetes API Server Audit Log: Policy and SIEM Pipeline

1) What do we actually want from the audit log?

2) Audit policy design: the “log everything” trap

Sample audit policy (starting point)

3) The pipeline: API Server → collector → SIEM

Normalization: the fields a SIEM likes

4) Retention and cost: don’t burn the log

5) Alarm ideas (things that work in the field)

6) Validation and runbook

Comments

Curated digest, hand-picked by me — not the AI

Your Reading Stats

Related Posts

Centralized Logging with Windows Event Forwarding (WEF)

Session Recording on the Bastion: tlog + sudo I/O + SSH Audit Pipeline

Centralized Logging with systemd-journal-remote: mTLS and Retention

1) What do we actually want from the audit log?

2) Audit policy design: the “log everything” trap

Sample audit policy (starting point)

3) The pipeline: API Server → collector → SIEM

Normalization: the fields a SIEM likes

4) Retention and cost: don’t burn the log

5) Alarm ideas (things that work in the field)

6) Validation and runbook

Comments

Curated digest, hand-picked by me — not the AI

Your Reading Stats

Related Posts

Centralized Logging with Windows Event Forwarding (WEF)

Session Recording on the Bastion: tlog + sudo I/O + SSH Audit Pipeline

Centralized Logging with systemd-journal-remote: mTLS and Retention

Klavye Kısayolları