When the firewall is enabled, clawker’s control plane runs netlogger — a userspace pipeline that drains an eBPF ringbuf populated by the cgroup/connect/sendmsg/sock_create programs and emits one OTLP log record per egress decision. Every connect, sendmsg, and socket-create call from a managed agent container produces a record carrying the kernel’s verdict (Documentation Index
Fetch the complete documentation index at: https://docs.clawker.dev/llms.txt
Use this file to discover all available pages before exploring further.
allowed / denied / bypassed), the container’s attribution (agent, project, container_id), the destination 4-tuple, and the resolved domain when DNS context is available.
The headline use is bypass-mode forensic coverage. The firewall’s bypass switch (clawker firewall bypass <duration> --agent <name>) intentionally short-circuits enforcement so the operator can perform supervised exploration without rule churn. Before netlogger, bypassed traffic flowed without leaving an enforcement record — the so-called “forensic black hole” of bypass mode. netlogger emits a verdict=bypassed record at the same decision points that would have emitted allowed/denied, so an audit trail exists for every bypass window without changing enforcement semantics.
Record Shape
Each record is an OTel log emitted on the trusted infra OTLP lane with:service.name = ebpf-egress(distinct fromclawker-cpso retention + volume profile are independent)event.name = ebpf.egress.connect/ebpf.egress.sendmsg/ebpf.egress.sock_create(per-emit-site so dashboards can filter by record kind without inspecting flag bits)body = "ebpf egress"severity = INFO
dst_ip / dst_port / dst_host are omitted when their source value is absent so operators can partition via _exists_:attributes.<key> in OS Discover):
| Attribute | Type | Description |
|---|---|---|
verdict | keyword | allowed, denied, or bypassed |
container_id | keyword | Docker container ID (empty if cgroup_id not yet in label cache) |
agent | keyword | dev.clawker.agent label (empty if cache miss) |
project | keyword | dev.clawker.project label (empty for global-scope agents or cache miss) |
cgroup_id | keyword | Kernel cgroup ID — trust anchor used for attribution lookup |
bpf_ts_ns | long | Kernel monotonic timestamp at the moment of decision (bpf_ktime_get_ns) |
dst_ip | ip | Destination address — IPv4 dotted-quad or IPv6 colon form. Mapped as type: ip, accepts both. Omitted on sock_create records (no_dst=true); operators filter via NOT _exists_:attributes.dst_ip |
dst_port | keyword | Destination port (host byte order). Omitted on sock_create records (no_dst=true) |
l4_proto | keyword | stream / dgram / raw (human-readable form of SOCK_*) |
l4_proto_code | integer | Raw SOCK_* constant in case operators need to filter on a code that doesn’t have a string form yet |
ipv6 | boolean | Native IPv6 destination — full 16-byte v6 address carried in dst_ip. Denied by default (only allowed during a bypass) |
ipv4_mapped | boolean | ::ffff:x.x.x.x IPv4-mapped IPv6 address (the dual-stack default for most clients) |
no_dst | boolean | Socket-creation event with no destination (sock_create program). dst_ip and dst_port are omitted on these records |
dst_host | keyword | Resolved domain string. Populated for every record whose destination IP was resolved via the managed CoreDNS under a firewall-allowed zone. Omitted for direct-IP connects (operators filter via NOT _exists_:attributes.dst_host; see Domain Resolution) |
domain_hash | keyword | BPF-side identity (FNV-1a of normalized domain). Correlates userspace records with BPF dns_cache / route_map entries when dst_host is empty (direct-IP connect, rule removed mid-flight, stale dnsbpf entry) |
sock_create carve-out above where BPF carries no destination.
Where Records Land
Records flow:clawker-opensearch-bootstrap one-shot service every time clawker monitor up runs. The retention policy (default 7 days, throwaway-stack semantics) auto-attaches via the same ISM policy that covers the other clawker indices. Cross-index queries against clawker-cp,clawker-envoy,clawker-coredns,clawker-ebpf-egress work out of the box — ingest_source is stamped on every record for filtering.
Per-Connection Bytes and Duration
Not in this stream. netlogger records the decision — the moment the kernel approved, denied, or bypassed an outbound connection. Byte counts and durations belong to the L7 proxy lifecycle, not the decision point. Forverdict=allowed records, the matching Envoy access log carries bytes_sent, bytes_received, duration_ms. Operators pivot from a netlogger record to the corresponding Envoy record by 5-tuple at query time. For verdict=denied records there are no bytes to record — no traffic flowed. For verdict=bypassed records, only the netlogger record exists — Envoy and CoreDNS enforcement are skipped under bypass by design.
Sock_ops-based per-connection byte tracking inside BPF is not on this stream’s roadmap. It would double the BPF surface area, leave UDP/connectionless flows without an analogous signal, and overlap with Envoy’s access-log emission for the cases where it matters.
Domain Resolution
dst_host is populated for every record whose destination IP came from a dnsbpf-resolved A record under a firewall-allowed zone. The translation is control-plane-driven: the BPF dns_cache map stores {domain_hash, expire_ts} keyed by IPv4, and netlogger’s reverse-DNS map maintains the inverse hash → domain table by hashing the live set of firewall rule destinations + internal hosts (docker.internal + monitoring service hostnames) on a 5-second refresh tick. The hash function (internal/controlplane/firewall/ebpf.DomainHash — FNV-1a) is the same one dnsbpf computes when it writes dns_cache, so the two sides agree on the identity by construction.
dst_host will be empty when:
- The destination IP was reached without DNS resolution (direct-IP
connect). - The IP was resolved through a path other than the managed CoreDNS (e.g.,
/etc/hostsentry inside the agent container). - A rule was removed and netlogger hasn’t yet refreshed (worst case: 5 seconds of stale records on the previously-allowed domain).
pkg/fqdn/namemanager pattern. At deployment-typical rule-set sizes (single-digit-to-hundreds of firewall-rule domains), the floor is operationally irrelevant.
Reliability
netlogger is engineered to fail open with respect to the firewall — enforcement runs whether or not netlogger is healthy.- BPF token-bucket rate limiter keyed by
cgroup_id(burst 64, refill 64 tokens/100ms ⇒ ~640 records/sec/cgroup ceiling). A misbehaving container cannot monopolize the ringbuf; throttled events are counted inratelimit_drops, keyed by the noisy cgroup. - Kernel-fault drop counter (
events_drops, PERCPU_ARRAY) bumps whenbpf_ringbuf_reservereturns NULL on a full buffer — distinct from rate-limit drops so the operator response is different (ringbuf size vs. noisy-agent triage). - Userspace queue between the ringbuf reader and the processor is bounded with drop-newest semantics; the reader never blocks on the consumer. Drops are counted in
clawker_netlogger_queue_dropped_total(Prom counter declared; scrape exposure is not wired). - Circuit breaker wraps the OTLP exporter: three consecutive
Export()failures permanently trip the breaker for the rest of the CP lifetime. Records drop on the floor afterward; the BatchProcessor queue drains via the SDK’s own drop-oldest path. No background reconnect — telemetry availability is binary per-CP-lifetime by design. Operator response: restart CP after fixing the collector. - Preflight TLS dial runs at CP boot with a 20-second deadline against the configured OTLP endpoint. Failure degrades netlogger to a no-op for the rest of the CP lifetime and emits
event=netlogger_unavailable(warn for “no endpoint configured”, error for actual failures like cert problems or unreachable collector). The firewall, AdminService, agent dispatch, and registry are unaffected — netlogger’s failure is contained.
Trust Lane
netlogger emits on the trusted infra lane — the same OTLP/gRPC + mTLS path the CP zerolog bridge, the Envoy access logger, and the CoreDNS otel plugin use. Identity reuse:- Cert: per-handshake ephemeral leaf minted by
otelcerts.Service(LoadTLSConfig("netlogger")), chained through the infra intermediate CA — not the CLI root. - Endpoint: the collector’s
otlp/infrareceiver onOtelInfraPort(not the unauth’dotel-collector:4317agent lane). - The OTLP endpoint must be
https://(or bare host:port). A plaintexthttp://endpoint is rejected at boot — pushing infra telemetry over plaintext would smuggle records onto the agent-lane receiver, defeating the trust-lane separation.
service.name=ebpf-egress records onto the trusted index — they don’t hold a leaf chained through the infra intermediate, so the receiver’s TLS handshake fails the chain check. The strict-directive promise (every field on every record, no discretion) only delivers if the records actually originate from the CP — the mTLS boundary is what makes that promise enforceable.
Configuration
netlogger inherits its endpoint from the standardOTEL_EXPORTER_OTLP_ENDPOINT resolution path used by the CP zerolog bridge — when clawker monitor up is running, the CP boot sequence wires the collector’s OtelInfraPort automatically. No netlogger-specific knobs ship; the BatchProcessor sizing, retry cap (10s vs. SDK default 1 min), and circuit-breaker threshold are CP-level constants.
To point netlogger at a custom collector (for a centralized SIEM, or to bypass the local stack entirely), override OTEL_EXPORTER_OTLP_ENDPOINT in the CP container’s environment and ensure the receiver presents a server cert chained through the infra intermediate. Plaintext endpoints are rejected by design — see Trust Lane above.
Current Limitations
- FNV-1a 32-bit identity — the BPF data path uses an FNV-1a hash of the lowercased domain as a fixed-width identity in
dns_cacheandroute_map. Collision-vulnerable in theory; harmless in practice at deployment-typical rule-set sizes. The route-identity-allocator follow-up replaces this with userspace-allocated sequential u32 identities (Cilium pattern). - Prom counters not scraped — the
clawker_netlogger_*counters are declared but not wired into a/metricsendpoint. The structured CP log surface is the operational signal for throughput, queue drops, parse errors, and OTLP export success/error. Kernel-side drop counters (events_drops,ratelimit_drops) live on the firewall/eBPF subsystem surface — they are subsystem health, not security telemetry, and intentionally don’t ride on the netlogger OTel stream.
See Also
- Firewall — the source of every record on this stream (decision points, bypass mechanics, rule lifecycle).
- Monitoring — the OpenSearch + OpenSearch Dashboards + Prometheus stack that hosts the
clawker-ebpf-egressindex.