Resilience Engineering for Local Docker Stacks

Break your system
before production does.

Chaos-Dock is a Go + Bubble Tea CLI that finds running containers, injects controlled faults, and validates recovery behavior with OS-level traffic controls.

chaos-dock session
$ chaos-dock -list
containers: api postgres redis

$ chaos-dock -init-config -config chaos.yaml
status: starter config written

$ chaos-dock -validate-config -config chaos.yaml
status: config validation successful

$ chaos-dock -run-scheduled -config chaos.yaml
status: network-latency + kill experiments running

$ chaos-dock -panic -targets "postgres,api"
status: reverted latency + restarted 1 container

Go + Docker SDK

Infrastructure-grade implementation with typed interfaces and strong error handling.

Linux Namespace Injection

Uses container PID + nsenter to run tc netem inside target network namespace.

Scheduler + Panic Safety

Supports recurring experiments with jitter and panic-button rollback with tracked targets.

Core Capabilities

Fault injection with guardrails

Config Bootstrap + Validation

Create starter experiment files and validate them before execution.

Container Discovery

Use -list to enumerate running workloads with container IDs, image names, and status.

Latency Injection

Inject deterministic delay with Linux traffic control and remove qdisc state on command.

Kill Fault Injector

Terminate a target with validated signals such as SIGTERM and SIGKILL.

Experiment as Code

Define repeatable scenarios in chaos.yaml and schedule recurring fault experiments.

Run Once / Scheduled Modes

Execute one-shot experiments for CI validation or continuous loops for reliability drills.

Emergency Recovery

Single panic action reverts latency and restarts either explicit or tracked target containers.

How It Works

OS-level mutation, app-level validation

  1. Docker SDK enumerates and inspects running containers to discover IDs and State.Pid.
  2. Network injector enters namespace with nsenter --target <pid> --net --mount.
  3. Runs tc qdisc replace dev eth0 root netem delay 500ms using exec.CommandContext.
  4. Kill injector sends validated Unix signals through Docker API (SIGTERM, SIGKILL, etc.).
  5. Panic button removes qdisc and restarts impacted containers using registry-aware rollback.

Clean Architecture

Domain-first design for a production OSS tool

Domain

FaultInjector contracts, config model, and typed fault errors.

Application

Experiment runner, recurring scheduler, panic-button orchestration, Bubble Tea UI model.

Infrastructure

Docker runtime adapter, YAML loader, namespace tc injector, and signal-based kill injector.

Quickstart

Run your first experiment

Sample chaos.yaml

experiments:
  - name: db-latency
    targetContainer: postgres
    enabled: true
    fault:
      type: network-latency
      delay: 500ms
    schedule:
      every: 60s
      jitter: 5s

  - name: kill-db
    targetContainer: postgres
    enabled: true
    fault:
      type: kill
      signal: SIGTERM
    schedule:
      every: 120s

Run Commands

git clone https://github.com/lekhanpro/chaos-dock.git
cd chaos-dock
go run ./cmd/chaos-dock -init-config -config chaos.yaml
go run ./cmd/chaos-dock -validate-config -config chaos.yaml
go run ./cmd/chaos-dock -list
go run ./cmd/chaos-dock -run-once -config chaos.yaml
go run ./cmd/chaos-dock -run-scheduled -config chaos.yaml
go run ./cmd/chaos-dock -panic -targets "postgres,api"
# quick script wrapper
./scripts/run-local.sh init chaos.yaml
./scripts/run-local.sh validate chaos.yaml
./scripts/run-local.sh run-once chaos.yaml

Requires Linux host privileges for namespace networking and tc execution.