SadServers
  • Scenarios
  • Labs
    All Labs Linux & Bash Web Servers Databases Data Processing Docker Kubernetes CI/CD Infrastructure As Code Observability Tooling / Applications
  • Dashboard
  • Solutions
    For Individuals For Businesses
  • Ranking
  • Newsletter
  • Documentation
    FAQ Support Pro Accounts Pro+ Accounts Business Accounts Gift API CLI/TUI Privacy Troubleshooting Interviews
  • Blog
  • Pricing
  • Gift
    Gift Purchase Gift Redeem
  • About
Log In - Sign Up

SadServers Linux & DevOps Troubleshooting Scenarios

Linux & Bash

  • - Linux commands, Bash scripting
  • - Systemd
  • - Networking, DNS
  • - Storage
  • - SSH, Firewall
  • - Libraries
  • - Cron and more...

Web Servers

  • - Nginx
  • - Apache
  • - HAProxy
  • - Caddy
  • - Gunicorn
  • - uWSGI
  • - HTTPS/TLS

Databases

  • - PostgreSQL
  • - MySQL
  • - SQLite
  • - Redis
  • - ClickHouse
  • - MongoDB
  • - etcd

Data Processing

  • - CSV
  • - JSON
  • - SQL queries

Docker

  • - Building images
  • - Multi-stage builds
  • - Volumes
  • - Networks
  • - Docker Compose
  • - Podman

Kubernetes

  • - kubectl
  • - Helm
  • - K8S Roles & Permissions
  • - Services
  • - Namespaces
  • - Deployments, StatefulSets
  • - ConfigMaps, Secrets

Infrastructure As Code

  • - Ansible
  • - Terraform

Observability

  • - ELK
  • - Prometheus

Tooling / Applications

  • - Git
  • - Rabbitmq
  • - Envoy
  • - Vault
  • - Harbor
  • - Jenkins

Hacking

  • - Capture the Flag (CTF) Challenges
  • - Code Vulnerabilities
  • - Privilege Escalation

Languages

  • - Python
  • - Golang
  • - PHP
  • - Java
  • - Node.js
  • - C
Previous Next
advent2025 ai ansible apache bash c caddy clickhouse cron csv data processing disk volumes dns docker elk envoy etcd ftp git golang gunicorn hack haproxy harbor hashicorp vault helm java jenkins json kubernetes linux-other mongodb mysql nginx node.js php podman postgres prometheus python rabbitmq redis sql sqlite ssh ssl supervisord systemd terraform traefik
realistic / interviews new pro business

Easy

# Name Time Type
1 "Bologna": counting ELB 5xx errors 15 m Do New
"Bologna": counting ELB 5xx errors

Scenario: "Bologna": counting ELB 5xx errors

Level: Easy

Type: Do

Access: Email

Description: Operations handed you a classic AWS Elastic Load Balancer access log at /home/admin/elb.log. Each line is one request. Fields are space-separated; the quoted HTTP request starts at field 12, so the numeric fields before it are fixed-width columns.

Field 8 is the ELB status code and field 9 is the backend status code returned by the target instance. Count how many log lines have a backend status code in the 5xx range (500 through 599). Write that integer — digits only — to /home/admin/solution.txt. For example: echo 42 > ~/solution.txt

The log mixes successful responses, redirects, client errors, and server errors; only backend 5xx responses count toward your answer.

Test: The MD5 checksum of your answer file md5sum /home/admin/solution.txt is b73ce398c39f506af761d2277d853a92 (we also accept the correct count with a trailing newline in the file).

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 15 minutes.

2 "Genova": cgroups problem 15 m Fix New
"Genova": cgroups problem

Scenario: "Genova": cgroups problem

Level: Easy

Type: Fix

Access: Email

Description: This small VM runs sad-api (a lightweight health endpoint on port 9090) and sad-batch (a nightly ETL-style job that allocates a lot of RAM).

After a recent deploy, starting sad-batch caused memory use to spike and sad-api was killed by the OOM killer. On-call stopped the batch service before handing you the host.

A legacy cgroup v2 launcher under /opt/sad/ is supposed to enforce a 128M hard limit on cgroup sad-batch, but the cap never applies.

sad-batch is intentionally stopped and disabled when you log in. Read /home/admin/incident-notes.txt for context. Fix the cgroup configuration so /sys/fs/cgroup/sad-batch/memory.max is 134217728 before you start the batch job again.

Do not change sad-api; it should keep running on 127.0.0.1:9090.

Test: sad-api is active and curl http://127.0.0.1:9090/ returns SadServers - API OK.

The cgroup v2 hard limit is in place: cat /sys/fs/cgroup/sad-batch/memory.max prints 134217728 (128 MiB).

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 15 minutes.

Medium

# Name Time Type
1 "Verona": Apache Portal Won't Open 15 m Fix New
"Verona": Apache Portal Won't Open

Scenario: "Verona": Apache Portal Won't Open

Level: Medium

Type: Fix

Access: Email

Description: An internal Apache portal was migrated to this host. The document root is /var/www/portal.

The site root at http://localhost/ does not serve the expected homepage, and a legacy bookmark at /reports no longer reaches the status page (direct access to /status/ works).

Find and fix what keeps the portal root and the legacy redirect from working. Adding missing content is allowed.

Test: Ready.

curl -L http://localhost/reports returns a first line of SadServers - Status OK.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 15 minutes.

2 "Modena": Ansible Deploy Won't Publish 30 m Fix Pro New
"Modena": Ansible Deploy Won't Publish

Scenario: "Modena": Ansible Deploy Won't Publish

Level: Medium

Type: Fix

Access: Paid

Description: This host publishes an internal status page by running Ansible locally against the Docker container status-app (port 8888 on localhost maps to the container's HTTP port).

The playbook tree lives in /home/admin/deploy/. After a refactor, ansible-playbook site.yml no longer leaves a working status endpoint — curl http://localhost:8888/ does not return the expected line.

Fix the Ansible project and run the playbook successfully so the status page is served from the container.

Test: curl http://localhost:8888/ returns a first line of SadServers - Modena OK.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

3 "Parma": Debugging Terraform Issues 20 m Fix New
"Parma": Debugging Terraform Issues

Scenario: "Parma": Debugging Terraform Issues

Level: Medium

Type: Fix

Access: Email

Description: This host publishes a machine-readable status marker using Terraform with a local backend. The project lives in /home/admin/infra/ and should write /var/local/platform-status.txt.

After a refactor, terraform plan and terraform apply no longer succeed, and the status file is missing or stale.

Fix the Terraform project and apply it so the marker is published again.

(Note: Internet access is not needed).

Test: The first line of /var/local/platform-status.txt is SadServers - Parma OK.

Running terraform plan in /home/admin/infra/ reports no changes pending (clean state).

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 20 minutes.

4 "Montevideo": restore test snapshot would clobber production 20 m Fix New
"Montevideo": restore test snapshot would clobber production

Scenario: "Montevideo": restore test snapshot would clobber production

Level: Medium

Type: Fix

Access: Email

Description: This host runs a small app whose live data lives under /production/. Nightly snapshot backups land under /snapshots/ in an rsnapshot-style layout.

A recent bug in the snapshot job may have pointed some files in the latest rotation (/snapshots/daily-2026-06-01/) at the live tree instead of making independent copies.

Ops scheduled a dry-run restore from that snapshot into /production/. If any snapshot path still aliases live data, the restore would overwrite production in place.

Read /home/admin/backup-notes.txt. Find every incorrectly shared file between /production/ and /snapshots/daily-2026-06-01/, then repair the snapshot so it is safe to restore from. Do not damage live production data, and do not break intentional space-saving deduplication inside older snapshot rotations.

Note: the tools rsync and dd are available in this server.

Test: Every file in /snapshots/daily-2026-06-01/ mirroring a name in /production/ must be an independent copy: restoring the snapshot must not alias or overwrite live files.
Older rotations (snapshot-1–snapshot-3) must keep shared config.ini hard links.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 20 minutes.

5 "Ravenna": Logs Missing in ELK Pipeline 30 m Fix Pro New
"Ravenna": Logs Missing in ELK Pipeline

Scenario: "Ravenna": Logs Missing in ELK Pipeline

Level: Medium

Type: Fix

Access: Paid

Description: You are on call for the orders-api service. Central logging uses a small ELK stack on Docker Compose: an application container, Filebeat, Logstash, and Elasticsearch.

Operations reports that no order events show up in Elasticsearch, even though the application container is healthy and keeps writing logs. SRE left notes that the service contract specifies plain-text log lines.

The stack lives under /home/admin/ravenna and is managed with Docker Compose. Elasticsearch is reachable on the VM at http://127.0.0.1:9200.

Notes: 1. Wait until all four containers are Up before debugging (docker compose -f /home/admin/ravenna/docker-compose.yml ps). Elasticsearch can take up to two minutes to become healthy.
2. Internet access is not needed; container images are preloaded in the local Docker engine.

Test: At least one document containing order_shipped is indexed in Elasticsearch under the orders-* index pattern.

Quick check:

 curl -s 'http://127.0.0.1:9200/orders-*/_search?q=order_shipped&size=1' | jq . 
The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can read and execute.

Time to Solve: 30 minutes.

Kubernetes Playgrounds

# Name Time Type
1 K8s Playground - Free 20 m Playground
K8s Playground - Free

Playground: K8s Playground - Free

Level: Easy

Type: Playground

Access: Email

Description: This is a Kubernetes sandbox for you to play with and experiment.

It comes with an nginx.yaml playbook. You can try for example k apply -f nginx.yaml (you can use "k" as an alias for "kubectl".)

The Helm binary is also installed.

Free account:
The free account sandbox runs on a 1 GB of RAM VM. As usual, there is no Internet access.

Paid accounts (Pro/Pro+/Business:
The playground in this case runs on a 2 GB of RAM VM. It has Internet access (to pull your own images for example) and twice the time.

Time to Play: 20 minutes.

2 K8s Playground - Pro 60 m Playground Pro
K8s Playground - Pro

Playground: K8s Playground - Pro

Level: Easy

Type: Playground

Access: Paid

Description: This is a Kubernetes sandbox for you to play with and experiment.

It comes with an nginx.yaml playbook. You can try for example k apply -f nginx.yaml (you can use "k" as an alias for "kubectl".)

The Helm binary is also installed.

Free account:
The free account sandbox runs on a 1 GB of RAM VM. As usual, there is no Internet access.

Paid accounts (Pro/Pro+/Business:
The playground in this case runs on a 2 GB of RAM VM. It has Internet access (to pull your own images for example) and twice the time.

Time to Play: 60 minutes.

Send Us Feedback
Get Notified
For announcements like new scenarios. We'll never share your email with anyone else.
SadServersSadServers

Real-world Linux and DevOps scenarios for hands-on learning and technical assessment.

Uptime Robot ratio (30 days)
Product
  • Scenarios
  • For Individuals
  • For Businesses
  • Pricing
Resources
  • FAQ
  • Blog
  • Newsletter
Company
  • About Us
  • Support
  • Privacy Policy
  • Terms of Service
  • Contact
Connect With Us
info@sadservers.com

Made in Canada 🇨🇦
Updated: 2026-06-26 23:27 UTC – f0e2403