SadServers
  • Scenarios
  • Labs
    All Labs Linux & Bash Web Servers Databases Data Processing Docker Kubernetes CI/CD Infrastructure As Code Observability Tooling / Applications
  • Dashboard
  • Solutions
    For Individuals For Businesses
  • Ranking
  • Newsletter
  • Documentation
    FAQ Support Pro Accounts Pro+ Accounts Business Accounts Gift API CLI/TUI Privacy Troubleshooting Interviews
  • Blog
  • Pricing
  • Gift
    Gift Purchase Gift Redeem
  • About
Log In - Sign Up

Pro Troubleshooting Scenarios

advent2025 ai ansible apache bash c caddy clickhouse cron csv data processing disk volumes dns docker elk envoy etcd ftp git golang gunicorn hack haproxy harbor hashicorp vault helm java jenkins json kubernetes linux-other mongodb mysql nginx node.js php podman postgres prometheus python rabbitmq redis sql sqlite ssh ssl supervisord systemd terraform traefik
realistic / interviews new pro business

Pro

Pro / Paid Scenarios
# Name Level Time Type
1 "Apia": Needle in a Haystack Easy 40 m Do Pro
"Apia": Needle in a Haystack

Scenario: "Apia": Needle in a Haystack

Level: Easy

Type: Do

Access: Paid

Description: In a directory /home/admin/data, there are multiple files, all of them with same content. One of these files has been modified, a word was added. You need to identify which word it is and put it in the solution file (both newline terminated or not are accepted).

Test: md5sum /home/admin/solution should return 55aba155290288b58e9b778c8f616560 or 2eeefea9fc4b16ea624bed5c67a49d80

Check My Solution: The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can

Time to Solve: 40 minutes.

2 "Gitega": Find the Bad Git Commit Easy 30 m Do Pro
"Gitega": Find the Bad Git Commit

Scenario: "Gitega": Find the Bad Git Commit

Level: Easy

Type: Do

Access: Paid

Description: The directory at /home/admin/git has a Git repository with a Golang program and a test for it.

To execute the test, from this "git" directory run: go test. The last (current HEAD) commit fails the test. Suppose the first commit passed the test.

Find the (long hash) commit that first broke the test and enter it in the /home/admin/solution file. For example: echo 9e80a7eb1b09385e93ab4a76cb2c93beec48fd9f > /home/admin/solution

Test: Doing md5sum /home/admin/solution returns f7db1bb6b7bfcd66a4eb66782804b39d.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

3 "Yokohama": Linux Users Working Together Easy 30 m Fix Pro
"Yokohama": Linux Users Working Together

Scenario: "Yokohama": Linux Users Working Together

Level: Easy

Type: Fix

Access: Paid

Description: There are four Linux users working together in a project in this server: abe, betty, carlos, debora.

First, they have asked you as the sysadmin, to make it so each of these four users can read the project files of the other users in the /home/admin/shared directory, but none of them can modify a file that belongs to another user. Users should be able modify their own files.

Secondly, they have asked you to modify the file shared/ALL so that any of these four users can write more content to it, but previous (existing) content cannot be altered.

Test: All users (abe, betty, carlos, debora) can write to their own files. None of them can write to another user's file.
All users can add more content (append)) to the shared/project_ALL file but none can change its current content.
The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

4 "Fukuoka": Forbidden Association Easy 30 m Fix Pro
"Fukuoka": Forbidden Association

Scenario: "Fukuoka": Forbidden Association

Level: Easy

Type: Fix

Access: Paid

Description: There's a web server running on this host but curl localhost returns the default 404 Not Found page.

Fix the issue so that a file is served correctly and the message Welcome to the Real Site! is returned.

Test: Running curl localhost should return HTTP 200 with the message Welcome to the Real Site!.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

5 "Rio de Janeiro": Do we have another option? Easy 30 m Fix Pro
"Rio de Janeiro": Do we have another option?

Scenario: "Rio de Janeiro": Do we have another option?

Level: Easy

Type: Fix

Access: Paid

Description: This scenario server is dedicated to Jenkins, a Java application managed by systemd. Jenkins is failing to start. Troubleshoot and find the problem, then apply the solution so Jenkins runs properly.

Test: The service must return the string "Sign in - Jenkins" amongst some other html code. You can check with the command curl -s localhost:8888/login | grep Jenkins | head -n1

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

6 "Nuuk": More SSH Troubles Easy 20 m Fix Pro
"Nuuk": More SSH Troubles

Scenario: "Nuuk": More SSH Troubles

Level: Easy

Type: Fix

Access: Paid

Description: (NOTE: if you are a Pro user, you cannot SSH directly into this VM; click the "Open the Server Terminal" button to use the web browser instead).

SSH seems broken in this server. The user admin has an id_ed25519 SSH key pair in their ~/.ssh directory with the public key in ~/.ssh/authorized_keys but ssh 127.0.0.1 won't work.

Test: You can ssh locally, i.e. ssh admin@127.0.0.1 works.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 20 minutes.

7 "Kortenberg": Can't touch this! Easy 30 m Fix Pro
"Kortenberg": Can't touch this!

Scenario: "Kortenberg": Can't touch this!

Level: Easy

Type: Fix

Access: Paid

Description: Is "All I want for Christmas is you" already everywhere?. A bit unrelated, someone messed up the permissions in this server, the admin user can't list new directories and can't write into new files. Fix the issue.
NOTE: Besides solving the problem in your current admin shell session, you need to fix it permanently, as in a new login shell for user "admin" (like the one initiated by the scenario checker) should have the problem fixed as well.

Test: The admin user in a separate Bash login session should be able to create a new directory in your /home/admin directory, as well as being able to create a file into this new directory and add text into the new file.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

8 "Hamburg": Find the AWS EC2 volume Easy 60 m Do Pro
"Hamburg": Find the AWS EC2 volume

Scenario: "Hamburg": Find the AWS EC2 volume

Level: Easy

Type: Do

Access: Paid

Description: We have a lot of AWS EBS volumes, the description of which we have save to a file with: aws ec2 describe-volumes > aws-volumes.json.
One of the volumes contains important data and we need to identify which volume (its ID), but we only remember these characteristics: gp3, created before 30/09/2025 , Size < 64 , Iops < 1500, Throughput > 300.

Find the correct volume and put its InstanceId into the ~/mysolution file, e.g.: echo "i-00000000000000000" > ~/mysolution

Test: Running md5sum /home/admin/mysolution returns e7e34463823bf7e39358bf6bb24336d8 (we also accept the file without a new line at the end).

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 60 minutes.

9 "Melbourne": WSGI with Gunicorn Medium 40 m Fix Pro
"Melbourne": WSGI with Gunicorn

Scenario: "Melbourne": WSGI with Gunicorn

Level: Medium

Type: Fix

Access: Paid

Description: There is a Python WSGI web application file at /home/admin/wsgi.py , the purpose of which is to serve the string "Hello, world!". This file is served by a Gunicorn server which is fronted by an nginx server (both servers managed by systemd). So the flow of an HTTP request is: Web Client (curl) -> Nginx -> Gunicorn -> wsgi.py . The objective is to be able to curl the localhost (on default port :80) and get back "Hello, world!", using the current setup.

Test: curl -s http://localhost returns Hello, world! (serving the wsgi.py file via Gunicorn and Nginx)

Time to Solve: 40 minutes.

10 "Unimak Island": Fun with Mr Jason Medium 30 m Do Pro
"Unimak Island": Fun with Mr Jason

Scenario: "Unimak Island": Fun with Mr Jason

Level: Medium

Type: Do

Access: Paid

Description: Using the file station_information.json , find the station_id where "has_kiosk" is false and "capacity" is greater than 30.

Save the station_id of the solution in the /home/admin/mysolution file, for example: echo "ec040a94-4de7-4fb3-aea0-ec5892034a69" > ~/mysolution

You can use the installed utilities jq, gron, jid as well as Python3 and Golang.

Test: md5sum /home/admin/mysolution returns 8d8414808b15d55dad857fd5aeb2aebc

Time to Solve: 30 minutes.

11 "Ivujivik": Parlez-vous Français? Medium 40 m Do Pro
"Ivujivik": Parlez-vous Français?

Scenario: "Ivujivik": Parlez-vous Français?

Level: Medium

Type: Do

Access: Paid

Description: Given the CSV file /home/admin/table_tableau11.csv, find the Electoral District Name/Nom de circonscription that has the largest number of Rejected Ballots/Bulletins rejetés and also has a population of less than 100,000.

The initial CSV file may be corrupted or invalid in a way that can be fixed without changing its data.

Installed in the VM are: Python3, Go, sqlite3, miller directly and PostgreSQL, MySQL in Docker images.

Save the solution in the /home/admin/mysolution , with the name as it is in the file, for example: echo "Trois-Rivières" > ~/mysolution (the solution must be terminated by newline).

Test: md5sum /home/admin/mysolution returns e399d171f21839a65f8f8ab55ed1e1a1

Time to Solve: 40 minutes.

12 "Tarifa": Between Two Seas Medium 40 m Fix Pro
"Tarifa": Between Two Seas

Scenario: "Tarifa": Between Two Seas

Level: Medium

Type: Fix

Access: Paid

Description: There are three Docker containers defined in the docker-compose.yml file: an HAProxy accepting connetions on port :5000 of the host, and two nginx containers, not exposed to the host.

The person who tried to set this up wanted to have HAProxy in front of the (backend or upstream) nginx containers load-balancing them but something is not working.

Test: Running curl localhost:5000 several times returns both hello there from nginx_0 and hello there from nginx_1

Check /home/admin/agent/check.sh for the test that "Check My Solution" runs.

Time to Solve: 40 minutes.

13 "Abaokoro": Restore MySQL Databases Spooked by a Ghost Medium 40 m Fix Pro
"Abaokoro": Restore MySQL Databases Spooked by a Ghost

Scenario: "Abaokoro": Restore MySQL Databases Spooked by a Ghost

Level: Medium

Type: Fix

Access: Paid

Description: There are three databases that need to be restored. You need to create three databases called "first", "second" and "third" and restore the databases using the file "/home/admin/dbs_to_restore.zip".
If you encounter an issue while restoring the database, fix it.

Credit: Sebastian Segovia

Test: All databases, once restored, have a table named "foo".

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

14 "Poznań": Helm Chart Issue in Kubernetes Medium 30 m Fix Pro
"Poznań": Helm Chart Issue in Kubernetes

Scenario: "Poznań": Helm Chart Issue in Kubernetes

Level: Medium

Type: Fix

Access: Paid

Description: NOTE: Prompt may take a few extra seconds to be responsive while the k3s environment gets ready. Root access is not needed for this challenge ("admin" user cannot sudo).

A DevOps engineer created a Helm Chart web_chart with a custom nginx site, however he still gets the default nginx index.html.

You can check for example with POD_IP=$(kubectl get pods -n default -o jsonpath='{.items[0].status.podIP}') and curl -s "${POD_IP}">.

In addition he should set replicas to 3.

The chart is not working as expected. Fix the configurations so you get the custom HTML page from any nginx pod.

Credit Kamil Błaż

Test: Doing curl on the default port (:80) of any nginx pod returns a Welcome SadServers page. The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

15 "Manado": How much do you press? Medium 60 m Do Pro
"Manado": How much do you press?

Scenario: "Manado": How much do you press?

Level: Medium

Type: Do

Access: Paid

Description: You have been tasked with compressing the file /home/admin/names, which is 35147 bytes, to a size smaller than 9400 bytes. You can use any compressing tool at your disposal (there are many available in the server), also you can modify the file without deleting anything in it. Put the solution (compressed file) in the /home/user/admin/solution directory with the default extension used by the compression tool (example: ~/solution/names.gzip).

Test: The size of the compressed file is smaller than 9400 bytes.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 60 minutes.

16 "Warsaw": Prometheus can't scrape the webserver Medium 60 m Fix Pro
"Warsaw": Prometheus can't scrape the webserver

Scenario: "Warsaw": Prometheus can't scrape the webserver

Level: Medium

Type: Fix

Access: Paid

Description: A developer created a golang application that is exposing the /metrics endpoint. They have a problem with scraping the metrics from the application. They asked you to help find the problem.

Full source code of the application is available at the /home/admin/app directory.

Credit Kamil Błaż

Test: The endpoint http://localhost:9000/metrics should return HTTP code 200.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 60 minutes.

17 "Bekasi": Supervisor is still around Medium 40 m Fix Pro
"Bekasi": Supervisor is still around

Scenario: "Bekasi": Supervisor is still around

Level: Medium

Type: Fix

Access: Paid

Description: There is an nginx service running on port 443, it is the main web server for the company and looks like a new employee has deployed some changes to the configuration of supervisor and now it is not working as expected.

If you try to access curl -k https://bekasi it should return Hello SadServers! but for some reason it is not.

You cannot modify files from the /home/admin/bekasi folder in order to pass the check.sh

You must find out what the issue is and fix it.

Test: curl -k https://bekasi returns Hello SadServers!

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

18 "Depok": Nginx with Brotli Medium 30 m Fix Pro
"Depok": Nginx with Brotli

Scenario: "Depok": Nginx with Brotli

Level: Medium

Type: Fix

Access: Paid

Description: You are tasked to add compression to the company website. The website is running on an Nginx server, and you decide to add Brotli compression to it.

Brotli has became very popular these days because of its high compression ratio. It's a generic-purpose lossless compression algorithm that compresses data using a combination of a modern variant of the LZ77 algorithm, Huffman coding, and 2nd order context modeling.

For this purpose, you decided to compile the brotli modules yourself and add them to the Nginx server.

The location of the Brotli source code is at /home/admin/ngx_brotli. The nginx source code (needed to compile the modules) is located at /home/admin/nginx-1.18.0. From the ngx_brotli repository first you need to compile the brotli dependencies and then configure and make modules for Nginx. Afer that you need to add the modules to the Nginx configuration.

After installing the modules, you need to make sure the responses from the server are being served with compression.

Create a port-forward to port 80 from the server to your computer and check the header Content-Encoding, responses must return br for Brotli compression. You can also use curl -H "Accept-Encoding: br, gzip" -I http://localhost to check the header.

Something nice about Brotli is that it fails over to gzip if the client doesn't support Brotli, so curl -H "Accept-Encoding: gzip" -I http://localhost should return gzip instead.

Test: curl -H "Accept-Encoding: br" -sI http://localhost returns the header Content-Encoding: br.

The "Check My Solution" button runs the script _/home/admin/agent/check.sh_, which you can see and execute.

Time to Solve: 30 minutes.

19 "Tukaani": XZ LZMA Library Compromised Medium 30 m Fix Pro
"Tukaani": XZ LZMA Library Compromised

Scenario: "Tukaani": XZ LZMA Library Compromised

Level: Medium

Type: Fix

Access: Paid

Description: (You can learn about Linux Libraries before starting this scenario).

The Linux shared library liblzma.so has been compromised (the real compromised XZ Utils liblzma has not been used). The liblzma.so at the path /usr/lib/x86_64-linux-gnu/liblzma.so.5.2.5 is the good one. Consider the same library liblzma.so.5.2.5 at other paths as compromised or malicious (ideally we would have used other real versions with different checksums).

Find all instances of this "malicious" liblzma library (remember, it's the same library but in different directory locations) and make it so none of the running processes use it, while the applications "webapp" and "jobapp" (both of which managed by systemd) still run properly (eg, stopping those applications is not a solution).

Test: lsof | grep liblzma.so.5 returns only the liblzma in the path: /usr/lib/x86_64-linux-gnu/liblzma.so.5.2.5

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

20 "Atrani": Modify a SQlite3 Database Medium 30 m Fix Pro
"Atrani": Modify a SQlite3 Database

Scenario: "Atrani": Modify a SQlite3 Database

Level: Medium

Type: Fix

Access: Paid

Description: A developer created a script /home/admin/readdb.py that tests access to a database. Without modifying the readdb.py file, change the database so that running the script returns the string "John Karmack".

Test: Running /home/admin/readdb.py returns "John Karmack".

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

21 "Hanoi": Find the Multitasking Users Medium 30 m Do Pro
"Hanoi": Find the Multitasking Users

Scenario: "Hanoi": Find the Multitasking Users

Level: Medium

Type: Do

Access: Paid

Description: The Hanoi office has a Linux server with a large number of user accounts and groups. The system administrators need to identify which users belong to multiple groups for better access management.

Given two files, `users.txt` and `groups.txt`, create a new file `/home/admin/multi-group-users.txt` containing the usernames of users who belong to more than one group, one username per line, sorted alphabetically.

The `users.txt` file contains a list of usernames, one per line. The `groups.txt` file contains group names and their members, in the format `group_name:user1,user2,user3`.

Test: Running md5sum /home/admin/multi-group-users.txt returns dc0ae86caae7125d21df03a0ab29d8ae

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

22 "Batumi": Troubleshoot "A" cannot connect to "B" Medium 40 m Fix Pro
"Batumi": Troubleshoot "A" cannot connect to "B"

Scenario: "Batumi": Troubleshoot "A" cannot connect to "B"

Level: Medium

Type: Fix

Access: Paid

Description: (To learn the skills to solve this challenge, see Can't Connect to a Service: Linux Troubleshooting Guide)

There is a web server (Caddy) on HTTP port :80 but curl http://127.0.0.1 doesn't work. Find out what's wrong and make the necessary fixes so the web server returns a URL.

Note: as a limitation, the file /home/admin/db_connector.py must not be modified so that the challenge is considered solved properly.
The web server has to respond on the IP address 127.0.0.1; not only on "localhost".

Test: The command curl http://127.0.0.1 returns a URL address.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

23 "Bharuch": Lost in Translation Medium 40 m Fix Pro
"Bharuch": Lost in Translation

Scenario: "Bharuch": Lost in Translation

Level: Medium

Type: Fix

Access: Paid

Description: There's a Docker container that runs a web server on port 3000, but it's not working.

Using the tooling and resources provided in the server, make the container run correctly.

Test: curl http://localhost:3000 should return "Hello from sadservers!"

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

24 "Ruaka": Kubernetes pod in distress Medium 30 m Fix Pro
"Ruaka": Kubernetes pod in distress

Scenario: "Ruaka": Kubernetes pod in distress

Level: Medium

Type: Fix

Access: Paid

Description: A developer wants to deploy an open-source tool on Kubernetes. The tool unfortunately has limited documentation.

They built a helm chart and a container image. When the application is deployed, for some reason the server in Kubernetes doesn't seem to work but when the binary is started on their laptop/machine it works perfectly.

The application server is deployed by Helm. The command they used is: helm upgrade --install ruaka charts/ruaka.

Debug and help the developer find the issue. NOTE: Do not change or delete any current Helm field value in the chart, only add if needed.

Remember to give enough time to k8S after you apply a change before checking the solution.

Test: kubectl get pod shows the ruaka application pod up and running, while no Helm fields have been taken out from the applicaiton chart.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

25 "Campina Grande": Give me my cert, Vault Medium 30 m Fix Pro
"Campina Grande": Give me my cert, Vault

Scenario: "Campina Grande": Give me my cert, Vault

Level: Medium

Type: Fix

Access: Paid

Description: A web application running at https://nginx.example.com has an expired certificate. Issue a new certificate using the Hashicorp Vault running on the server.
The Vault instance is already unsealed and initialized, and you have full admin access with the admin user.

Test: Running curl https://nginx.example.com returns Hello!.

The certificate presented by Nginx is issued by the Vault PKI (check using openssl verify -CAfile /usr/local/share/ca-certificates/vault-pki-ca.crt /etc/nginx/ssl/cert.pem).

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

26 "Atlantis": Not found Medium 30 m Fix Pro
"Atlantis": Not found

Scenario: "Atlantis": Not found

Level: Medium

Type: Fix

Access: Paid

Description: There is a small "C" application in the /home/admin/app directory. Create the Docker container "app" with a small footprint and minimalistic so you get a hello binary that returns a greeting in Atlantean (Docker multi-stage build). The binary application is automatically called when running docker run app

Test: docker run app returns SOO-puhk

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

27 "Solanea": ClickHouse mad house Medium 40 m Do Pro
"Solanea": ClickHouse mad house

Scenario: "Solanea": ClickHouse mad house

Level: Medium

Type: Do

Access: Paid

Description: You have a ClickHouse installation CHI running on a Kubernetes cluster and a set of requests (located at ~/data/requests.csv) that you must populate into the http_requests table in the monitoring database (table may not exist in all pod instances).
Do this insert in all pod instances of the database.
The user and password to connect to the database are default.
The keeper pods provide clickhouse replication services.

Test: You are able to query the database and see the data:

clickhouse-client -h --password default -q 'SELECT COUNT(*) FROM monitoring.http_requests'

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

28 "Tunis": Redis Replication Problem Medium 40 m Fix Pro
"Tunis": Redis Replication Problem

Scenario: "Tunis": Redis Replication Problem

Level: Medium

Type: Fix

Access: Paid

Description: A Redis master-replica setup is running on this server, with the master on port 6379 and the replica on port 6380. Both instances show as "connected" when you check their status, but data synchronization has silently broken.

Recent writes to the master don't appear on the replica, even though there are no obvious errors in the logs and both Redis instances appear healthy.

Fix the replication issues so that data written to the master (port 6379) immediately appears on the replica (port 6380) without data loss.

Master: localhost:6379
Replica: localhost:6380
Password: masterpass123

A helper test script is available at /home/admin/test_replication.sh

Test: The solution will be validated by writing a test key to the master and verifying it appears on the replica within 2 seconds.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

29 "Auderghem": Containers miscommunication Medium 30 m Fix Pro
"Auderghem": Containers miscommunication

Scenario: "Auderghem": Containers miscommunication

Level: Medium

Type: Fix

Access: Paid

Description: There is an nginx Docker container that listens on port 80, the purpose of which is to redirect the traffic to two other containers statichtml1 and statichtml2 but this redirection is not working.
Fix the problem.

IMPORTANT. You can restart all containers, but don't stop or remove them.

Test: The nginx container must redirect the traffic to the statichtml1 and statichtml2 containers:

curl http://localhost returns the Welcome to nginx default page
curl http://localhost/1 returns HelloWorld;1
curl http://localhost/2 returns HelloWorld;2

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

30 "Marseille": Rocky security Medium 30 m Fix Pro
"Marseille": Rocky security

Scenario: "Marseille": Rocky security

Level: Medium

Type: Fix

Access: Paid

Description: As the Christmas shopping season approaches, the security team has asked Mary and John to implemente more security measures. Unfortunately, this time they have broken the LAMP stack; the frontend is unable get an answer from upstream, thus they need your help again to fix it.

The application should be able to serve the content from the webserver.

Note for Pro users: direct SSH access is not available (yet) for this scenario.

Test: curl localhost | head -n1 returns SadServers - LAMP Stack

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

31 "Woluwe": Too many images Medium 30 m Fix Pro
"Woluwe": Too many images

Scenario: "Woluwe": Too many images

Level: Medium

Type: Fix

Access: Paid

Description: A pipeline created a lot of Docker images locally for a web app. All these images except for one contain a typo introduced by a developer: there's an incorrect image instruction to pipe "HelloWorld" to "index.htmlz" instead of using the correct "index.html"
Find which image doesn't have the typo (and uses the correct "index.html"), tag this correct image as "prod" (rather than fixing the current prod image) and then deploy it with docker run -d --name prod -p 3000:3000 prod so it responds correctly to HTTP requests on port :3000 instead of "404 Not Found".

Test: curl http://localhost:3000 should respond with HelloWorld;529

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

32 "Podgorica": Docker to Podman migration Medium 40 m Do Pro
"Podgorica": Docker to Podman migration

Scenario: "Podgorica": Docker to Podman migration

Level: Medium

Type: Do

Access: Paid

Description: You have been tasked with migrating this future web server from using Docker (which uses a daemon) to rootless Podman.
There is already an Nginx Podman image on the server, and your objective is to manage the container created from it using systemd, so the it starts automatically on reboot and continues running unless explicity stopped (the same behaviour expected from a Docker-managed container).
Create a systemd service named container-nginx.service that manages the Podman Nginx container. Enable and start this service.

NOTES: Although a quadlet file solution should be valid, the check script is still not accounting for it.

There is no need to reboot the VM, although if you want you could reboot it from the command line with /sbin/shutdown -r now and refresh or reopen the web console.

Test: The checker script will test if the container-nginx.service is active and enabled, and if it can stop and start the service. It will also verify that curl localhost:8888 returns the default "Welcome to nginx" web page.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

33 "Torino": Optimize grande Docker image Medium 30 m Do Pro
"Torino": Optimize grande Docker image

Scenario: "Torino": Optimize grande Docker image

Level: Medium

Type: Do

Access: Paid

Description: A Torino Node.js application is located in the ~/torino-app directory.
You can run it directly with: nohup node app.js > app.log 2>&1 &. You can also verify that it works by running: curl localhost:3000

There is already a torino Docker image built with the Dockerfile in ~/torino-app, but the resulting image size is 916 MB.

Your task is to optimize the Docker image size:
1. Build a new Docker image for the Torino application, also called torino:latest but with a total size under 122 MB
2. Create and run a container using this optimized image.

NOTE: You can only use the existing Docker images in the server.
To build a Node application you need to COPY in your Dockerfile, besides the app.js , the package*.json files and without Internet access, the node_modules directory, since you cannot RUN npm install.

Test: The torino Docker image is less than 122 MB and curl http://localhost:3000 returns Hello from Torino!

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

34 "Socorro, NM": Optimize Podman image Medium 30 m Do Pro
"Socorro, NM": Optimize Podman image

Scenario: "Socorro, NM": Optimize Podman image

Level: Medium

Type: Do

Access: Paid

Description: The podman image localhost/prod:latest contains a static website.
Initially the image size is 261 MB and contains 100 layers.

Your task:
1. Optimize the image localhost/prod:latest so that its size is less than 200 MB, using the same tag.
2. Run a container named "check" from the optimized image: podman run -d --name check -p 8888:80 localhost/prod:latest so that curl localhost:8888 returns 100 lines.

Test: The podman image localhost/prod:latest size is less than 200 MB and running curl localhost:8888 from a container named "check" created from the image retuns 100 lines.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

35 "Lyon": Migrate Ingress-NGINX to Traefik Medium 40 m Do Pro
"Lyon": Migrate Ingress-NGINX to Traefik

Scenario: "Lyon": Migrate Ingress-NGINX to Traefik

Level: Medium

Type: Do

Access: Paid

Description: Ingress-NGINX is being retired. As the DevOps Engineer, you will replace it with Traefik on the production Kubernetes cluster in a private VPC. This scenario is a local proof-of-concept for that migration.

The current K8s cluster has a "Hello World" pod running, i.e.: curl hello.lyon.local returns "Hello world" (see note 1). You should be able to see the same content delivered via Traefik once the ingress-nginx is down.

Notes: 1: Wait at the start until k8s is fully up before doing curl, otherwise you get 503, you can check for ex with k get pod -n ingress-nginx
2: The k8s manifests are under the ~/app dir.
3: ingress-nginx was deployed with a Helm chart.
4: The Helm chart for traefik is available under /home/admin/traefik (The Traefik image is already loaded in k3s).
5: Traefik dashboard and probes/metrics port by default is :8080 but that's used by the system; use a different port or disable.
6: The domain hello.lyon.local is actually pointing to the localhost.
7: The ingress must be listening on port 80 for any IP so it can respond to localhost:80 or actually to *:80

TIP: You can use k as an alias for kubectl, and it has autocomplete enabled.

Test: When the command curl -i hello.lyon.local is executed, it returns the message Hello World, while only the traefik pod must be present (instead of ingress-nginx).

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

36 "Stockholm": DNS health check issue Medium 20 m Fix Pro
"Stockholm": DNS health check issue

Scenario: "Stockholm": DNS health check issue

Level: Medium

Type: Fix

Access: Paid

Description: The internal status portal on this host should answer on http://127.0.0.1:9167/ with a body containing OK.

It worked until operations ran a package cleanup.

The portal service (stockholm-portal) only runs after a DNS health check at /usr/local/bin/stockholm-dns-check.sh succeeds.

Make the necessary changes so the portal works again.

Do not modify /usr/local/bin/stockholm-dns-check.sh.

Test: The health script /usr/local/bin/stockholm-dns-check.sh runs successfully, stockholm-portal is active, and curl http://127.0.0.1:9167/ returns a response whose body contains OK.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 20 minutes.

37 "Tallinn": BuildKit & Docker build mismatch Medium 30 m Fix Pro
"Tallinn": BuildKit & Docker build mismatch

Scenario: "Tallinn": BuildKit & Docker build mismatch

Level: Medium

Type: Fix

Access: Paid

Description: This VM runs a tiny container app, tallinn-service, whose only job is to print an API version string (for example tallinn-api-version=1.4.0). The image is built from /home/admin/tallinn-app with docker build.

The dev team raised the API contract to 2.0.0 in src/api_version.txt and ran a new build, but QA still rejects the image tagged tallinn-app:current: it reports 1.4.0 at runtime. A recent CI log is in /home/admin/build.log.

Fix the docker build outcome so the deploy image matches what the sources ask for.

Fix the image tagged tallinn-app:current so the on-disk contract file and the shipped binary both report API 2.0.0.

Test: Image tallinn-app:current exists, /etc/tallinn/api_version is 2.0.0, and /usr/local/bin/tallinn-service prints tallinn-api-version=2.0.0.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

38 "Modena": Ansible Deploy Won't Publish Medium 30 m Fix Pro New
"Modena": Ansible Deploy Won't Publish

Scenario: "Modena": Ansible Deploy Won't Publish

Level: Medium

Type: Fix

Access: Paid

Description: This host publishes an internal status page by running Ansible locally against the Docker container status-app (port 8888 on localhost maps to the container's HTTP port).

The playbook tree lives in /home/admin/deploy/. After a refactor, ansible-playbook site.yml no longer leaves a working status endpoint — curl http://localhost:8888/ does not return the expected line.

Fix the Ansible project and run the playbook successfully so the status page is served from the container.

Test: curl http://localhost:8888/ returns a first line of SadServers - Modena OK.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 30 minutes.

39 "Ravenna": Logs Missing in ELK Pipeline Medium 30 m Fix Pro New
"Ravenna": Logs Missing in ELK Pipeline

Scenario: "Ravenna": Logs Missing in ELK Pipeline

Level: Medium

Type: Fix

Access: Paid

Description: You are on call for the orders-api service. Central logging uses a small ELK stack on Docker Compose: an application container, Filebeat, Logstash, and Elasticsearch.

Operations reports that no order events show up in Elasticsearch, even though the application container is healthy and keeps writing logs. SRE left notes that the service contract specifies plain-text log lines.

The stack lives under /home/admin/ravenna and is managed with Docker Compose. Elasticsearch is reachable on the VM at http://127.0.0.1:9200.

Notes: 1. Wait until all four containers are Up before debugging (docker compose -f /home/admin/ravenna/docker-compose.yml ps). Elasticsearch can take up to two minutes to become healthy.
2. Internet access is not needed; container images are preloaded in the local Docker engine.

Test: At least one document containing order_shipped is indexed in Elasticsearch under the orders-* index pattern.

Quick check:

 curl -s 'http://127.0.0.1:9200/orders-*/_search?q=order_shipped&size=1' | jq . 
The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can read and execute.

Time to Solve: 30 minutes.

40 "Hong-Kong": can't write data into database. Hard 40 m Fix Pro
"Hong-Kong": can't write data into database.

Scenario: "Hong-Kong": can't write data into database.

Level: Hard

Type: Fix

Access: Paid

Description: (Similar to "Manhattan" scenario but harder). Your objective is to be able to insert a row in an existing Postgres database. The issue is not specific to Postgres and you don't need to know details about it (although it may help).

Postgres information: it's a service that listens to a port (:5432) and writes to disk in a data directory, the location of which is defined in the data_directory parameter of the configuration file /etc/postgresql/14/main/postgresql.conf. In our case Postgres is managed by systemd as a unit with name postgresql.

Test: sudo -u postgres psql -c "insert into persons(name) values ('jane smith');" -d dt

Should return:INSERT 0 1

Time to Solve: 40 minutes.

41 "Pokhara": SSH and other sshenanigans Hard 60 m Fix Pro
"Pokhara": SSH and other sshenanigans

Scenario: "Pokhara": SSH and other sshenanigans

Level: Hard

Type: Fix

Access: Paid

Description: A user client was added to the server, as well as their SSH public key.
The objective is to be able to SSH locally (there's only one server) as this user client using their ssh keys. This is, if as root you change to this user sudo su; su client, you should be able to login with ssh: ssh localhost.

Test: As user admin: sudo -u client ssh client@localhost 'pwd' returns /home/client

Time to Solve: 60 minutes.

42 "Belo-Horizonte": A Java Enigma Hard 40 m Fix Pro
"Belo-Horizonte": A Java Enigma

Scenario: "Belo-Horizonte": A Java Enigma

Level: Hard

Type: Fix

Access: Paid

Description: (Credit for the idea: fuero)

There is a one-class Java application in your /home/admin directory. Running the program will print out a secret code, or you may be able to extract the secret from the class file without executing it but I'm not providing any special tools for that.

Put the secret code in a /home/admin/solution file, eg echo "code" > /home/admin/solution.

Test: md5sum /home/admin/solution |awk '{print $1}' returns 9d2bd7aabb26681eacd9444da6b6643c

Time to Solve: 40 minutes.

43 "Chennai": Pull a Rabbit from a Hat Hard 60 m Fix Pro
"Chennai": Pull a Rabbit from a Hat

Scenario: "Chennai": Pull a Rabbit from a Hat

Level: Hard

Type: Fix

Access: Paid

Description: There is a RabbitMQ (RMQ) cluster defined in a docker-compose.yml file.

Bring this system up and then run the producer.py script in such a way that is able to send messages to RMQ. In particular you have to send the message "hello-lwc".

- RMQ is a queuing system: messages are put in the queue with a "producer" and they are taken out from the other side by a "consumer". The queue name has to be the same for both.

- To send the message "hello-lwc": python3 ~/producer.py hello-lwc. Should return Message sent to RabbitMQ. "IncompatibleProtocolError" means RMQ is not working properly.

- To test consuming it: python3 ~/consumer.py, this will retrieve the next message from the queue and print it. Once everything is working send more than one message so there's at least one in the queue when the validation runs.

- Do not change the consumer.py and producer.py files; if you do the Check My Solution will fail.

Test: python3 ~/consumer.py returns hello-lwc

See /home/admin/agent/check.sh for the exact test.

Time to Solve: 60 minutes.

44 "Florence": Database Migration Hell Hard 60 m Fix Pro
"Florence": Database Migration Hell

Scenario: "Florence": Database Migration Hell

Level: Hard

Type: Fix

Access: Paid

Description: You are working as a DevOps Engineer in a company and another team member left the company and left the docker-compose.yml of a database-backed web application unfinished.

Generally, the problem revolves around the database migration and docker compose.

Additionally on front of the application there is an Nginx server and you need to fix the proper access to it as well.

The source of code is in /home/admin/app

Credit Kamil Błaż

Test: curl --cacert /etc/nginx/certs/sadserver.crt https://sadserver.local returns a message containing "ready to serve requests"

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute

Time to Solve: 60 minutes.

45 "Zaragoza": Test changing critical files Hard 40 m Do Pro
"Zaragoza": Test changing critical files

Scenario: "Zaragoza": Test changing critical files

Level: Hard

Type: Do

Access: Paid

Description: The goal is to make the script /home/admin/agent/check.sh return OK, without editing the original /etc/hosts file.

Think of testing changes in the critical directory /etc in a safe way. In this case, adding "127.0.0.1 my.local.test" to /etc/hosts .

There would be many ways of trying to do this with "sudo" access, like the usual procedure of making a copy of the config file, editing there and copying or renaming back to the original file. In our case, to avoid all those simple solutions, there is no general "sudo" privileges in this scenario (but there may be for some commands).

Test: The string my.local.test is in /etc/hosts

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

46 "Amygdala": Do you have enough insight to see the secrets? Hard 40 m Fix Pro
"Amygdala": Do you have enough insight to see the secrets?

Scenario: "Amygdala": Do you have enough insight to see the secrets?

Level: Hard

Type: Fix

Access: Paid

Description: Troubleshoot and fix a Kubernetes web application running in the app namespace. Make the deployment run successfully so that it returns Hello handsome! when you curl it.

Fix first your admin user access to the local Kubernetes cluster; the KUBECONFIG environment variable must be set to $HOME/.kube/config.

You have full admin access to a Vault server (containing the secrets you need) from the admin user. All the used manifests for the application are placed on the /home/admin/manifests directory.

Test: Running: POD_IP=$(kubectl get po -n app -l app=app -o jsonpath='{.items[0].status.podIP}') curl http://$POD_IP returns Hello handsome!.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

47 "Cabedelo": Harbor full of issues Hard 40 m Fix Pro
"Cabedelo": Harbor full of issues

Scenario: "Cabedelo": Harbor full of issues

Level: Hard

Type: Fix

Access: Paid

Description: You need to build and push a docker image without changing the Dockerfile to your company's Harbor registry, which is running at harbor.sadservers.local, with its home directory at /opt/harbor. You have full admin access with admin:Harbor12345 credential. The source code and the Dockerfile are in the ~/app directory. The image name must be harbor.sadservers.local/images/app:1.0.0. It is also expected that the application will be up and running at localhost:5000 in a container named app.

IMPORTANT. Do not:
1. Generate new internal certificates
2. Change the Dockerfile
3. Change the /opt/harbor.yml file

Test: You are able to pull the application image from Harbor:
docker rmi harbor.sadservers.local/images/app:1.0.0
docker pull harbor.sadservers.local/images/app:1.0.0


You can access the application; curl localhost:5000 returns Hello world!

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

48 "Karakorum": WTFIT – What The Fun Is This? Hard 40 m Fix Pro
"Karakorum": WTFIT – What The Fun Is This?

Scenario: "Karakorum": WTFIT – What The Fun Is This?

Level: Hard

Type: Fix

Access: Paid

Description: (NOTE: this is not a new scenario but an existing Pro one temporarily available to all users as the last Advent of SysAdmin 2025 scenario).

There's a binary at /home/admin/wtfit that nobody knows how it works or what it does ("what the fun is this"). Someone remembers something about wtfit needing to communicate to a service in order to start.

Run this wtfit program so it doesn't exit with an error, fixing or working around things that you need but are broken in this server.

Test: Running /home/admin/wtfit returns OK.

Time to Solve: 40 minutes.

49 "London": Ollama LLM troubles Hard 40 m Fix Pro
"London": Ollama LLM troubles

Scenario: "London": Ollama LLM troubles

Level: Hard

Type: Fix

Access: Paid

Description: An AI agent has been deployed to production as a container called ai-agent managed by the Docker Compose configuration /home/admin/app/docker-compose.yaml. This ai-agent container relies on an Ollama LLM backend to generate a report but hasn't generated any yet. Your mission is to restore the broken agent-to-LLM (Ollama) connectivity, and tune the agent configuration so it can produce a report in /home/admin/app/agent/report.json. Example of the expected output:

{
  "summary": "Nginx is failing to reach its upstream service",
  "root_causes": [
    {
      "service": "nginx",
      "error": "connection refused to upstream 127.0.0.1:9999",
      "severity": "high"
    }
  ],
  "recommended_actions": "Fix upstream port configuration"
}
Note: The system consist of a group of dummy nginx containers generating logs and sending them to a central rsyslog container. The logs are then shared on a volume with the ai-agent container, from there the agent picks up the logs and passes them together with a promt to the LLM server so it can produce the desired answer with the expected JSON format. You don't need to worry about troubleshooting any container other than the container ai-agent or service agent within docker compose.

Test: The command docker compose up -d agent under the directory /home/admin/app must create the report file /home/admin/app/agent/report.json. The format of the answer must be as specified in the description.

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

50 "Anatolia": compromised server Hard 40 m Fix Pro
"Anatolia": compromised server

Scenario: "Anatolia": compromised server

Level: Hard

Type: Fix

Access: Paid

Description: This web server has been compromised and is not serving the home page anymore, those troubleshooting skills you have as DevOps are urgently needed to solve the mystery of the missed home page and restore the integrity of the server.

Note: The default configuration files under /etc/apache2 are not the problem.

This scenario is based on a real server that was "hacked". Ideally you'd recover from infrastrucrure as code playbooks and clean data backups on a new server with the vulnerabilities fixed. Instead, in this exercise you are asked to clean manually the compromised server, restore it to a working condition and ideally, find how the server was broken into. The solution test only checks that the web service is working.

Test: curl localhost must return SadServer - Anatolia

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

51 "Sapporo": ephemeral tokens Hard 40 m Fix Pro
"Sapporo": ephemeral tokens

Scenario: "Sapporo": ephemeral tokens

Level: Hard

Type: Fix

Access: Paid

Description: The Sapporo gate API on this host should answer on http://127.0.0.1:9180/ with a body containing OK.

A background service writes short-lived tokens to /var/lib/sapporo/pulse (each value is visible for only a fraction of a second, then the file is cleared again). The gate compares /home/admin/sapporo/active-token against the latest emitted token.

The installed collector at /home/admin/sapporo-collector.sh (triggered by sapporo-collector.timer once per minute) never keeps up; active-token stays empty or stale and the gate keeps failing.

Fix collection so the current token is captured reliably and the gate returns OK.

Test: curl http://127.0.0.1:9180/ returns a response whose body contains OK, and /home/admin/sapporo/active-token holds a token matching the current pulse (format SAPPORO- followed by eight hex digits).

The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.

Time to Solve: 40 minutes.

Send Us Feedback or Get Notified
For announcements like new scenarios. We'll never share your email with anyone else.
SadServersSadServers

Real-world Linux and DevOps scenarios for hands-on learning and technical assessment.

Uptime Robot ratio (30 days)
Product
  • Scenarios
  • For Individuals
  • For Businesses
  • Pricing
Resources
  • FAQ
  • Blog
  • Newsletter
Company
  • About Us
  • Support
  • Privacy Policy
  • Terms of Service
  • Contact
Connect With Us
info@sadservers.com

Made in Canada 🇨🇦
Updated: 2026-06-26 23:27 UTC – f0e2403