Redis troubleshooting

Connection refused

Server not running or not listening on the expected interface. Check systemctl status redis-server and redis-cli PING. Verify bind in redis.conf — default may be 127.0.0.1 only. Confirm port with ss -tlnp | grep 6379. Read logs: journalctl -u redis-server -e.

NOAUTH / WRONGPASS

Server requires authentication. Connect with redis-cli -a password or AUTH username password for ACL users. Check requirepass or ACL list with ACL LIST. protected-mode may block unauthenticated remote connections even when bind is 0.0.0.0.

OOM / out of memory

Redis hit maxmemory or the host ran out of RAM. Check INFO memory — look at used_memory, maxmemory, and evicted_keys in INFO stats. With noeviction, writes return errors. Fix: raise maxmemory, enable eviction policy for caches, delete stale keys, or scale to a larger instance / cluster.

MISCONF: can't save in background

Background save (RDB) failed — often disk full or permission error on dir. Redis stops accepting writes when stop-writes-on-bgsave-error yes. Check df -h on the data directory, fix permissions, then BGSAVE. Only as a temporary measure: CONFIG SET stop-writes-on-bgsave-error no (not recommended long-term).

Replication broken or lagging

Run INFO replication on primary and replica. On replica, check master_link_status and master_last_io_seconds_ago. Common causes: wrong masterauth, network firewall, primary restarted with empty data, or replica read-only writes attempted. Resync may require REPLICAOF reset or full RDB re-seed on large datasets.

Sudden slowness / timeouts

Redis is single-threaded — one blocking command stalls everything. Check SLOWLOG GET 20 for offenders. Suspects: KEYS, large SORT, huge LRANGE, saving huge RDB on slow disk. Use CLIENT LIST to see idle vs active clients. Latency spikes during BGSAVE or AOF rewrite are normal on busy instances.

Too many connections

Hit maxclients (default often 10000). Count with INFO clients or CLIENT LIST | wc -l. Find connection leaks in apps (missing pool limits). Kill stale clients: CLIENT KILL TYPE normal SKIPME yes (careful). Long-term: fix pooling, raise limit, or use a proxy.

Data disappeared after restart

Persistence may be disabled or pointed at the wrong directory. Check CONFIG GET dir, save, and appendonly. If only cache with no TTL, data loss on restart is expected. Verify dump.rdb and appendonly.aof exist and timestamps match expectations. Ensure systemd starts redis with the correct config file.

Debugging workflow

1. Is Redis up?

systemctl status redis-server
redis-cli PING
ss -tlnp | grep 6379

2. Quick health snapshot

redis-cli INFO server
redis-cli INFO memory
redis-cli INFO persistence
redis-cli INFO replication

3. Find slow or blocking commands

redis-cli SLOWLOG GET 20
redis-cli CLIENT LIST

Practice scenarios

Hands-on Redis scenarios on live Linux VMs: redis

Cheatsheet →