MongoDB troubleshooting

Connection refused

mongod not running or not listening. Check systemctl status mongod and ss -tlnp | grep 27017. Verify net.bindIp in /etc/mongod.conf — binding only to 127.0.0.1 blocks remote clients. Read logs: tail -f /var/log/mongodb/mongod.log or journalctl -u mongod -e.

Authentication failed

Wrong user, password, or auth database. Users are scoped to the database where created — authenticate against that db (often admin for admin users). Check db.getUsers() from an admin session. Ensure security.authorization: enabled matches your expectations. Connection string must include credentials when auth is on.

not primary / NotWritablePrimary

Client tried to write to a SECONDARY or during failover. Check db.hello().isWritablePrimary or rs.status() for current PRIMARY. Update connection string to include full replica set and replicaSet name. Wait for election to complete after primary failure — usually seconds. Do not hardcode the primary hostname in apps.

Replica set not initialized / REMOVED state

Single mongod without rs.initiate() is standalone — no HA. Run rs.initiate() on one member, then rs.add() for others (all must share replSetName). Members in REMOVED or DOWN state: verify hostname in rs.conf() matches what other members can resolve (use FQDNs, not localhost, across hosts).

Replication lag growing

On PRIMARY run rs.printSecondaryReplicationInfo() — check syncedTo vs primary optime. Causes: secondary disk slower than primary, network issues, heavy reads on secondary competing with replication, or large index builds on secondary. A secondary in RECOVERING is catching up — monitor until SECONDARY. Oplog too small can force full resync if a secondary falls too far behind.

Election loops / no PRIMARY

No member can achieve majority quorum. Common causes: even number of voting members without arbiter, network partition splitting the set, or members down. Need majority of voting members reachable. Check rs.status() for health: 0 members. Fix connectivity, restore down nodes, or reconfigure (carefully) with rs.reconfig() — improper reconfig can make things worse.

Disk full / WiredTiger errors

MongoDB stops accepting writes when disk is full. Check df -h on storage.dbPath. Find large collections: db.stats(), db.collection.stats(). Compact or archive old data; add TTL indexes for expiring documents. Free space before restart — corrupted shutdown may require repair (last resort).

Slow queries

Run db.collection.explain("executionStats").find(...). A COLLSCAN on a large collection needs an index. Enable the profiler (db.setProfilingLevel(1, { slowms: 100 })) or check db.currentOp() for long-running ops. Missing or wrong compound index field order is a frequent cause.

Too many connections

Hit connection limit (default often high but apps can leak). Check db.serverStatus().connections. Fix connection pooling in applications. Kill long-idle ops with db.killOp(opid) after identifying in db.currentOp().

Debugging workflow

1. Is mongod up?

systemctl status mongod
ss -tlnp | grep 27017
mongosh --eval "db.adminCommand({ ping: 1 })"

2. Replica set health

mongosh --eval "rs.status()"
mongosh --eval "rs.printSecondaryReplicationInfo()"

3. Active operations and slow queries

db.currentOp({ active: true, secs_running: { $gt: 5 } })
db.system.profile.find().sort({ ts: -1 }).limit(5)

Practice scenarios

Hands-on MongoDB scenarios on live Linux VMs: mongodb

Cheatsheet →