MongoDB troubleshooting
Connection refused
mongod not running or not listening. Check
systemctl status mongod and
ss -tlnp | grep 27017. Verify net.bindIp in
/etc/mongod.conf — binding only to 127.0.0.1
blocks remote clients. Read logs:
tail -f /var/log/mongodb/mongod.log or
journalctl -u mongod -e.
Authentication failed
Wrong user, password, or auth database. Users are scoped to the database where
created — authenticate against that db (often admin for admin users).
Check db.getUsers() from an admin session. Ensure
security.authorization: enabled matches your expectations.
Connection string must include credentials when auth is on.
not primary / NotWritablePrimary
Client tried to write to a SECONDARY or during failover. Check
db.hello().isWritablePrimary or rs.status() for
current PRIMARY. Update connection string to include full replica set and
replicaSet name. Wait for election to complete after primary
failure — usually seconds. Do not hardcode the primary hostname in apps.
Replica set not initialized / REMOVED state
Single mongod without rs.initiate() is standalone —
no HA. Run rs.initiate() on one member, then rs.add()
for others (all must share replSetName). Members in
REMOVED or DOWN state: verify hostname in
rs.conf() matches what other members can resolve (use FQDNs,
not localhost, across hosts).
Replication lag growing
On PRIMARY run rs.printSecondaryReplicationInfo() — check
syncedTo vs primary optime. Causes: secondary disk slower than
primary, network issues, heavy reads on secondary competing with replication,
or large index builds on secondary. A secondary in RECOVERING
is catching up — monitor until SECONDARY. Oplog too small can
force full resync if a secondary falls too far behind.
Election loops / no PRIMARY
No member can achieve majority quorum. Common causes: even number of voting
members without arbiter, network partition splitting the set, or members
down. Need majority of voting members reachable. Check
rs.status() for health: 0 members. Fix connectivity,
restore down nodes, or reconfigure (carefully) with
rs.reconfig() — improper reconfig can make things worse.
Disk full / WiredTiger errors
MongoDB stops accepting writes when disk is full. Check
df -h on storage.dbPath. Find large collections:
db.stats(), db.collection.stats(). Compact or
archive old data; add TTL indexes for expiring documents. Free space before
restart — corrupted shutdown may require repair (last resort).
Slow queries
Run db.collection.explain("executionStats").find(...). A
COLLSCAN on a large collection needs an index. Enable the profiler
(db.setProfilingLevel(1, { slowms: 100 })) or check
db.currentOp() for long-running ops. Missing or wrong compound
index field order is a frequent cause.
Too many connections
Hit connection limit (default often high but apps can leak). Check
db.serverStatus().connections. Fix connection pooling in
applications. Kill long-idle ops with db.killOp(opid) after
identifying in db.currentOp().
Debugging workflow
1. Is mongod up?
systemctl status mongod
ss -tlnp | grep 27017
mongosh --eval "db.adminCommand({ ping: 1 })"2. Replica set health
mongosh --eval "rs.status()"
mongosh --eval "rs.printSecondaryReplicationInfo()"3. Active operations and slow queries
db.currentOp({ active: true, secs_running: { $gt: 5 } })
db.system.profile.find().sort({ ts: -1 }).limit(5)Practice scenarios
Hands-on MongoDB scenarios on live Linux VMs: mongodb