Mongodb Interview Questions Answers Scenarios 2025 Interview Questions & Answers

Q1:

What exactly happens inside WiredTiger when you run a write operation?

Mid

Answer

WiredTiger uses a document-level concurrency model. When a write occurs: The update first goes to in-memory cache (dirty pages). A write timestamp is assigned from the session. WiredTiger writes a change record to the Write-Ahead Log (WAL). Dirty pages remain in memory until checkpoint or eviction. Checkpoint creates a consistent on-disk snapshot using MVCC rules. Readers during the write continue using older snapshot images due to MVCC. Thus, MongoDB ensures consistency without blocking reads.

Q2:

Why does MongoDB sometimes cause “cache pressure,” and how does the eviction algorithm work?

Mid

Answer

MongoDB allocates up to 50% of RAM to WiredTiger cache. When data access spikes: Cache fills with working-set documents. The eviction threads start scanning pages using a modified LRU algorithm. Pages with low "score" (less recently used and not dirty) are evicted first. If eviction cannot keep up, MongoDB slows incoming writes (flow control).

Q3:

Explain how MongoDB maintains snapshot isolation without a global lock.

Mid

Answer

MongoDB uses timestamp-based MVCC: Each operation receives a read timestamp. Readers use versions = their timestamp. Writers create new versions with higher timestamps. WiredTiger tracks versions inside the B+Tree leaf pages. Thus, readers never block writers and vice-versa.

Q4:

What causes “write conflicts” even though MongoDB is lock-free?

Mid

Answer

Write conflicts happen when: Two writes try to update the same document concurrently. MongoDB retries the losing transaction internally. This is not a global lock; it’’s a document-level serialization issue.

Q5:

Why is $lookup slow in large collections and what internal process causes the slowdown?

Mid

Answer

$lookup performs: A nested loop join inside the aggregation pipeline. For each left-side document, MongoDB performs an indexed or non-indexed lookup. If no index exists on the joined field: It triggers a full scan inside the pipeline. Causes excessive materialization and memory usage. Use pipeline-based lookup + indexes to optimize.

Q6:

What does “collection-level locking” mean in MongoDB 6.0?

Mid

Answer

MongoDB uses: A global lock manager. But locks are only at collection-level (not database-level). For each operation: CRUD uses Shared/Exclusive locks. Locks don't block if disjoint collections are accessed. Internal operations like DDL use metadata locks.

Q7:

How does MongoDB handle rollbacks during replica set elections?

Mid

Answer

During failover: The new PRIMARY compares its oplog with the old PRIMARY’’s oplog. Divergent oplog entries are rolled back. MongoDB creates rollback files for the deleted writes. The app must re-apply business logic manually. Rollbacks happen when: The old primary had writes that didn’’t replicate before its crash.

Q8:

Explain Oplog application pipeline inside a secondary node.

Mid

Answer

A secondary node: Fetches oplog from primary. Queues operations into a buffer. Applies them using writer threads. Maintains commit order using timestamp-based ordering. If application lags ? secondary becomes stale.

Q9:

What is a “dangling index” in MongoDB?

Mid

Answer

When index entries refer to missing documents, caused by: Crash during write Inconsistent index builds Logical corruption Detected via: db.collection.validate({full: true})

Q10:

When does MongoDB decide to rewrite a document to a new location?

Mid

Answer

WiredTiger uses variable-length documents. If an update increases size beyond the allocated slot: Document is moved to a new location. Old slot becomes a "tombstone". Frequent moves create fragmentation.

Q11:

Why is $facet expensive?

Mid

Answer

Because it: Runs multiple sub-pipelines in parallel Each subpipeline receives full input data Memory consumption becomes huge

Q12:

How does sharding determine chunk ownership internally?

Mid

Answer

MongoDB stores chunk metadata inside config.chunks. Chunks contain: Namespace Range (min / max shard key) Owning shard History (migration timestamps) Mongos uses this metadata to route queries.

Q13:

What happens during a chunk split?

Mid

Answer

When chunk size exceeds threshold: Shard scans the range. Determines split points. Updates config metadata inside a multi-document transaction. Split produces two new chunks.

Q14:

What happens when a chunk migration fails midway?

Mid

Answer

Migration has three phases: Clone Catch-up Commit If migration fails before commit: Target shard discards cloned data. No metadata changes are made. If it fails after commit: Source deletions may be inconsistent ? balancer repair cleanup.

Q15:

Explain how MongoDB builds an index in the background.

Mid

Answer

For background index: Collection remains writable. MongoDB creates index structure. Then scans entire collection. Inserts index entries for documents as they exist at scan time. Captures in-flight writes using an oplog-like side record.

Q16:

Why does an aggregation pipeline sometimes switch to the “disk use” mode?

Mid

Answer

Aggregation stages like $sort, $group, $lookup may exceed: 100 MB memory limit MongoDB then spills intermediate data to: Temporary WiredTiger files under tmp. This slow path affects performance.

Q17:

What is "Plan Cache" and why can it lead to degraded performance?

Mid

Answer

MongoDB caches query plans. If: Data distribution changes Index cardinality changes Or plan becomes suboptimal Plan cache retains old plan ? slower queries. Use: db.collection.getPlanCache().clear()

Q18:

How does Index Intersection work?

Mid

Answer

MongoDB can combine two single-field indexes: Compute intersection of document sets Apply merge-sort or hash intersection But it’’s only beneficial for selective indexes. Otherwise, a compound index is better.

Q19:

What is “yielding” and when does MongoDB yield locks?

Mid

Answer

During long operations: MongoDB temporarily releases locks Allows other operations to proceed Happens in: Collection scans Background index builds Aggregations

Q20:

Explain retryable writes mechanism.

Mid

Answer

Each retryable write has a session ID + transaction ID. If the client retries: MongoDB checks if the write already exists in the transaction table. If yes ? returns previous result without re-executing.

Q21:

How are transactions stored internally?

Mid

Answer

MongoDB stores transaction state in: config.transactions Includes lastWriteTimestamp and oplog pointers For multi-document transactions, writes are stored in memory until commit.

Q22:

Why does MongoDB reject multi-document transactions on unsharded collections in a sharded cluster (older versions)?

Mid

Answer

Older versions couldn’’t ensure: Cross-shard ordering Global commit consistency Transaction routing correctness MongoDB 4.2+ supports distributed transactions using 2-phase commit.

Q23:

What is "Majority Write Concern" internally?

Mid

Answer

Write is acknowledged only when: Primary writes to local oplog. Secondary nodes (majority) confirm replication. If majority is unavailable ? write fails.

Q24:

How does MongoDB detect dead nodes?

Mid

Answer

Replica set nodes send heartbeats every 2 seconds. If: No response within 10 seconds Node is marked down. Triggers election if it’’s primary.

Q25:

What is the internal structure of a MongoDB document on disk?

Mid

Answer

Documents are stored as WiredTiger B+Tree leaf entries: Key = Record ID (RID) Value = WiredTiger-encoded BSON BSON uses type-length-value format Strings are null-terminated MongoDB does not store JSON; it stores binary-encoded BSON.