Junior-Level MongoDB Interview Questions

Q1:

What is a capped collection and when should it be used?

Junior

Answer

A capped collection is a fixed-size collection where MongoDB overwrites old documents when full. It maintains insertion order and supports high-speed writes, useful for logs and metrics.

Quick Summary: Capped collections have a fixed maximum size (in bytes) and optionally a max document count. When full, oldest documents are automatically overwritten by new ones (circular buffer). No deletes needed. Use for: logs, event streams, caches where only recent data matters. Insert order is maintained. Downside: can't delete individual documents, limited update operations.

Permalink

Q2:

What is the difference between $push and $addToSet?

Junior

Answer

$push adds an element to an array even if it already exists. $addToSet adds it only if it is not present, preventing duplicates.

Quick Summary: $push appends a value to an array even if it already exists - can create duplicates. $addToSet adds a value only if it doesn't already exist in the array - like a set in math. Use $addToSet when maintaining unique values (tags, categories, user IDs). Use $push when order matters or duplicates are allowed (event log entries).

Permalink

Q3:

What is an embedded document and when is embedding recommended?

Junior

Answer

An embedded document stores related data inside a parent document. Embedding improves read performance and is recommended for one-to-few relationships.

Quick Summary: Embedded documents store related data together in one document (address inside a user doc). Recommended when data is accessed together, relationship is one-to-one or one-to-few, and child data doesn't grow unboundedly. Referencing stores the related document's _id and uses $lookup for joins. Use referencing for many-to-many, frequently changing data, or data shared across documents.

Permalink

Q4:

What is data referencing in MongoDB?

Junior

Answer

Referencing links documents across collections using IDs. It is used when datasets are large, loosely connected, or when avoiding duplication.

Quick Summary: Data referencing stores the _id of a related document instead of embedding the data. Like a foreign key in SQL. Used when: data is large, shared across many documents, or independently accessed. Requires a separate query or $lookup to fetch the referenced data. Trade-off: two queries or slower $lookup vs embedded doc simplicity.

Permalink

Q5:

What is the purpose of the aggregation pipeline?

Junior

Answer

The aggregation pipeline processes documents through stages such as $match, $group, $project, and $lookup for analytics and transformations.

Quick Summary: The aggregation pipeline processes documents through a series of stages to transform and analyze data. Common stages: $match (filter), $group (aggregate by field), $sort, $project (reshape), $lookup (join), $unwind (flatten arrays), $limit, $skip. Each stage passes its output to the next. More powerful than find() for analytics and data transformation.

Permalink

Q6:

What is $lookup used for?

Junior

Answer

$lookup performs a left outer join between collections, enriching documents with related data.

Quick Summary: $lookup performs a left outer join between collections in an aggregation pipeline. It matches documents from the "from" collection based on a localField/foreignField pair and adds matched docs as an array in the output. Similar to SQL JOIN. Performance tip: $lookup is expensive - consider embedding if you always access data together.

Permalink

Q7:

What is the difference between insertOne and insertMany?

Junior

Answer

insertOne inserts a single document. insertMany inserts multiple documents in one operation and improves performance.

Quick Summary: insertOne() inserts a single document and returns the inserted document's _id. insertMany() inserts an array of documents in one operation - faster than calling insertOne() in a loop (one network round-trip). insertMany() by default stops on first error (ordered mode). Set {ordered: false} to continue inserting remaining documents even if some fail.

Permalink

Q8:

What is the purpose of TTL indexes?

Junior

Answer

TTL indexes automatically delete documents after a specified time, useful for sessions, logs, and temporary data.

Quick Summary: TTL (Time To Live) indexes automatically delete documents after a specified number of seconds. Created with expireAfterSeconds: db.sessions.createIndex({createdAt: 1}, {expireAfterSeconds: 3600}) deletes documents after 1 hour. MongoDB runs a background cleanup process every 60 seconds. Use for: sessions, cache entries, temporary data, audit logs with retention policies.

Permalink

Q9:

What is the explain function and why is it useful?

Junior

Answer

explain() shows how a query is executed, including index usage and performance details. It helps diagnose slow queries.

Quick Summary: explain() shows how MongoDB executes a query - which index was used (IXSCAN vs COLLSCAN), how many documents were examined, execution time, and query plan. Use explain("executionStats") for detailed stats. Essential for performance debugging - if you see COLLSCAN on a frequently run query, you need an index. Always run explain() on new queries in development.

Permalink

Q10:

What is a write concern?

Junior

Answer

Write concern defines how strictly MongoDB should confirm a write, ranging from w:1 (primary only) to w:majority for higher durability.

Quick Summary: Write concern controls how many replica set members must acknowledge a write before MongoDB considers it successful. w:1 (default): primary acknowledges. w:majority: majority of members must acknowledge - safer, slower. w:0: fire and forget. Higher write concern = stronger durability guarantee but higher latency. Choose based on your data loss tolerance.

Permalink

Q11:

What is a read preference?

Junior

Answer

Read preference decides which nodes serve read requests, such as primary, secondary, or nearest, enabling load balancing.

Quick Summary: Read preference controls which replica set member handles read operations. primary: all reads from primary (consistent, default). primaryPreferred: primary if available, else secondary. secondary: always read from secondaries (may be slightly stale). secondaryPreferred: secondaries when available. nearest: lowest network latency. Use secondaries to distribute read load but accept eventual consistency.

Permalink

Q12:

What is journaling in MongoDB?

Junior

Answer

Journaling writes operations to a journal file before applying them to data files, preventing data loss in crashes.

Quick Summary: Journaling writes every write operation to an on-disk journal (write-ahead log) before applying it to data files. If MongoDB crashes mid-write, it replays the journal on restart to recover to a consistent state. Enabled by default since MongoDB 3.2. Without journaling, a crash between the write and fsync can corrupt data files.

Permalink

Q13:

What is $regex used for?

Junior

Answer

$regex performs pattern matching on string fields and is useful for partial text searches.

Quick Summary: $regex filters documents where a string field matches a regular expression. db.users.find({name: {$regex: "^alice", $options: "i"}}) finds users whose name starts with "alice" (case-insensitive). Performance warning: regex queries without a text index or leading wildcard can't use indexes and cause full collection scans. Anchor patterns to the start (^) when possible.

Permalink

Q14:

What is the difference between save and update?

Junior

Answer

save replaces an entire document if it exists or inserts it if not. update modifies only specified fields using update operators.

Quick Summary: save() was removed in MongoDB 5.x. Previously: if the document had an _id that matched an existing document, it replaced the whole document; otherwise it inserted. update() (now updateOne/updateMany) modifies specific fields. Always use insertOne/updateOne/replaceOne explicitly - they're clearer about intent and safer than the old save() which could silently replace entire documents.

Permalink

Q15:

What is sharding key selection and why is it important?

Junior

Answer

A good sharding key ensures balanced data distribution, high cardinality, and avoids write hotspots, which affects scaling performance.

Quick Summary: The shard key determines how data is distributed across shards. A good shard key has high cardinality (many distinct values), even write distribution (avoid hotspots), and is included in most queries. Bad choices: monotonically increasing keys (like timestamps or ObjectId) cause all writes to go to one shard. Hash sharding distributes ObjectIds evenly across shards.

Permalink

Get Pro for Free

Junior MongoDB Interview Questions

MongoDB Interview Questions & Answers

What you will learn from these MongoDB interview questions:

Questions

What is a capped collection and when should it be used?

Answer

What is the difference between $push and $addToSet?

Answer

What is an embedded document and when is embedding recommended?

Answer

What is data referencing in MongoDB?

Answer

What is the purpose of the aggregation pipeline?

Answer

What is $lookup used for?

Answer

What is the difference between insertOne and insertMany?

Answer

What is the purpose of TTL indexes?

Answer

What is the explain function and why is it useful?

Answer

What is a write concern?

Answer

What is a read preference?

Answer

What is journaling in MongoDB?

Answer

What is $regex used for?

Answer

What is the difference between save and update?

Answer

What is sharding key selection and why is it important?

Answer

Curated Sets for MongoDB

Get Pro for Free

Junior MongoDB Interview Questions

MongoDB Interview Questions & Answers

What you will learn from these MongoDB interview questions:

Questions

What is a capped collection and when should it be used?

Answer

What is the difference between $push and $addToSet?

Answer

What is an embedded document and when is embedding recommended?

Answer

What is data referencing in MongoDB?

Answer

What is the purpose of the aggregation pipeline?

Answer

What is $lookup used for?

Answer

What is the difference between insertOne and insertMany?

Answer

What is the purpose of TTL indexes?

Answer

What is the explain function and why is it useful?

Answer

What is a write concern?

Answer

What is a read preference?

Answer

What is journaling in MongoDB?

Answer

What is $regex used for?

Answer

What is the difference between save and update?

Answer

What is sharding key selection and why is it important?

Answer

Curated Sets for MongoDB

People Also Ask - Related MongoDB Questions