Skip to main content

Apple Interview MS SQL Interview Questions

Curated Apple Interview-level MS SQL interview questions for developers targeting apple interview positions. 160 questions available.

Last updated:

MS SQL Interview Questions & Answers

Skip to Questions

Welcome to our comprehensive collection of MS SQL interview questions and answers. This page contains expertly curated interview questions covering all aspects of MS SQL, from fundamental concepts to advanced topics. Whether you're preparing for an entry-level position or a senior role, you'll find questions tailored to your experience level.

Our MS SQL interview questions are designed to help you:

  • Understand core concepts and best practices in MS SQL
  • Prepare for technical interviews at all experience levels
  • Master both theoretical knowledge and practical application
  • Build confidence for your next MS SQL interview

Each question includes detailed answers and explanations to help you understand not just what the answer is, but why it's correct. We cover topics ranging from basic MS SQL concepts to advanced scenarios that you might encounter in senior-level interviews.

Use the filters below to find questions by difficulty level (Entry, Junior, Mid, Senior, Expert) or focus specifically on code challenges. Each question is carefully crafted to reflect real-world interview scenarios you'll encounter at top tech companies, startups, and MNCs.

Questions

160 questions
Q1:

What is MS SQL Server and where is it commonly used?

Entry

Answer

MS SQL Server is Microsoft’s enterprise relational database system used in finance, e-commerce, SaaS, ERPs, CRMs, and large web applications. It provides secure, consistent, and high-performance data storage.

Quick Summary: MS SQL Server is Microsoft's relational database system built for enterprise use. It stores, retrieves, and manages structured data reliably. You'll find it in finance, healthcare, SaaS apps, ERPs, and large websites where data consistency and performance really matter.
Q2:

Explain Database, Table, Row, and Column in relational terms.

Entry

Answer

A database stores all data objects. A table contains structured data. A row represents one record. A column represents a specific data field with a defined type.

Quick Summary: A database holds all your data objects. A table organizes data into rows and columns, like a spreadsheet. A row is one record (one person, one order). A column is a specific attribute of that record (name, price, date). Together, they form the foundation of relational data storage.
Q3:

What is a Primary Key and why is it important?

Entry

Answer

A primary key uniquely identifies each row. It ensures uniqueness, improves indexing, and maintains reliable data integrity.

Quick Summary: A primary key uniquely identifies each row in a table — no two rows can have the same value, and it can't be NULL. It's the anchor for relationships between tables and speeds up lookups. Without it, finding or linking a specific record becomes messy and unreliable.
Q4:

What is a Foreign Key and what problem does it solve?

Entry

Answer

A foreign key creates a relationship between tables and prevents orphaned or invalid data by enforcing referential integrity.

Quick Summary: A foreign key in one table points to the primary key in another table. It enforces referential integrity — you can't add an order for a customer that doesn't exist. It's how relational databases maintain relationships between entities without duplicating data.
Q5:

What are Constraints in MS SQL?

Entry

Answer

Constraints like primary key, foreign key, unique, check, not null, and default rules ensure accurate, valid, and reliable data.

Quick Summary: Constraints are rules you enforce at the database level. Common ones: NOT NULL (column can't be empty), UNIQUE (no duplicates), CHECK (value must meet a condition), DEFAULT (fallback value), PRIMARY KEY, and FOREIGN KEY. They catch bad data before it ever lands in the table.
Q6:

What is Normalization and why do we use it?

Entry

Answer

Normalization organizes data to eliminate redundancy, improve consistency, and reduce anomalies during insert, update, or delete operations.

Quick Summary: Normalization organizes data to reduce redundancy and keep things consistent. Instead of storing a customer's name in every order row, you store it once and reference it by ID. This prevents update anomalies — you change data in one place, not a hundred.
Q7:

Explain 1NF, 2NF, and 3NF briefly.

Entry

Answer

1NF: No repeating groups; atomic values.
2NF: Remove partial dependencies.
3NF: Remove transitive dependencies.

Quick Summary: 1NF: each cell holds one atomic value, no repeating groups. 2NF: no partial dependency — every non-key column depends on the whole primary key. 3NF: no transitive dependency — non-key columns depend only on the primary key, not on each other. Each level builds on the previous.
Q8:

What is Denormalization and when is it used?

Entry

Answer

Denormalization introduces controlled redundancy to improve read performance, especially in reporting and analytics systems.

Quick Summary: Denormalization intentionally adds redundancy to speed up reads. Instead of joining three tables every time, you store the data together. Used in reporting databases, dashboards, and read-heavy systems where join performance is a bottleneck and write consistency is less critical.
Q9:

What is an Index and why is it important?

Entry

Answer

An index speeds up data searches by avoiding full table scans. Too many indexes slow down inserts/updates due to maintenance.

Quick Summary: An index is a separate data structure that helps SQL Server find rows fast without scanning the whole table. Think of it like a book's index — instead of reading every page, you jump straight to what you need. Without indexes, every query does a full table scan.
Q10:

What is the difference between Clustered and Non-Clustered Index?

Entry

Answer

Clustered index: Defines physical row order.
Non-clustered index: A separate structure pointing to actual rows.

Quick Summary: Clustered index defines the physical order of rows in the table — only one per table. Non-clustered index is a separate structure with pointers back to the actual rows — you can have many. Clustered is faster for range queries; non-clustered is better for selective lookups on specific columns.
Q11:

What is a View and why is it used?

Entry

Answer

A view is a virtual table created from a query. Used for simplicity, security, reusability, and hiding complex queries.

Quick Summary: A view is a saved SELECT query that looks like a table. It simplifies complex joins, hides sensitive columns, and gives different users a tailored perspective of the data. The view itself stores no data — it's just a reusable query definition that runs when you access it.
Q12:

What is a Stored Procedure and what is it used for?

Entry

Answer

A stored procedure is a precompiled SQL code block used for business logic, improving performance and reducing network overhead.

Quick Summary: A stored procedure is pre-compiled SQL code saved in the database. You call it by name with parameters. It runs faster than ad-hoc SQL because the execution plan is cached, reduces network traffic, centralizes business logic, and is easier to secure with permissions.
Q13:

What are User-Defined Functions (UDFs)?

Entry

Answer

UDFs return calculated values based on parameters. Used for reusable logic, validation, and transformations.

Quick Summary: UDFs are reusable functions you define in T-SQL. Scalar functions return one value. Inline table-valued functions return a table (like a parameterized view). Multi-statement TVFs build a table row by row. Scalar UDFs are handy but can tank performance when called per row in large queries.
Q14:

What is a Transaction and why is it important?

Entry

Answer

A transaction ensures a group of SQL operations is treated as a single unit—either all succeed or all fail (ACID compliance).

Quick Summary: A transaction is a group of SQL operations treated as one unit. Either all succeed and commit, or all fail and roll back. This prevents half-done operations from corrupting your data — like a bank transfer where money leaves one account but never arrives in another.
Q15:

Explain ACID properties in SQL.

Entry

Answer

Atomicity: All-or-nothing.
Consistency: Valid states only.
Isolation: No interference between transactions.
Durability: Data persists after commit.

Quick Summary: ACID stands for: Atomicity (all or nothing), Consistency (data stays valid before and after), Isolation (transactions don't interfere with each other), Durability (once committed, data survives crashes). These four guarantees make transactions reliable in production systems.
Q16:

What is a Deadlock in SQL Server?

Entry

Answer

A deadlock occurs when transactions block each other while waiting for resources. SQL Server resolves it by killing one transaction.

Quick Summary: A deadlock happens when two transactions each hold a lock the other needs — both wait forever. SQL Server's deadlock monitor detects the cycle and kills the cheaper transaction (the deadlock victim), rolling it back and returning an error. The other transaction then completes normally.
Q17:

What is Locking in MS SQL Server?

Entry

Answer

Locking controls concurrent access using shared (read), exclusive (write), and update locks to maintain data consistency.

Quick Summary: Locking prevents concurrent transactions from corrupting shared data. SQL Server automatically applies locks (shared for reads, exclusive for writes) when a transaction touches data. The granularity ranges from row to page to table. Poor lock management causes blocking, contention, and deadlocks.
Q18:

What is a Schema in SQL?

Entry

Answer

A schema is a logical container grouping tables, views, and procedures for better organization and security.

Quick Summary: A schema is a logical namespace inside a database that groups related objects — tables, views, procedures. Like folders for your database objects. dbo is the default schema. Schemas help with organization, security (grant access per schema), and avoiding naming conflicts between teams.
Q19:

Difference between DELETE and TRUNCATE?

Entry

Answer

DELETE: Row-by-row removal, fully logged.
TRUNCATE: Deallocates pages, extremely fast, minimal logging.

Quick Summary: DELETE removes rows one at a time, logs each deletion, fires triggers, and can be rolled back. TRUNCATE removes all rows at once, logs only page deallocations, is much faster, but can't be used with triggers or foreign key constraints. TRUNCATE resets identity columns; DELETE does not.
Q20:

What is an Execution Plan and why is it important?

Entry

Answer

An execution plan shows how SQL Server executes a query—joins, indexes, scans, seeks. It is vital for tuning slow queries.

Quick Summary: An execution plan is SQL Server's roadmap for running a query — which indexes to use, how to join tables, in what order to process operations. Reading the plan reveals bottlenecks like table scans, key lookups, or hash matches that slow things down. It's the first tool to reach for when tuning.
Q21:

What is the main purpose of indexing in SQL Server?

Entry

Answer

Indexing improves data retrieval speed by avoiding full table scans.

Indexes act like optimized search maps that help SQL Server locate rows faster.

However, too many indexes slow down write performance due to maintenance overhead.

Quick Summary: Indexing lets SQL Server find rows without reading the entire table. Instead of scanning a million rows, it jumps to the right spot using the index structure (B-tree). Without good indexes, even simple lookups turn into full table scans that waste CPU, memory, and time.
Q22:

How does SQL Server decide between a Table Scan, Index Scan, and Index Seek?

Entry

Answer

SQL Server chooses access methods based on filters, available indexes, and cost estimates.

Table Scan: Reads all rows when no useful index exists.

Index Scan: Reads all index entries for broad filters.

Index Seek: Most efficient; jumps directly to matching rows.

Quick Summary: SQL Server uses statistics (row count estimates) to pick the cheapest plan. Table Scan reads every row — used when there's no useful index or the whole table is needed. Index Scan reads the index but all rows. Index Seek jumps directly to matching rows — the most efficient, what you want for selective queries.
Q23:

What is a Composite Index and when is it used?

Entry

Answer

A composite index includes multiple columns, used when queries filter or sort using those column combinations.

The order of columns is important because SQL Server can only efficiently use leading key columns.

Quick Summary: A composite index covers multiple columns. Useful when your WHERE clause filters on a combination of columns together. The column order matters — the index is most effective when the leading column appears in the filter. A (LastName, FirstName) index won't help a query filtering only on FirstName.
Q24:

What is Bookmark Lookup and why can it cause performance problems?

Entry

Answer

Bookmark lookup occurs when SQL uses a non-clustered index but must fetch extra columns from the base table.

This becomes slow when many rows match, increasing I/O and reducing performance.

Quick Summary: Bookmark lookup happens when a non-clustered index finds the matching row keys, but the query needs columns not in the index — so SQL Server does an extra lookup into the clustered index to get them. Each lookup is a random I/O. With many rows, this becomes very expensive. Fix: use a covering index.
Q25:

What is a Covering Index and why is it powerful?

Entry

Answer

A covering index contains all columns needed for a query.

SQL Server can return results from the index alone without touching the table.

This greatly improves performance for read-heavy queries.

Quick Summary: A covering index includes all the columns a query needs — the index columns plus extra INCLUDE columns. SQL Server can answer the query entirely from the index without touching the main table. Eliminates key lookups, reduces I/O, and often cuts query time dramatically.
Q26:

What are SQL Server Statistics and why are they essential?

Entry

Answer

Statistics describe data distribution in columns.

They help the optimizer estimate row counts and choose efficient execution plans.

Quick Summary: Statistics are histograms that tell SQL Server how data is distributed in a column — how many distinct values exist, how they're spread, what the min/max are. The query optimizer uses them to estimate how many rows a filter will return and pick the best execution plan.
Q27:

How do outdated or missing statistics affect query performance?

Entry

Answer

Outdated statistics produce inaccurate row estimates.

SQL Server may choose inefficient join types or scans, causing slow performance.

Regular statistic updates maintain optimal plans.

Quick Summary: Outdated statistics mean SQL Server is working with stale data distribution information. It might think a filter returns 100 rows when it actually returns 100,000 — leading to a plan built for small data that falls apart at scale. This causes unexpected table scans, wrong join types, and slow queries.
Q28:

What is Parameter Sniffing and why does it occur?

Entry

Answer

SQL Server caches execution plans for reuse.

Parameter sniffing occurs when a cached plan works well for one parameter but poorly for another due to data skew.

Quick Summary: Parameter sniffing is when SQL Server compiles a stored procedure plan based on the first parameter values it sees. If those values are atypical (e.g., a rare ID vs. a common status), the cached plan is optimal for that one call but terrible for everyone else. Leads to inconsistent query performance.
Q29:

What is an Execution Plan and how does SQL Server generate it?

Entry

Answer

An execution plan is SQL Server’s strategy for running a query.

The optimizer evaluates multiple plan options and picks the lowest-cost one.

Quick Summary: When SQL Server receives a query, it parses it, checks the plan cache for a reusable plan, then the query optimizer evaluates multiple possible plans using cost estimates and picks the cheapest one. The plan is cached and reused. For parameterized queries, the same plan handles all values — good and bad.
Q30:

What is the difference between Estimated and Actual Execution Plans?

Entry

Answer

Estimated plans show predicted row counts before execution.

Actual plans include real runtime row counts and performance metrics.

Quick Summary: Estimated plan shows what SQL Server planned to do before running the query — based on statistics. Actual plan shows what really happened including actual row counts, execution counts, and time. Comparing estimated vs actual row counts reveals where statistics are wrong and the plan went off-course.
Q31:

What is Cardinality Estimation and why is it important?

Entry

Answer

Cardinality estimation predicts how many rows a query will process.

Accurate estimates are critical for choosing efficient join types and memory usage.

Quick Summary: Cardinality estimation predicts how many rows each operation in a plan will produce. If the estimate is 10 rows but the real result is 100,000, SQL Server builds the wrong plan — wrong join type, wrong memory grant, wrong index choice. Accurate statistics are what keep cardinality estimates correct.
Q32:

What are Index Fragmentation and Fill Factor?

Entry

Answer

Fragmentation occurs when index pages become disordered, slowing scans and seeks.

Fill factor controls how much free space to leave during index creation to reduce fragmentation.

Quick Summary: Fragmentation happens when index pages get out of order after many inserts, updates, and deletes. Logical fragmentation means pages are out of sequence — sequential reads become random I/O. Fill Factor controls how full each page is when the index is built, leaving room to grow and reducing future splits.
Q33:

What is the Query Optimizer and how does it work?

Entry

Answer

The optimizer evaluates many query strategies using statistics and metadata.

It selects the lowest estimated cost plan.

Quick Summary: The query optimizer is SQL Server's brain for plan selection. It generates multiple candidate plans, estimates the cost of each using statistics and cardinality, and picks the one with the lowest estimated cost. It's cost-based, not rule-based — which is why bad statistics lead directly to bad plans.
Q34:

What is a Filtered Index and when is it beneficial?

Entry

Answer

A filtered index stores only rows matching a condition.

It reduces index size and improves seek performance for selective queries.

Quick Summary: A filtered index only indexes rows that match a WHERE condition. If 90% of your rows are "archived" and queries only touch "active" ones, a filtered index on IsActive = 1 is tiny, fast, and highly selective. Much cheaper than a full index and often a perfect fit for status or soft-delete columns.
Q35:

What is a Hint and when should it be used?

Entry

Answer

Hints override optimizer decisions by forcing specific behavior.

They should be used only when the optimizer consistently chooses suboptimal plans.

Quick Summary: A hint forces SQL Server to use a specific index, join type, or behavior instead of letting the optimizer decide. Use sparingly — hints override the optimizer's judgment, which can help in edge cases but often makes things worse long-term as data changes. Prefer fixing statistics or schema over using hints.
Q36:

What is a Hotspot in indexing terms?

Entry

Answer

A hotspot occurs when many concurrent operations target the same physical index location.

This causes locking and contention, reducing performance.

Quick Summary: A hotspot is a single index page that many concurrent inserts target at the same time — typically the last page of a sequential (identity-based) clustered index. All writers compete for the same page latch, causing contention and blocking. Solutions: GUID keys, partitioning, or using a non-sequential key.
Q37:

What is the Role of the Query Store in SQL Server?

Entry

Answer

Query Store tracks query history, execution plans, and performance trends.

It helps diagnose regressions and enforce stable plans.

Quick Summary: Query Store is a built-in SQL Server feature that captures query text, execution plans, and runtime stats over time. When a query suddenly gets slower (plan regression), Query Store can show you the old fast plan and let you force it back. Essential for diagnosing intermittent performance problems.
Q38:

What is an Index Seek Predicate vs Residual Predicate?

Entry

Answer

Seek predicates allow precise navigation through an index.

Residual predicates apply filters after the seek when the index does not fully cover the query.

Quick Summary: Seek predicate is the condition used to enter the index — it actually narrows which rows are retrieved. Residual predicate is a secondary filter applied after the seek, on rows already found. A residual predicate means SQL Server fetched more rows than needed and filtered them post-retrieval — less efficient.
Q39:

Why does SQL Server choose a Hash Match instead of Nested Loop or Merge Join?

Entry

Answer

SQL Server uses hash match for large, unsorted datasets where hashing is cheaper than repeated seeks.

Poor row estimates may also cause SQL to choose hashing, sometimes leading to spills.

Quick Summary: SQL Server chooses Hash Match when it can't sort both inputs efficiently (no useful index, large unsorted data). It builds a hash table from the smaller input and probes it with the larger. It's memory-intensive and can spill to TempDB if the memory grant was underestimated. More common with missing or outdated indexes.
Q40:

Explain the internal structure of a SQL Server index (Clustered vs Non-Clustered). How does it affect performance?

Junior

Answer

Clustered Index: Defines the physical order of rows in the table. Implemented as a B-Tree with root, intermediate, and leaf nodes. The leaf level contains actual data rows.

Non-Clustered Index: Also a B-Tree, but leaf nodes store index keys and row locators (RID or clustered key).

Performance Impact: Clustered indexes help range queries, while non-clustered indexes help quick lookups. Poor clustered key choice can make indexes large and slow.

Quick Summary: Clustered index stores the actual data rows in leaf pages, sorted by the key — the table IS the index. Non-clustered index stores key values plus a pointer (row locator) to the actual row. Clustered is optimal for range queries. Non-clustered adds lookup overhead unless it covers all needed columns.
Q41:

What is an Index Seek vs Index Scan vs Table Scan? When is each used?

Junior

Answer

Index Seek: SQL jumps directly to matched rows. Happens with selective filters and proper indexes.

Index Scan: SQL reads entire index. Happens with broad filters or missing indexes.

Table Scan: Reads whole table. Used when no index exists or table is small.

Quick Summary: Index Seek: directly navigates the B-tree to find matching rows — most efficient, low I/O. Index Scan: traverses all index pages — used when filtering is weak or all rows are needed. Table Scan: reads the entire heap — worst case, no usable index. Seek is ideal; scans are a sign something needs attention.
Q42:

How does SQL Server choose an Execution Plan?

Junior

Answer

SQL Server uses a cost-based optimizer that parses the query, evaluates different plan options, estimates CPU/I/O cost, and chooses the lowest-cost plan.

Quick Summary: SQL Server parses the query, generates a query tree, checks the plan cache (reuse if found), then the optimizer produces candidate plans with cost estimates. It picks the lowest-cost plan and caches it. For parameterized queries, that same plan runs for all future calls with different parameter values.
Q43:

What are Statistics? Why do bad statistics slow queries?

Junior

Answer

Statistics contain value distribution info. SQL uses them to estimate row counts. Bad or outdated statistics cause wrong estimations, leading to slow plans.

Quick Summary: Statistics contain histograms showing how values are distributed in a column. The optimizer uses them to estimate row counts for each operation. If stats are stale — say after a large data load — estimates are wrong, plans are wrong, and queries get slow. Update statistics regularly in high-change tables.
Q44:

What is a Covering Index and when should you create one?

Junior

Answer

A covering index contains all columns needed for a query. It removes key lookups and improves performance in read-heavy workloads.

Quick Summary: A covering index satisfies a query entirely from the index without touching the main table. Include the WHERE columns as index keys and extra SELECT columns via INCLUDE. Eliminates costly key lookups. One well-placed covering index can eliminate 90% of I/O for frequently-run queries.
Q45:

What is a Key Lookup? Why is it expensive?

Junior

Answer

Key lookup happens when a non-clustered index finds a row but SQL needs extra columns. It causes many random I/O operations and slows queries.

Quick Summary: Key lookup (aka Bookmark Lookup) happens when a non-clustered index finds the row, but the query needs extra columns not in the index — SQL Server fetches those from the clustered index. Each lookup is a random I/O. With thousands of rows, this stacks up. Adding INCLUDE columns to the index eliminates it.
Q46:

Explain Lock Escalation in SQL Server.

Junior

Answer

When too many row/page locks exist, SQL upgrades to a table lock. This reduces overhead but lowers concurrency.

Quick Summary: Lock escalation converts many fine-grained locks (row, page) into one coarse table-level lock when the lock count crosses a threshold (default ~5,000). It reduces lock manager overhead but blocks all concurrent readers and writers on that table. Can cause unexpected blocking in high-concurrency workloads.
Q47:

What is Page Splitting? How does Fill Factor help?

Junior

Answer

Page splitting happens when SQL inserts rows into a full page. It increases fragmentation. Fill Factor leaves free space to reduce page splits.

Quick Summary: Page splitting happens when a new row is inserted into a full index page — SQL Server splits the page into two half-full pages, causing fragmentation. Fill Factor sets how full pages are when an index is built (e.g., 80% means 20% room to grow). Lower fill factor = less splitting but larger index size.
Q48:

Explain SQL Server Memory Architecture (Buffer Pool, Plan Cache).

Junior

Answer

Buffer Pool: Stores data and index pages.

Plan Cache: Stores compiled query plans for reuse.

Quick Summary: The Buffer Pool is SQL Server's main memory cache — it holds data pages and index pages read from disk. Reads hit the buffer pool first (logical read); if not found, it fetches from disk (physical read). Plan Cache stores compiled execution plans for reuse. Together they dramatically reduce disk I/O and compilation overhead.
Q49:

What is Parameter Sniffing and how do you fix it?

Junior

Answer

SQL optimizes a plan based on first-used parameter value. Fix options include using RECOMPILE, OPTIMIZE FOR UNKNOWN, or rewriting queries.

Quick Summary: SQL Server compiles a plan for a stored procedure using the first parameter values it sees. If those values are unusual (high selectivity vs. low), the cached plan is ideal for them but wrong for typical calls. Fix: OPTION(RECOMPILE), OPTIMIZE FOR hint, or local variable workaround to break plan reuse.
Q50:

When should you use a Filtered Index?

Junior

Answer

Use when queries target specific subsets (e.g., Active records). Improves performance and reduces index size.

Quick Summary: Filtered indexes work best for columns where queries consistently target a specific subset — IsDeleted = 0, Status = 'Active', Region = 'US'. The index is smaller, faster, and more efficient than a full index. Particularly valuable for soft-delete patterns where active rows are a small fraction of total.
Q51:

What is a Heap? Should you use it?

Junior

Answer

A heap is a table without clustered index. Good for bulk loads but slow for lookups. Most tables should have a clustered index.

Quick Summary: A heap is a table with no clustered index — rows are stored in no particular order. Lookups require a full table scan or a non-clustered index with a Row Identifier (RID) lookup. For most workloads, a clustered index is better. Heaps can be useful for bulk-load staging tables where you don't need ordered access.
Q52:

What is Partitioning in SQL Server?

Junior

Answer

Partitioning splits a table into smaller pieces, improving maintenance and performance for large datasets.

Quick Summary: Partitioning divides a large table into smaller physical pieces (partitions) based on a column value — usually a date or range. Queries that filter on the partition column only scan relevant partitions (partition elimination). Makes archiving, purging, and managing large datasets much more practical.
Q53:

What is Query Store and why is it useful?

Junior

Answer

Query Store keeps execution history, performance metrics, and helps detect regressions. It allows forcing stable plans.

Quick Summary: Query Store records query text, execution plans, and runtime stats (duration, CPU, I/O) historically. When a plan suddenly regresses (gets slow), you can compare it to the previous fast plan and force the good one back. Invaluable for diagnosing plan regressions after upgrades, stats updates, or data changes.
Q54:

What is Cardinality Estimation in SQL Server?

Junior

Answer

It predicts row counts for query operations. Accurate estimates help SQL choose the best join and operator strategy.

Quick Summary: Cardinality estimation predicts how many rows each step in a query plan will produce. Wrong estimates lead to wrong join strategies, wrong memory grants, and wrong index choices. The estimator relies on statistics — stale or missing stats cause it to guess badly, which cascades into a poorly-performing plan.
Q55:

Explain Memory Grants in SQL Server.

Junior

Answer

SQL requests memory for sorting and hashing. Over-grants cause waits; under-grants cause spills to TempDB.

Quick Summary: SQL Server pre-allocates memory (a memory grant) for sort and hash operations based on cardinality estimates. Too small → spills to TempDB, slowing the query. Too large → wastes memory, blocks other queries from getting grants. Fix bad grants by fixing statistics so estimates are accurate.
Q56:

What are Hot Spots in indexing?

Junior

Answer

Hotspots occur when many inserts hit the same index page (e.g., identity keys). They cause blocking and contention.

Quick Summary: Hotspots occur on the last page of a sequential clustered index (identity or timestamp key) when many concurrent inserts all target the same page. They compete for the same page latch, serializing inserts. Solutions: GUID clustered key (random distribution), multiple tempDB data files, or table partitioning.
Q57:

Pessimistic vs Optimistic Concurrency in SQL Server.

Junior

Answer

Pessimistic uses locks to avoid conflicts. Optimistic uses row versioning to detect conflicts at commit without blocking readers.

Quick Summary: Pessimistic concurrency uses locks upfront — a transaction locks data before reading or writing to prevent conflicts. Optimistic concurrency assumes conflicts are rare — it reads without locking and checks at commit time if someone else changed the data. SQL Server supports both via isolation levels and snapshot isolation.
Q58:

What are Wait Types in SQL Server?

Junior

Answer

Wait types show where SQL is spending time (CPU, I/O, locks). Examples: CXPACKET, PAGEIOLATCH, LCK_M_X.

Quick Summary: Wait types tell you what SQL Server is waiting for — PAGEIOLATCH (disk I/O), LCK_M_X (lock wait), CXPACKET (parallelism), SOS_SCHEDULER_YIELD (CPU pressure). When a query is slow, checking wait stats points directly at the bottleneck. It's the fastest way to know where to look.
Q59:

Explain TempDB internals and why it affects performance.

Junior

Answer

TempDB stores temp tables, spills, versioning, and intermediate results. Best practices include multiple equal-sized data files and SSD storage.

Quick Summary: TempDB is a shared system database used for temporary tables, table variables, sort spills, row versioning (for snapshot isolation), and hash join/sort intermediate results. High TempDB contention (allocation page contention) can bottleneck the entire server. Multiple data files and fast storage mitigate this.
Q60:

Explain in detail how SQL Server processes different types of JOINs internally.

Mid

Answer

SQL Server evaluates JOINs using physical operators such as Nested Loops, Merge Join, and Hash Join. INNER, LEFT, RIGHT, and FULL JOINs all use these strategies depending on data size, indexes, and sorting.

INNER JOIN: Returns matching rows only.

LEFT/RIGHT JOIN: Returns matching rows plus NULL for non-matching.

FULL JOIN: Returns all matches + unmatched from both sides.

CROSS JOIN: Cartesian product; typically expensive.

Proper indexing affects which join operator SQL Server chooses.

Quick Summary: SQL Server supports Nested Loops (small outer table, indexed inner), Merge Join (both inputs sorted — often the fastest), and Hash Match (large unsorted inputs — memory intensive). The optimizer picks based on data size and index availability. Nested Loops is cheapest for small datasets; Hash Match is a fallback for large ones.
Q61:

Explain Nested Loops, Merge Join, and Hash Join with when each is chosen.

Mid

Answer

Nested Loops: Best for small outer input and indexed inner table. Great for OLTP random lookups.

Merge Join: Requires sorted inputs. Very fast for large, sorted datasets.

Hash Join: Best for large, unsorted sets. Spills to TempDB if memory is insufficient.

Quick Summary: Nested Loops: iterate outer rows, seek matching inner rows via index — best when outer set is small. Merge Join: both inputs must be pre-sorted; scan both simultaneously — fast when indexes provide sort order. Hash Match: build a hash table from smaller input, probe with larger — chosen when sorts aren't available and data is large.
Q62:

What is a Transaction? Explain ACID with real SQL Server implications.

Mid

Answer

A transaction is a unit of work ensured by ACID properties:

  • Atomicity: All-or-nothing behavior.
  • Consistency: Constraints must remain valid.
  • Isolation: Controls concurrency.
  • Durability: Committed data survives crashes via logging.
Quick Summary: A transaction groups multiple operations into one atomic unit. In SQL Server: Atomicity means the transaction commits fully or rolls back entirely. Consistency ensures constraints hold. Isolation controls what concurrent transactions can see. Durability means committed data survives crashes via the transaction log.
Q63:

Explain SQL Server Transaction Isolation Levels with practical use-cases.

Mid

Answer

Isolation levels include:

READ UNCOMMITTED: Allows dirty reads; used in analytics.

READ COMMITTED: Default; prevents dirty reads.

RCSI: Uses version store; avoids blocking.

REPEATABLE READ: Prevents non-repeatable reads.

SERIALIZABLE: Highest isolation; heavy locking.

SNAPSHOT: Uses row versioning; avoids shared locks.

Quick Summary: Read Uncommitted: sees dirty (uncommitted) data — fastest but risky. Read Committed: default, only reads committed data but allows non-repeatable reads. Repeatable Read: locks read rows, prevents changes. Serializable: full isolation, range locks prevent phantom reads. Snapshot: uses row versions instead of locks — concurrent and clean.
Q64:

Explain Pessimistic vs Optimistic Concurrency in SQL Server.

Mid

Answer

Pessimistic: Uses locks to avoid conflicts; good for heavy-write systems.

Optimistic: Uses row versioning; detects conflicts at commit.

Quick Summary: Pessimistic: lock the data before you use it — nobody else can change it until you're done. Low concurrency but zero conflict risk. Optimistic: don't lock on read; check at commit time if data changed. Higher concurrency but requires retry logic on conflict. SQL Server's Snapshot Isolation is an optimistic approach.
Q65:

What is Deadlock? How does SQL Server detect and resolve it?

Mid

Answer

A deadlock occurs when sessions wait on each other indefinitely. SQL Server detects deadlocks every 5 seconds and selects a victim to kill.

Prevention includes using consistent resource order, short transactions, and proper indexes.

Quick Summary: A deadlock forms when transaction A holds lock 1 and wants lock 2, while transaction B holds lock 2 and wants lock 1. Neither can proceed. SQL Server's deadlock monitor runs every 5 seconds, detects the cycle, picks the cheapest victim, rolls it back, and lets the other complete. Victims get error 1205.
Q66:

Explain Lock Types (Shared, Update, Exclusive, Intent).

Mid

Answer

Shared (S): For reading.

Exclusive (X): For writing.

Update (U): Prevents deadlocks during read-to-write transitions.

Intent Locks: Used to manage lock hierarchy efficiently.

Quick Summary: Shared (S): read lock, multiple transactions can hold simultaneously. Update (U): intent to update — prevents two transactions from upgrading to exclusive at the same time. Exclusive (X): write lock, blocks all other access. Intent locks (IS, IX, SIX): signal at a higher level (table) that lower-level locks exist.
Q67:

What are Lock Waits and Blocking Chains? How do you debug them?

Mid

Answer

Blocking occurs when a session holds a lock required by another. Debugging tools include:

  • sp_whoisactive
  • sys.dm_exec_requests
  • Extended events
  • Activity Monitor
Quick Summary: Lock waits happen when one transaction needs a lock already held by another. A blocking chain is multiple transactions queued behind the head blocker. Debug using sys.dm_exec_requests, sys.dm_os_waiting_tasks, or the blocking report in Query Store. The head blocker is always the root cause — fix that transaction first.
Q68:

Explain TempDB usage in detail.

Mid

Answer

TempDB stores temp tables, table variables, hash join work tables, version store, spills, and cursor data. Best practices include multiple equal-sized files and SSD storage.

Quick Summary: TempDB stores: temp tables (#temp), table variables (@table), internal work tables for sorts and hash joins, row version store for snapshot isolation, DBCC operations, and Service Broker. It's recreated fresh every SQL Server restart. All user sessions share it — heavy TempDB use is a common performance bottleneck.
Q69:

Explain the Write-Ahead Logging (WAL) mechanism.

Mid

Answer

Before modifying a data page, SQL writes the log record first. This ensures durability and crash recovery.

Quick Summary: WAL guarantees every change is written to the transaction log (WAL) before it's written to data pages. On crash, SQL Server replays committed log records (redo) and undoes uncommitted ones (undo). This makes the database consistent after a crash without data loss. Durability in ACID depends entirely on WAL.
Q70:

Describe Transaction Log Architecture. Why does the log grow?

Mid

Answer

The log consists of Virtual Log Files (VLFs). Log grows due to long-running transactions, no log backups, replication delays, or index rebuilds inside transactions.

Quick Summary: The transaction log records every change in sequence — each log record has an LSN (Log Sequence Number). The log grows when: long-running transactions hold the log open, log backups aren't taken (Full recovery model), or replication hasn't consumed the log. Regular log backups keep it manageable.
Q71:

Explain CHECKPOINT and why it is important.

Mid

Answer

CHECKPOINT flushes dirty pages to disk to reduce crash recovery time and manage buffer pressure.

Quick Summary: CHECKPOINT flushes dirty pages (modified but not yet written) from the buffer pool to disk. It reduces crash recovery time — SQL Server only needs to replay changes since the last checkpoint instead of replaying the whole log. Automatic checkpoints run periodically; you can also force one manually.
Q72:

What is the difference between Full, Differential, and Log Backups?

Mid

Answer

Full: Entire database.

Differential: Changes since last full.

Log: Captures all log records since last log backup.

Quick Summary: Full backup: entire database, self-contained restore point. Differential backup: only pages changed since the last full backup — faster than full, smaller file. Log backup: transaction log since last log backup — enables point-in-time recovery. Together they form a restore chain: full + diff + logs.
Q73:

Explain Recovery Models (Simple, Full, Bulk-Logged).

Mid

Answer

Simple: Auto-truncates log; no PIT recovery.

Full: Requires log backups; supports PIT recovery.

Bulk-Logged: Minimally logs operations.

Quick Summary: Simple: log is truncated at each checkpoint — no point-in-time recovery, just restore to last full or diff backup. Full: log is kept until backed up — allows point-in-time restore, required for AlwaysOn. Bulk-Logged: like Full but bulk operations are minimally logged — faster loads but can't do point-in-time around them.
Q74:

What is Tail Log Backup?

Mid

Answer

A backup taken before restoring a damaged DB to prevent data loss. Captures last active log records.

Quick Summary: A tail-log backup captures the end of the transaction log after a failure but before restore begins. Without it, you lose all transactions between the last log backup and the crash. It's the last piece of the restore chain. Required when recovering a Full-model database to the point of failure.
Q75:

Explain the Restore Sequence in SQL Server.

Mid

Answer

Restore order: Full ? Differential ? Logs. Use NORECOVERY until final restore, then RECOVERY.

Quick Summary: Restore sequence: 1) Restore the full backup (NORECOVERY), 2) Apply each differential/log backup in order (NORECOVERY), 3) Apply final log backup with RECOVERY to bring the database online. NORECOVERY keeps the database in restoring state so you can apply more backups; RECOVERY completes the restore.
Q76:

What is AlwaysOn Availability Groups? Explain components.

Mid

Answer

AG consists of primary replica, secondary replicas, listener, and modes (sync/async). Provides HA, DR, and readable secondaries.

Quick Summary: AlwaysOn AG replicates databases from a primary replica to one or more secondary replicas using log shipping internally. Supports synchronous (zero data loss, needed for automatic failover) and asynchronous (lower latency for distant replicas). Secondaries can serve reads, backups, and reporting to offload the primary.
Q77:

Explain Log Shipping.

Mid

Answer

Log Shipping copies log backups to a secondary server for restore. Simple, reliable, but no automatic failover.

Quick Summary: Log shipping automatically copies and restores transaction log backups from a primary to one or more secondary servers. Simple and cheap — just a SQL Agent job. Secondaries are in restoring state (read-only via STANDBY). No automatic failover; manual intervention required. Predecessor to AlwaysOn AG.
Q78:

Explain Replication Types (Snapshot, Transactional, Merge).

Mid

Answer

Snapshot: Full copy; simple.

Transactional: Near real-time; best for reporting.

Merge: Bidirectional sync for disconnected systems.

Quick Summary: Snapshot replication copies a full snapshot of data to subscribers periodically — good for infrequently changed data. Transactional replication streams individual changes in near real-time — good for read scale-out. Merge replication allows both sides to make changes and merges them — complex, used for offline scenarios.
Q79:

What is Database Mirroring? Why is it deprecated?

Mid

Answer

Mirroring uses principal + mirror + witness for automatic failover. Deprecated in favor of Availability Groups.

Quick Summary: Database Mirroring sends transaction log blocks from principal to mirror in real-time. High-Safety mode (synchronous) = automatic failover with a witness. High-Performance mode (asynchronous) = potential data loss. Deprecated since SQL Server 2012 in favor of AlwaysOn AGs, which are more flexible and feature-rich.
Q80:

What are Stored Procedures and why are they preferred over sending raw SQL from applications?

Senior

Answer

Stored procedures are precompiled program units that live inside SQL Server. Instead of sending raw SQL text for every request, the application sends only the procedure name and parameters. SQL Server then executes pre-validated logic using a cached execution plan.

Key advantages over raw ad-hoc SQL:

  • Reduced network traffic: Only parameter values are sent, not large query strings.
  • Plan reuse: SQL Server can cache and reuse execution plans for procedures, reducing compilation overhead and stabilizing performance.
  • Centralized business logic: Data-related rules (validation, transformations, audit operations) are centralized and versioned at the database level, simplifying deployments and multi-application integration.
  • Security and least-privilege: Applications can be granted EXECUTE rights on procedures instead of direct table access, improving security boundaries and auditability.
  • Reduced SQL injection risk: The SQL logic is fixed inside the procedure; parameters are bound, greatly decreasing injection surfaces compared to concatenated SQL strings.
Quick Summary: Stored procedures send a single call over the network instead of many SQL strings — less round trips. Plans are compiled and cached on first run — subsequent calls skip parsing and optimization. Security is cleaner: grant EXECUTE on the procedure, not SELECT/INSERT on the tables. Logic stays in the database layer.
Q81:

Explain the difference between Stored Procedures and Functions in SQL Server.

Senior

Answer

Stored procedures and functions both encapsulate reusable logic, but they serve different purposes and behave differently inside SQL Server.

Stored procedures:

  • Primarily designed to perform actions: DML operations, transaction control, administrative tasks.
  • Can return multiple result sets and output parameters.
  • Cannot be directly used in SELECT, WHERE, or JOIN clauses.
  • Commonly used as API endpoints from the application layer.

Functions:

  • Intended for computation and value-returning logic.
  • Scalar functions return a single value; table-valued functions return a rowset.
  • Can be used inside SELECT, WHERE, and JOIN clauses like expressions or tables.
  • Have tighter restrictions: no explicit transaction control; most cannot perform data-modifying side effects.

In short: procedures orchestrate operations, while functions compute values that integrate with queries.

Quick Summary: Stored procedures can modify data, use transactions, output parameters, and don't have to return a result set. Functions must return a value, can't modify data (except table variables), and can be called inline in a SELECT. Inline TVFs act like parameterized views — they compose well; multi-statement TVFs often don't.
Q82:

Explain Scalar Functions vs Inline Table-Valued Functions vs Multi-Statement Table-Valued Functions.

Senior

Answer

SQL Server supports several function types, each with distinct performance and usage characteristics.

Scalar functions:

  • Return a single scalar value (e.g., INT, VARCHAR).
  • Evaluated per row when used in queries, often resulting in RBAR (Row-By-Agonizing-Row) execution.
  • Act as black boxes to the optimizer, frequently preventing parallelism and leading to severe performance issues.

Inline table-valued functions (iTVFs):

  • Return a table defined by a single RETURN (SELECT ...) statement.
  • Logically similar to parameterized views.
  • Fully inlined into the calling query, enabling the optimizer to generate efficient set-based plans with good cardinality estimates.
  • Usually the best-performing function type for set-based logic.

Multi-statement table-valued functions (mTVFs):

  • Declare an internal table variable and populate it across multiple statements.
  • By default, SQL Server assumes a fixed row estimate (historically 1 row), often leading to poor plans.
  • Limited statistics and cardinality information, frequently causing spills and suboptimal joins.

For performance-sensitive code, prefer inline TVFs over scalar or multi-statement TVFs whenever possible.

Quick Summary: Scalar: returns one value, called per row — can severely degrade performance in queries. Inline TVF: returns a table, SQL Server can inline it into the calling query and optimize it like a view — very efficient. Multi-statement TVF: builds a table row by row — opaque to the optimizer, often slow on large data.
Q83:

Why can Scalar Functions severely degrade performance?

Senior

Answer

Scalar functions often look clean and reusable but can be hidden performance killers in production workloads.

Reasons they degrade performance:

  • They are executed once per row in a result set, turning set-based operations into row-by-row computation.
  • The optimizer cannot see inside the function body, treating it as a black box. This prevents many optimizations and accurate cardinality estimates.
  • In many SQL Server versions, the presence of scalar functions in the SELECT or WHERE list can disable parallelism, forcing single-threaded plans for otherwise parallelizable queries.
  • They frequently increase CPU usage and query duration dramatically on large datasets.

Mitigations include:

  • Refactoring to inline TVFs or pure set-based SQL.
  • Replacing scalar functions with computed columns (possibly indexed) when appropriate.
  • Inlining logic into the main query where feasible.
Quick Summary: A scalar UDF called in a WHERE or SELECT clause runs once per row — on a million-row table, that's a million individual function calls. The optimizer often can't see inside and treats it as a black box, blocking parallelism and estimates. Replace with inline TVFs or inlined expressions whenever possible for big tables.
Q84:

What are Views? Explain their benefits and limitations.

Senior

Answer

A view is a named, virtual table defined by a SELECT query. It does not store data itself (unless indexed) but presents a reusable query abstraction.

Benefits:

  • Abstraction and simplification: Hides complex joins and expressions behind a simple interface.
  • Security: Exposes only selected columns/rows while hiding underlying schemas and tables.
  • Reuse: Centralizes business logic or filter logic so that many queries can benefit from a single definition.
  • Schema evolution: Application code can query the view even if underlying tables change, as long as the view's contract is preserved.

Limitations:

  • Views can become stacked (views on views), leading to overly complex, hard-to-tune execution plans.
  • Not all views are updatable; complex joins, aggregates, and DISTINCT can prevent direct DML.
  • They do not inherently improve performance unless combined with indexed views or used to encapsulate optimal query patterns.
Quick Summary: Views simplify complex queries by saving them as named objects. They enforce consistent access patterns and can restrict column-level access. Limitations: non-indexed views don't store data (they run on every access), can't use ORDER BY without TOP, and add a layer of indirection that occasionally confuses the optimizer.
Q85:

What are Indexed Views and when should you use them?

Senior

Answer

Indexed views materialize the result of a view and maintain it on disk like a physical table. They are backed by a clustered index and optionally additional nonclustered indexes.

When to use:

  • Heavy, repetitive aggregations or joins over large datasets (e.g., reporting queries over transactional tables).
  • Scenarios where read performance is critical and data changes are relatively moderate.
  • To precompute expensive expressions and reduce CPU usage under analytic workloads.

Trade-offs:

  • Every insert/update/delete on the underlying tables must update the indexed view, increasing DML overhead.
  • Strict requirements apply (e.g., schema binding, deterministic expressions).
  • Can complicate troubleshooting if developers are unaware of their presence.

They are powerful for read-intensive workloads but should be used selectively and measured carefully in write-heavy systems.

Quick Summary: Indexed views (materialized views) physically store the query result on disk with an index. Reads are instant — no recomputation. But writes to the base tables must maintain the view, adding overhead. Best for aggregation queries that run frequently on slowly-changing data. Requires SCHEMABINDING and strict query form.
Q86:

What are Triggers? Explain AFTER vs INSTEAD OF Triggers.

Senior

Answer

Triggers are special stored procedures that automatically execute in response to DML events (INSERT, UPDATE, DELETE) or certain DDL operations.

AFTER triggers:

  • Fire after the base operation logically succeeds but before the transaction commits.
  • Often used for auditing, logging, or enforcing complex constraints that span multiple tables.
  • Run within the transaction, so failures can cause rollbacks and increase latency.

INSTEAD OF triggers:

  • Fire instead of the original DML operation.
  • Commonly used on views to simulate complex update logic or route changes to multiple underlying tables.
  • Give full control over how changes are applied.

Both types must be designed carefully to avoid recursion, hidden performance issues, and unexpected side effects.

Quick Summary: AFTER trigger fires after the DML completes — used for auditing, cascading changes. INSTEAD OF trigger replaces the operation entirely — used on views that can't be directly modified, or to intercept and transform inserts/updates. Both have access to inserted and deleted virtual tables showing what changed.
Q87:

Why are Triggers generally not preferred in large systems?

Senior

Answer

Triggers are powerful but can introduce hidden complexity and performance problems in large systems.

Key concerns:

  • Hidden behavior: Business logic executes implicitly on DML, making the system harder to reason about. Developers may not realize why certain actions occur.
  • Chained effects: Triggers can call other triggers, causing cascading side effects that are difficult to debug and test.
  • Performance impact: Trigger logic runs inside the transaction. Long-running triggers increase lock durations and contention.
  • Maintainability: Logic spread across triggers and procedures leads to fragmented business rules and operational risk.

For these reasons, many teams prefer stored procedures, constraints, and explicit application-level logic over heavy trigger usage, reserving triggers for focused, unavoidable use cases (e.g., auditing).

Quick Summary: Triggers are invisible — they fire silently and can cause unexpected side effects. They slow down every DML operation on the table, even when you don't need the trigger behavior. They complicate debugging and make data changes non-obvious. Modern systems prefer explicit application logic or outbox patterns instead.
Q88:

Explain Window Functions and why they are essential in modern SQL.

Senior

Answer

Window functions (e.g., ROW_NUMBER(), RANK(), SUM() OVER (...)) allow calculations across sets of rows without collapsing them into a single result like GROUP BY does.

Why they are essential:

  • Enable ranking, running totals, moving averages, percentiles, and gap/density analysis in a single pass.
  • Reduce the need for self-joins and correlated subqueries, often resulting in cleaner and faster plans.
  • Can be combined with PARTITION BY and ORDER BY to support rich analytical queries directly in OLTP or reporting databases.
  • Help keep logic set-based and push computation into the database layer where it is highly optimized.
Quick Summary: Window functions operate across a set of rows related to the current row without collapsing them into groups. ROW_NUMBER, RANK, DENSE_RANK for ranking. LAG/LEAD for accessing neighboring rows. SUM/AVG OVER() for running totals. They replace complex self-joins and correlated subqueries with clean, readable SQL.
Q89:

Explain PARTITION BY and ORDER BY within Window Functions.

Senior

Answer

Window functions operate over a logical window of rows, defined by PARTITION BY and ORDER BY clauses in the OVER() expression.

PARTITION BY:

  • Divides the result set into groups (partitions) for independent calculations, similar to grouping in analytics.
  • Example: SUM(SalesAmount) OVER (PARTITION BY CustomerId) gives total sales per customer.

ORDER BY:

  • Defines the sequence of rows within each partition.
  • Required for ranking and running calculations like ROW_NUMBER(), cumulative sums, and moving averages.

In summary, PARTITION BY defines the scope and ORDER BY defines the sequence of the window.

Quick Summary: PARTITION BY divides rows into groups for the window function calculation — like GROUP BY but without collapsing rows. ORDER BY inside the window defines the order of rows within each partition for ranking or running calculations. Together: ROW_NUMBER() OVER (PARTITION BY Dept ORDER BY Salary) ranks employees within each department.
Q90:

What is Table Partitioning? How does SQL Server implement it?

Senior

Answer

Table partitioning horizontally splits a large table into smaller, manageable partitions based on a partition key (often date or range-based IDs). Logically it remains a single table, but physically data is separated.

SQL Server implementation:

  • Partition function: Defines boundary points for key ranges.
  • Partition scheme: Maps partitions to one or more filegroups.
  • The table or index is created ON the partition scheme, distributing rows across partitions based on the key.

Benefits:

  • Maintenance operations (index rebuilds, statistics, archiving) can be targeted at specific partitions.
  • Supports sliding window load/archive patterns via partition SWITCH operations.
  • Can improve query performance via partition elimination, especially for time-sliced workloads.

Partitioning is not automatically faster; it must align with query predicates and maintenance strategies to be effective.

Quick Summary: Table partitioning divides a large table into multiple physical filegroups based on a partition function (range of values) and partition scheme (maps ranges to filegroups). SQL Server can eliminate partitions that don't match the query filter. Makes archiving simple: switch old partitions out without touching other data.
Q91:

Explain Partition Elimination with an example scenario.

Senior

Answer

Partition elimination occurs when SQL Server restricts I/O to only those partitions that might contain relevant rows, instead of scanning all partitions.

Example: A table partitioned by OrderDate by month. A query filtered on OrderDate BETWEEN '2024-01-01' AND '2024-01-31' can read only the January partition if:

  • The filter is directly on the partition key.
  • There are no non-SARGable expressions on the partition column.
  • The data types and collation match exactly.

If functions like CONVERT() or different data types are used on the partition column, partition elimination may fail and all partitions may be scanned, losing the performance benefit.

Quick Summary: Partition elimination is when SQL Server skips partitions that can't contain matching rows. A query filtering WHERE OrderDate BETWEEN '2024-01-01' AND '2024-03-31' on a table partitioned by month only scans Q1 partitions and ignores the rest. This dramatically reduces I/O on large, date-partitioned tables.
Q92:

What are the typical causes of poor execution plan performance?

Senior

Answer

Poor execution plans are usually symptoms of deeper issues in schema design, statistics, or query patterns.

Common causes:

  • Missing or inappropriate indexes: Forcing table scans or expensive lookups.
  • Stale or missing statistics: Leading to incorrect row estimates and wrong join strategies.
  • Parameter sniffing: Plan optimized for one parameter value, reused for others with different data distributions.
  • Scalar functions and multi-statement TVFs: Preventing optimization and parallelism.
  • Complex views over views: Obscure actual data access and create bloated plans.
  • Implicit conversions: Causing non-SARGable predicates or index misses.
  • RBAR patterns (cursors, loops): Neglecting set-based approaches.

Effective tuning often involves query simplification, better indexing, and statistics maintenance rather than just tweaking server settings.

Quick Summary: Common causes of poor plans: stale statistics (wrong row estimates), parameter sniffing (plan built for atypical values), missing indexes (forcing table scans), implicit conversions (can't use indexes), SARGability issues (function on column), and lock escalation or blocking distorting plan behavior.
Q93:

What is Parameter Sniffing and how do you handle it?

Senior

Answer

Parameter sniffing occurs when SQL Server compiles a plan using the initial parameter values it sees, then reuses that plan for subsequent executions. If data distribution is skewed, one plan may not fit all parameter scenarios.

Symptoms: Some calls are lightning fast, others very slow, using the same procedure and query shape.

Handling strategies:

  • Use OPTION (RECOMPILE) for highly skewed queries where compilation cost is acceptable.
  • Use OPTIMIZE FOR UNKNOWN or OPTIMIZE FOR (@param = ...) hints to choose more robust plans.
  • Capture parameters in local variables inside the procedure to discourage sniffing and produce more average plans.
  • Split logic into separate procedures for different parameter ranges if patterns are distinct.
  • Use Query Store to force stable plans when appropriate.
Quick Summary: Parameter sniffing caches a plan based on first-call parameters. If subsequent calls use different values, the cached plan may be inefficient. Fixes: OPTION(RECOMPILE) forces fresh compilation each time; OPTIMIZE FOR UNKNOWN makes the optimizer ignore sniffed values; local variables break plan reuse entirely.
Q94:

Explain Execution Plan Caching and Reuse.

Senior

Answer

SQL Server uses a plan cache to store compiled execution plans so that subsequent executions of the same (or parameterized) queries can skip the compilation phase.

Benefits:

  • Reduces CPU overhead from repeated compilations.
  • Improves response time for frequently executed queries and stored procedures.

Challenges:

  • Poorly parameterized or ad-hoc queries can cause plan cache bloat, with many single-use plans.
  • Parameter sniffing issues stem from plan reuse across different parameter values.
  • Schema changes or statistics updates can invalidate plans, causing recompilation spikes.

Best practice is to use parameterized queries, monitor plan cache size and reuse patterns, and leverage Query Store to manage problematic plans.

Quick Summary: SQL Server caches compiled execution plans in the plan cache keyed by the query text hash. Matching queries reuse the plan without reparsing and reoptimizing. Plan reuse saves CPU but can backfire with parameter sniffing. sp_executesql with parameters enables safe plan reuse better than string concatenation.
Q95:

What is Schema Binding and why is it important?

Senior

Answer

Schema binding associates a view or function tightly with the underlying schema so that the referenced objects cannot be modified in incompatible ways without first changing the bound object.

Importance:

  • Required for indexed views and some computed columns, ensuring structural stability.
  • Prevents accidental changes (e.g., dropping or altering columns) that would silently break dependent logic.
  • Helps enforce contracts between application code, views, and functions.

In high-governance environments, schema binding is a tool to ensure that refactoring is deliberate and fully impact-assessed.

Quick Summary: SCHEMABINDING ties a function or view to the objects it references — you can't drop or alter those objects without first modifying the dependent object. Required for indexed views. It also enables the optimizer to inline simple functions more aggressively and prevents accidental schema changes from breaking dependencies.
Q96:

Explain the concept of set-based operations vs row-by-row operations.

Senior

Answer

SQL is fundamentally a set-based language: it is designed to operate on collections of rows at once, not one row at a time.

Set-based operations:

  • Leverage the optimizer to choose efficient algorithms (hash joins, merges, batch operations).
  • Use fewer logical and physical reads for bulk operations.
  • Scale more gracefully as data volume grows.

Row-by-row (RBAR) operations:

  • Implement logic via cursors or loops, processing one row at a time.
  • Usually lead to excessive context switching, locking overhead, and long runtimes.
  • Only justified for very complex, inherently procedural business rules.

Senior-level SQL design focuses on transforming requirements into set-based patterns whenever possible, often with window functions, joins, and properly designed queries.

Quick Summary: Set-based operations (UPDATE, INSERT, DELETE on sets of rows) leverage SQL Server's ability to process many rows in one pass using optimized algorithms and parallelism. Row-by-row processing (cursors, WHILE loops) forces sequential iteration — slow, can't parallelize, and often orders of magnitude slower on large data.
Q97:

What are Cursors and why are they usually discouraged?

Senior

Answer

Cursors are database objects that allow row-by-row traversal of a result set, similar to iterators in procedural languages.

Reasons they are discouraged:

  • They process data one row at a time, leading to poor performance on large sets.
  • They often hold locks for a long time, reducing concurrency.
  • They introduce complex, procedural logic that is harder to maintain and test.
  • They typically have higher memory and tempdb overhead compared to set-based alternatives.

Cursors should be a last resort, used only when set-based solutions are impractical or impossible. In many cases, window functions, MERGE statements, or carefully written set-based updates can replace cursor logic.

Quick Summary: Cursors iterate over result sets one row at a time. For large datasets this is extremely slow — it bypasses SQL Server's set-based engine. Most cursor logic can be rewritten with set-based updates, window functions, or CTEs. Use cursors only when row-by-row processing is genuinely unavoidable, like calling a stored procedure per row.
Q98:

Explain the difference between Physical Reads, Logical Reads, and Page Life Expectancy.

Senior

Answer

These metrics help diagnose I/O and memory performance in SQL Server.

Logical reads: The number of 8KB pages read from the buffer cache (memory). High logical reads can indicate inefficient queries or missing indexes.

Physical reads: Pages read from disk because they were not found in the buffer cache. Physical I/O is orders of magnitude slower than memory access.

Page Life Expectancy (PLE): An indicator of how long pages stay in the buffer cache before being evicted. A consistently low PLE suggests memory pressure or inefficient queries repeatedly flushing the cache.

Senior engineers use these metrics together to determine whether to focus on query tuning, indexing, or adding memory/storage capacity.

Quick Summary: Physical Read: SQL Server fetched the page from disk (slow). Logical Read: page was found in the buffer pool (fast, in-memory). Logical reads measure how much work a query did in memory — lower is better. Page Life Expectancy (PLE) shows how long pages stay in the buffer pool — low PLE means memory pressure and frequent disk reads.
Q99:

What is SARGability and why is it critical for performance?

Senior

Answer

SARGability (Search ARGument-ability) describes whether a predicate can efficiently use an index to seek rows instead of scanning.

SARGable predicates:

  • Simple comparisons like Column = @Value, Column >= @Start AND Column <= @End.
  • Allow the optimizer to perform index seeks and range scans.

Non-SARGable patterns:

  • Applying functions to columns: WHERE LOWER(Col) = 'abc'.
  • Expressions on the left side: WHERE Col + 1 = 10.
  • Implicit conversions that change the column's data type.
  • Leading wildcards: LIKE '%abc'.

Ensuring predicates are SARGable is one of the most impactful techniques in query tuning: it allows indexes to be used effectively, minimizing reads and dramatically improving performance.

Quick Summary: SARGable (Search ARGument ABLE) expressions let SQL Server use an index to seek matching rows. Non-SARGable expressions wrap the column in a function or implicit conversion — forcing a scan. SARGable: WHERE LastName = 'Smith'. Non-SARGable: WHERE UPPER(LastName) = 'SMITH' or WHERE CAST(Id AS varchar) = '123'.
Q100:

What Is TempDB and Why Is It Critical for SQL Server Performance?

Senior

Answer

TempDB is SQL Server’s global workspace used by all users and internal operations. It is essential because it handles:

  • Sorting operations (ORDER BY, GROUP BY)
  • Hash joins and aggregations
  • Row versioning for snapshot isolation
  • Temporary tables and table variables
  • Intermediate spill data during execution
  • Cursors and internal worktables

If TempDB becomes slow or suffers contention, the entire SQL Server instance slows down. Proper sizing, fast storage, and multiple data files are critical for performance.

Quick Summary: TempDB is a shared system database recreated on SQL Server restart. It holds temp tables, table variables, sort and hash spill buffers, row version store for snapshot isolation, and DBCC intermediate data. All user sessions share it — heavy concurrent use creates allocation contention on specific system pages.
Q101:

What Is TempDB Contention and What Causes It?

Senior

Answer

TempDB contention occurs when multiple threads compete for the same allocation pages (PFS, GAM, SGAM) or metadata structures.

Typical causes include:

  • Too few TempDB data files
  • Heavy use of temp tables/table variables
  • Large sorting or hashing operations
  • High row versioning pressure

Fixes: increase file count, equal-size files, optimize workload, reduce spills.

Quick Summary: TempDB contention typically hits PFS, GAM, and SGAM pages — special allocation pages that every new temp object allocation touches. In high-concurrency workloads, many sessions compete for the same pages. Fix: add multiple TempDB data files (one per CPU core up to 8), enable trace flag 1118, use SQL Server 2016+ with automatic improvement.
Q102:

What Are Data Pages and Index Pages in SQL Server?

Senior

Answer

SQL Server stores all data in 8 KB pages.

  • Data pages store full table rows.
  • Index pages store B-tree navigation structures and pointers.

Understanding pages explains index fragmentation, logical reads, and physical I/O behavior.

Quick Summary: SQL Server stores data in 8KB pages. Data pages hold actual table rows. Index pages hold B-tree index entries at each level — leaf pages point to actual data rows (or contain them for clustered indexes). Each page belongs to one object and is managed by the buffer pool for caching.
Q103:

What Is the Buffer Pool and How Does SQL Server Use It?

Senior

Answer

The buffer pool is SQL Server’s memory area used for caching data and index pages.

  • Logical reads come from memory.
  • Physical reads occur only when data is missing in cache.
  • High buffer reuse improves performance.

If buffer pool is too small: more disk reads, lower PLE, and slower query execution.

Quick Summary: The buffer pool is SQL Server's main memory cache. When data is needed, SQL Server reads the page from disk into the buffer pool. Future reads hit memory (logical read) instead of disk (physical read). SQL Server uses all available memory for the buffer pool — giving it more RAM directly improves read performance.
Q104:

What Is Page Life Expectancy (PLE) and What Does It Indicate?

Senior

Answer

Page Life Expectancy (PLE) measures how long pages remain in the buffer pool before eviction.

High PLE = good memory health.
Low PLE = memory pressure, excessive physical reads.

Common causes of low PLE: bad queries, missing indexes, large scans, spills.

Quick Summary: PLE measures how long (in seconds) a data page stays in the buffer pool before being pushed out by new pages. A healthy PLE is typically 300+ seconds (5 minutes). Dropping PLE means SQL Server is constantly evicting and re-reading pages — a sign of memory pressure or queries doing large scans.
Q105:

How Does the SQL Server Transaction Log Work Internally?

Senior

Answer

The transaction log is an append-only structure that records all modifications. SQL Server uses Write-Ahead Logging (WAL):

  • Log records are written to disk first
  • Data pages update later

This ensures atomicity and durability. Log truncation depends on checkpoints and active transactions.

Quick Summary: Every change is written to the transaction log before hitting data pages (Write-Ahead Logging). The log is sequential — fast to write. At CHECKPOINT, dirty pages flush to disk. On crash, SQL Server reads the log: redo commits that weren't in data files, undo uncommitted transactions. This sequence ensures full recoverability.
Q106:

What Is the Purpose of Checkpoints in SQL Server?

Senior

Answer

Checkpoints flush dirty pages (modified pages) from memory to disk.

Benefits:

  • Shorter crash recovery time
  • Enables transaction log truncation
  • Reduces buffer pool pressure
Quick Summary: CHECKPOINT writes all dirty pages (modified in buffer pool but not yet on disk) to data files. This shortens crash recovery — SQL Server only needs to replay log records after the last checkpoint. Without checkpoints, recovery would require replaying the entire log from the beginning, which could take hours.
Q107:

What Are Dirty Pages and Clean Pages in SQL Server?

Senior

Answer

Dirty pages: modified in buffer pool but not yet persisted to disk.
Clean pages: identical to the disk version.

Checkpoints convert dirty pages to clean pages. Too many dirty pages increase recovery time and degrade performance.

Quick Summary: Dirty pages are buffer pool pages that have been modified but not yet written to disk. Clean pages are in sync with disk. CHECKPOINT flushes dirty pages. Tracking dirty vs. clean pages is how SQL Server knows exactly what needs to be written during a checkpoint and what needs to be redone after a crash.
Q108:

What Are Latches and How Do They Differ From Locks?

Senior

Answer

Latches protect internal memory structures and are non-transactional.

Locks protect logical data consistency and last for transaction duration.

Latch waits = engine pressure.
Lock waits = concurrency or blocking issues.

Quick Summary: Latches are lightweight, short-lived synchronization objects for physical consistency of in-memory pages — they don't appear in deadlock graphs and aren't tracked like locks. Locks are for logical, transaction-level concurrency — they persist for the transaction duration and show up in blocking queries and deadlocks.
Q109:

What Is Lock Escalation and How Does It Impact Performance?

Senior

Answer

Lock escalation converts many row/page locks into a single table lock to reduce overhead.

Impact:

  • Fewer locks = lower memory use
  • But higher blocking risk
Quick Summary: When individual row or page locks hit a threshold (~5,000), SQL Server escalates to a single table lock to reduce lock manager overhead. This improves efficiency but completely blocks other transactions from accessing the table. Problematic in high-concurrency workloads — can cause cascading blocking chains.
Q110:

What Is the Cardinality Estimator and Why Is It Important?

Senior

Answer

The Cardinality Estimator (CE) predicts row counts used to choose join types, memory grants, and access paths.

Poor estimates ? poor execution plans.

CE depends on statistics, data distribution, and query structure.

Quick Summary: The cardinality estimator predicts how many rows each plan step will produce based on statistics histograms. Wrong predictions lead to wrong join types, bad memory grants, and poor index choices. SQL Server 2014 introduced a new CE (model 120); older compatibility levels use the legacy model. Statistics quality is the root input.
Q111:

What Are Statistics in SQL Server and How Do They Affect Performance?

Senior

Answer

Statistics describe data distribution that SQL Server uses to estimate row counts.

Bad or outdated stats ? misestimation ? poor plans ? slow queries.

Quick Summary: Statistics are metadata objects containing histograms of column value distribution. SQL Server uses them to estimate selectivity — how many rows a filter returns. Good statistics → accurate estimates → efficient plans. Stale statistics → wrong estimates → scans instead of seeks, wrong joins, wasted memory grants.
Q112:

What Are Memory Grants and Why Do They Matter?

Senior

Answer

SQL Server allocates memory for sorting, hashing, and aggregations.

Incorrect grants:

  • Too small ? spills to TempDB
  • Too large ? resource starvation
Quick Summary: Memory grants are pre-allocated memory for sort and hash operations in a query. SQL Server estimates how much memory is needed based on cardinality estimates. Grant too small → spill to TempDB (slow). Grant too large → other queries starved of memory. Fix by updating statistics for accurate row estimates.
Q113:

What Is a Spill to TempDB and Why Does It Occur?

Senior

Answer

A spill happens when the memory grant is too small and SQL Server must offload intermediate data to TempDB.

Common causes: inaccurate cardinality estimation, heavy sorts, hash joins, window functions.

Quick Summary: A spill to TempDB happens when SQL Server allocated less memory than a sort or hash operation actually needed. Overflow data is written to TempDB disk, which is orders of magnitude slower than memory. Caused by underestimated row counts (bad statistics). Look for sort warnings and hash warnings in execution plans.
Q114:

What Is the SQL Server Query Processor?

Senior

Answer

The Query Processor:

  • Parses T-SQL
  • Binds objects
  • Optimizes using cost-based rules
  • Generates physical plans
Quick Summary: The query processor is the component that takes your SQL statement and executes it. It includes the parser (syntax check), algebrizer (binding to objects), query optimizer (plan selection), and execution engine (running the plan). The query processor is why well-written SQL runs fast — and why poorly-written SQL does not.
Q115:

What Are Logical and Physical Operators in Execution Plans?

Senior

Answer

Logical operators represent conceptual operations (join, project, filter).

Physical operators are real engine implementations (hash join, nested loop, merge join).

Quick Summary: Logical operators are the high-level operations in the plan: Filter, Join, Aggregate, Sort. Physical operators are the specific algorithms chosen: Nested Loops (for join), Hash Match (for join or aggregate), Sort (for ordering). One logical operation maps to one physical operator — understanding both helps you read execution plans.
Q116:

How Does SQL Server Choose Join Algorithms?

Senior

Answer

Join selection depends on:

  • Row count estimates
  • Sort order
  • Indexes
  • Estimated CPU and I/O cost
Quick Summary: SQL Server picks join algorithms based on input size and available indexes. Small outer table with indexed inner → Nested Loops. Both inputs sorted by join key → Merge Join (fast, no extra memory). Large unsorted inputs → Hash Match (memory-heavy, may spill). Missing indexes push everything toward Hash Match.
Q117:

What Is a Hash Match and Why Is It Heavy on TempDB?

Senior

Answer

A hash match creates an in-memory hash table for joins and aggregates.

If memory is insufficient ? spills to TempDB ? huge slowdown.

Quick Summary: Hash Match builds a hash table from the smaller input (build phase), then scans the larger input and probes the hash table (probe phase). Build phase requires memory — if underestimated, it spills to TempDB. Heavy on both memory and TempDB I/O. Common when indexes are missing and inputs are large.
Q118:

What Are Log Buffers and How Do Log Flushes Work?

Senior

Answer

Changes go to an in-memory log buffer. A log flush occurs:

  • On transaction commit
  • When buffer fills
  • On checkpoint

Slow log storage = slow commits.

Quick Summary: Log records are written to in-memory log buffers first, then flushed to disk on transaction commit, every 60KB of log buffer, or every second. COMMIT can't return until the log record is hardened to disk — that's what makes transactions durable. Log flushes are sequential writes — far faster than random data page I/O.
Q119:

What Is the Difference Between Row Store and Column Store in SQL Server?

Senior

Answer

Row store: row-by-row storage; best for OLTP.

Column store: column-by-column storage; best for analytics, high compression, parallel execution.

Quick Summary: Row store organizes data by rows — all columns for a row are stored together. Good for OLTP (point lookups, row modifications). Column store organizes data by column — ideal for analytics where queries read a few columns across millions of rows. Column store also compresses heavily and enables batch-mode processing.
Q120:

What Is Query Tuning and Why Is It Necessary in SQL Server?

Expert

Answer

Query tuning is the process of optimizing SQL statements so they consume fewer CPU cycles, memory, I/O, and locking resources. Poorly written queries slow down the entire SQL Server instance by overusing limited system resources.

The goal is predictable, scalable performance by reducing logical reads, avoiding unnecessary work, using optimal joins, minimizing spills, and improving concurrency.

Quick Summary: Query tuning is finding why a query is slow and making it faster without changing what it returns. SQL Server's optimizer is good but not perfect — bad statistics, missing indexes, parameter sniffing, and poorly-written SQL all require human intervention. Tuning is reading execution plans, fixing root causes, and measuring the improvement.
Q121:

What is Query Tuning, and why is it necessary in SQL Server?

Expert

Answer

Query tuning is the process of analyzing how SQL Server executes a query and optimizing it to minimize resource usage. Because SQL Server has finite CPU, memory, I/O, and concurrency capacity, an inefficient query can slow down the entire system. Tuning ensures optimal execution paths, fewer logical reads, reduced data movement, faster joins, and short lock durations. The goal is predictable, scalable performance.

Quick Summary: Query tuning improves query performance by identifying inefficiencies in execution plans. Even good databases accumulate slow queries as data grows, schemas change, or statistics drift. Without tuning, a query that ran in 10ms at 10K rows can take 10 seconds at 10M rows with the same code.
Q122:

What is the difference between Estimated and Actual Execution Plans?

Expert

Answer

The estimated plan shows SQL Server's predicted execution strategy based on statistics before running the query. The actual plan shows what really happened: row counts, spills, memory usage, and operator execution. Estimated plans are safe for production; actual plans reveal real bottlenecks like bad estimates, scans, or sorts. Both are essential for diagnosing performance issues.

Quick Summary: Estimated plan uses statistics to guess row counts and show the planned approach. Actual plan captures what really happened — actual rows, actual executions, actual time. The gap between estimated and actual row counts is the first thing to look at. A big gap means bad statistics are misleading the optimizer.
Q123:

Why do row estimation errors cause performance issues?

Expert

Answer

SQL Server uses estimated row counts to choose join types, memory grants, and index strategies. If estimates are inaccurate, SQL Server may choose poor plans. Overestimation causes excessive memory allocation, while underestimation causes spills, nested loops, and excessive lookups. Accurate cardinality estimation is fundamental to stable performance.

Quick Summary: Row estimation errors cause the optimizer to pick wrong strategies. If it thinks a join produces 100 rows but it produces 1 million, it might choose Nested Loops (correct for 100 rows) which is catastrophic at 1 million. Estimation errors cascade — each wrong estimate compounds into a progressively worse plan.
Q124:

What happens when a query spills to TempDB?

Expert

Answer

A query spills when SQL Server lacks sufficient memory for operations like sort or hash. SQL offloads intermediate results to TempDB, causing heavy disk I/O and slow execution. Frequent spills indicate bad estimates, missing indexes, or insufficient memory. Repeated spills degrade overall SQL Server performance.

Quick Summary: When a sort or hash operation needs more memory than its grant, SQL Server writes intermediate data to TempDB. This turns an in-memory operation into disk I/O — often 10-100x slower. You'll see Sort Warnings or Hash Warnings in the execution plan. Fix: update statistics, or use OPTION(MIN_GRANT_PERCENT) hints.
Q125:

What is Parameter Sniffing, and how does it affect performance?

Expert

Answer

Parameter sniffing allows SQL Server to reuse cached execution plans based on the first parameter values supplied. If those values are atypical, the plan may be inefficient for later executions. This leads to performance instability. Solutions include OPTION(RECOMPILE), OPTIMIZE FOR UNKNOWN, local variables, or plan forcing when appropriate.

Quick Summary: SQL Server caches a procedure's plan based on the parameter values from the first compilation. If those values were unusual (e.g., a rare customer ID), the plan is optimized for that case. When typical values run later, they use a plan that's wrong for them. Symptoms: same procedure runs fast sometimes, slow other times.
Q126:

Why is choosing the correct JOIN type critical for performance?

Expert

Answer

Join type determines how SQL Server matches rows. Nested loops are ideal for small sets, merge joins require sorted input, and hash joins work on large, unsorted sets. Incorrect join choices cause excessive I/O, CPU load, and slowdowns. SQL chooses join types based on row estimates and indexes, making accurate statistics essential.

Quick Summary: The wrong JOIN type can transform a millisecond query into a minutes-long one. Nested Loops on a large unsorted table causes repeated random I/O. Hash Match without enough memory spills to TempDB. Choose the join type the optimizer picks naturally with good indexes and statistics — forcing the wrong one manually is rarely right.
Q127:

What causes Hash Match operations, and why can they become bottlenecks?

Expert

Answer

Hash matches occur when SQL Server must build a memory-based hash table for joins or aggregates. They appear when inputs are unsorted or not indexed. They become bottlenecks when memory is insufficient, causing spills to TempDB. Hash operations can be CPU-intensive and degrade concurrency.

Quick Summary: Hash Match builds a hash table in memory and probes it — it's chosen when inputs are large and unsorted with no usable index. Bottlenecks: it needs a significant memory grant, and if that grant is insufficient, it spills to TempDB. Adding the right index often eliminates Hash Match entirely in favor of Nested Loops or Merge Join.
Q128:

How do SARGable expressions influence performance?

Expert

Answer

A SARGable expression allows SQL Server to use indexes efficiently. Non-SARGable predicates (functions on columns, mismatched types) force scans, increasing logical reads and slowing queries. SARGability is foundational for scaling queries on large datasets.

Quick Summary: SARGable expressions allow the optimizer to use index seeks by passing the filter condition directly into the index. Non-SARGable expressions (wrapping the column in a function, CAST, or CONVERT) force a table or index scan. Always filter on the raw column, and ensure data types match without implicit conversion.
Q129:

Why is reducing logical reads more important than reducing elapsed time?

Expert

Answer

Elapsed time varies with workload and server load, but logical reads measure actual data touched. Reducing logical reads consistently reduces CPU, I/O, and cache pressure. Logical reads are the primary, stable metric for performance tuning.

Quick Summary: Elapsed time includes waits — network, I/O, locks. Logical reads measure actual work done by the query engine. A query that takes 5 seconds but does 1 million logical reads has a different problem than one that takes 5 seconds waiting for a lock. Reducing logical reads means the query is fundamentally more efficient.
Q130:

What is the importance of covering indexes in performance tuning?

Expert

Answer

A covering index includes all columns a query needs. This eliminates key lookups and reduces I/O. Covering indexes provide dramatic performance improvements in OLTP systems by minimizing data access and improving plan efficiency.

Quick Summary: A covering index includes all columns a query needs — so SQL Server can answer it entirely from the index without touching the base table. This eliminates key lookups, reduces logical reads, and often dramatically cuts query time. Add SELECT columns via INCLUDE to cover without making the index key unnecessarily wide.
Q131:

When does SQL Server choose Index Seek over Index Scan?

Expert

Answer

SQL Server chooses an index seek when predicates match the index key order and are selective. Otherwise, SQL chooses a scan to avoid expensive random lookups. Seek-friendly index design is critical for optimal performance.

Quick Summary: SQL Server uses Index Seek when the filter is selective (returns a small fraction of rows) and the column is part of an index with a matching leading key. It uses Index Scan when the filter is broad or the index doesn't match well. SARGable predicates are what make seeks possible.
Q132:

What causes Key Lookups, and why are they expensive?

Expert

Answer

Key lookups occur when a non-clustered index lacks required columns. SQL must fetch missing columns from the clustered index for each qualifying row. With many rows, this becomes slow and I/O-heavy. Solutions include covering indexes or query redesign.

Quick Summary: Key lookups happen when a non-clustered index satisfies the WHERE clause, but the SELECT needs columns not in the index. SQL Server fetches those from the clustered index — one random I/O per row. For 100 rows it's fine; for 10,000 rows it's painful. Fix: add INCLUDE columns to the non-clustered index to cover the query.
Q133:

What is the purpose of Statistics in SQL Server performance?

Expert

Answer

Statistics describe column data distribution. SQL Server uses them to estimate row counts and choose efficient plans. Outdated or missing statistics lead to poor estimates and unstable performance. Keeping statistics fresh is essential for reliable query optimization.

Quick Summary: Statistics contain column value histograms used by the optimizer to estimate result set sizes. Good statistics → accurate row estimates → good plan choices. The optimizer doesn't run your query to see how many rows come back — it trusts its statistics. Out-of-date stats from large data loads are a top cause of performance regression.
Q134:

Why do implicit conversions degrade performance?

Expert

Answer

Implicit conversions prevent index usage by forcing SQL Server to convert values at runtime. This leads to scans instead of seeks, higher CPU usage, and slower joins. Matching data types between columns and parameters is vital.

Quick Summary: Implicit conversion happens when two columns being compared have different data types — SQL Server silently converts one. This often makes the comparison non-SARGable, forcing a full scan instead of a seek. Common: comparing nvarchar column to a varchar parameter. Always match data types in joins and WHERE clauses.
Q135:

How does high fragmentation affect performance?

Expert

Answer

Fragmentation scatters index pages, causing more I/O during seeks and reducing read-ahead efficiency. High fragmentation slows down queries and increases disk activity. Regular index maintenance helps restore performance.

Quick Summary: Index fragmentation means leaf pages are out of logical order. When SQL Server does a range scan, it should read pages sequentially — fragmentation turns that into random I/O. Over 30% fragmentation is worth addressing. Reorganize (online, light) for moderate fragmentation; Rebuild (offline or online with Enterprise) for heavy fragmentation.
Q136:

Why is TempDB optimization critical for performance tuning?

Expert

Answer

TempDB handles sorts, hashes, temp tables, versioning, and spill operations. Heavy workloads can cause latch contention and I/O bottlenecks. Optimizing TempDB improves concurrency and stabilizes system-wide performance.

Quick Summary: TempDB is shared by all sessions — temp tables, sorts, hash joins, row versioning all land there. Contention on TempDB allocation pages throttles the whole server. Spill-to-TempDB from undersized memory grants is often the hidden bottleneck in slow queries. Multiple data files and fast NVMe storage are baseline requirements.
Q137:

What are Memory Grants, and why do they impact performance?

Expert

Answer

Memory grants are allocations for joins, sorts, and aggregates. Overestimated grants waste memory; underestimated grants cause spills to TempDB. Accurate row estimation and indexing ensure proper memory usage.

Quick Summary: Memory grants are pre-allocated for sort and hash operations. Underallocated → spill to TempDB. Overallocated → grants lock up RAM, starving other queries. Both situations degrade overall throughput. The root fix is always accurate statistics — the optimizer sizes grants based on estimated row counts.
Q138:

How does excessive recompilation affect query performance?

Expert

Answer

Excessive recompilation forces SQL Server to repeatedly rebuild execution plans, increasing CPU consumption and causing unpredictable performance. Causes include volatile schema, unstable statistics, and widely varying parameters.

Quick Summary: Recompilation forces SQL Server to rebuild an execution plan from scratch. Some recompilation is healthy (catching plan regressions). Excessive recompilation wastes CPU — each compilation takes time. Caused by: schema changes in temp tables, statistics updates, or using OPTION(RECOMPILE) on high-frequency queries.
Q139:

Why do large IN clauses degrade performance?

Expert

Answer

Large IN lists complicate cardinality estimation and increase parsing overhead. SQL Server may misestimate row counts, leading to inefficient join choices. Using temp tables or joins improves accuracy and performance.

Quick Summary: Large IN lists force SQL Server to evaluate many OR conditions or build a large hash structure. For very long lists (1000+ values), performance degrades because the optimizer can't estimate the combined selectivity well. Alternative: load the values into a temp table and JOIN — SQL Server handles joins far more efficiently than giant IN lists.
Q140:

What is the significance of identifying the most expensive operators in execution plans?

Expert

Answer

Most query time is spent in one or two heavy operators like hash joins, sorts, or scans. Identifying these bottlenecks allows focused tuning efforts, resulting in faster optimization and greater performance gains.

Quick Summary: The most expensive operator in an execution plan (highest % cost) is where the most work is happening. Focus tuning there first — it's the bottleneck. A 95% cost table scan means adding an index there will have far more impact than optimizing anything else in the plan. Always fix the biggest problem first.
Q141:

What are the most common root causes of slow queries in production?

Expert

Answer

Slow production queries usually result from:

  • Missing or poorly designed indexes
  • Incorrect cardinality estimates
  • High fragmentation
  • Outdated statistics
  • Parameter sniffing issues
  • Excessive key lookups
  • TempDB contention or spills
  • Implicit conversions
  • Heavy sorts/hashes causing CPU pressure
  • Blocking or deadlocks

Most performance issues are caused by a combination rather than a single factor.

Quick Summary: Common root causes of slow queries: missing or outdated indexes, stale statistics, parameter sniffing, implicit data type conversions, non-SARGable predicates, lock blocking, memory pressure (spills), excessive recompilation, and poorly written loops or cursors instead of set-based operations.
Q142:

How do you identify which query is slowing down the system?

Expert

Answer

To identify slow queries, examine:

  • sys.dm_exec_query_stats – CPU, I/O, execution time
  • sys.dm_exec_requests – currently running queries
  • sys.dm_tran_locks – blocking analysis
  • Extended Events – deep tracing
  • Query Store – historical regressions

This helps pinpoint which query and which operator is causing server slowdown.

Quick Summary: Start with sys.dm_exec_query_stats sorted by total_worker_time or total_elapsed_time to find the costliest queries. Query Store gives historical data with plan regression tracking. Wait stats (sys.dm_os_wait_stats) show whether slowdowns are CPU, I/O, or lock-related. Always measure before tuning — don't guess.
Q143:

How do you detect blocking issues and understand who is blocking whom?

Expert

Answer

Blocking is identified using:

  • sys.dm_exec_requests – blocking_session_id
  • sys.dm_os_waiting_tasks – wait chains
  • Activity Monitor – high-level visualization

Blocking chains often originate from a long-running transaction holding critical locks.

Quick Summary: Use sys.dm_exec_requests to see currently waiting sessions and what they're waiting for. sys.dm_os_waiting_tasks shows the blocking chain — who is waiting on whom. The session at the head of the chain with no wait is the blocker. That's the transaction to investigate — long-running or uncommitted transactions are the usual culprit.
Q144:

What is the difference between blocking and deadlocking?

Expert

Answer

Blocking occurs when one transaction holds a lock and others wait behind it.

Deadlock occurs when two transactions wait on each other, creating a cycle.

SQL Server detects deadlocks and kills one transaction (the victim) to resolve the cycle.

Quick Summary: Blocking: transaction A waits for transaction B to release a lock — eventually resolves when B commits or rolls back. Deadlock: two transactions each hold what the other needs — neither can proceed without intervention. SQL Server resolves deadlocks by killing one victim. Blocking is slow; deadlocks are forced terminations.
Q145:

Which indexing mistakes cause the most issues in high-traffic systems?

Expert

Answer

Common indexing mistakes include:

  • Too many non-clustered indexes
  • Lack of covering indexes for hot queries
  • Poor clustered key choice (wide, GUID, non-sequential)
  • Duplicate or overlapping indexes
  • Not using filtered indexes

These increase I/O, fragmentation, and reduce plan efficiency.

Quick Summary: Top indexing mistakes: too many indexes (slows writes, wastes space), wrong leading column (index unused), missing covering columns (causes key lookups), ignoring fill factor (causes page splits), never maintaining fragmentation, and duplicate indexes that the optimizer ignores while still costing maintenance overhead.
Q146:

Why do large table scans cause severe slowdowns?

Expert

Answer

Large scans:

  • Generate heavy I/O
  • Evict useful pages from buffer pool
  • Increase CPU usage
  • Slow concurrent users

One bad scan can impact dozens of users in a production environment.

Quick Summary: Large table scans read every page in the table — for a 100GB table, that's potentially GBs of I/O. This pushes useful pages out of the buffer pool, increases disk read time, and spikes CPU. In a shared server, one scan-heavy query can degrade performance for all other sessions simultaneously.
Q147:

Which issues cause sudden regression in query performance?

Expert

Answer

Sudden regressions often occur due to:

  • Updated statistics
  • Parameter sniffing plan changes
  • Index changes
  • Fragmentation spikes
  • Failover causing cold cache
  • Data growth affecting plan shapes
Quick Summary: Plan regressions after stats updates (optimizer uses new distribution data), index rebuilds (resets statistics), database compatibility level changes (new CE model), and SQL Server upgrades. Queries that were fast can suddenly get a different plan that's wrong for current data. Query Store is the primary tool to detect and revert regressions.
Q148:

What is the importance of Query Store in high-traffic systems?

Expert

Answer

Query Store provides:

  • Plan history recording
  • Runtime performance metrics
  • Ability to revert to stable plans
  • Insight into parameter sniffing behavior

It acts as a "black box recorder" for SQL Server.

Quick Summary: Query Store tracks every query's execution history — plan, duration, CPU, I/O, logical reads. In high-traffic systems it enables: detecting plan regressions instantly, forcing known-good plans, identifying top resource consumers, and understanding query behavior over time without relying on memory-only DMVs that reset on restart.
Q149:

Why should TempDB be placed on fast storage?

Expert

Answer

TempDB handles:

  • Sorts
  • Hash operations
  • Version store
  • Spills
  • Temporary objects

Slow TempDB I/O becomes a bottleneck for the entire SQL Server instance.

Quick Summary: TempDB is heavily used for sort spills, hash join overflows, and temp table operations. Slow TempDB I/O directly slows every query that spills. NVMe SSDs for TempDB are a standard recommendation. Separating TempDB from data/log files on dedicated drives avoids I/O contention between user queries and TempDB operations.
Q150:

Why are wide clustered keys a long-term performance problem?

Expert

Answer

The clustered key is included in all non-clustered indexes. Wide keys increase:

  • Storage footprint
  • Read/write cost
  • Fragmentation
  • Memory usage

Narrow, sequential keys are ideal.

Quick Summary: Wide clustered keys (GUIDs, composite multi-column keys) are included in every non-clustered index as the row locator. A 16-byte GUID key in a table with 10 non-clustered indexes multiplies key storage significantly. Wide keys also increase page splits and fragmentation over time. An integer identity key keeps everything compact.
Q151:

How do you troubleshoot excessive CPU usage in SQL Server?

Expert

Answer

Troubleshooting steps:

  • Identify high-CPU queries in DMVs
  • Check plans for sorts, hashes, conversions
  • Validate statistics
  • Review index usage
  • Investigate parallelism waits
  • Check server MAXDOP settings
Quick Summary: For CPU issues: check sys.dm_exec_query_stats for queries with high total_worker_time. Look for missing indexes, non-SARGable predicates, or excessive recompilation. Execution plans showing large sorts, hash joins, or scans on big tables are CPU-heavy. Fixing those reduces CPU — indexes shift work from CPU to I/O.
Q152:

What causes excessive memory usage in SQL Server?

Expert

Answer

Causes include:

  • Large memory grants
  • Hash and sort operations
  • Poor cardinality estimates
  • Too many concurrent queries
  • SQL caching large data pages
Quick Summary: Excessive memory usage causes: buffer pool pressure (too many large scans pushing data pages in), oversized memory grants (queries holding memory others need), and plan cache bloat (thousands of single-use plans from non-parameterized queries). Use sys.dm_os_memory_clerks to find which component is consuming the most.
Q153:

Why are CROSS JOINs dangerous in real systems?

Expert

Answer

CROSS JOINs produce Cartesian products. They may:

  • Explode row counts
  • Consume massive CPU
  • Stress memory
  • Cause system freezes

They must be used intentionally and carefully.

Quick Summary: CROSS JOIN produces every combination of rows from two tables — N × M rows. A CROSS JOIN between two 10,000-row tables produces 100 million rows. Usually unintentional (a forgotten JOIN condition). In real systems, accidental CROSS JOINs can bring down a server by overwhelming memory, CPU, and TempDB simultaneously.
Q154:

Which situations require table partitioning?

Expert

Answer

Use partitioning when:

  • Tables contain hundreds of millions of rows
  • Fast archiving is required
  • Hot/cold data separation is needed
  • Maintenance takes too long
Quick Summary: Use partitioning when: tables exceed 50-100GB and queries naturally filter on a date or range column, you need to archive/purge old data efficiently (partition switching is instant), or you want to separate hot and cold data across different storage tiers. Smaller tables rarely benefit enough to justify the complexity.
Q155:

How does SQL Server handle large deletes efficiently?

Expert

Answer

Large deletes cause heavy logging, lock escalation, and blocking. Solutions:

  • Batch deletes
  • Partition switching
  • Mark-and-archive patterns
Quick Summary: Delete large data in batches — DELETE TOP (10000) WHERE ... in a loop — instead of one massive DELETE. A single large delete holds locks for a long time, blocks other sessions, and generates a huge transaction log entry. Batching keeps transactions short, log growth manageable, and blocking minimal.
Q156:

Why do developers use OUTPUT clause for auditing?

Expert

Answer

The OUTPUT clause returns affected rows without re-querying the table. It is efficient for:

  • Auditing changes
  • Logging
  • Data replication
Quick Summary: The OUTPUT clause returns rows affected by INSERT, UPDATE, DELETE, or MERGE — directly from the statement without an extra SELECT. For auditing: INSERT INTO AuditLog SELECT * FROM deleted in a delete's OUTPUT clause captures exactly what was removed, atomically in the same transaction, with no separate read needed.
Q157:

What is the impact of missing foreign keys?

Expert

Answer

Missing foreign keys:

  • Prevent automatic referential integrity
  • Hurt query optimization because relationships are unknown
  • Cause poor join strategy selection
Quick Summary: Missing foreign keys mean the database won't prevent orphaned rows — orders with no customer, invoice lines with no invoice. The application must enforce referential integrity, and if it misses a case, the data silently corrupts. Missing FKs also remove optimizer hints — SQL Server uses FK metadata to simplify join plans.
Q158:

Why minimize the number of columns in non-clustered indexes?

Expert

Answer

Extra columns increase:

  • Index size
  • Maintenance overhead
  • Memory usage

Indexes should be narrow and purpose-specific.

Quick Summary: Every non-clustered index duplicates the clustered key columns as the row locator. Wide index keys = larger index pages = more I/O to scan the index. Narrow indexes fit more entries per page, require less memory in the buffer pool, and are faster to scan. Only index the columns you actually need for your queries.
Q159:

When should filtered indexes be used?

Expert

Answer

Filtered indexes are ideal when:

  • Queries target selective subsets
  • Columns contain sparse values
  • Large tables need optimized predicate performance
Quick Summary: Use filtered indexes when queries consistently target a predictable subset: active records (IsDeleted = 0), a specific status, or a regional subset. The index is smaller, statistics are more accurate for that subset, and seek performance is higher. Particularly effective when the indexed subset is a small percentage of the total rows.
Q160:

What steps do you take before rewriting a slow query?

Expert

Answer

Before rewriting a query:

  • Check execution plan
  • Validate statistics
  • Check indexes
  • Identify costly operators
  • Try small predicate adjustments
  • Confirm rewrite is necessary

Most problems are fixed by plan adjustments, not rewrites.

Quick Summary: Before rewriting: capture the actual execution plan and compare estimated vs actual rows. Check wait stats and blocking. Look at logical reads (SET STATISTICS IO ON). Identify the highest-cost operators. Update statistics and check for missing indexes first — often fixes the query without any rewrite. Rewrite only after understanding the root cause.

Curated Sets for MS SQL

No curated sets yet. Group questions into collections from the admin panel to feature them here.

Ready to level up? Start Practice