MS SQL Interview Cheat Sheet

1. What is MS SQL Server and where is it commonly used?

MS SQL Server is Microsoft's relational database system built for enterprise use. It stores, retrieves, and manages structured data reliably. You'll find it in finance, healthcare, SaaS apps, ERPs, and large websites where data consistency and performance really matter.

Entry Full Answer →

2. Explain Database, Table, Row, and Column in relational terms.

A database holds all your data objects. A table organizes data into rows and columns, like a spreadsheet. A row is one record (one person, one order). A column is a specific attribute of that record (name, price, date). Together, they form the foundation of relational data storage.

Entry Full Answer →

3. What is a Primary Key and why is it important?

A primary key uniquely identifies each row in a table â€” no two rows can have the same value, and it can't be NULL. It's the anchor for relationships between tables and speeds up lookups. Without it, finding or linking a specific record becomes messy and unreliable.

Entry Full Answer →

4. What is a Foreign Key and what problem does it solve?

A foreign key in one table points to the primary key in another table. It enforces referential integrity â€” you can't add an order for a customer that doesn't exist. It's how relational databases maintain relationships between entities without duplicating data.

Entry Full Answer →

5. What are Constraints in MS SQL?

Constraints are rules you enforce at the database level. Common ones: NOT NULL (column can't be empty), UNIQUE (no duplicates), CHECK (value must meet a condition), DEFAULT (fallback value), PRIMARY KEY, and FOREIGN KEY. They catch bad data before it ever lands in the table.

Entry Full Answer →

6. What is Normalization and why do we use it?

Normalization organizes data to reduce redundancy and keep things consistent. Instead of storing a customer's name in every order row, you store it once and reference it by ID. This prevents update anomalies â€” you change data in one place, not a hundred.

Entry Full Answer →

7. Explain 1NF, 2NF, and 3NF briefly.

1NF: each cell holds one atomic value, no repeating groups. 2NF: no partial dependency â€” every non-key column depends on the whole primary key. 3NF: no transitive dependency â€” non-key columns depend only on the primary key, not on each other. Each level builds on the previous.

Entry Full Answer →

8. What is Denormalization and when is it used?

Denormalization intentionally adds redundancy to speed up reads. Instead of joining three tables every time, you store the data together. Used in reporting databases, dashboards, and read-heavy systems where join performance is a bottleneck and write consistency is less critical.

Entry Full Answer →

9. What is an Index and why is it important?

An index is a separate data structure that helps SQL Server find rows fast without scanning the whole table. Think of it like a book's index â€” instead of reading every page, you jump straight to what you need. Without indexes, every query does a full table scan.

Entry Full Answer →

10. What is the difference between Clustered and Non-Clustered Index?

Clustered index defines the physical order of rows in the table â€” only one per table. Non-clustered index is a separate structure with pointers back to the actual rows â€” you can have many. Clustered is faster for range queries; non-clustered is better for selective lookups on specific columns.

Entry Full Answer →

11. What is a View and why is it used?

A view is a saved SELECT query that looks like a table. It simplifies complex joins, hides sensitive columns, and gives different users a tailored perspective of the data. The view itself stores no data â€” it's just a reusable query definition that runs when you access it.

Entry Full Answer →

12. What is a Stored Procedure and what is it used for?

A stored procedure is pre-compiled SQL code saved in the database. You call it by name with parameters. It runs faster than ad-hoc SQL because the execution plan is cached, reduces network traffic, centralizes business logic, and is easier to secure with permissions.

Entry Full Answer →

13. What are User-Defined Functions (UDFs)?

UDFs are reusable functions you define in T-SQL. Scalar functions return one value. Inline table-valued functions return a table (like a parameterized view). Multi-statement TVFs build a table row by row. Scalar UDFs are handy but can tank performance when called per row in large queries.

Entry Full Answer →

14. What is a Transaction and why is it important?

A transaction is a group of SQL operations treated as one unit. Either all succeed and commit, or all fail and roll back. This prevents half-done operations from corrupting your data â€” like a bank transfer where money leaves one account but never arrives in another.

Entry Full Answer →

15. Explain ACID properties in SQL.

ACID stands for: Atomicity (all or nothing), Consistency (data stays valid before and after), Isolation (transactions don't interfere with each other), Durability (once committed, data survives crashes). These four guarantees make transactions reliable in production systems.

Entry Full Answer →

16. What is a Deadlock in SQL Server?

A deadlock happens when two transactions each hold a lock the other needs â€” both wait forever. SQL Server's deadlock monitor detects the cycle and kills the cheaper transaction (the deadlock victim), rolling it back and returning an error. The other transaction then completes normally.

Entry Full Answer →

17. What is Locking in MS SQL Server?

Locking prevents concurrent transactions from corrupting shared data. SQL Server automatically applies locks (shared for reads, exclusive for writes) when a transaction touches data. The granularity ranges from row to page to table. Poor lock management causes blocking, contention, and deadlocks.

Entry Full Answer →

18. What is a Schema in SQL?

A schema is a logical namespace inside a database that groups related objects â€” tables, views, procedures. Like folders for your database objects. dbo is the default schema. Schemas help with organization, security (grant access per schema), and avoiding naming conflicts between teams.

Entry Full Answer →

19. Difference between DELETE and TRUNCATE?

DELETE removes rows one at a time, logs each deletion, fires triggers, and can be rolled back. TRUNCATE removes all rows at once, logs only page deallocations, is much faster, but can't be used with triggers or foreign key constraints. TRUNCATE resets identity columns; DELETE does not.

Entry Full Answer →

20. What is an Execution Plan and why is it important?

An execution plan is SQL Server's roadmap for running a query â€” which indexes to use, how to join tables, in what order to process operations. Reading the plan reveals bottlenecks like table scans, key lookups, or hash matches that slow things down. It's the first tool to reach for when tuning.

Entry Full Answer →

21. What is the main purpose of indexing in SQL Server?

Indexing lets SQL Server find rows without reading the entire table. Instead of scanning a million rows, it jumps to the right spot using the index structure (B-tree). Without good indexes, even simple lookups turn into full table scans that waste CPU, memory, and time.

Entry Full Answer →

22. How does SQL Server decide between a Table Scan, Index Scan, and Index Seek?

SQL Server uses statistics (row count estimates) to pick the cheapest plan. Table Scan reads every row â€” used when there's no useful index or the whole table is needed. Index Scan reads the index but all rows. Index Seek jumps directly to matching rows â€” the most efficient, what you want for selective queries.

Entry Full Answer →

23. What is a Composite Index and when is it used?

A composite index covers multiple columns. Useful when your WHERE clause filters on a combination of columns together. The column order matters â€” the index is most effective when the leading column appears in the filter. A (LastName, FirstName) index won't help a query filtering only on FirstName.

Entry Full Answer →

24. What is Bookmark Lookup and why can it cause performance problems?

Bookmark lookup happens when a non-clustered index finds the matching row keys, but the query needs columns not in the index â€” so SQL Server does an extra lookup into the clustered index to get them. Each lookup is a random I/O. With many rows, this becomes very expensive. Fix: use a covering index.

Entry Full Answer →

25. What is a Covering Index and why is it powerful?

A covering index includes all the columns a query needs â€” the index columns plus extra INCLUDE columns. SQL Server can answer the query entirely from the index without touching the main table. Eliminates key lookups, reduces I/O, and often cuts query time dramatically.

Entry Full Answer →

26. What are SQL Server Statistics and why are they essential?

Statistics are histograms that tell SQL Server how data is distributed in a column â€” how many distinct values exist, how they're spread, what the min/max are. The query optimizer uses them to estimate how many rows a filter will return and pick the best execution plan.

Entry Full Answer →

27. How do outdated or missing statistics affect query performance?

Outdated statistics mean SQL Server is working with stale data distribution information. It might think a filter returns 100 rows when it actually returns 100,000 â€” leading to a plan built for small data that falls apart at scale. This causes unexpected table scans, wrong join types, and slow queries.

Entry Full Answer →

28. What is Parameter Sniffing and why does it occur?

Parameter sniffing is when SQL Server compiles a stored procedure plan based on the first parameter values it sees. If those values are atypical (e.g., a rare ID vs. a common status), the cached plan is optimal for that one call but terrible for everyone else. Leads to inconsistent query performance.

Entry Full Answer →

29. What is an Execution Plan and how does SQL Server generate it?

When SQL Server receives a query, it parses it, checks the plan cache for a reusable plan, then the query optimizer evaluates multiple possible plans using cost estimates and picks the cheapest one. The plan is cached and reused. For parameterized queries, the same plan handles all values â€” good and bad.

Entry Full Answer →

30. What is the difference between Estimated and Actual Execution Plans?

Estimated plan shows what SQL Server planned to do before running the query â€” based on statistics. Actual plan shows what really happened including actual row counts, execution counts, and time. Comparing estimated vs actual row counts reveals where statistics are wrong and the plan went off-course.

Entry Full Answer →

31. What is Cardinality Estimation and why is it important?

Cardinality estimation predicts how many rows each operation in a plan will produce. If the estimate is 10 rows but the real result is 100,000, SQL Server builds the wrong plan â€” wrong join type, wrong memory grant, wrong index choice. Accurate statistics are what keep cardinality estimates correct.

Entry Full Answer →

32. What are Index Fragmentation and Fill Factor?

Fragmentation happens when index pages get out of order after many inserts, updates, and deletes. Logical fragmentation means pages are out of sequence â€” sequential reads become random I/O. Fill Factor controls how full each page is when the index is built, leaving room to grow and reducing future splits.

Entry Full Answer →

33. What is the Query Optimizer and how does it work?

The query optimizer is SQL Server's brain for plan selection. It generates multiple candidate plans, estimates the cost of each using statistics and cardinality, and picks the one with the lowest estimated cost. It's cost-based, not rule-based â€” which is why bad statistics lead directly to bad plans.

Entry Full Answer →

34. What is a Filtered Index and when is it beneficial?

A filtered index only indexes rows that match a WHERE condition. If 90% of your rows are "archived" and queries only touch "active" ones, a filtered index on IsActive = 1 is tiny, fast, and highly selective. Much cheaper than a full index and often a perfect fit for status or soft-delete columns.

Entry Full Answer →

35. What is a Hint and when should it be used?

A hint forces SQL Server to use a specific index, join type, or behavior instead of letting the optimizer decide. Use sparingly â€” hints override the optimizer's judgment, which can help in edge cases but often makes things worse long-term as data changes. Prefer fixing statistics or schema over using hints.

Entry Full Answer →

36. What is a Hotspot in indexing terms?

A hotspot is a single index page that many concurrent inserts target at the same time â€” typically the last page of a sequential (identity-based) clustered index. All writers compete for the same page latch, causing contention and blocking. Solutions: GUID keys, partitioning, or using a non-sequential key.

Entry Full Answer →

37. What is the Role of the Query Store in SQL Server?

Query Store is a built-in SQL Server feature that captures query text, execution plans, and runtime stats over time. When a query suddenly gets slower (plan regression), Query Store can show you the old fast plan and let you force it back. Essential for diagnosing intermittent performance problems.

Entry Full Answer →

38. What is an Index Seek Predicate vs Residual Predicate?

Seek predicate is the condition used to enter the index â€” it actually narrows which rows are retrieved. Residual predicate is a secondary filter applied after the seek, on rows already found. A residual predicate means SQL Server fetched more rows than needed and filtered them post-retrieval â€” less efficient.

Entry Full Answer →

39. Why does SQL Server choose a Hash Match instead of Nested Loop or Merge Join?

SQL Server chooses Hash Match when it can't sort both inputs efficiently (no useful index, large unsorted data). It builds a hash table from the smaller input and probes it with the larger. It's memory-intensive and can spill to TempDB if the memory grant was underestimated. More common with missing or outdated indexes.

Entry Full Answer →

40. Explain the internal structure of a SQL Server index (Clustered vs Non-Clustered). How does it affect performance?

Clustered index stores the actual data rows in leaf pages, sorted by the key â€” the table IS the index. Non-clustered index stores key values plus a pointer (row locator) to the actual row. Clustered is optimal for range queries. Non-clustered adds lookup overhead unless it covers all needed columns.

Junior Full Answer →

41. What is an Index Seek vs Index Scan vs Table Scan? When is each used?

Index Seek: directly navigates the B-tree to find matching rows â€” most efficient, low I/O. Index Scan: traverses all index pages â€” used when filtering is weak or all rows are needed. Table Scan: reads the entire heap â€” worst case, no usable index. Seek is ideal; scans are a sign something needs attention.

Junior Full Answer →

42. How does SQL Server choose an Execution Plan?

SQL Server parses the query, generates a query tree, checks the plan cache (reuse if found), then the optimizer produces candidate plans with cost estimates. It picks the lowest-cost plan and caches it. For parameterized queries, that same plan runs for all future calls with different parameter values.

Junior Full Answer →

43. What are Statistics? Why do bad statistics slow queries?

Statistics contain histograms showing how values are distributed in a column. The optimizer uses them to estimate row counts for each operation. If stats are stale â€” say after a large data load â€” estimates are wrong, plans are wrong, and queries get slow. Update statistics regularly in high-change tables.

Junior Full Answer →

44. What is a Covering Index and when should you create one?

A covering index satisfies a query entirely from the index without touching the main table. Include the WHERE columns as index keys and extra SELECT columns via INCLUDE. Eliminates costly key lookups. One well-placed covering index can eliminate 90% of I/O for frequently-run queries.

Junior Full Answer →

45. What is a Key Lookup? Why is it expensive?

Key lookup (aka Bookmark Lookup) happens when a non-clustered index finds the row, but the query needs extra columns not in the index â€” SQL Server fetches those from the clustered index. Each lookup is a random I/O. With thousands of rows, this stacks up. Adding INCLUDE columns to the index eliminates it.

Junior Full Answer →

46. Explain Lock Escalation in SQL Server.

Lock escalation converts many fine-grained locks (row, page) into one coarse table-level lock when the lock count crosses a threshold (default ~5,000). It reduces lock manager overhead but blocks all concurrent readers and writers on that table. Can cause unexpected blocking in high-concurrency workloads.

Junior Full Answer →

47. What is Page Splitting? How does Fill Factor help?

Page splitting happens when a new row is inserted into a full index page â€” SQL Server splits the page into two half-full pages, causing fragmentation. Fill Factor sets how full pages are when an index is built (e.g., 80% means 20% room to grow). Lower fill factor = less splitting but larger index size.

Junior Full Answer →

48. Explain SQL Server Memory Architecture (Buffer Pool, Plan Cache).

The Buffer Pool is SQL Server's main memory cache â€” it holds data pages and index pages read from disk. Reads hit the buffer pool first (logical read); if not found, it fetches from disk (physical read). Plan Cache stores compiled execution plans for reuse. Together they dramatically reduce disk I/O and compilation overhead.

Junior Full Answer →

49. What is Parameter Sniffing and how do you fix it?

SQL Server compiles a plan for a stored procedure using the first parameter values it sees. If those values are unusual (high selectivity vs. low), the cached plan is ideal for them but wrong for typical calls. Fix: OPTION(RECOMPILE), OPTIMIZE FOR hint, or local variable workaround to break plan reuse.

Junior Full Answer →

50. When should you use a Filtered Index?

Filtered indexes work best for columns where queries consistently target a specific subset â€” IsDeleted = 0, Status = 'Active', Region = 'US'. The index is smaller, faster, and more efficient than a full index. Particularly valuable for soft-delete patterns where active rows are a small fraction of total.

Junior Full Answer →