Apple MS SQL Interview Questions

However, too many indexes slow down write performance due to maintenance overhead.

Quick Summary: Indexing lets SQL Server find rows without reading the entire table. Instead of scanning a million rows, it jumps to the right spot using the index structure (B-tree). Without good indexes, even simple lookups turn into full table scans that waste CPU, memory, and time.

Permalink

Q22:

How does SQL Server decide between a Table Scan, Index Scan, and Index Seek?

Entry

Answer

SQL Server chooses access methods based on filters, available indexes, and cost estimates.

Table Scan: Reads all rows when no useful index exists.

A covering index contains all columns needed for a query.

SQL Server can return results from the index alone without touching the table.

This greatly improves performance for read-heavy queries.

Quick Summary: A covering index includes all the columns a query needs â€” the index columns plus extra INCLUDE columns. SQL Server can answer the query entirely from the index without touching the main table. Eliminates key lookups, reduces I/O, and often cuts query time dramatically.

Permalink

Q26:

What are SQL Server Statistics and why are they essential?

Entry

Answer

Statistics describe data distribution in columns.

They help the optimizer estimate row counts and choose efficient execution plans.

Quick Summary: Statistics are histograms that tell SQL Server how data is distributed in a column â€” how many distinct values exist, how they're spread, what the min/max are. The query optimizer uses them to estimate how many rows a filter will return and pick the best execution plan.

Permalink

Q27:

How do outdated or missing statistics affect query performance?

Entry

Answer

Outdated statistics produce inaccurate row estimates.

Clustered Index: Defines the physical order of rows in the table. Implemented as a B-Tree with root, intermediate, and leaf nodes. The leaf level contains actual data rows.

Non-Clustered Index: Also a B-Tree, but leaf nodes store index keys and row locators (RID or clustered key).

Performance Impact: Clustered indexes help range queries, while non-clustered indexes help quick lookups. Poor clustered key choice can make indexes large and slow.

Quick Summary: Clustered index stores the actual data rows in leaf pages, sorted by the key â€” the table IS the index. Non-clustered index stores key values plus a pointer (row locator) to the actual row. Clustered is optimal for range queries. Non-clustered adds lookup overhead unless it covers all needed columns.

Permalink

Q41:

What is an Index Seek vs Index Scan vs Table Scan? When is each used?

Junior

Answer

Index Seek: SQL jumps directly to matched rows. Happens with selective filters and proper indexes.

Index Scan: SQL reads entire index. Happens with broad filters or missing indexes.

SQL Server evaluates JOINs using physical operators such as Nested Loops, Merge Join, and Hash Join. INNER, LEFT, RIGHT, and FULL JOINs all use these strategies depending on data size, indexes, and sorting.

INNER JOIN: Returns matching rows only.

LEFT/RIGHT JOIN: Returns matching rows plus NULL for non-matching.

FULL JOIN: Returns all matches + unmatched from both sides.

CROSS JOIN: Cartesian product; typically expensive.

Proper indexing affects which join operator SQL Server chooses.

Quick Summary: SQL Server supports Nested Loops (small outer table, indexed inner), Merge Join (both inputs sorted â€” often the fastest), and Hash Match (large unsorted inputs â€” memory intensive). The optimizer picks based on data size and index availability. Nested Loops is cheapest for small datasets; Hash Match is a fallback for large ones.

Permalink

Q61:

Explain Nested Loops, Merge Join, and Hash Join with when each is chosen.

Mid

Answer

Nested Loops: Best for small outer input and indexed inner table. Great for OLTP random lookups.

Merge Join: Requires sorted inputs. Very fast for large, sorted datasets.

Hash Join: Best for large, unsorted sets. Spills to TempDB if memory is insufficient.

Quick Summary: Nested Loops: iterate outer rows, seek matching inner rows via index â€” best when outer set is small. Merge Join: both inputs must be pre-sorted; scan both simultaneously â€” fast when indexes provide sort order. Hash Match: build a hash table from smaller input, probe with larger â€” chosen when sorts aren't available and data is large.

Permalink

Q62:

What is a Transaction? Explain ACID with real SQL Server implications.

Mid

Answer

A transaction is a unit of work ensured by ACID properties:

Atomicity: All-or-nothing behavior.
Consistency: Constraints must remain valid.
Isolation: Controls concurrency.
Durability: Committed data survives crashes via logging.

Quick Summary: A transaction groups multiple operations into one atomic unit. In SQL Server: Atomicity means the transaction commits fully or rolls back entirely. Consistency ensures constraints hold. Isolation controls what concurrent transactions can see. Durability means committed data survives crashes via the transaction log.

Permalink

Q63:

Explain SQL Server Transaction Isolation Levels with practical use-cases.

Mid

Answer

Isolation levels include:

READ UNCOMMITTED: Allows dirty reads; used in analytics.

READ COMMITTED: Default; prevents dirty reads.

RCSI: Uses version store; avoids blocking.

REPEATABLE READ: Prevents non-repeatable reads.

Mid

Answer

Blocking occurs when a session holds a lock required by another. Debugging tools include:

sp_whoisactive
sys.dm_exec_requests
Extended events
Activity Monitor

Answer

Simple: Auto-truncates log; no PIT recovery.

Full: Requires log backups; supports PIT recovery.

Mid

Answer

Mirroring uses principal + mirror + witness for automatic failover. Deprecated in favor of Availability Groups.

Quick Summary: Database Mirroring sends transaction log blocks from principal to mirror in real-time. High-Safety mode (synchronous) = automatic failover with a witness. High-Performance mode (asynchronous) = potential data loss. Deprecated since SQL Server 2012 in favor of AlwaysOn AGs, which are more flexible and feature-rich.

Permalink

Q80:

What are Stored Procedures and why are they preferred over sending raw SQL from applications?

Senior

Answer

Stored procedures are precompiled program units that live inside SQL Server. Instead of sending raw SQL text for every request, the application sends only the procedure name and parameters. SQL Server then executes pre-validated logic using a cached execution plan.

Key advantages over raw ad-hoc SQL:

Reduced network traffic: Only parameter values are sent, not large query strings.
Plan reuse: SQL Server can cache and reuse execution plans for procedures, reducing compilation overhead and stabilizing performance.
Centralized business logic: Data-related rules (validation, transformations, audit operations) are centralized and versioned at the database level, simplifying deployments and multi-application integration.
Security and least-privilege: Applications can be granted EXECUTE rights on procedures instead of direct table access, improving security boundaries and auditability.
Reduced SQL injection risk: The SQL logic is fixed inside the procedure; parameters are bound, greatly decreasing injection surfaces compared to concatenated SQL strings.

Quick Summary: Stored procedures send a single call over the network instead of many SQL strings â€” less round trips. Plans are compiled and cached on first run â€” subsequent calls skip parsing and optimization. Security is cleaner: grant EXECUTE on the procedure, not SELECT/INSERT on the tables. Logic stays in the database layer.

Permalink

Q81:

Explain the difference between Stored Procedures and Functions in SQL Server.

Senior

Answer

Stored procedures and functions both encapsulate reusable logic, but they serve different purposes and behave differently inside SQL Server.

Stored procedures:

Primarily designed to perform actions: DML operations, transaction control, administrative tasks.
Can return multiple result sets and output parameters.
Cannot be directly used in SELECT, WHERE, or JOIN clauses.
Commonly used as API endpoints from the application layer.

Functions:

Intended for computation and value-returning logic.
Scalar functions return a single value; table-valued functions return a rowset.
Can be used inside SELECT, WHERE, and JOIN clauses like expressions or tables.
Have tighter restrictions: no explicit transaction control; most cannot perform data-modifying side effects.

In short: procedures orchestrate operations, while functions compute values that integrate with queries.

Quick Summary: Stored procedures can modify data, use transactions, output parameters, and don't have to return a result set. Functions must return a value, can't modify data (except table variables), and can be called inline in a SELECT. Inline TVFs act like parameterized views â€” they compose well; multi-statement TVFs often don't.

Permalink

Q82:

Explain Scalar Functions vs Inline Table-Valued Functions vs Multi-Statement Table-Valued Functions.

Senior

Answer

SQL Server supports several function types, each with distinct performance and usage characteristics.

Scalar functions:

Return a single scalar value (e.g., INT, VARCHAR).
Evaluated per row when used in queries, often resulting in RBAR (Row-By-Agonizing-Row) execution.
Act as black boxes to the optimizer, frequently preventing parallelism and leading to severe performance issues.

Inline table-valued functions (iTVFs):

Return a table defined by a single RETURN (SELECT ...) statement.
Logically similar to parameterized views.
Fully inlined into the calling query, enabling the optimizer to generate efficient set-based plans with good cardinality estimates.
Usually the best-performing function type for set-based logic.

Multi-statement table-valued functions (mTVFs):

Declare an internal table variable and populate it across multiple statements.
By default, SQL Server assumes a fixed row estimate (historically 1 row), often leading to poor plans.
Limited statistics and cardinality information, frequently causing spills and suboptimal joins.

For performance-sensitive code, prefer inline TVFs over scalar or multi-statement TVFs whenever possible.

Quick Summary: Scalar: returns one value, called per row â€” can severely degrade performance in queries. Inline TVF: returns a table, SQL Server can inline it into the calling query and optimize it like a view â€” very efficient. Multi-statement TVF: builds a table row by row â€” opaque to the optimizer, often slow on large data.

Permalink

Q83:

Why can Scalar Functions severely degrade performance?

Senior

Answer

Scalar functions often look clean and reusable but can be hidden performance killers in production workloads.

Reasons they degrade performance:

They are executed once per row in a result set, turning set-based operations into row-by-row computation.
The optimizer cannot see inside the function body, treating it as a black box. This prevents many optimizations and accurate cardinality estimates.
In many SQL Server versions, the presence of scalar functions in the SELECT or WHERE list can disable parallelism, forcing single-threaded plans for otherwise parallelizable queries.
They frequently increase CPU usage and query duration dramatically on large datasets.

Mitigations include:

Refactoring to inline TVFs or pure set-based SQL.
Replacing scalar functions with computed columns (possibly indexed) when appropriate.
Inlining logic into the main query where feasible.

Quick Summary: A scalar UDF called in a WHERE or SELECT clause runs once per row â€” on a million-row table, that's a million individual function calls. The optimizer often can't see inside and treats it as a black box, blocking parallelism and estimates. Replace with inline TVFs or inlined expressions whenever possible for big tables.

Permalink

Q84:

What are Views? Explain their benefits and limitations.

Senior

Answer

A view is a named, virtual table defined by a SELECT query. It does not store data itself (unless indexed) but presents a reusable query abstraction.

Benefits:

Abstraction and simplification: Hides complex joins and expressions behind a simple interface.
Security: Exposes only selected columns/rows while hiding underlying schemas and tables.
Reuse: Centralizes business logic or filter logic so that many queries can benefit from a single definition.
Schema evolution: Application code can query the view even if underlying tables change, as long as the view's contract is preserved.

Limitations:

Views can become stacked (views on views), leading to overly complex, hard-to-tune execution plans.
Not all views are updatable; complex joins, aggregates, and DISTINCT can prevent direct DML.
They do not inherently improve performance unless combined with indexed views or used to encapsulate optimal query patterns.

Quick Summary: Views simplify complex queries by saving them as named objects. They enforce consistent access patterns and can restrict column-level access. Limitations: non-indexed views don't store data (they run on every access), can't use ORDER BY without TOP, and add a layer of indirection that occasionally confuses the optimizer.

Permalink

Q85:

What are Indexed Views and when should you use them?

Senior

Answer

Indexed views materialize the result of a view and maintain it on disk like a physical table. They are backed by a clustered index and optionally additional nonclustered indexes.

When to use:

Heavy, repetitive aggregations or joins over large datasets (e.g., reporting queries over transactional tables).
Scenarios where read performance is critical and data changes are relatively moderate.
To precompute expensive expressions and reduce CPU usage under analytic workloads.

Trade-offs:

Every insert/update/delete on the underlying tables must update the indexed view, increasing DML overhead.
Strict requirements apply (e.g., schema binding, deterministic expressions).
Can complicate troubleshooting if developers are unaware of their presence.

They are powerful for read-intensive workloads but should be used selectively and measured carefully in write-heavy systems.

Quick Summary: Indexed views (materialized views) physically store the query result on disk with an index. Reads are instant â€” no recomputation. But writes to the base tables must maintain the view, adding overhead. Best for aggregation queries that run frequently on slowly-changing data. Requires SCHEMABINDING and strict query form.

Permalink

Q86:

What are Triggers? Explain AFTER vs INSTEAD OF Triggers.

Senior

Answer

Triggers are special stored procedures that automatically execute in response to DML events (INSERT, UPDATE, DELETE) or certain DDL operations.

AFTER triggers:

Fire after the base operation logically succeeds but before the transaction commits.
Often used for auditing, logging, or enforcing complex constraints that span multiple tables.
Run within the transaction, so failures can cause rollbacks and increase latency.

INSTEAD OF triggers:

Fire instead of the original DML operation.
Commonly used on views to simulate complex update logic or route changes to multiple underlying tables.
Give full control over how changes are applied.

Both types must be designed carefully to avoid recursion, hidden performance issues, and unexpected side effects.

Quick Summary: AFTER trigger fires after the DML completes â€” used for auditing, cascading changes. INSTEAD OF trigger replaces the operation entirely â€” used on views that can't be directly modified, or to intercept and transform inserts/updates. Both have access to inserted and deleted virtual tables showing what changed.

Permalink

Q87:

Why are Triggers generally not preferred in large systems?

Senior

Answer

Triggers are powerful but can introduce hidden complexity and performance problems in large systems.

Key concerns:

Hidden behavior: Business logic executes implicitly on DML, making the system harder to reason about. Developers may not realize why certain actions occur.
Chained effects: Triggers can call other triggers, causing cascading side effects that are difficult to debug and test.
Performance impact: Trigger logic runs inside the transaction. Long-running triggers increase lock durations and contention.
Maintainability: Logic spread across triggers and procedures leads to fragmented business rules and operational risk.

For these reasons, many teams prefer stored procedures, constraints, and explicit application-level logic over heavy trigger usage, reserving triggers for focused, unavoidable use cases (e.g., auditing).

Quick Summary: Triggers are invisible â€” they fire silently and can cause unexpected side effects. They slow down every DML operation on the table, even when you don't need the trigger behavior. They complicate debugging and make data changes non-obvious. Modern systems prefer explicit application logic or outbox patterns instead.

Permalink

Q88:

Explain Window Functions and why they are essential in modern SQL.

Senior

Answer

Window functions (e.g., ROW_NUMBER(), RANK(), SUM() OVER (...)) allow calculations across sets of rows without collapsing them into a single result like GROUP BY does.

Why they are essential:

Enable ranking, running totals, moving averages, percentiles, and gap/density analysis in a single pass.
Reduce the need for self-joins and correlated subqueries, often resulting in cleaner and faster plans.
Can be combined with PARTITION BY and ORDER BY to support rich analytical queries directly in OLTP or reporting databases.
Help keep logic set-based and push computation into the database layer where it is highly optimized.

Quick Summary: Window functions operate across a set of rows related to the current row without collapsing them into groups. ROW_NUMBER, RANK, DENSE_RANK for ranking. LAG/LEAD for accessing neighboring rows. SUM/AVG OVER() for running totals. They replace complex self-joins and correlated subqueries with clean, readable SQL.

Permalink

Q89:

Explain PARTITION BY and ORDER BY within Window Functions.

Senior

Answer

Window functions operate over a logical window of rows, defined by PARTITION BY and ORDER BY clauses in the OVER() expression.

PARTITION BY:

Divides the result set into groups (partitions) for independent calculations, similar to grouping in analytics.
Example: SUM(SalesAmount) OVER (PARTITION BY CustomerId) gives total sales per customer.

ORDER BY:

Defines the sequence of rows within each partition.
Required for ranking and running calculations like ROW_NUMBER(), cumulative sums, and moving averages.

In summary, PARTITION BY defines the scope and ORDER BY defines the sequence of the window.

Quick Summary: PARTITION BY divides rows into groups for the window function calculation â€” like GROUP BY but without collapsing rows. ORDER BY inside the window defines the order of rows within each partition for ranking or running calculations. Together: ROW_NUMBER() OVER (PARTITION BY Dept ORDER BY Salary) ranks employees within each department.

Permalink

Q90:

What is Table Partitioning? How does SQL Server implement it?

Senior

Answer

Table partitioning horizontally splits a large table into smaller, manageable partitions based on a partition key (often date or range-based IDs). Logically it remains a single table, but physically data is separated.

SQL Server implementation:

Partition function: Defines boundary points for key ranges.
Partition scheme: Maps partitions to one or more filegroups.
The table or index is created ON the partition scheme, distributing rows across partitions based on the key.

Benefits:

Maintenance operations (index rebuilds, statistics, archiving) can be targeted at specific partitions.
Supports sliding window load/archive patterns via partition SWITCH operations.
Can improve query performance via partition elimination, especially for time-sliced workloads.

Partitioning is not automatically faster; it must align with query predicates and maintenance strategies to be effective.

Quick Summary: Table partitioning divides a large table into multiple physical filegroups based on a partition function (range of values) and partition scheme (maps ranges to filegroups). SQL Server can eliminate partitions that don't match the query filter. Makes archiving simple: switch old partitions out without touching other data.

Permalink

Q91:

Explain Partition Elimination with an example scenario.

Senior

Answer

Partition elimination occurs when SQL Server restricts I/O to only those partitions that might contain relevant rows, instead of scanning all partitions.

Example: A table partitioned by OrderDate by month. A query filtered on OrderDate BETWEEN '2024-01-01' AND '2024-01-31' can read only the January partition if:

The filter is directly on the partition key.
There are no non-SARGable expressions on the partition column.
The data types and collation match exactly.

If functions like CONVERT() or different data types are used on the partition column, partition elimination may fail and all partitions may be scanned, losing the performance benefit.

Quick Summary: Partition elimination is when SQL Server skips partitions that can't contain matching rows. A query filtering WHERE OrderDate BETWEEN '2024-01-01' AND '2024-03-31' on a table partitioned by month only scans Q1 partitions and ignores the rest. This dramatically reduces I/O on large, date-partitioned tables.

Permalink

Q92:

What are the typical causes of poor execution plan performance?

Senior

Answer

Poor execution plans are usually symptoms of deeper issues in schema design, statistics, or query patterns.

Common causes:

Missing or inappropriate indexes: Forcing table scans or expensive lookups.
Stale or missing statistics: Leading to incorrect row estimates and wrong join strategies.
Parameter sniffing: Plan optimized for one parameter value, reused for others with different data distributions.
Scalar functions and multi-statement TVFs: Preventing optimization and parallelism.
Complex views over views: Obscure actual data access and create bloated plans.
Implicit conversions: Causing non-SARGable predicates or index misses.
RBAR patterns (cursors, loops): Neglecting set-based approaches.

Effective tuning often involves query simplification, better indexing, and statistics maintenance rather than just tweaking server settings.

Quick Summary: Common causes of poor plans: stale statistics (wrong row estimates), parameter sniffing (plan built for atypical values), missing indexes (forcing table scans), implicit conversions (can't use indexes), SARGability issues (function on column), and lock escalation or blocking distorting plan behavior.

Permalink

Q93:

What is Parameter Sniffing and how do you handle it?

Senior

Answer

Parameter sniffing occurs when SQL Server compiles a plan using the initial parameter values it sees, then reuses that plan for subsequent executions. If data distribution is skewed, one plan may not fit all parameter scenarios.

Symptoms: Some calls are lightning fast, others very slow, using the same procedure and query shape.

Handling strategies:

Use OPTION (RECOMPILE) for highly skewed queries where compilation cost is acceptable.
Use OPTIMIZE FOR UNKNOWN or OPTIMIZE FOR (@param = ...) hints to choose more robust plans.
Capture parameters in local variables inside the procedure to discourage sniffing and produce more average plans.
Split logic into separate procedures for different parameter ranges if patterns are distinct.
Use Query Store to force stable plans when appropriate.

Quick Summary: Parameter sniffing caches a plan based on first-call parameters. If subsequent calls use different values, the cached plan may be inefficient. Fixes: OPTION(RECOMPILE) forces fresh compilation each time; OPTIMIZE FOR UNKNOWN makes the optimizer ignore sniffed values; local variables break plan reuse entirely.

Permalink

Q94:

Explain Execution Plan Caching and Reuse.

Senior

Answer

SQL Server uses a plan cache to store compiled execution plans so that subsequent executions of the same (or parameterized) queries can skip the compilation phase.

Benefits:

Reduces CPU overhead from repeated compilations.
Improves response time for frequently executed queries and stored procedures.

Challenges:

Poorly parameterized or ad-hoc queries can cause plan cache bloat, with many single-use plans.
Parameter sniffing issues stem from plan reuse across different parameter values.
Schema changes or statistics updates can invalidate plans, causing recompilation spikes.

Best practice is to use parameterized queries, monitor plan cache size and reuse patterns, and leverage Query Store to manage problematic plans.

Quick Summary: SQL Server caches compiled execution plans in the plan cache keyed by the query text hash. Matching queries reuse the plan without reparsing and reoptimizing. Plan reuse saves CPU but can backfire with parameter sniffing. sp_executesql with parameters enables safe plan reuse better than string concatenation.

Permalink

Q95:

What is Schema Binding and why is it important?

Senior

Answer

Schema binding associates a view or function tightly with the underlying schema so that the referenced objects cannot be modified in incompatible ways without first changing the bound object.

Importance:

Required for indexed views and some computed columns, ensuring structural stability.
Prevents accidental changes (e.g., dropping or altering columns) that would silently break dependent logic.
Helps enforce contracts between application code, views, and functions.

In high-governance environments, schema binding is a tool to ensure that refactoring is deliberate and fully impact-assessed.

Quick Summary: SCHEMABINDING ties a function or view to the objects it references â€” you can't drop or alter those objects without first modifying the dependent object. Required for indexed views. It also enables the optimizer to inline simple functions more aggressively and prevents accidental schema changes from breaking dependencies.

Permalink

Q96:

Explain the concept of set-based operations vs row-by-row operations.

Senior

Answer

SQL is fundamentally a set-based language: it is designed to operate on collections of rows at once, not one row at a time.

Set-based operations:

Leverage the optimizer to choose efficient algorithms (hash joins, merges, batch operations).
Use fewer logical and physical reads for bulk operations.
Scale more gracefully as data volume grows.

Row-by-row (RBAR) operations:

Implement logic via cursors or loops, processing one row at a time.
Usually lead to excessive context switching, locking overhead, and long runtimes.
Only justified for very complex, inherently procedural business rules.

Senior-level SQL design focuses on transforming requirements into set-based patterns whenever possible, often with window functions, joins, and properly designed queries.

Quick Summary: Set-based operations (UPDATE, INSERT, DELETE on sets of rows) leverage SQL Server's ability to process many rows in one pass using optimized algorithms and parallelism. Row-by-row processing (cursors, WHILE loops) forces sequential iteration â€” slow, can't parallelize, and often orders of magnitude slower on large data.

Permalink

Q97:

What are Cursors and why are they usually discouraged?

Senior

Answer

Cursors are database objects that allow row-by-row traversal of a result set, similar to iterators in procedural languages.

Reasons they are discouraged:

They process data one row at a time, leading to poor performance on large sets.
They often hold locks for a long time, reducing concurrency.
They introduce complex, procedural logic that is harder to maintain and test.
They typically have higher memory and tempdb overhead compared to set-based alternatives.

Cursors should be a last resort, used only when set-based solutions are impractical or impossible. In many cases, window functions, MERGE statements, or carefully written set-based updates can replace cursor logic.

Quick Summary: Cursors iterate over result sets one row at a time. For large datasets this is extremely slow â€” it bypasses SQL Server's set-based engine. Most cursor logic can be rewritten with set-based updates, window functions, or CTEs. Use cursors only when row-by-row processing is genuinely unavoidable, like calling a stored procedure per row.

Permalink

Q98:

Explain the difference between Physical Reads, Logical Reads, and Page Life Expectancy.

Senior

Answer

These metrics help diagnose I/O and memory performance in SQL Server.

Logical reads: The number of 8KB pages read from the buffer cache (memory). High logical reads can indicate inefficient queries or missing indexes.

Physical reads: Pages read from disk because they were not found in the buffer cache. Physical I/O is orders of magnitude slower than memory access.

Page Life Expectancy (PLE): An indicator of how long pages stay in the buffer cache before being evicted. A consistently low PLE suggests memory pressure or inefficient queries repeatedly flushing the cache.

Senior engineers use these metrics together to determine whether to focus on query tuning, indexing, or adding memory/storage capacity.

Quick Summary: Physical Read: SQL Server fetched the page from disk (slow). Logical Read: page was found in the buffer pool (fast, in-memory). Logical reads measure how much work a query did in memory â€” lower is better. Page Life Expectancy (PLE) shows how long pages stay in the buffer pool â€” low PLE means memory pressure and frequent disk reads.

Permalink

Q99:

What is SARGability and why is it critical for performance?

Senior

Answer

SARGability (Search ARGument-ability) describes whether a predicate can efficiently use an index to seek rows instead of scanning.

SARGable predicates:

Simple comparisons like Column = @Value, Column >= @Start AND Column <= @End.
Allow the optimizer to perform index seeks and range scans.

Non-SARGable patterns:

Applying functions to columns: WHERE LOWER(Col) = 'abc'.
Expressions on the left side: WHERE Col + 1 = 10.
Implicit conversions that change the column's data type.
Leading wildcards: LIKE '%abc'.

Ensuring predicates are SARGable is one of the most impactful techniques in query tuning: it allows indexes to be used effectively, minimizing reads and dramatically improving performance.

Quick Summary: SARGable (Search ARGument ABLE) expressions let SQL Server use an index to seek matching rows. Non-SARGable expressions wrap the column in a function or implicit conversion â€” forcing a scan. SARGable: WHERE LastName = 'Smith'. Non-SARGable: WHERE UPPER(LastName) = 'SMITH' or WHERE CAST(Id AS varchar) = '123'.

Permalink

Q100:

What Is TempDB and Why Is It Critical for SQL Server Performance?

Senior

Answer

TempDB is SQL Server’s global workspace used by all users and internal operations. It is essential because it handles:

Sorting operations (ORDER BY, GROUP BY)
Hash joins and aggregations
Row versioning for snapshot isolation
Temporary tables and table variables
Intermediate spill data during execution
Cursors and internal worktables

If TempDB becomes slow or suffers contention, the entire SQL Server instance slows down. Proper sizing, fast storage, and multiple data files are critical for performance.

Quick Summary: TempDB is a shared system database recreated on SQL Server restart. It holds temp tables, table variables, sort and hash spill buffers, row version store for snapshot isolation, and DBCC intermediate data. All user sessions share it â€” heavy concurrent use creates allocation contention on specific system pages.

Permalink

Q101:

What Is TempDB Contention and What Causes It?

Senior

Answer

TempDB contention occurs when multiple threads compete for the same allocation pages (PFS, GAM, SGAM) or metadata structures.

Typical causes include:

Too few TempDB data files
Heavy use of temp tables/table variables
Large sorting or hashing operations
High row versioning pressure

Fixes: increase file count, equal-size files, optimize workload, reduce spills.

Quick Summary: TempDB contention typically hits PFS, GAM, and SGAM pages â€” special allocation pages that every new temp object allocation touches. In high-concurrency workloads, many sessions compete for the same pages. Fix: add multiple TempDB data files (one per CPU core up to 8), enable trace flag 1118, use SQL Server 2016+ with automatic improvement.

Permalink

Q102:

What Are Data Pages and Index Pages in SQL Server?

Senior

Answer

SQL Server stores all data in 8 KB pages.

Data pages store full table rows.
Index pages store B-tree navigation structures and pointers.

Understanding pages explains index fragmentation, logical reads, and physical I/O behavior.

Quick Summary: SQL Server stores data in 8KB pages. Data pages hold actual table rows. Index pages hold B-tree index entries at each level â€” leaf pages point to actual data rows (or contain them for clustered indexes). Each page belongs to one object and is managed by the buffer pool for caching.

Permalink

Q103:

What Is the Buffer Pool and How Does SQL Server Use It?

Senior

Answer

The buffer pool is SQL Server’s memory area used for caching data and index pages.

Logical reads come from memory.
Physical reads occur only when data is missing in cache.
High buffer reuse improves performance.

Permalink

Q112:

What Are Memory Grants and Why Do They Matter?

Senior

Answer

SQL Server allocates memory for sorting, hashing, and aggregations.

Incorrect grants:

Too small ? spills to TempDB
Too large ? resource starvation

Slow log storage = slow commits.

Quick Summary: Log records are written to in-memory log buffers first, then flushed to disk on transaction commit, every 60KB of log buffer, or every second. COMMIT can't return until the log record is hardened to disk â€” that's what makes transactions durable. Log flushes are sequential writes â€” far faster than random data page I/O.

Permalink

Q119:

What Is the Difference Between Row Store and Column Store in SQL Server?

Senior

Answer

Row store: row-by-row storage; best for OLTP.

Column store: column-by-column storage; best for analytics, high compression, parallel execution.

Quick Summary: Row store organizes data by rows â€” all columns for a row are stored together. Good for OLTP (point lookups, row modifications). Column store organizes data by column â€” ideal for analytics where queries read a few columns across millions of rows. Column store also compresses heavily and enables batch-mode processing.

Permalink

Q120:

What Is Query Tuning and Why Is It Necessary in SQL Server?

Expert

Answer

Expert

Answer

Sudden regressions often occur due to:

Updated statistics
Parameter sniffing plan changes
Index changes
Fragmentation spikes
Failover causing cold cache
Data growth affecting plan shapes

Quick Summary: Plan regressions after stats updates (optimizer uses new distribution data), index rebuilds (resets statistics), database compatibility level changes (new CE model), and SQL Server upgrades. Queries that were fast can suddenly get a different plan that's wrong for current data. Query Store is the primary tool to detect and revert regressions.

Permalink

Q148:

What is the importance of Query Store in high-traffic systems?

Expert

Answer

Query Store provides:

Plan history recording
Runtime performance metrics
Ability to revert to stable plans
Insight into parameter sniffing behavior

It acts as a "black box recorder" for SQL Server.

Quick Summary: Query Store tracks every query's execution history â€” plan, duration, CPU, I/O, logical reads. In high-traffic systems it enables: detecting plan regressions instantly, forcing known-good plans, identifying top resource consumers, and understanding query behavior over time without relying on memory-only DMVs that reset on restart.

Permalink

Q149:

Why should TempDB be placed on fast storage?

Expert

Answer

TempDB handles:

Sorts
Hash operations
Version store
Spills
Temporary objects

Slow TempDB I/O becomes a bottleneck for the entire SQL Server instance.

Quick Summary: TempDB is heavily used for sort spills, hash join overflows, and temp table operations. Slow TempDB I/O directly slows every query that spills. NVMe SSDs for TempDB are a standard recommendation. Separating TempDB from data/log files on dedicated drives avoids I/O contention between user queries and TempDB operations.

Permalink

Q150:

Why are wide clustered keys a long-term performance problem?

Expert

Answer

The clustered key is included in all non-clustered indexes. Wide keys increase:

Storage footprint
Read/write cost
Fragmentation
Memory usage

Explode row counts
Consume massive CPU
Stress memory
Cause system freezes

Index size
Maintenance overhead
Memory usage

Indexes should be narrow and purpose-specific.

Quick Summary: Every non-clustered index duplicates the clustered key columns as the row locator. Wide index keys = larger index pages = more I/O to scan the index. Narrow indexes fit more entries per page, require less memory in the buffer pool, and are faster to scan. Only index the columns you actually need for your queries.

Permalink

Q159:

When should filtered indexes be used?

Expert

Answer

Filtered indexes are ideal when:

Queries target selective subsets
Columns contain sparse values
Large tables need optimized predicate performance

Quick Summary: Use filtered indexes when queries consistently target a predictable subset: active records (IsDeleted = 0), a specific status, or a regional subset. The index is smaller, statistics are more accurate for that subset, and seek performance is higher. Particularly effective when the indexed subset is a small percentage of the total rows.

Permalink

Q160:

What steps do you take before rewriting a slow query?

Expert

Answer

Before rewriting a query:

Check execution plan
Validate statistics
Check indexes
Identify costly operators
Try small predicate adjustments
Confirm rewrite is necessary

Most problems are fixed by plan adjustments, not rewrites.

Quick Summary: Before rewriting: capture the actual execution plan and compare estimated vs actual rows. Check wait stats and blocking. Look at logical reads (SET STATISTICS IO ON). Identify the highest-cost operators. Update statistics and check for missing indexes first â€” often fixes the query without any rewrite. Rewrite only after understanding the root cause.

Permalink

Apple Interview MS SQL Interview Questions

MS SQL Interview Questions & Answers

Questions

What is MS SQL Server and where is it commonly used?

Answer

Explain Database, Table, Row, and Column in relational terms.

Answer

What is a Primary Key and why is it important?

Answer

What is a Foreign Key and what problem does it solve?

Answer

What are Constraints in MS SQL?

Answer

What is Normalization and why do we use it?

Answer

Explain 1NF, 2NF, and 3NF briefly.

Answer

What is Denormalization and when is it used?

Answer

What is an Index and why is it important?

Answer

What is the difference between Clustered and Non-Clustered Index?

Answer

What is a View and why is it used?

Answer

What is a Stored Procedure and what is it used for?

Answer

What are User-Defined Functions (UDFs)?

Answer

What is a Transaction and why is it important?

Answer

Explain ACID properties in SQL.

Answer

What is a Deadlock in SQL Server?

Answer

What is Locking in MS SQL Server?

Answer

What is a Schema in SQL?

Answer

Difference between DELETE and TRUNCATE?

Answer

What is an Execution Plan and why is it important?

Answer

What is the main purpose of indexing in SQL Server?

Answer

How does SQL Server decide between a Table Scan, Index Scan, and Index Seek?

Answer

What is a Composite Index and when is it used?

Answer

What is Bookmark Lookup and why can it cause performance problems?

Answer

What is a Covering Index and why is it powerful?

Answer

What are SQL Server Statistics and why are they essential?

Answer

How do outdated or missing statistics affect query performance?

Answer

What is Parameter Sniffing and why does it occur?

Answer

What is an Execution Plan and how does SQL Server generate it?

Answer

What is the difference between Estimated and Actual Execution Plans?

Answer

What is Cardinality Estimation and why is it important?

Answer

What are Index Fragmentation and Fill Factor?

Answer

What is the Query Optimizer and how does it work?

Answer

What is a Filtered Index and when is it beneficial?

Answer

What is a Hint and when should it be used?

Answer

What is a Hotspot in indexing terms?

Answer

What is the Role of the Query Store in SQL Server?

Answer

What is an Index Seek Predicate vs Residual Predicate?

Answer

Why does SQL Server choose a Hash Match instead of Nested Loop or Merge Join?