Microservices Interview Cheat Sheet

1. What is a microservices architecture?

Microservices is an architecture where an app is split into small, independent services - each doing one specific job. Every service runs separately, has its own database, and talks to others via APIs or messaging. This lets teams build, deploy, and scale each piece independently instead of touching one giant codebase.

Entry Full Answer →

2. How does microservices differ from monolithic architecture?

Monolith: all features in one codebase, deployed together as one unit. Microservices: split into separate services, each deployable independently. Monolith is simpler to start but hard to scale and change over time. Microservices give flexibility and scalability but add complexity in networking, data consistency, and ops.

Entry Full Answer →

3. What are the advantages of microservices?

Key advantages: independent deployments (deploy one service without touching others), independent scaling (scale only the bottleneck service), tech flexibility (each team picks its own stack), fault isolation (one service crash doesn't bring down everything), and smaller codebases that are easier to understand and maintain.

Entry Full Answer →

4. What are the challenges of microservices?

Main challenges: distributed systems complexity (network failures, latency), data consistency across services (no single transaction across multiple DBs), service discovery and load balancing, debugging across multiple services, higher operational overhead, and more infrastructure to manage compared to a simple monolith.

Entry Full Answer →

5. Explain service discovery in microservices.

Service discovery is how services find each other at runtime. Client-side: the service queries a registry (like Consul or Eureka) to get the target's address, then calls directly. Server-side: a load balancer queries the registry and routes the request. Without discovery, hardcoding IPs breaks as services scale and restart.

Entry Full Answer →

6. What is API Gateway in microservices?

API Gateway is the single entry point for all client requests. It handles routing to the right service, authentication, rate limiting, SSL termination, and response aggregation. Clients talk to one gateway instead of dozens of services. Examples: Kong, AWS API Gateway, NGINX. It reduces client complexity and centralizes cross-cutting concerns.

Entry Full Answer →

7. Explain inter-service communication methods.

Two main ways: synchronous (HTTP/REST or gRPC - caller waits for a response, simpler but creates tight coupling and cascading failures if a service is down) and asynchronous (message queues like Kafka or RabbitMQ - fire and forget, more resilient, but eventual consistency and harder to debug). Most systems use both.

Entry Full Answer →

8. How is data managed in microservices?

Each microservice owns its own database - no shared DB. This prevents tight coupling at the data layer. Cross-service data needs are handled via API calls or event-driven patterns (a service publishes events when data changes, others subscribe and maintain their own read models). This is the database-per-service pattern.

Entry Full Answer →

9. What is the difference between synchronous and asynchronous microservices?

Synchronous: caller sends a request and waits for the response (HTTP, gRPC). Simple to reason about but the caller is blocked and failure in the called service directly affects the caller. Asynchronous: caller sends a message and continues (Kafka, RabbitMQ). Decoupled and more resilient, but you get eventual consistency instead of immediate.

Entry Full Answer →

10. What is eventual consistency?

Eventual consistency means after an update, not all services see the new data immediately - but they will all be consistent eventually. It's accepted in distributed systems where strong consistency is too expensive. Example: you place an order, inventory updates asynchronously. For a brief moment inventory count is stale, then it catches up.

Entry Full Answer →

11. Explain circuit breaker pattern.

Circuit breaker monitors calls to a service. If failures cross a threshold, it "opens" and stops sending requests (returns a fallback immediately instead). After a timeout, it goes "half-open" and tries one request. If it succeeds, it closes again. This prevents cascading failures when a downstream service is slow or down.

Entry Full Answer →

12. What is the role of load balancing in microservices?

Load balancing distributes incoming requests across multiple instances of a service. Without it, one instance gets overwhelmed while others sit idle. In microservices it's critical since services scale to multiple instances. Solutions: round-robin, least-connections, or weighted. Tools: NGINX, HAProxy, AWS ALB, Kubernetes Services.

Entry Full Answer →

13. How do microservices handle security?

Security in microservices: use JWT or OAuth2 for authentication at the API gateway. Enforce authorization in each service (don't trust just because the gateway passed it). Use mTLS for service-to-service communication. Secrets management via Vault or cloud secret stores. Network policies to restrict which services can talk to which.

Entry Full Answer →

14. What is logging and monitoring in microservices?

Each service logs independently - structured JSON logs are best. Centralize them with ELK Stack or similar. Add correlation IDs to trace a request across services. Use distributed tracing (Jaeger, Zipkin) to see the full call chain. Metrics via Prometheus + Grafana for dashboards and alerts. Without this, debugging distributed systems is nearly impossible.

Entry Full Answer →

15. Explain containerization in microservices.

Containerization packages each service with all its dependencies into a Docker container. Containers are lightweight, consistent across environments (no "works on my machine"), and start fast. Each microservice runs in its own container. Container orchestration (Kubernetes) manages scheduling, scaling, health checks, and networking across containers.

Entry Full Answer →

16. What is the role of Kubernetes in microservices?

Kubernetes automates deployment, scaling, and management of containerized microservices. It handles: running the right number of instances (Deployments), load balancing between them (Services), self-healing (restarts crashed pods), config and secret management (ConfigMaps/Secrets), and rolling deployments with zero downtime.

Entry Full Answer →

17. How do microservices achieve high availability?

High availability in microservices: run multiple instances of each service so one failing doesn't cause downtime. Use health checks so orchestrators replace unhealthy instances. Deploy across availability zones. Use circuit breakers to prevent cascading failures. Implement retries with backoff. Design for graceful degradation when a non-critical service is down.

Entry Full Answer →

18. Explain the Saga pattern.

Saga pattern handles distributed transactions across multiple services without a single ACID transaction. Each service does its local transaction and publishes an event. If a later step fails, compensating transactions roll back previous steps. Two styles: choreography (services react to events) and orchestration (a saga coordinator drives the steps).

Entry Full Answer →

19. What is event-driven architecture in microservices?

Event-driven architecture means services communicate by publishing and consuming events instead of direct API calls. A service publishes "OrderPlaced", other services (inventory, notification, billing) react independently. This decouples services - they don't need to know about each other, just the events. Makes the system more resilient and scalable.

Entry Full Answer →

20. How do microservices scale?

Microservices scale horizontally - you just run more instances of the service that's the bottleneck. Because services are independent, you don't need to scale the whole app. Combined with auto-scaling (Kubernetes HPA triggers on CPU/memory/custom metrics), the system adjusts automatically to traffic spikes and drops.

Entry Full Answer →

21. What is CQRS (Command Query Responsibility Segregation)?

CQRS separates reads and writes into different models. Commands (write operations) go through one path that changes state. Queries (reads) go through a separate, optimized read model. This lets you scale reads and writes independently, optimize each separately, and use different storage for reads vs writes (e.g., SQL for writes, Elasticsearch for reads).

Junior Full Answer →

22. Explain Event Sourcing in microservices.

Event Sourcing stores all changes to state as a sequence of events instead of just the current value. To get current state, replay all events. Benefits: full audit trail, ability to replay events to rebuild state, natural fit with event-driven architecture. Downside: querying current state is more complex - usually solved with a projected read model.

Junior Full Answer →

23. How does the Saga pattern work for distributed transactions?

The Saga pattern breaks a distributed transaction into steps. Each step does a local DB transaction and publishes an event. The next service picks it up and does its step. If any step fails, compensating transactions undo the previous steps. Example: book hotel -> book flight -> charge card. If card fails, cancel hotel and flight bookings.

Junior Full Answer →

24. What is observability in microservices?

Observability means you can understand what's happening inside the system from external signals. Three pillars: Logs (what happened), Metrics (how much/how fast), Traces (which path did the request take). With proper observability you can debug production issues, understand performance bottlenecks, and know when things are about to break.

Junior Full Answer →

25. Explain distributed tracing.

Distributed tracing tracks a single request as it flows through multiple services. Each service adds a span with timing and metadata. Spans are linked by a trace ID. Tools like Jaeger or Zipkin collect and visualize these traces. You can see exactly which service is slow, where errors happen, and how calls fan out across the system.

Junior Full Answer →

26. What are circuit breakers and fallback mechanisms?

Circuit breaker monitors failure rate to a service. When failures exceed threshold it opens - calls return a fallback immediately without hitting the failing service. Fallback could be a cached response, default value, or error message. This prevents your service from wasting threads waiting on a dead service and stops failures from cascading upstream.

Junior Full Answer →

27. Explain bulkhead pattern.

Bulkhead pattern isolates failures by giving each service or feature its own resource pool - separate thread pools, connection pools, or instances. If one service gets overwhelmed (or leaks resources), it only consumes its own pool and doesn't starve other services. Named after ship bulkheads that keep one flooded compartment from sinking the whole ship.

Junior Full Answer →

28. How does rate limiting work?

Rate limiting caps how many requests a client can make in a time window. Common algorithms: token bucket (refills tokens at a fixed rate, burst allowed), sliding window (smooth counting over a rolling period), leaky bucket (queues requests and releases at a fixed rate). Implemented at the API gateway or per service. Returns 429 Too Many Requests when exceeded.

Junior Full Answer →

29. Explain retries and backoff strategies.

Retries handle transient failures by automatically retrying failed requests. But naive retries can overwhelm a struggling service. Exponential backoff increases wait time between retries (1s, 2s, 4s, 8s...). Add jitter (random offset) to prevent thundering herd when many clients retry at the same time. Always set a max retry count.

Junior Full Answer →

30. What is a sidecar pattern?

Sidecar pattern deploys a helper container alongside the main service container in the same pod. The sidecar handles cross-cutting concerns: log collection, metrics scraping, mTLS certificate management, service mesh proxy (Envoy in Istio). The main service stays focused on business logic while the sidecar handles infrastructure concerns transparently.

Junior Full Answer →

31. How do you implement API versioning in microservices?

API versioning strategies: URI versioning (/api/v1/users vs /api/v2/users) - simple and visible. Header versioning (Accept: application/vnd.api.v2+json) - cleaner URLs but harder to test. Query param versioning (?version=2) - easy but pollutes URLs. Use semantic versioning. Don't break existing clients - keep old versions running during migration.

Junior Full Answer →

32. Explain service mesh.

Service mesh is an infrastructure layer that handles service-to-service communication. Deployed as sidecar proxies (Envoy) next to each service. Handles: mTLS encryption between services, traffic management (retries, timeouts, circuit breaking), observability (traces, metrics) - all without changing your app code. Istio and Linkerd are popular choices.

Junior Full Answer →

33. How do microservices handle configuration management?

Configuration management in microservices: don't hardcode configs. Use environment variables for simple values. Use a centralized config server (Spring Cloud Config, Consul, AWS Parameter Store) for shared or dynamic config. Config changes should not require redeployment. Sensitive values (passwords, API keys) go in a secrets manager, not config files.

Junior Full Answer →

34. What is blue-green deployment?

Blue-green deployment runs two identical environments - blue (current live) and green (new version). Traffic switches from blue to green all at once. If something breaks, rollback is just switching traffic back to blue. No downtime during deployment. Downside: requires double the infrastructure. Best for when you can't do gradual rollouts.

Junior Full Answer →

35. What is canary deployment?

Canary deployment releases the new version to a small percentage of users first (1-5%). Monitor errors, latency, and business metrics. If it looks good, gradually increase traffic to the new version until it's 100%. If problems appear, roll back only the canary. Lower risk than blue-green since issues affect only a small user slice.

Junior Full Answer →

36. How do you implement logging best practices in microservices?

Logging best practices: use structured logs (JSON) - machine-readable and easy to query. Include correlation/trace IDs so you can follow a request across services. Log at appropriate levels (DEBUG/INFO/WARN/ERROR). Centralize logs in ELK, Loki, or CloudWatch. Don't log sensitive data (PII, passwords). Avoid log noise - noisy logs hide real problems.

Junior Full Answer →

37. How do microservices ensure resilience?

Resilience in microservices comes from designing for failure. Use: circuit breakers (stop hitting failing services), retries with backoff (handle transient failures), bulkheads (isolate resource pools), timeouts (don't wait forever), health checks (remove unhealthy instances), graceful degradation (return partial results when non-critical services fail).

Junior Full Answer →

38. Explain health checks in microservices.

Health checks tell the orchestrator if a service is ready to serve traffic. Liveness probe: is the app alive? (if not, restart it). Readiness probe: is the app ready for traffic? (if not, stop sending requests to it). You implement an endpoint (/health or /ready) that checks DB connections, dependencies, and internal state.

Junior Full Answer →

39. Explain the importance of idempotency in microservices.

Idempotency means calling an operation multiple times gives the same result as calling it once. Critical in microservices because retries are common. Use idempotency keys (client sends a unique ID with each request, server stores results and returns the same response for duplicate IDs). Makes retries safe - no duplicate orders, no duplicate charges.

Junior Full Answer →

40. How do you monitor microservices performance?

Monitor microservices with: Prometheus to scrape metrics (request rate, error rate, latency - the RED method). Grafana dashboards to visualize. Distributed tracing (Jaeger) to see request paths. Alerting via Alertmanager or PagerDuty. Set SLOs (service level objectives) and alert when you burn through your error budget.

Junior Full Answer →

41. What is event-driven architecture in microservices?

In event-driven microservices, services communicate by publishing events to a broker (Kafka, RabbitMQ). No direct service-to-service calls. Producer publishes "UserRegistered", consumer services independently react. This decouples services temporally and spatially - they don't need to be running at the same time or know each other's addresses.

Mid Full Answer →

42. Difference between event-driven and request-driven microservices.

Event-driven: services communicate via events/messages through a broker. Loose coupling, async, high throughput. Request-driven: service A calls service B directly and waits (HTTP/gRPC). Simple to understand, easier debugging. Request-driven works well for queries and commands needing immediate response. Event-driven works well for workflows and fan-out operations.

Mid Full Answer →

43. What are message brokers?

Message brokers are middleware that receive, store, and forward messages between services. They decouple producers and consumers - producer doesn't need to know who consumes, consumer doesn't need to be online when producer sends. Examples: Kafka (high-throughput streaming), RabbitMQ (flexible routing), AWS SQS (managed queue). Enable async, resilient communication.

Mid Full Answer →

44. Explain pub/sub and message queue patterns.

Pub/sub: publisher sends to a topic, multiple subscribers receive copies independently. One-to-many broadcast. Queue (point-to-point): message goes to one consumer in a group, processed once. Load balances work across consumers. Most systems use both: Kafka topics with consumer groups give pub/sub semantics plus competing consumer load balancing in the same system.

Mid Full Answer →

45. Explain Kafka and its advantages.

Kafka is a distributed streaming platform. Advantages: extremely high throughput (millions of events/second), durable (events stored on disk, replicated), replayable (consumers can re-read past events), ordered within partitions, horizontal scaling via partitions. Used for event sourcing, stream processing, activity tracking, and service-to-service async communication.

Mid Full Answer →

46. How do microservices ensure reliable messaging?

Reliable messaging strategies: at-least-once delivery (broker retries until ack - make consumers idempotent), exactly-once (Kafka transactions, harder to achieve), persistent storage in the broker (messages survive restarts), acknowledgements (consumer explicitly acks after processing), dead-letter queues for messages that repeatedly fail processing.

Mid Full Answer →

47. What is the transactional outbox pattern?

Transactional outbox: instead of publishing directly to Kafka (two operations - DB write and message publish can't be atomic), write the event to an "outbox" table in the same DB transaction as your data change. A separate relay process reads the outbox and publishes to Kafka, then marks as published. Guarantees at-least-once event delivery.

Mid Full Answer →

48. How do microservices achieve scalability?

Microservices scale horizontally by running more instances. Each service scales independently based on its specific bottleneck. Auto-scaling reacts to metrics (CPU, memory, queue depth, custom metrics). Services are stateless (session in Redis not in-process), so any instance can handle any request. Load balancers distribute traffic across all instances.

Mid Full Answer →

49. Explain CQRS + Event Sourcing for scaling.

CQRS separates write model (handles commands, enforces business rules, appends events) from read model (denormalized projections optimized for queries). Event Sourcing provides the write model as an event log. Together: high-throughput writes, flexible querying, complete audit trail, and easy replay to rebuild or add new read models.

Mid Full Answer →

50. How does asynchronous communication improve microservices performance?

Async communication improves performance because the calling service doesn't block waiting for a response. It can handle other work while the downstream service processes. Message queues absorb traffic spikes - producers publish at their rate, consumers process at their rate. This smooths out load instead of letting spikes overwhelm downstream services.

Mid Full Answer →