Top 50 Microservices Interview Questions in 2026 (With Real Answers)

After a decade of "split everything into microservices," 2025 saw a backlash. Amazon Prime Video famously consolidated to a monolith and saved 90% in costs. By 2026, the smart take is "use microservices when their costs pay for themselves, and don't before."

These are the 50 microservices questions that come up in real interviews in 2026 - including the ones designed to catch candidates who've memorized buzzwords without understanding the tradeoffs.

Fundamentals (1-10)

1. What is a microservice?

A small, independently-deployable service that owns its data and exposes a network API. The "small" is fuzzy - the right size is the smallest unit that can be developed, deployed, and operated by one team without coordinating constantly with others.

2. Microservices vs monolith - which is the right default in 2026?

Modular monolith for new projects. Microservices when you've measured monolith pain (deployment coupling, scaling bottlenecks, team coordination overhead) and the split addresses a real problem. The default is monolith because microservices add cost up front for benefits you may never need.

3. What is a modular monolith?

A single deployable with strong internal module boundaries. Modules talk through well-defined interfaces, but ship together. You get most of the benefits of microservices (clear domains, testable units) without the operational tax (network calls, distributed transactions, observability sprawl).

4. When should you actually adopt microservices?

(1) Multiple teams need to deploy independently. (2) Different services have wildly different scaling needs (one is CPU-bound, another is IO-bound). (3) Different services have different uptime SLAs. (4) Compliance or security requires isolation. If none of those apply, stay monolithic.

5. What is a "distributed monolith" and why is it bad?

Microservices that must be deployed in lockstep, share databases, and tightly couple their APIs. You get the operational complexity of microservices and the deployment coupling of monoliths. The worst of both. Usually the result of a bad split done for resume reasons.

6. Domain-Driven Design - how does it relate to microservices?

DDD gives you the tools to find good service boundaries. Bounded contexts define where models are consistent. Aggregates define transaction boundaries. A service usually maps to one bounded context. Without DDD, splits tend to follow technical layers (auth-service, db-service) and become a distributed monolith.

7. What's a bounded context?

A clearly delimited area of the domain where terms have specific meanings. "Customer" in billing means something different than "Customer" in shipping. Each context owns its model. Microservice boundaries should follow bounded context boundaries.

8. How do you decide service boundaries?

Look for high cohesion (things that change together) and low coupling (things that don't). A good service can be reasoned about by one team in isolation. If you're constantly coordinating changes across two services, they should probably be one.

9. What's the "two-pizza team" rule?

Amazon's heuristic - a team should be small enough that two pizzas feed it (~6-8 people). Each team owns a service end to end. If a team can't fit, the service is too big or the team is.

10. Conway's Law - what does it say about microservices?

"Organizations design systems that mirror their communication structure." If your team structure doesn't match your service boundaries, you'll fight friction constantly. Either restructure teams or reconsider the boundaries.

Communication Patterns (11-20)

11. Synchronous vs asynchronous - when do you use each?

Sync (HTTP/gRPC) - when the caller needs the answer to proceed. Async (message queue, event bus) - when work can happen out-of-band. Default to async for cross-service calls; sync makes you vulnerable to cascading failures.

12. REST vs gRPC vs GraphQL - which do you pick?

REST - lowest barrier, broadest compat, fine for most public APIs.
gRPC - lowest latency, schema-first, great for service-to-service. Default for internal in 2026.
GraphQL - client-driven shape, good for frontends with diverse data needs. Adds server-side complexity.

13. What's gRPC and why is it popular for internal services?

HTTP/2-based RPC framework with Protobuf schemas. Multiplexed streams, binary encoding, generated clients in every major language. Faster than REST, type-safe, built-in deadlines and cancellation. The 2026 default for service-to-service inside a cluster.

14. What is an event bus?

A messaging infrastructure that decouples producers from consumers. Producers emit events; multiple consumers subscribe. Kafka, NATS, RabbitMQ, AWS EventBridge, GCP Pub/Sub - all event bus implementations.

15. Event-driven vs request-driven architecture?

Event-driven - producers emit, consumers react. Loose coupling, hard to trace.
Request-driven - explicit calls. Tight coupling, easy to reason about.

Most real systems are hybrid. Use events for "this happened" notifications, requests for "I need an answer."

16. What is the saga pattern?

A way to handle distributed transactions without 2PC. The transaction is a sequence of local transactions, each with a compensating action. If step 4 fails, you run compensations for 3, 2, 1. Coordinator-based (orchestration) or event-based (choreography).

17. Choreography vs orchestration - which is better?

Choreography - services react to events. No central brain. Hard to reason about end-to-end.
Orchestration - one component drives the workflow. Centralized logic. Single point of coordination.

Default to orchestration for anything with more than three steps. Choreography looks elegant but becomes impossible to debug.

18. What's a service mesh and when do you need one?

A network layer (Istio, Linkerd) that handles service-to-service traffic - mTLS, retries, circuit breaking, observability. You don't need it for 5 services. You probably need it for 50. The threshold is when you're rebuilding cross-cutting features in every service.

19. What's mTLS and why does it matter?

Mutual TLS - both client and server present certificates. In a microservices setup, this gives you authenticated service-to-service identity by default. Service meshes provision and rotate the certs automatically.

20. How do you handle service discovery?

Three flavors. (1) DNS-based (Kubernetes services) - simple, default in K8s. (2) Client-side discovery - service queries a registry (Consul, Eureka), picks an instance. (3) Server-side - load balancer or API gateway routes by name. K8s + DNS is the most common in 2026.

Data Management (21-30)

21. Database-per-service - is it required?

Yes, in spirit. Each service owns its data. Other services don't read its tables directly. If you need shared data, expose an API or stream events. Without this rule, your "microservices" are a database monolith with API frontends.

22. How do you handle data that multiple services need?

(1) One service owns it, others call. (2) Replicate via events to read-optimized stores. (3) CQRS - one service writes, others build their own read models from the event stream. Choice depends on consistency tolerance and latency needs.

23. What's CQRS?

Command Query Responsibility Segregation. Writes go through one model; reads go through another. Often pairs with event sourcing. Useful when read and write workloads have different shapes. Overkill for simple CRUD.

24. Event sourcing - what is it and when do you use it?

Store every state change as an immutable event. Current state is derived by replaying events. Gives you audit logs, time travel, and easy CQRS read models. Cost: queries against current state require building projections.

25. How do you handle distributed transactions?

You don't, mostly. You use sagas with compensating actions, or you accept eventual consistency. Two-phase commit exists but is rarely worth the operational cost. Design your domain so most transactions are local to one service.

26. What's eventual consistency and how do you reason about it?

Different services have different views of the data, briefly. Eventually they converge. The trick is making this acceptable to the business: "the order is placed; the inventory count updates within 2 seconds." Communicate the SLO explicitly.

27. The dual-write problem - what is it?

Service writes to its DB and emits an event. Two failure modes: DB succeeds but event fails (consumers miss it), or event succeeds but DB rollback (consumers see a phantom). Solution: outbox pattern.

28. What is the outbox pattern?

Write the event to an outbox table in the same DB transaction as the business write. A separate process reads the outbox and publishes events. Atomicity is guaranteed because both writes are in one transaction.

29. How do you handle data migrations across services?

Slowly. (1) Dual-write to old and new. (2) Backfill historical data. (3) Read from new while writing to both. (4) Stop writing to old. (5) Remove old. Each step needs to be deployable independently. The whole migration can take months.

30. Polyglot persistence - is it worth the complexity?

Sometimes. Different services use different databases optimized for their workload (Postgres for OLTP, Elasticsearch for search, Redis for cache, ClickHouse for analytics). The cost is operational diversity. Don't add a new database without strong justification.

Reliability and Resilience (31-40)

31. What's a circuit breaker?

A pattern that stops calling a failing dependency. After N failures, the circuit "opens" and calls fail fast for a period. After a cooldown, you "half-open" - allow one call through to test. Prevents cascading failure when downstream is dying.

32. Retry with backoff - how do you do it right?

Exponential backoff with jitter: delay = min(cap, base * 2^attempt) + random(0, delay/2). Without jitter, all clients retry at the same time and synchronize-DDoS the recovering service. Cap retries; not every failure should retry.

33. What's the difference between idempotency and at-least-once delivery?

At-least-once means a message might be delivered more than once - the network and the broker can't promise "exactly once." Idempotency means processing the same message twice has the same effect as once. The combination gives you exactly-once semantics without the impossible delivery guarantee.

34. How do you make an API idempotent?

(1) Idempotency keys - client supplies a UUID, server dedupes by it. (2) Natural idempotency - PUT with full state instead of POST with deltas. (3) Conditional updates - "set to X only if currently Y."

35. What's a bulkhead?

Resource isolation between operations so one bad actor doesn't take everything down. Separate thread pools per dependency. Separate connection pools. If one dependency hangs, only that pool is exhausted, not all of them.

36. Timeouts - what's the right number?

Lower than the caller expects. If your API has a 5s SLO, your DB call timeout should be ~1s. Without explicit timeouts, slow dependencies become hangs that propagate. Always set them.

37. What's a deadline budget?

A total time budget for a request, propagated through every service call. Each downstream call gets a portion. When the budget runs out, the request fails fast instead of doing pointless work. gRPC has this built-in via deadlines.

38. How do you do graceful degradation?

When a dependency is unavailable, return a partial response or a sensible default instead of failing. The home page should still render even if the recommendations service is down. Cache, fallback content, or "feature unavailable" beats a 500 error.

39. What's chaos engineering?

Deliberately injecting failures (latency, errors, instance termination) to verify the system handles them. Netflix's Chaos Monkey is the canonical example. Run in pre-prod first, in prod once your system can take it. Catches bugs that load tests don't.

40. Blue-green vs canary vs rolling deploys?

Rolling - replace instances incrementally. Default in K8s.
Blue-green - two full environments, switch traffic at once. Fast rollback.
Canary - small percentage of traffic to new version, measure, expand.

Canary is the safest for serious production. Rolling is the simplest. Blue-green is great for stateful systems where you need a clean cutover.

Observability and Operations (41-50)

41. What are the three pillars of observability?

Logs, metrics, and traces. Logs - what happened. Metrics - how often and how slow. Traces - what path the request took. You need all three to debug a microservices system.

42. What is distributed tracing?

A trace ID propagates through every service call in a request. Each service emits spans with parent/child relationships. Tools (Jaeger, Tempo, Honeycomb, Datadog APM) reconstruct the call tree. Without traces, debugging cross-service issues is guessing.

43. OpenTelemetry - what is it and why does it matter?

The vendor-neutral observability standard. Defines APIs and SDKs for emitting traces, metrics, and logs. Lets you swap backends (Jaeger → Tempo → Honeycomb) without changing app code. By 2026 it's the default; vendor-specific instrumentation is a smell.

44. How do you do structured logging?

JSON logs with consistent fields: timestamp, level, service, trace_id, message, plus context-specific fields. Lets you query and aggregate. Plain text logs are unsearchable at scale.

45. Correlation ID vs Trace ID - same thing?

Effectively yes. Trace ID is the term in OpenTelemetry; correlation ID is the older term from logging-only setups. Both: a unique identifier propagated across service boundaries to tie related logs together.

46. What metrics should every service emit?

The RED method: Rate (requests per second), Errors (failed requests per second), Duration (request latency distribution). Plus the USE method for resources: Utilization, Saturation, Errors. Together they give you a quick read on health.

47. How do you debug an issue spanning 8 services?

(1) Find the trace ID from the user report or logs. (2) Pull the full distributed trace - look for the slow or errored span. (3) Drill into that service's logs filtered by trace ID. (4) Check the dependency the slow span was calling. The first time you do this without traces, you'll never want to be without them again.

48. What's the "noisy neighbor" problem?

In shared infrastructure, one service's load (CPU, IO, network) impacts others on the same host. Solutions: resource limits per pod, isolation by node pool, dedicated nodes for hot services. Common in K8s clusters with mixed workloads.

49. How do you handle versioning across services?

Semantic versioning of APIs. Backwards-compatible changes are minor; breaking changes require a new major version. Run both versions during migration. Deprecate, don't delete. The N-1 compatibility rule: any service must work with both the previous and current version of its dependencies.

50. What's the most common reason microservices fail in real companies?

Premature adoption. Teams split a small monolith into 12 services for resume reasons or because some FAANG blog said so. They get all the costs (deployment overhead, distributed debugging, observability stack) without the benefits (independent scaling, team autonomy). Six months later they're either rolling back or hiring a platform team to make it tolerable.

Final thoughts

Microservices interviews in 2026 reward nuance. Interviewers know the pendulum has swung; they want to hear you reason about tradeoffs, not parrot the orthodoxy. "When wouldn't you use microservices?" is the question that separates candidates.

If you don't have a clear answer to that, you're not ready for the architecture round.