A complete, visual guide — core concepts, internals, partitioning, ISR, system design patterns, and a Spring Boot E2E walkthrough.
Start from zero. Understand the problem Kafka solves before touching any code.
Imagine you have 10 services. Each service needs to send data to 5 other services. That's 50 direct connections — a maintenance nightmare. Every new service multiplies the complexity.
Kafka introduces a central, durable event log that all services write to and read from. Instead of 50 connections, you have N + M connections (producers + consumers).
The building blocks you must understand cold before any interview.
A Topic is a named, ordered, immutable log of events. Think of it as a database table — but append-only and time-ordered.
// Creating a topic programmatically in Java
AdminClient admin = AdminClient.create(props);
NewTopic topic = new NewTopic(
"orders", // topic name
3, // partitions
(short) 2 // replication factor
);
admin.createTopics(List.of(topic));
An Offset is a unique, sequential ID for each message within a partition. Offsets are per-partition — not global across the topic.
// Manually committing offset after processing
@KafkaListener(topics = "orders")
public void consume(ConsumerRecord<String, String> record,
Acknowledgment ack) {
processOrder(record.value());
ack.acknowledge(); // commit offset
}
A Broker is a single Kafka server. It stores partitions, handles reads/writes, and replicates data to other brokers. A Kafka cluster is a group of brokers.
# broker.properties key settings
broker.id=0
log.dirs=/var/kafka/logs
num.partitions=3
default.replication.factor=3
min.insync.replicas=2
A Consumer Group is a set of consumers sharing work. Each partition is assigned to exactly ONE consumer in the group at a time — ensuring ordered, parallel processing.
Properties props = new Properties();
props.put("group.id", "order-processing-group");
props.put("auto.offset.reset", "earliest");
// All consumers with same group.id share work
This is the most important concept for scaling Kafka. Master it.
When a producer sends a message, Kafka decides which partition it goes to using a partitioning strategy. This is crucial for both ordering guarantees and load distribution.
partition = hash(key) % numPartitions. Same key ALWAYS goes to same partition — guarantees ordering per key. Use for: user events (key=userId), order events (key=orderId).ProducerRecord<String, String> record =
new ProducerRecord<>(
"orders", // topic
"user-123", // KEY → same partition always!
orderJson // value
);
producer.send(record);
// No key = round-robin distribution
ProducerRecord<String, String> record =
new ProducerRecord<>(
"metrics", // topic
null, // no key!
metricJson // value
);
producer.send(record);
public class RegionPartitioner
implements Partitioner {
@Override
public int partition(String topic, Object key,
byte[] keyBytes, Object value,
byte[] valueBytes, Cluster cluster) {
String region = key.toString();
return switch (region) {
case "US" -> 0;
case "EU" -> 1;
default -> 2;
};
}
}
Click a message block to see its metadata. See how keys determine partition assignment.
This is a common interview question. There's no magic formula, but here's the mental model:
partitions = max(throughput_needed / single_partition_throughput, consumers_needed)
You can't have more active consumers in a group than partitions. Plan for your max parallelism first.
Each partition handles ~10-100 MB/s depending on message size and broker. If you need 1 GB/s, you need at least 10–100 partitions.
Each partition costs memory, file handles, and replication overhead. More isn't always better. Over-partitioning increases leader election time.
Replication, ISR, leader election, log compaction — the internals that make Kafka reliable.
Every partition has one leader and zero or more followers. Producers and consumers always talk to the leader. Followers replicate data from the leader continuously.
The ISR (In-Sync Replica Set) is the list of replicas that are fully caught up with the leader. This is Kafka's durability guarantee mechanism.
// Producer durability settings
props.put("acks", "all"); // wait for all ISR
props.put("retries", Integer.MAX_VALUE); // retry forever
props.put("enable.idempotence", "true"); // exactly-once
props.put("max.in.flight.requests.per.connection", "5");
| acks setting | Durability | Latency | Use case |
|---|---|---|---|
| acks=0 | Fire & forget | Lowest | Metrics, logs (OK to lose) |
| acks=1 | Leader only | Medium | Default — balanced |
| acks=all | All ISR | Higher | Financial, orders, critical |
Kafka's default retention policy deletes old messages by time or size. But log compaction is different — it keeps the latest value per key, deleting old versions.
Before Kafka 3.0, ZooKeeper managed cluster metadata (leader election, broker registry). KRaft (Kafka Raft) replaces this with a built-in consensus mechanism — fewer moving parts, faster leader election, simpler operations.
Hands-on Java examples covering everything you need for interviews and production.
public class OrderProducer {
public static void main(String[] args) {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
// Durability settings
props.put("acks", "all");
props.put("retries", 3);
props.put("enable.idempotence", "true");
try (KafkaProducer<String, String> producer =
new KafkaProducer<>(props)) {
for (int i = 0; i < 10; i++) {
ProducerRecord<String, String> record =
new ProducerRecord<>(
"orders", // topic
"order-" + i, // key
"{\"id\":" + i + "}" // value
);
producer.send(record);
}
}
}
}
// Custom serializer for domain objects
public class JsonSerializer<T> implements Serializer<T> {
private final ObjectMapper mapper = new ObjectMapper();
@Override
public byte[] serialize(String topic, T data) {
try {
return mapper.writeValueAsBytes(data);
} catch (Exception e) {
throw new SerializationException(e);
}
}
}
// Using it with an Order object
props.put("value.serializer", JsonSerializer.class);
ProducerRecord<String, Order> record =
new ProducerRecord<>("orders", order.getId(), order);
// Async send with callback — know this for interviews!
producer.send(record, (metadata, exception) -> {
if (exception != null) {
log.error("Failed to send: {}", exception.getMessage());
// handle retry or DLQ
} else {
log.info("Sent to partition={} offset={}",
metadata.partition(),
metadata.offset());
}
});
// For sync (blocking) confirmation:
RecordMetadata meta = producer.send(record).get();
// .get() blocks until broker acks
public class OrderConsumer {
public static void main(String[] args) {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "order-service-group");
props.put("key.deserializer", "...StringDeserializer");
props.put("value.deserializer", "...StringDeserializer");
props.put("auto.offset.reset", "earliest"); // start from beginning
props.put("enable.auto.commit", "false"); // manual commit!
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(props)) {
consumer.subscribe(List.of("orders"));
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
System.out.printf(
"P:%d Offset:%d Key:%s Value:%s%n",
record.partition(),
record.offset(),
record.key(),
record.value()
);
processOrder(record.value());
}
// Commit AFTER processing (at-least-once semantics)
consumer.commitSync();
}
}
}
}
Simulate Kafka message flow — produce messages and watch consumers process them across partitions.
E-commerce Order Processing & Payment/Banking — with sharding diagrams.
Orders flow through multiple services: inventory check, payment, fulfillment, notification. Kafka decouples all of them with event-driven choreography.
If one user generates 80% of traffic (think a VIP or a bot), one partition becomes a hot partition — a bottleneck. Solutions:
userId + "_" + random(0,4). Distributes load but breaks strict per-user ordering.In banking, ordering per account is critical. A debit must be processed before the next credit check, or the balance could go negative.
// Producer with exactly-once semantics (transactions)
props.put("transactional.id", "payment-service-1");
props.put("enable.idempotence", "true");
props.put("acks", "all");
producer.initTransactions();
try {
producer.beginTransaction();
producer.send(debitRecord); // debit account A
producer.send(creditRecord); // credit account B
producer.commitTransaction(); // atomic!
} catch (ProducerFencedException e) {
producer.abortTransaction(); // roll back both
}
Full Spring Boot integration. Covers config, producer, consumer, and Docker compose.
# application.yml — Kafka Spring Boot config
spring:
kafka:
bootstrap-servers: localhost:9092
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
acks: all
retries: 3
properties:
enable.idempotence: true
max.in.flight.requests.per.connection: 5
consumer:
group-id: order-service-group
auto-offset-reset: earliest
enable-auto-commit: false # manual commits!
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
properties:
spring.json.trusted.packages: com.example.model
listener:
ack-mode: MANUAL_IMMEDIATE # commit after processing
# Topic config (auto-create in dev)
app:
kafka:
topic:
orders: orders
payments: payment-events
partitions: 3
replication-factor: 1 # 3 in prod
@Service
@RequiredArgsConstructor
public class OrderProducerService {
private final KafkaTemplate<String, OrderEvent> kafkaTemplate;
@Value("${app.kafka.topic.orders}")
private String ordersTopic;
public CompletableFuture<SendResult<String, OrderEvent>>
publishOrderCreated(Order order) {
OrderEvent event = OrderEvent.builder()
.orderId(order.getId())
.userId(order.getUserId())
.amount(order.getAmount())
.status("CREATED")
.timestamp(Instant.now())
.build();
// Key = userId ensures ordering per user
return kafkaTemplate
.send(ordersTopic, order.getUserId(), event)
.whenComplete((result, ex) -> {
if (ex != null) {
log.error("Failed to publish order {}: {}",
order.getId(), ex.getMessage());
} else {
log.info("Order {} sent to P:{} O:{}",
order.getId(),
result.getRecordMetadata().partition(),
result.getRecordMetadata().offset());
}
});
}
}
@Component
@Slf4j
public class OrderEventConsumer {
private final PaymentService paymentService;
private final MeterRegistry meterRegistry;
@KafkaListener(
topics = "${app.kafka.topic.orders}",
groupId = "${spring.kafka.consumer.group-id}",
concurrency = "3" // 3 threads = 3 partitions
)
public void handleOrderEvent(
@Payload OrderEvent event,
@Header(KafkaHeaders.RECEIVED_PARTITION) int partition,
@Header(KafkaHeaders.OFFSET) long offset,
Acknowledgment ack
) {
log.info("Processing order {} from P:{} O:{}",
event.getOrderId(), partition, offset);
try {
paymentService.initiatePayment(event);
ack.acknowledge(); // commit only on success
meterRegistry.counter("kafka.orders.processed").increment();
} catch (Exception e) {
log.error("Failed processing order {}",
event.getOrderId(), e);
// Don't ack — message will be retried
// Or route to DLQ after N retries
}
}
}
# docker-compose.yml — Local Kafka dev setup
version: '3.8'
services:
kafka:
image: confluentinc/cp-kafka:7.5.0
ports:
- "9092:9092"
environment:
# KRaft mode (no ZooKeeper!)
KAFKA_NODE_ID: 1
KAFKA_PROCESS_ROLES: broker,controller
KAFKA_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@kafka:9093
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
CLUSTER_ID: MkU3OEVBNTcwNTJENDM2Qk
kafka-ui:
image: provectuslabs/kafka-ui:latest
ports:
- "8080:8080"
environment:
KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092
depends_on:
- kafka
docker compose up -d then open http://localhost:8080 for Kafka UI. You can visually inspect topics, partitions, and messages.The questions that actually get asked. Click each to reveal the model answer.
__consumer_offsets internal Kafka topic. (4) auto.offset.reset=earliest starts from offset 0; latest starts from new messages only.broker.id.min.insync.replicas=2 to prevent accepting writes with only 1 replica.replica.lag.time.max.ms). It matters because: (1) acks=all only waits for ISR members — if a slow follower falls out of ISR, it doesn't block producers. (2) min.insync.replicas defines the minimum ISR size — if ISR falls below this, Kafka refuses writes (NOT_ENOUGH_REPLICAS error). (3) During leader failure, only ISR members can become the new leader (unless unclean.leader.election.enable=true, which risks data loss). A replica is removed from ISR if it hasn't fetched from the leader within the lag timeout.enable.idempotence=true + transactional.id) + consumer with isolation.level=read_committed. Guarantees no duplicates AND no data loss. Most complex, slightly lower throughput. Required for financial transactions.max.poll.interval.ms. (4) Partitions are added to the topic. During rebalancing, all consumption stops (stop-the-world). Modern Kafka uses Cooperative Incremental Rebalancing (Kafka 2.4+) to reduce this impact — only partitions that need to move are reassigned. Use session.timeout.ms and heartbeat.interval.ms to tune liveness detection.cleanup.policy=compact. Note: compacted topics still preserve ordering within surviving offsets — they're not sorted by key. A message with offset 10 still comes after offset 5, even if they have the same key.topic-name.DLT. (4) A separate consumer monitors the DLQ for alerting/manual intervention. Spring Kafka's @RetryableTopic and DeadLetterPublishingRecoverer implement this automatically. Always include original partition, offset, and error details as headers in the DLQ message for debugging.max.in.flight.requests.per.connection=1 for strict ordering without idempotence, or use enable.idempotence=true with up to 5 in-flight requests. (3) The consumer should have only one active consumer per partition (enforced by consumer groups). Across partitions, there is no ordering guarantee. If you need total ordering, use a single-partition topic (but this kills scalability).replica.lag.time.max.ms). (2) Controller detects the leader is down (via ZooKeeper session expiry or KRaft mechanism). (3) Controller selects new leader from ISR (the most caught-up replica). (4) Controller updates partition metadata. (5) Producers and consumers are notified of new leader via metadata refresh. The failover typically completes in seconds. If ISR is empty and unclean.leader.election.enable=false (default), the partition becomes unavailable until an ISR member returns — this is the safe choice for critical data.max.poll.records, fetch.min.bytes. (5) Monitor with kafka-consumer-groups.sh --describe or Prometheus + Grafana. Alert when lag exceeds threshold (e.g., 1 million messages or 5 minutes behind)..log files — actual message data, append-only. (2) .index files — sparse index mapping offset → file position for O(log n) seeks. (3) .timeindex files — timestamp → offset mapping for time-based seeks. Kafka uses sequential I/O (append-only writes, sequential reads) — this is why it's fast even on spinning disks. Messages are also held in the OS page cache, so reads of recent messages hit memory. Log segments are rolled when they reach log.segment.bytes (default 1GB) or log.roll.ms.sendfile() syscall transfers data from page cache to network socket without copying to user space. (4) Batching — producers accumulate messages into batches (linger.ms, batch.size) before sending, amortizing network overhead. (5) Compression — batch-level compression (snappy, lz4, zstd) reduces network and storage I/O. (6) Partition parallelism — multiple partitions across multiple disks and brokers.enable.idempotence=true prevents producer-side duplicates caused by retries. (4) Transactional consumers — consume + produce + commit offset atomically. The best strategy depends on the cost of duplicates vs implementation complexity.order-created. (2) Inventory service consumes, reserves stock, publishes inventory-reserved or inventory-failed. (3) Payment service consumes inventory-reserved, processes payment, publishes payment-processed or payment-failed. (4) On any failure event, compensating transactions are published upstream. Challenges: (1) No global atomicity — compensating actions can also fail. (2) Hard to debug — need distributed tracing (correlation-id header). (3) Circular dependencies. For very complex flows, consider orchestration-based saga with Conductor or Temporal.