0% found this document useful (0 votes)
7 views

Kafka Interview Guide

Apache Kafka is a distributed streaming platform designed for real-time data pipelines and applications, featuring core concepts such as producers, consumers, topics, and partitions. Its architecture ensures high throughput, fault tolerance, and message ordering, while it also provides various CLI commands for managing topics and consumers. Best practices include using multiple partitions, setting appropriate replication factors, and securing Kafka with SSL and ACLs.

Uploaded by

sxqjvsjng4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Kafka Interview Guide

Apache Kafka is a distributed streaming platform designed for real-time data pipelines and applications, featuring core concepts such as producers, consumers, topics, and partitions. Its architecture ensures high throughput, fault tolerance, and message ordering, while it also provides various CLI commands for managing topics and consumers. Best practices include using multiple partitions, setting appropriate replication factors, and securing Kafka with SSL and ACLs.

Uploaded by

sxqjvsjng4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Apache Kafka Interview Guide

1. What is Kafka?

Apache Kafka is a distributed streaming platform used for building real-time data pipelines and streaming

applications. It is highly scalable, fault-tolerant, and high-throughput.

Main Use Cases:

- Real-time messaging

- Stream processing

- Event sourcing

- Log aggregation

2. Core Concepts

- **Producer**: Sends records to Kafka topics.

- **Consumer**: Reads records from topics.

- **Topic**: A category/feed name to which records are sent.

- **Partition**: A topic can be split into partitions for scalability.

- **Broker**: Kafka server that stores data and serves clients.

- **Zookeeper**: Coordinates and manages Kafka brokers (Kafka 3.0+ can run without it using KRaft mode).

- **Consumer Group**: A group of consumers that share the work of consuming records.

3. Kafka Architecture

- Topics are split into partitions.

- Each partition is replicated for fault-tolerance.

- Producers write to topics, brokers distribute them.

- Consumers read from partitions in a consumer group.


Apache Kafka Interview Guide

- Kafka guarantees message ordering within a partition.

4. Key Features

- High throughput and low latency.

- Distributed and horizontally scalable.

- Persistent storage using commit logs.

- Exactly-once semantics (with configurations).

- Stream processing via Kafka Streams or ksqlDB.

5. Kafka CLI Commands

# Start Zookeeper (if needed)

bin/zookeeper-server-start.sh config/zookeeper.properties

# Start Kafka broker

bin/kafka-server-start.sh config/server.properties

# Create a topic

bin/kafka-topics.sh --create --topic test --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

# List topics

bin/kafka-topics.sh --list --bootstrap-server localhost:9092

# Describe a topic

bin/kafka-topics.sh --describe --topic test --bootstrap-server localhost:9092


Apache Kafka Interview Guide

# Start a producer

bin/kafka-console-producer.sh --topic test --bootstrap-server localhost:9092

# Start a consumer

bin/kafka-console-consumer.sh --topic test --from-beginning --bootstrap-server localhost:9092

6. Kafka with Spring Boot

- Use Spring Kafka dependency:

<dependency>

<groupId>org.springframework.kafka</groupId>

<artifactId>spring-kafka</artifactId>

</dependency>

- Create Kafka producer and consumer configs

- Use @KafkaListener for consuming messages

- Use KafkaTemplate to send messages

Example:

@Autowired

private KafkaTemplate<String, String> kafkaTemplate;

kafkaTemplate.send("test", "Hello Kafka!");

7. Common Kafka Interview Questions

- What is Kafka and why is it used?


Apache Kafka Interview Guide

- Difference between Kafka and traditional messaging systems?

- How does Kafka achieve high throughput?

- What happens if a Kafka consumer fails?

- What are Kafka offsets and how are they managed?

- Explain Kafka topic partitioning.

- Difference between at-least-once, at-most-once, and exactly-once delivery?

- How does Kafka ensure data durability?

8. Best Practices

- Use multiple partitions for scalability.

- Replication factor >= 2 for fault-tolerance.

- Monitor lag using Kafka tools.

- Avoid committing offsets too frequently or too late.

- Set retention policies wisely (log.retention.hours).

- Secure Kafka with SSL, SASL, and ACLs.

You might also like