Tutorial: Add and Remove Brokers with Auto Data Balancer in Confluent Platform¶
This tutorial runs Confluent Auto Data Balancer (ADB) on Confluent Server, which allows you to shift data to create an even workload across your cluster. By the end of this tutorial, you will have successfully run the Auto Data Balancer CLI tool to rebalance data after adding and removing brokers.
Starting in Confluent Platform 6.0.0, Manage Self-Balancing Kafka Clusters in Confluent Platform is the preferred alternative to Auto Data Balancer. For a detailed feature comparison, see Self-Balancing vs. Auto Data Balancer.
Installing and running Docker¶
For this tutorial, you run Docker using the Docker client. If you are interested in information on using Docker Compose to run the images, skip to the bottom of this guide.
To get started, install Docker and get it running. The Confluent Platform Docker Images require Docker version 1.11 or greater.
- Prerequisites
- Docker version 1.11 or later installed and running.
- If you’re running on macOS, you must use Docker for Mac.
- If you’re running on Windows, you must use Docker for Windows.
- You must allocate at least 8 GB of RAM (2 GB is the default).
- Git
- Docker version 1.11 or later installed and running.
Note
In this tutorial, Kafka is configured to store data locally in the Docker containers. For production deployments, you should use mounted volumes for persisting data in the event that a container stops running or is restarted. Kafka relies heavily on the filesystem for storing and caching messages, so this is important when running Kafka on Docker. For an example of how to add mounted volumes to the host machine, see Mount Docker External Volumes in Confluent Platform.
Use Docker to set up a three Node Kafka cluster¶
Note
In the following steps, each Docker container runs in detached mode. You are shown how to access to the
logs for a running container. You can also run the containers in the foreground by replacing the -d
flags
with -it
.
Clone the Git repository and navigate to the examples directory:
git clone [email protected]:confluentinc/kafka-images.git
cd kafka-images/examples/confluent-server
Start the services using the example Docker Compose file. The Docker Compose file has configuration properties for one KRaft controller and three Kafka brokers. These brokers are configured to be on two racks. One rack with 3 brokers is started, a topic with sample data is created, and then the Auto Data Balancer CLI tool runs to balance the cluster. After this step, you will add another rack of brokers and run the Auto Data Balancer CLI tool to rebalance the data across the newly added brokers.
Create the controller and the first rack of brokers using the Docker Compose command.
docker compose create
You should see the following
[+] Running 4/4 ✔ kafka-1 Pulled 1.9s ✔ kafka-2 Pulled 1.9s ✔ kafka-3 Pulled 1.9s ✔ controller-1 Pulled 1.9s [+] Creating 4/4 ✔ Container confluent-server-controller-1-1 Created 0.2s ✔ Container confluent-server-kafka-1-1 Created 0.3s ✔ Container confluent-server-kafka-2-1 Created 0.3s ✔ Container confluent-server-kafka-3-1 Created 0.3s
Start the services.
docker compose up -d
You should see the following
[+] Running 4/4 ✔ Container confluent-server-controller-1-1 Started 0.2s ✔ Container confluent-server-kafka-2-1 Started 0.1s ✔ Container confluent-server-kafka-3-1 Started 0.2s ✔ Container confluent-server-kafka-1-1 Started 0.2s
You can also run the following command to see the status of the containers:
docker compose ps
You should see the following:
confluent-server-controller-1-1 confluentinc/cp-server:latest "/etc/confluent/dock…" controller-1 8 minutes ago Up 47 seconds confluent-server-kafka-1-1 confluentinc/cp-server:latest "/etc/confluent/dock…" kafka-1 8 minutes ago Up 47 seconds confluent-server-kafka-2-1 confluentinc/cp-server:latest "/etc/confluent/dock…" kafka-2 8 minutes ago Up 47 seconds confluent-server-kafka-3-1 confluentinc/cp-server:latest "/etc/confluent/dock…" kafka-3 8 minutes ago Up 47 seconds
Check the Kafka logs for the destination cluster to verify that broker is healthy.
docker-compose logs kafka-1 | grep -i started
You should see message a message that looks like the following:
kafka-1-1 | [2025-04-24 21:38:22,482] INFO [ClusterLinkManager-broker-2] ClusterLinkManager has started up. (kafka.server.link.ClusterLinkManager) kafka-1-1 | [2025-04-24 21:38:22,852] INFO [BrokerServer id=2] Waiting for all of the SocketServer Acceptors to be started (kafka.server.BrokerServer) ....
Create a topic and generate data¶
Next, you will populate a topic with data and then run the Auto Data Balancer CLI tool to rebalance the data across the cluster.
Now that the brokers are up, creates a test topic called
adb-test
.docker run \ --net=host \ --rm confluentinc/cp-kafka:8.0.0 \ kafka-topics --create --topic adb-test --partitions 20 --replication-factor 3 --if-not-exists --bootstrap-server localhost:19092
You should see the following output in your terminal window:
Created topic adb-test.
Optional: Verify that the topic was created successfully:
docker run \ --net=host \ --rm confluentinc/cp-kafka:8.0.0 \ kafka-topics --describe --topic adb-test --bootstrap-server localhost:19092
You should see the following output in your terminal window:
Topic: adb-test TopicId: cjGEOHiWRJKO_HEtVbSFDA PartitionCount: 20 ReplicationFactor: 3 Configs: Topic: adb-test Partition: 0 Leader: 3 Replicas: 3,4,2 Isr: 3,4,2 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 1 Leader: 4 Replicas: 4,2,3 Isr: 4,2,3 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 2 Leader: 2 Replicas: 2,3,4 Isr: 2,3,4 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 3 Leader: 4 Replicas: 4,2,3 Isr: 4,2,3 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 4 Leader: 2 Replicas: 2,3,4 Isr: 2,3,4 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 5 Leader: 3 Replicas: 3,4,2 Isr: 3,4,2 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 6 Leader: 3 Replicas: 3,2,4 Isr: 3,2,4 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 7 Leader: 2 Replicas: 2,4,3 Isr: 2,4,3 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 8 Leader: 4 Replicas: 4,3,2 Isr: 4,3,2 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 9 Leader: 4 Replicas: 4,2,3 Isr: 4,2,3 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 10 Leader: 2 Replicas: 2,3,4 Isr: 2,3,4 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 11 Leader: 3 Replicas: 3,4,2 Isr: 3,4,2 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 12 Leader: 4 Replicas: 4,2,3 Isr: 4,2,3 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 13 Leader: 2 Replicas: 2,3,4 Isr: 2,3,4 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 14 Leader: 3 Replicas: 3,4,2 Isr: 3,4,2 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 15 Leader: 3 Replicas: 3,4,2 Isr: 3,4,2 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 16 Leader: 4 Replicas: 4,2,3 Isr: 4,2,3 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 17 Leader: 2 Replicas: 2,3,4 Isr: 2,3,4 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 18 Leader: 2 Replicas: 2,3,4 Isr: 2,3,4 Elr: N/A LastKnownElr: N/A Topic: adb-test Partition: 19 Leader: 3 Replicas: 3,4,2 Isr: 3,4,2 Elr: N/A LastKnownElr: N/A
Now add data to the new topic. For this step, you will use the Kafka Producer Perf Test tool to generate sample data and send it to the topic.
docker run \ --net=host \ --rm \ confluentinc/cp-kafka:8.0.0 \ bash -c 'kafka-producer-perf-test --topic adb-test --num-records 2000000 --record-size 1000 --throughput 100000 --producer-props bootstrap.servers=localhost:19092'
This command will use the built-in Kafka Performance Producer to produce 2 GB of sample data to the topic. Upon running it, you should see the following:
77585 records sent, 15507.7 records/sec (14.79 MB/sec), 1634.9 ms avg latency, 2793.0 ms max latency. 108416 records sent, 21635.6 records/sec (20.63 MB/sec), 1504.6 ms avg latency, 2066.0 ms max latency. 125248 records sent, 24979.7 records/sec (23.82 MB/sec), 1182.5 ms avg latency, 3347.0 ms max latency. 81216 records sent, 16217.3 records/sec (15.47 MB/sec), 2126.8 ms avg latency, 2744.0 ms max latency. 107744 records sent, 21531.6 records/sec (20.53 MB/sec), 1546.7 ms avg latency, 2100.0 ms max latency. 139488 records sent, 27830.8 records/sec (26.54 MB/sec), 1210.0 ms avg latency, 2084.0 ms max latency. 146656 records sent, 29319.5 records/sec (27.96 MB/sec), 1154.8 ms avg latency, 1793.0 ms max latency. 219680 records sent, 43918.4 records/sec (41.88 MB/sec), 736.0 ms avg latency, 1882.0 ms max latency. 143056 records sent, 28588.3 records/sec (27.26 MB/sec), 1061.5 ms avg latency, 1713.0 ms max latency. 175200 records sent, 35026.0 records/sec (33.40 MB/sec), 923.0 ms avg latency, 1614.0 ms max latency. 178096 records sent, 35576.5 records/sec (33.93 MB/sec), 962.8 ms avg latency, 1433.0 ms max latency. 176752 records sent, 35308.0 records/sec (33.67 MB/sec), 887.7 ms avg latency, 1194.0 ms max latency. 227952 records sent, 45590.4 records/sec (43.48 MB/sec), 747.4 ms avg latency, 1250.0 ms max latency. 2000000 records sent, 29761.904762 records/sec (28.38 MB/sec), 1075.94 ms avg latency, 3347.00 ms max latency, 1006 ms 50th, 2043 ms 95th, 2598 ms 99th, 3149 ms 99.9th.
Run
confluent-rebalancer
to balance the data in the cluster.docker run \ --net=host \ --rm \ confluentinc/cp-enterprise-kafka:8.0.0 \ bash -c "confluent-rebalancer execute --bootstrap-server localhost:19092 --metrics-bootstrap-server localhost:19092 --throttle 100000000 --force --verbose"
You should see the rebalancing start and should see the following:
Computing the rebalance plan (this may take a while) ... You are about to move 6 replica(s) for 6 partitions to 1 broker(s) with total size 0.9 MB. The preferred leader for 6 partition(s) will be changed. In total, the assignment for 7 partitions will be changed. The following brokers will require more disk space during the rebalance and, in some cases, after the rebalance: Broker Current (MB) During Rebalance (MB) After Rebalance (MB) 2 2,212.8 2,213.8 2,213.8 Min/max stats for brokers (before -> after): Type Leader Count Replica Count Size (MB) Min 8 (id: 2) -> 10 (id: 1) 21 (id: 2) -> 27 (id: 1) 2,069.6 (id: 1) -> 2,069.1 (id: 1) Max 12 (id: 3) -> 11 (id: 2) 30 (id: 1) -> 27 (id: 1) 2,212.8 (id: 2) -> 2,213.8 (id: 2) Rack stats (before -> after): Rack Leader Count Replica Count Size (MB) rack-a 31 -> 31 81 -> 81 6,352 -> 6,352 Broker stats (before -> after): Broker Leader Count Replica Count Size (MB) 1 11 -> 10 30 -> 27 2,069.6 -> 2,069.1 2 8 -> 11 21 -> 27 2,212.8 -> 2,213.8 3 12 -> 10 30 -> 27 2,069.6 -> 2,069.1 The rebalance has been started, run ``status`` to check progress. Warning: You must run the ``status`` or ``finish`` command periodically, until the rebalance completes, to ensure the throttle is removed. You can also alter the throttle by re-running the execute command passing a new value.
You can check the status of the rebalance operation by running the following command:
docker run \ --net=host \ --rm \ confluentinc/cp-enterprise-kafka:8.0.0 \ bash -c "confluent-rebalancer status --bootstrap-server localhost:19092"
If you see the a message like
7 partitions are being rebalanced
, wait for 15-20 seconds and rerun the above command until you seeNo rebalance is currently in progress
. This means that the rebalance action has completed successfully.You can finish the rebalance action by running the following command (this command ensures that the replication throttle is removed):
docker run \ --net=host \ --rm \ confluentinc/cp-enterprise-kafka:8.0.0 \ bash -c "confluent-rebalancer finish --bootstrap-server localhost:19092"
You should see the following in the logs:
The rebalance has completed and throttling has been disabled
Now you can try removing a broker and running the rebalance operation again.
Hint: You must notify the rebalancer to exclude broker from the rebalance plan. For example, to remove broker 1 you must run the following command:
docker run \ --net=host \ --rm \ confluentinc/cp-enterprise-kafka:8.0.0 \ bash -c "confluent-rebalancer execute --bootstrap-server localhost:19092 --metrics-bootstrap-server localhost:19092 --throttle 100000000 --force --verbose --remove-broker-ids 2"
You can optionally experiment with the
confluent-rebalance
command. When you are done, use the following commands to shut down all the components.docker-compose stop
If you want to remove all the containers, run:
docker-compose rm