Created on 08-04-2016 11:03 AM - edited 08-17-2019 10:55 AM
This article is applicable for anyone deploying a Kafka cluster for production use. It points out certain high level points that you should be thinking about before deploying your cluster.
It is divided into 3 parts: Important configurations for the Producer, Broker, Consumer and tips for Performance tuning.
The most important producer configurations:
The compression type for all data generated by the producer. The default is none (i.e. no compression). Valid values arenone, gzip, snappy, or lz4. Compression is of full batches of data, so the efficacy of batching will also impact the compression ratio (more batching means better compression).
Name:
compression.type
Default:
None
Batching is one of the big drivers of efficiency, and to enable batching the Kafka producer will attempt to accumulate data in memory and to send out larger batches in a single request.
Name:
producer.type
Default:
sync
A small batch size will make batching less common and may reduce throughput (a batch size of zero will disable batching entirely). A very large batch size may use memory a bit more wastefully as we will always allocate a buffer of the specified batch size in anticipation of additional records.
Name:
batch.size
Default:
200
This is largest message size Kafka will allow to be appended to this topic. Note that if you increase this size you must also increase your consumer's fetch size so they can fetch messages this large.
Account for this when doing disk sizing. Average Message size+Retention Period * Replication factor
Name:
max.message.bytes
Default: 1,000,000
The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are common:
acks=0 If set to zero then the producer will not wait for any acknowledgment from the server at all. The record will be immediately added to the socket buffer and considered sent. No guarantee can be made that the server has received the record in this case, and theretries configuration will not take effect (as the client won't generally know of any failures). The offset given back for each record will always be set to -1.
acks=1 This will mean the leader will write the record to its local log but will respond without awaiting full acknowledgement from all followers. In this case should the leader fail immediately after acknowledging the record but before the followers have replicated it then the record will be lost.
acks=all This means the leader will wait for the full set of in-sync replicas to acknowledge the record. This guarantees that the record will not be lost as long as at least one in-sync replica remains alive. This is the strongest available guarantee.
Name:
acks
Default:
1
To see full list of producer configs:
http://kafka.apache.org/documentation.html#producerconfigs
The important configurations are the following:
Specify the final compression type for a given topic. This configuration accepts the standard compression codecs ('gzip', 'snappy', 'lz4'). It additionally accepts 'uncompressed' which is equivalent to no compression; and 'producer' which means retain the original compression codec set by the producer.
Name:
compression.type
Default:
producer
Number of fetcher threads used to replicate messages from a source broker. Increasing this value can increase the degree of I/O parallelism in the follower broker.
Name:
num.replica.fetchers
Default:
1
The replication factor for the offsets topic (set higher to ensure availability). To ensure that the effective replication factor of the offsets topic is the configured value, the number of alive brokers has to be at least the replication factor at the time of the first request for the offsets topic. If not, either the offsets topic creation will fail or it will get a replication factor of min(alive brokers, configured replication factor).
Name:
offsets.topic.replication.factor
Default:
3
max.partition.fetch.bytes
#partitions * max.partition.fetch.bytes
. This size must be at least as large as the message.max.bytes
or else it is possible for the producer to send messages larger than the consumer can fetch. If that happens, the consumer can get stuck trying to fetch a large message on a certain partition.
message.max.bytes
or else it is possible for the producer to send messages larger than the consumer can fetch.
num.io.threads
should be greater than the number of disks dedicated for Kafka. I strongly recommend to start with same number of disks first.num.network.threads
adjust based on the number of producers + number of consumers + replication factorlog.flush.interval.message
log.flush.interval.message
log.flush.interval.ms.per.topic
Source: http://queue.acm.org/detail.cfm?id=1563874
enable.auto.commit
, add longer intervals for auto.commit.interval.ms
(default is 500 ms)SOURCE: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap...
Created on 02-27-2017 11:54 AM
some config you must explain explicitly,such as unclean.leader.election.enable,min.insync.replicas,num.replica.fetchers
Created on 04-16-2018 05:23 PM
Nice Article.
I am trying to access full list of producer configurations but not able to access it, please update the article :
http://kafka.apache.org/documentation.html#producerconfigs
Created on 10-02-2018 01:14 PM
In the Kafka heap size section, it is mentioned as 4 GB is the ideal one, but in the example, it is given as 16 GB, please see below:
Try to keep the Kafka Heap size below 4GB.
Created on 05-01-2019 08:25 PM
Hi Vedant!
You state:
num.io.threads should be greater than the number of disks dedicated for Kafka. I strongly recommend to start with same number of disks first.
Is num.io.threads to be calculated as the number of disks per node allocated to Kafka or the total number of disk for Kafka for the entire cluster?
I'm guessing disks per node dedicated for Kafka, but I wanted to confirm.
Thanks,
Jeff G.