Skip to content

Kafka

Kafka-Arch

Kafka Architecture

Kafka’s architecture is easy to understand when broken down into simple building blocks:

  • Producer: Sends messages(data) into Kafka.
  • Topic: Logical stream where events are categorized.
  • Partition: Splits a topic into parallel logs for scalability.
  • Broker: Kafka servers that stores records/messages.
  • Consumer: Reads records from topics.
  • Zookeeper: (Optional in newer versions) coordinates Kafka brokers.

Kafka-Arch

Usage

Create a new topic

kafka-topics --create --topic order-events --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

kafka-topics --list --bootstrap-server localhost:9092

Produce Messages (simulate orders):

kafka-console-producer --topic order-events --bootstrap-server localhost:9092

kafka-console-consumer --topic order-events --bootstrap-server localhost:9092 --from-beginning

Kafka-broker

Cost-efficient Kafka

  1. Continuously optimize: Start by eliminating inactive resources such as unused topics, idle consumer groups, and idle connections. These resources consume valuable cluster resources, contribute to CPU, memory, and storage utilization, and increase rebalances. If they’re not needed, eliminate them.
  2. Shrink your payload: Enable client-level compression and use more efficient data formats like Avro or Protobuf. While Protobuf has a steeper learning curve, once implemented, your CFO will appreciate the savings, and your application will thrive.
  3. Avoid the default: Continuously fine-tune your brokers to match your current workload by updating their num.network.threads and num.io.threads. There’s no one-size-fits-all configuration, so iterate and experiment. Finding the sweet spot for both will increase your cluster’s throughput and responsiveness without adding more hardware—which would increase your spending.
  4. Adopt dynamic sizing: Shift from static to dynamic resource allocation. Ensure your Kafka clusters use only the necessary hardware and resources at any given moment.

Reference

  • https://kafka.apache.org/
  • https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/kafka.html
  • https://www.datadoghq.com/knowledge-center/apache-kafka/
  • https://www.linkedin.com/posts/stanislavkozlovski_kafka-apachekafka-dataengineering-activity-7227972197183598592-JZBc/
  • https://stackoverflow.blog/2024/09/04/best-practices-for-cost-efficient-kafka-clusters/
  • https://medium.com/@akmuthumala/introduction-to-apache-kafka-a-hands-on-guide-with-docker-bc65ae1009e5

kafka-cheatsheet

Feedback