Apache Flume Vs Apache Kafka

Kafka Flume
Publish subscribe messaging system Its a service for collecting, aggregating and moving the large amounts of data to hadoop or process and persists the data into a relational database systems
The messages are replicated in multiple broker nodes, so in case of failure, we can easily retrieve back the message It does not replicates the events/data, so in case of node failure, the data will be lost
Its a pull messaging system so the message is still available for some number of days. So the client with different consumer group can pull the message Data is pushed to the destination which could be logger, hadoop or Custom Sink. So the messages wont be stored as like in Kafka

Both systems can be used together. So the messages can be pushed to Kafka and the same would be consumed by Flume agent with KafkaSource and the data also can be pushed to the KafkaSink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s