0% found this document useful (0 votes)
99 views7 pages

Flume PDF

Flume is a distributed system for collecting, aggregating, and moving large amounts of log data from different sources to a centralized data store. It uses agents that contain sources to collect data, channels to buffer data, and sinks to export the data to destinations such as HDFS. Common sources include the Avro, exec, and file sources, channels include memory and file channels, and sinks include HDFS, logger, and file roll sinks. Flume is configured using text configuration files defining the agents and components.

Uploaded by

chandra reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views7 pages

Flume PDF

Flume is a distributed system for collecting, aggregating, and moving large amounts of log data from different sources to a centralized data store. It uses agents that contain sources to collect data, channels to buffer data, and sinks to export the data to destinations such as HDFS. Common sources include the Avro, exec, and file sources, channels include memory and file channels, and sinks include HDFS, logger, and file roll sinks. Flume is configured using text configuration files defining the agents and components.

Uploaded by

chandra reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Flume

What is Flume?

• Is a distributed service for collecting, aggregating, and


moving large data to a centralized data store
• Was developed by Apache
• Has the following features:
– Simple
– Reliable
– Fault tolerant
– Used for online analytic applications

6-2
Flume: Architecture

Source Sink

Channel

Agent

HDFS
Web
Server

6-3
Flume Sources (Consume Events)

• Avro source
• Exec source
• Spooling Directory source
• Sequence Generator source
• Syslog source
• HTTP source
Source Sink
• Custom source

Channel

Agent

Web
HDFS
Server

6-4
Flume Channels (Hold Events)

• Memory channel
• JDBC channel
• File channel
• Custom channel

Source Sink

Channel

Agent

Web
HDFS
Server

6-5
Flume Sinks (Deliver Events)

• HDFS sink
• Logger sink
• Avro sink
• IRC sink
• File Roll sink
• Null sink Source Sink
• HBase sink
• AsyncHBaseSink
• ElasticSearchSink Channel

• Custom sink Agent

Web HDFS
Server

6-6
Configuring Flume

1. Create a configuration file (flume.conf).


2. Store the file in the flume-ng/conf directory.
3. Configure individual components.
4. (Optional) Edit flume-env.sh.
5. Verify the installation by running the following command:
$ flume-ng help

6-7

You might also like