0% found this document useful (0 votes)

11 views

Flume Developer Guide

Uploaded by

aurshiv123

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Flume Developer Guide

Uploaded by

aurshiv123

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

™

Apache
Flume ™

Flume 1.9.0 Developer Guide

Introduction

Overview
Apache Flume is a distributed, reliable, and available system for efficiently collecting,
aggregating and moving large amounts of log data from many different sources to a centralized
data store.

Apache Flume is a top-level project at the Apache Software Foundation. There are currently
two release code lines available, versions 0.9.x and 1.x. This documentation applies to the 1.x
codeline. For the 0.9.x codeline, please see the Flume 0.9.x Developer Guide.

Architecture

Data flow model

An Event is a unit of data that flows through a Flume agent. The Event flows from Source to
Channel to Sink , and is represented by an implementation of the Event interface. An Event
carries a payload (byte array) that is accompanied by an optional set of headers (string
attributes). A Flume agent is a process (JVM) that hosts the components that allow Event s to
flow from an external source to a external destination.

A Sourceconsumes Event s having a specific format, and those Event s are delivered to the
Source by an external source like a web server. For example, an AvroSource can be used to
receive Avro Event s from clients or from other Flume agents in the flow. When a Source
receives an Event , it stores it into one or more Channel s. The Channel is a passive store that
holds the Event until that Event is consumed by a Sink . One type of Channel available in Flume
is the FileChannel which uses the local filesystem as its backing store. A Sink is responsible for
removing an Event from the Channel and putting it into an external repository like HDFS (in the
case of an HDFSEventSink ) or forwarding it to the Source at the next hop of the flow. The Source
and Sink within the given agent run asynchronously with the Event s staged in the Channel .

Reliability

An Event is staged in a Flume agent’s Channel . Then it’s the Sink ‘s responsibility to deliver the
Event to the next agent or terminal repository (like HDFS) in the flow. The Sink removes an
Event from the Channel only after the Event is stored into the Channel of the next agent or stored
in the terminal repository. This is how the single-hop message delivery semantics in Flume
provide end-to-end reliability of the flow. Flume uses a transactional approach to guarantee the
reliable delivery of the Event s. The Source s and Sink s encapsulate the storage/retrieval of the
Event s in a Transaction provided by the Channel . This ensures that the set of Event s are reliably
passed from point to point in the flow. In the case of a multi-hop flow, the Sink from the previous
hop and the Source of the next hop both have their Transaction s open to ensure that the Event
data is safely stored in the Channel of the next hop.

Building Flume

Getting the source

Check-out the code using Git. Click here for the git repository root.

The Flume 1.x development happens under the branch “trunk” so this command line can be
used:

git clone https://git-wip-us.apache.org/repos/asf/flume.git

Compile/test Flume

The Flume build is mavenized. You can compile Flume using the standard Maven commands:

1. Compile only: mvn clean compile

2. Compile and run unit tests: mvn clean test
3. Run individual test(s): mvn clean test -Dtest=<Test1>,<Test2>,... -DfailIfNoTests=false
4. Create tarball package: mvn clean install
5. Create tarball package (skip unit tests): mvn clean install -DskipTests

Please note that Flume builds requires that the Google Protocol Buffers compiler be in the path.
You can download and install it by following the instructions here.

Updating Protocol Buffer Version

File channel has a dependency on Protocol Buffer. When updating the version of Protocol
Buffer used by Flume, it is necessary to regenerate the data access classes using the protoc
compiler that is part of Protocol Buffer as follows.

1. Install the desired version of Protocol Buffer on your local machine

2. Update version of Protocol Buffer in pom.xml
3. Generate new Protocol Buffer data access classes in Flume: cd flume-ng-channels/flume-
file-channel; mvn -P compile-proto clean package -DskipTests
4. Add Apache license header to any of the generated files that are missing it
5. Rebuild and test Flume: cd ../..; mvn clean install

Developing custom components

Client

The client operates at the point of origin of events and delivers them to a Flume agent. Clients
typically operate in the process space of the application they are consuming data from. Flume
currently supports Avro, log4j, syslog, and Http POST (with a JSON body) as ways to transfer
data from a external source. Additionally, there’s an ExecSource that can consume the output of
a local process as input to Flume.

It’s quite possible to have a use case where these existing options are not sufficient. In this
case you can build a custom mechanism to send data to Flume. There are two ways of
achieving this. The first option is to create a custom client that communicates with one of
Flume’s existing Source s like AvroSource or SyslogTcpSource . Here the client should convert its
data into messages understood by these Flume Source s. The other option is to write a custom
Flume Source that directly talks with your existing client application using some IPC or RPC
protocol, and then converts the client data into Flume Event s to be sent downstream. Note that
all events stored within the Channel of a Flume agent must exist as Flume Event s.

Client SDK

Though Flume contains a number of built-in mechanisms (i.e. Source s) to ingest data, often one
wants the ability to communicate with Flume directly from a custom application. The Flume
Client SDK is a library that enables applications to connect to Flume and send data into
Flume’s data flow over RPC.

RPC client interface

An implementation of Flume’s RpcClient interface encapsulates the RPC mechanism supported

by Flume. The user’s application can simply call the Flume Client SDK’s append(Event) or
appendBatch(List<Event>) to send data and not worry about the underlying message exchange
details. The user can provide the required Event arg by either directly implementing the Event
interface, by using a convenience implementation such as the SimpleEvent class, or by using
EventBuilder ‘s overloaded withBody() static helper methods.

RPC clients - Avro and Thrift

As of Flume 1.4.0, Avro is the default RPC protocol. The NettyAvroRpcClient and
ThriftRpcClient implement the RpcClient interface. The client needs to create this object with
the host and port of the target Flume agent, and can then use the RpcClient to send data into
the agent. The following example shows how to use the Flume Client SDK API within a user’s
data-generating application:
import org.apache.flume.Event;
import org.apache.flume.EventDeliveryException;
import org.apache.flume.api.RpcClient;
import org.apache.flume.api.RpcClientFactory;
import org.apache.flume.event.EventBuilder;
import java.nio.charset.Charset;

public class MyApp {

public static void main(String[] args) {
MyRpcClientFacade client = new MyRpcClientFacade();
// Initialize client with the remote Flume agent's host and port
client.init("host.example.org", 41414);

// Send 10 events to the remote Flume agent. That agent should be

// configured to listen with an AvroSource.
String sampleData = "Hello Flume!";
for (int i = 0; i < 10; i++) {
client.sendDataToFlume(sampleData);
}

client.cleanUp();
}
}

class MyRpcClientFacade {
private RpcClient client;
private String hostname;
private int port;

public void init(String hostname, int port) {

// Setup the RPC connection
this.hostname = hostname;
this.port = port;
this.client = RpcClientFactory.getDefaultInstance(hostname, port);
// Use the following method to create a thrift client (instead of the above line):
// this.client = RpcClientFactory.getThriftInstance(hostname, port);
}

public void sendDataToFlume(String data) {

// Create a Flume Event object that encapsulates the sample data
Event event = EventBuilder.withBody(data, Charset.forName("UTF-8"));

// Send the event

try {
client.append(event);
} catch (EventDeliveryException e) {
// clean up and recreate the client
client.close();
client = null;
client = RpcClientFactory.getDefaultInstance(hostname, port);
// Use the following method to create a thrift client (instead of the above line
// this.client = RpcClientFactory.getThriftInstance(hostname, port);
}
}

public void cleanUp() {

// Close the RPC connection
client.close();
}

The remote Flume agent needs to have an AvroSource (or a ThriftSource if you are using a
Thrift client) listening on some port. Below is an example Flume agent configuration that’s
waiting for a connection from MyApp:
a1.channels = c1
a1.sources = r1
a1.sinks = k1

a1.channels.c1.type = memory

a1.sources.r1.channels = c1
a1.sources.r1.type = avro
# For using a thrift source set the following instead of the above line.
# a1.source.r1.type = thrift
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 41414

a1.sinks.k1.channel = c1
a1.sinks.k1.type = logger

For more flexibility, the default Flume client implementations ( NettyAvroRpcClient and
ThriftRpcClient ) can be configured with these properties:

client.type = default (for avro) or thrift (for thrift)

hosts = h1 # default client accepts only 1 host

# (additional hosts will be ignored)

hosts.h1 = host1.example.org:41414 # host and port must both be specified

# (neither has a default)

batch-size = 100 # Must be >=1 (default: 100)

connect-timeout = 20000 # Must be >=1000 (default: 20000)

request-timeout = 20000 # Must be >=1000 (default: 20000)

Secure RPC client - Thrift

As of Flume 1.6.0, Thrift source and sink supports kerberos based authentication. The client
needs to use the getThriftInstance method of SecureRpcClientFactory to get hold of a
SecureThriftRpcClient . SecureThriftRpcClient extends ThriftRpcClient which implements the
RpcClient interface. The kerberos authentication module resides in flume-ng-auth module
which is required in classpath, when using the SecureRpcClientFactory . Both the client principal
and the client keytab should be passed in as parameters through the properties and they reflect
the credentials of the client to authenticate against the kerberos KDC. In addition, the server
principal of the destination Thrift source to which this client is connecting to, should also be
provided. The following example shows how to use the SecureRpcClientFactory within a user’s
data-generating application:

import org.apache.flume.Event;
import org.apache.flume.EventDeliveryException;
import org.apache.flume.event.EventBuilder;
import org.apache.flume.api.SecureRpcClientFactory;
import org.apache.flume.api.RpcClientConfigurationConstants;
import org.apache.flume.api.RpcClient;
import java.nio.charset.Charset;
import java.util.Properties;

public class MyApp {

public static void main(String[] args) {
MySecureRpcClientFacade client = new MySecureRpcClientFacade();
// Initialize client with the remote Flume agent's host, port
Properties props = new Properties();
props.setProperty(RpcClientConfigurationConstants.CONFIG_CLIENT_TYPE, "thrift");
props.setProperty("hosts", "h1");
props.setProperty("hosts.h1", "client.example.org"+":"+ String.valueOf(41414));

// Initialize client with the kerberos authentication related properties

props.setProperty("kerberos", "true");
props.setProperty("client-principal", "flumeclient/[email protected]"
props.setProperty("client-keytab", "/tmp/flumeclient.keytab");
props.setProperty("server-principal", "flume/[email protected]");
client.init(props);

// Send 10 events to the remote Flume agent. That agent should be

// configured to listen with an AvroSource.
String sampleData = "Hello Flume!";
for (int i = 0; i < 10; i++) {
client.sendDataToFlume(sampleData);
}

client.cleanUp();
}
}

class MySecureRpcClientFacade {
private RpcClient client;
private Properties properties;

public void init(Properties properties) {

// Setup the RPC connection
this.properties = properties;
// Create the ThriftSecureRpcClient instance by using SecureRpcClientFactory
this.client = SecureRpcClientFactory.getThriftInstance(properties);
}

public void sendDataToFlume(String data) {

// Create a Flume Event object that encapsulates the sample data
Event event = EventBuilder.withBody(data, Charset.forName("UTF-8"));

// Send the event

try {
client.append(event);
} catch (EventDeliveryException e) {
// clean up and recreate the client
client.close();
client = null;
client = SecureRpcClientFactory.getThriftInstance(properties);
}
}

public void cleanUp() {

// Close the RPC connection
client.close();
}
}

The remote ThriftSource should be started in kerberos mode. Below is an example Flume
agent configuration that’s waiting for a connection from MyApp:

a1.channels = c1
a1.sources = r1
a1.sinks = k1

a1.channels.c1.type = memory

a1.sources.r1.channels = c1
a1.sources.r1.type = thrift
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 41414
a1.sources.r1.kerberos = true
a1.sources.r1.agent-principal = flume/[email protected]
a1.sources.r1.agent-keytab = /tmp/flume.keytab

a1.sinks.k1.channel = c1
a1.sinks.k1.type = logger

Failover Client

This class wraps the default Avro RPC client to provide failover handling capability to clients.
This takes a whitespace-separated list of <host>:<port> representing the Flume agents that
make-up a failover group. The Failover RPC Client currently does not support thrift. If there’s a
communication error with the currently selected host (i.e. agent) agent, then the failover client
automatically fails-over to the next host in the list. For example:

// Setup properties for the failover

Properties props = new Properties();
props.put("client.type", "default_failover");

// List of hosts (space-separated list of user-chosen host aliases)

props.put("hosts", "h1 h2 h3");

// host/port pair for each host alias

String host1 = "host1.example.org:41414";
String host2 = "host2.example.org:41414";
String host3 = "host3.example.org:41414";
props.put("hosts.h1", host1);
props.put("hosts.h2", host2);
props.put("hosts.h3", host3);

// create the client with failover properties

RpcClient client = RpcClientFactory.getInstance(props);

For more flexibility, the failover Flume client implementation ( FailoverRpcClient ) can be
configured with these properties:

client.type = default_failover

hosts = h1 h2 h3 # at least one is required, but 2 or

# more makes better sense

hosts.h1 = host1.example.org:41414

hosts.h2 = host2.example.org:41414

hosts.h3 = host3.example.org:41414

max-attempts = 3 # Must be >=0 (default: number of hosts

# specified, 3 in this case). A '0'
# value doesn't make much sense because
# it will just cause an append call to
# immmediately fail. A '1' value means
# that the failover client will try only
# once to send the Event, and if it
# fails then there will be no failover
# to a second client, so this value
# causes the failover client to
# degenerate into just a default client.
# It makes sense to set this value to at
# least the number of hosts that you
# specified.

batch-size = 100 # Must be >=1 (default: 100)

connect-timeout = 20000 # Must be >=1000 (default: 20000)

request-timeout = 20000 # Must be >=1000 (default: 20000)

LoadBalancing RPC client

The Flume Client SDK also supports an RpcClient which load-balances among multiple hosts.
This type of client takes a whitespace-separated list of <host>:<port> representing the Flume
agents that make-up a load-balancing group. This client can be configured with a load
balancing strategy that either randomly selects one of the configured hosts, or selects a host in
a round-robin fashion. You can also specify your own custom class that implements the
LoadBalancingRpcClient$HostSelector interface so that a custom selection order is used. In that
case, the FQCN of the custom class needs to be specified as the value of the host-selector
property. The LoadBalancing RPC Client currently does not support thrift.

If backoff is enabled then the client will temporarily blacklist hosts that fail, causing them to be
excluded from being selected as a failover host until a given timeout. When the timeout
elapses, if the host is still unresponsive then this is considered a sequential failure, and the
timeout is increased exponentially to avoid potentially getting stuck in long waits on
unresponsive hosts.

The maximum backoff time can be configured by setting maxBackoff (in milliseconds). The
maxBackoff default is 30 seconds (specified in the OrderSelector class that’s the superclass of
both load balancing strategies). The backoff timeout will increase exponentially with each
sequential failure up to the maximum possible backoff timeout. The maximum possible backoff
is limited to 65536 seconds (about 18.2 hours). For example:

// Setup properties for the load balancing

Properties props = new Properties();
props.put("client.type", "default_loadbalance");

// List of hosts (space-separated list of user-chosen host aliases)

props.put("hosts", "h1 h2 h3");

// host/port pair for each host alias

props.put("host-selector", "random"); // For random host selection

// props.put("host-selector", "round_robin"); // For round-robin host
// // selection
props.put("backoff", "true"); // Disabled by default.

props.put("maxBackoff", "10000"); // Defaults 0, which effectively

// becomes 30000 ms

// Create the client with load balancing properties

RpcClient client = RpcClientFactory.getInstance(props);

For more flexibility, the load-balancing Flume client implementation ( LoadBalancingRpcClient )

can be configured with these properties:

client.type = default_loadbalance

hosts = h1 h2 h3 # At least 2 hosts are required

hosts.h1 = host1.example.org:41414

hosts.h2 = host2.example.org:41414

hosts.h3 = host3.example.org:41414

backoff = false # Specifies whether the client should

# back-off from (i.e. temporarily
# blacklist) a failed host
# (default: false).

maxBackoff = 0 # Max timeout in millis that a will

# remain inactive due to a previous
# failure with that host (default: 0,
# which effectively becomes 30000)

host-selector = round_robin # The host selection strategy used

# when load-balancing among hosts
# (default: round_robin).
# Other values are include "random"
# or the FQCN of a custom class
# that implements
# LoadBalancingRpcClient$HostSelector

batch-size = 100 # Must be >=1 (default: 100)

connect-timeout = 20000 # Must be >=1000 (default: 20000)

request-timeout = 20000 # Must be >=1000 (default: 20000)

Embedded agent

Flume has an embedded agent api which allows users to embed an agent in their application.
This agent is meant to be lightweight and as such not all sources, sinks, and channels are
allowed. Specifically the source used is a special embedded source and events should be send
to the source via the put, putAll methods on the EmbeddedAgent object. Only File Channel and
Memory Channel are allowed as channels while Avro Sink is the only supported sink.
Interceptors are also supported by the embedded agent.

Note: The embedded agent has a dependency on hadoop-core.jar.

Configuration of an Embedded Agent is similar to configuration of a full Agent. The following is

an exhaustive list of configration options:

Required properties are in bold.

Property Name Default Description

source.type embedded The only available source is the embedded source.
channel.type – Either memory or file which correspond to
MemoryChannel and FileChannel respectively.
channel.* – Configuration options for the channel type requested,
see MemoryChannel or FileChannel user guide for an
exhaustive list.
sinks – List of sink names
sink.type – Property name must match a name in the list of sinks.
Value must be avro
sink.* – Configuration options for the sink. See AvroSink user
guide for an exhaustive list, however note AvroSink
requires at least hostname and port.
Property Name Default Description
processor.type – Either failover or load_balance which correspond to
FailoverSinksProcessor and
LoadBalancingSinkProcessor respectively.
processor.* – Configuration options for the sink processor selected.
See FailoverSinksProcessor and
LoadBalancingSinkProcessor user guide for an
exhaustive list.
source.interceptors – Space-separated list of interceptors
source.interceptors.* – Configuration options for individual interceptors specified
in the source.interceptors property

Below is an example of how to use the agent:

Map<String, String> properties = new HashMap<String, String>();

properties.put("channel.type", "memory");
properties.put("channel.capacity", "200");
properties.put("sinks", "sink1 sink2");
properties.put("sink1.type", "avro");
properties.put("sink2.type", "avro");
properties.put("sink1.hostname", "collector1.apache.org");
properties.put("sink1.port", "5564");
properties.put("sink2.hostname", "collector2.apache.org");
properties.put("sink2.port", "5565");
properties.put("processor.type", "load_balance");
properties.put("source.interceptors", "i1");
properties.put("source.interceptors.i1.type", "static");
properties.put("source.interceptors.i1.key", "key1");
properties.put("source.interceptors.i1.value", "value1");

EmbeddedAgent agent = new EmbeddedAgent("myagent");

agent.configure(properties);
agent.start();

List<Event> events = Lists.newArrayList();

events.add(event);
events.add(event);
events.add(event);
events.add(event);

agent.putAll(events);

...

agent.stop();

Transaction interface

The Transaction interface is the basis of reliability for Flume. All the major components (ie.
Source s, Sink s and Channel s) must use a Flume Transaction .
A Transaction is implemented within a Channel implementation. Each Source and Sink that is
connected to a Channel must obtain a Transaction object. The Source s use a ChannelProcessor
to manage the Transaction s, the Sink s manage them explicitly via their configured Channel . The
operation to stage an Event (put it into a Channel ) or extract an Event (take it out of a Channel ) is
done inside an active Transaction . For example:

Channel ch = new MemoryChannel();

Transaction txn = ch.getTransaction();
txn.begin();
try {
// This try clause includes whatever Channel operations you want to do

Event eventToStage = EventBuilder.withBody("Hello Flume!",

Charset.forName("UTF-8"));
ch.put(eventToStage);
// Event takenEvent = ch.take();
// ...
txn.commit();
} catch (Throwable t) {
txn.rollback();

// Log exception, handle individual exceptions as needed

// re-throw all Errors

if (t instanceof Error) {
throw (Error)t;
}
} finally {
txn.close();
}

Here we get hold of a Transaction from a Channel . After begin() returns, the Transaction is now
active/open and the Event is then put into the Channel . If the put is successful, then the
Transaction is committed and closed.

Sink

The purpose of a Sink to extract Event s from the Channel and forward them to the next Flume
Agent in the flow or store them in an external repository. A Sink is associated with exactly one
Channel s, as configured in the Flume properties file. There’s one SinkRunner instance
associated with every configured Sink , and when the Flume framework calls
SinkRunner.start() , a new thread is created to drive the Sink (using SinkRunner.PollingRunner
as the thread’s Runnable ). This thread manages the Sink ’s lifecycle. The Sink needs to
implement the start() and stop() methods that are part of the LifecycleAware interface. The
Sink.start() method should initialize the Sink and bring it to a state where it can forward the
Event s to its next destination. The Sink.process() method should do the core processing of
extracting the Event from the Channel and forwarding it. The Sink.stop() method should do the
necessary cleanup (e.g. releasing resources). The Sink implementation also needs to
implement the Configurable interface for processing its own configuration settings. For
example:

public class MySink extends AbstractSink implements Configurable {

private String myProp;

@Override
public void configure(Context context) {
String myProp = context.getString("myProp", "defaultValue");

// Process the myProp value (e.g. validation)

// Store myProp for later retrieval by process() method

this.myProp = myProp;
}

@Override
public void start() {
// Initialize the connection to the external repository (e.g. HDFS) that
// this Sink will forward Events to ..
}

@Override
public void stop () {
// Disconnect from the external respository and do any
// additional cleanup (e.g. releasing resources or nulling-out
// field values) ..
}

@Override
public Status process() throws EventDeliveryException {
Status status = null;

// Start transaction
Channel ch = getChannel();
Transaction txn = ch.getTransaction();
txn.begin();
try {
// This try clause includes whatever Channel operations you want to do

Event event = ch.take();

// Send the Event to the external repository.

// storeSomeData(e);

txn.commit();
status = Status.READY;
} catch (Throwable t) {
txn.rollback();

// Log exception, handle individual exceptions as needed

status = Status.BACKOFF;

// re-throw all Errors

if (t instanceof Error) {
throw (Error)t;
}
}
return status;
}
}

Source

The purpose of a Source is to receive data from an external client and store it into the
configured Channel s. A Source can get an instance of its own ChannelProcessor to process an
Event , commited within a Channel local transaction, in serial. In the case of an exception,
required Channel s will propagate the exception, all Channel s will rollback their transaction, but
events processed previously on other Channel s will remain committed.

Similar to the SinkRunner.PollingRunner Runnable , there’s a PollingRunner Runnable that

executes on a thread created when the Flume framework calls PollableSourceRunner.start() .
Each configured PollableSource is associated with its own thread that runs a PollingRunner .
This thread manages the PollableSource ’s lifecycle, such as starting and stopping. A
PollableSource implementation must implement the start() and stop() methods that are
declared in the LifecycleAware interface. The runner of a PollableSource invokes that Source ‘s
process() method. The process() method should check for new data and store it into the
Channel as Flume Event s.

Note that there are actually two types of Source s. The PollableSource was already mentioned.
The other is the EventDrivenSource . The EventDrivenSource , unlike the PollableSource , must
have its own callback mechanism that captures the new data and stores it into the Channel . The
EventDrivenSource s are not each driven by their own thread like the PollableSource s are. Below
is an example of a custom PollableSource :

public class MySource extends AbstractSource implements Configurable, PollableSource {

private String myProp;

@Override
public void configure(Context context) {
String myProp = context.getString("myProp", "defaultValue");

// Process the myProp value (e.g. validation, convert to another type, ...)

// Store myProp for later retrieval by process() method

this.myProp = myProp;
}

@Override
public void start() {
// Initialize the connection to the external client
}

@Override
public void stop () {
// Disconnect from external client and do any additional cleanup
// (e.g. releasing resources or nulling-out field values) ..
}

@Override
public Status process() throws EventDeliveryException {
Status status = null;

try {
// This try clause includes whatever Channel/Event operations you want to do
// Receive new data
Event e = getSomeData();

// Store the Event into this Source's associated Channel(s)

getChannelProcessor().processEvent(e);

status = Status.READY;
} catch (Throwable t) {
// Log exception, handle individual exceptions as needed

status = Status.BACKOFF;

// re-throw all Errors

if (t instanceof Error) {
throw (Error)t;
}
} finally {
txn.close();
}
return status;
}
}

Channel

TBD

Generic - End User Computing
No ratings yet
Generic - End User Computing
139 pages
Microservices For Java Developers
No ratings yet
Microservices For Java Developers
160 pages
KIA MapCare Manual Change Log
No ratings yet
KIA MapCare Manual Change Log
40 pages
Flume User Guide
No ratings yet
Flume User Guide
48 pages
Flume User Guide
No ratings yet
Flume User Guide
32 pages
Module 10 Flume - Massive Logs Aggregation
No ratings yet
Module 10 Flume - Massive Logs Aggregation
42 pages
Expose BDD
No ratings yet
Expose BDD
16 pages
What Is Apache Flume?: Collecting, Aggregating, and Moving Large Amounts of Log Data. in
No ratings yet
What Is Apache Flume?: Collecting, Aggregating, and Moving Large Amounts of Log Data. in
8 pages
Flume
No ratings yet
Flume
15 pages
8 - Big - Data Vivek
No ratings yet
8 - Big - Data Vivek
2 pages
Module 5_Flume
No ratings yet
Module 5_Flume
23 pages
Apache Flume Tutorial - What Is - Architecture
No ratings yet
Apache Flume Tutorial - What Is - Architecture
8 pages
Streaming Data Via Flume
No ratings yet
Streaming Data Via Flume
13 pages
Chapter 8 Flume - Massive Log Aggregation
No ratings yet
Chapter 8 Flume - Massive Log Aggregation
35 pages
Apache Flume Tutorial PDF
No ratings yet
Apache Flume Tutorial PDF
43 pages
Notes Bug Data and of Apache
No ratings yet
Notes Bug Data and of Apache
11 pages
06 - Acquire Data Using CLI and Flume
No ratings yet
06 - Acquire Data Using CLI and Flume
13 pages
Flume Case Study
No ratings yet
Flume Case Study
2 pages
Flume PDF
No ratings yet
Flume PDF
7 pages
Arinto Murdopo Josep Subirats Group 4 EEDC 2012
No ratings yet
Arinto Murdopo Josep Subirats Group 4 EEDC 2012
19 pages
A728542518 - 16469 - 30 - 2019 - Flume Complete
No ratings yet
A728542518 - 16469 - 30 - 2019 - Flume Complete
13 pages
Assignment
No ratings yet
Assignment
37 pages
Unit-3 (HDFS-II)
No ratings yet
Unit-3 (HDFS-II)
28 pages
Apache Flume
No ratings yet
Apache Flume
8 pages
Apache Flume
No ratings yet
Apache Flume
21 pages
6 Flume - Student - Datadotz
No ratings yet
6 Flume - Student - Datadotz
29 pages
Apache Flume: Distributed Log Collection For Hadoop - Second Edition - Sample Chapter
No ratings yet
Apache Flume: Distributed Log Collection For Hadoop - Second Edition - Sample Chapter
13 pages
2020300053_BDA_EXP7_CHINMAY
No ratings yet
2020300053_BDA_EXP7_CHINMAY
5 pages
Questions For CCA175
50% (2)
Questions For CCA175
33 pages
Sqoop & Flume: Issues With Data Load Into Hadoop
No ratings yet
Sqoop & Flume: Issues With Data Load Into Hadoop
6 pages
Big Data Ca
No ratings yet
Big Data Ca
14 pages
Understanding Software Engineering Vol 3: Programming Basic Software Functionalities.
From Everand
Understanding Software Engineering Vol 3: Programming Basic Software Functionalities.
Gabriel Clemente
No ratings yet
OGG Flume Integration
No ratings yet
OGG Flume Integration
12 pages
Presentation of Big Data
No ratings yet
Presentation of Big Data
4 pages
Illumio Core FlowLink Configuration and Usage Guide 1.1.2
No ratings yet
Illumio Core FlowLink Configuration and Usage Guide 1.1.2
36 pages
Aprende programación python aplicaciones web: python, #2
From Everand
Aprende programación python aplicaciones web: python, #2
Jesus Jonathan cuevas orozco
No ratings yet
Mule 2
No ratings yet
Mule 2
15 pages
Flume Set Up On CLoudera CDH3
No ratings yet
Flume Set Up On CLoudera CDH3
17 pages
Flume Installation Cdh3
No ratings yet
Flume Installation Cdh3
17 pages
FreeSWITCH 1.0.6
From Everand
FreeSWITCH 1.0.6
Anthony Minessale
No ratings yet
CompTIA Security+: Network Attacks
From Everand
CompTIA Security+: Network Attacks
AS Snipes
5/5 (1)
2018 06 Bryzek Qconny Compressed
No ratings yet
2018 06 Bryzek Qconny Compressed
59 pages
Foundations For High Scalability in Mule 4 PDF
No ratings yet
Foundations For High Scalability in Mule 4 PDF
33 pages
Random (1)
No ratings yet
Random (1)
3 pages
Akk A Stream and HTTP Java
No ratings yet
Akk A Stream and HTTP Java
138 pages
Interview 4
No ratings yet
Interview 4
25 pages
Hadoop 3
No ratings yet
Hadoop 3
52 pages
Ccreate Your First Flume Program HTML
No ratings yet
Ccreate Your First Flume Program HTML
17 pages
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
From Everand
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
Dr. Hidaia Mamood Alassouli
No ratings yet
ETHICAL HACKING GUIDE-Part 3: Comprehensive Guide to Ethical Hacking world
From Everand
ETHICAL HACKING GUIDE-Part 3: Comprehensive Guide to Ethical Hacking world
POONAM DEVI
No ratings yet
INDJCSE24-15-04-020
No ratings yet
INDJCSE24-15-04-020
13 pages
PHP & MySQL Practice It Learn It
From Everand
PHP & MySQL Practice It Learn It
Jitendra Patel
3/5 (2)
Akka Scala
No ratings yet
Akka Scala
771 pages
Big Data-2 Sourcing Data
No ratings yet
Big Data-2 Sourcing Data
38 pages
Api Microservice PDF
No ratings yet
Api Microservice PDF
3 pages
AkkaJava PDF
No ratings yet
AkkaJava PDF
761 pages
Primer
No ratings yet
Primer
2 pages
Project 1 - State-Wise Development Analysis in India
No ratings yet
Project 1 - State-Wise Development Analysis in India
8 pages
BDA Mid-2 Important Questions
No ratings yet
BDA Mid-2 Important Questions
19 pages
Footprinting, Reconnaissance, Scanning and Enumeration Techniques of Computer Networks
From Everand
Footprinting, Reconnaissance, Scanning and Enumeration Techniques of Computer Networks
Dr. Hidaia Mahmood Alassouli
No ratings yet
AkkaStreamAndHTTPScala PDF
No ratings yet
AkkaStreamAndHTTPScala PDF
100 pages
04 Building RESTful Services
No ratings yet
04 Building RESTful Services
78 pages
2: Installation & Configuration of NS2 and Introduction To TCL Hello Programming
No ratings yet
2: Installation & Configuration of NS2 and Introduction To TCL Hello Programming
4 pages
eBOX 3300 Manual
No ratings yet
eBOX 3300 Manual
28 pages
Hadoop Course Content PDF
No ratings yet
Hadoop Course Content PDF
9 pages
Assistant in Python
100% (1)
Assistant in Python
16 pages
Module 1 - Get Started With Web Programming
No ratings yet
Module 1 - Get Started With Web Programming
6 pages
Siemnes 7SJ85
No ratings yet
Siemnes 7SJ85
2 pages
Uci224f 311 TD en Rev A
No ratings yet
Uci224f 311 TD en Rev A
9 pages
Pleiger Process Control (PPC)
100% (1)
Pleiger Process Control (PPC)
8 pages
Hopewind - hopePowerBox-G03
No ratings yet
Hopewind - hopePowerBox-G03
1 page
IoT Based Biometric Voting System
No ratings yet
IoT Based Biometric Voting System
12 pages
Artificial Intelligence Overview
No ratings yet
Artificial Intelligence Overview
27 pages
FICT Class Code Degree 2021-02foundation 2021-02diploma 2021-02
No ratings yet
FICT Class Code Degree 2021-02foundation 2021-02diploma 2021-02
4 pages
Phison eMMC 153 Ball PSLC SPEC - v1.4 PDF
No ratings yet
Phison eMMC 153 Ball PSLC SPEC - v1.4 PDF
43 pages
Password Recovery Cisco ME3400
No ratings yet
Password Recovery Cisco ME3400
2 pages
Python For Accounting A Modern Guide Python Programming in Accounting 9789730338928 Compress
100% (2)
Python For Accounting A Modern Guide Python Programming in Accounting 9789730338928 Compress
395 pages
RTC
No ratings yet
RTC
10 pages
IP Version 4 Addressing
No ratings yet
IP Version 4 Addressing
53 pages
C Interview Questions: Click Here
No ratings yet
C Interview Questions: Click Here
31 pages
ICDL Computer Essentials Sample Part-Test
100% (1)
ICDL Computer Essentials Sample Part-Test
2 pages
Sti 22 NM 60 N
No ratings yet
Sti 22 NM 60 N
13 pages
Quiz 2 CIS
No ratings yet
Quiz 2 CIS
4 pages
Error Code20
No ratings yet
Error Code20
5 pages
Regent University College of Sci & Tech Mr. David Fiase Department of Engineering
No ratings yet
Regent University College of Sci & Tech Mr. David Fiase Department of Engineering
35 pages
Assignment 4 SE
No ratings yet
Assignment 4 SE
12 pages
Digital Logic Circuit Design
No ratings yet
Digital Logic Circuit Design
8 pages
M02 - Configuring and Administering Server
No ratings yet
M02 - Configuring and Administering Server
77 pages
8 8086 Interrupts
No ratings yet
8 8086 Interrupts
26 pages
IT WorkShop Lab Manual
No ratings yet
IT WorkShop Lab Manual
111 pages