0% found this document useful (0 votes)
159 views24 pages

NiFi Pyro

NiFi is an open source data flow platform developed by Apache that allows users to automate the movement of data between disparate systems in a configurable way. It was originally developed by the NSA and donated to Apache in 2014. NiFi provides real-time data processing, transformation and routing. It includes over 165 processors to interact with data, track data provenance, and prioritize data flows. NiFi runs on any device that supports Java and allows processing data from many sources in a flexible, extensible manner.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
159 views24 pages

NiFi Pyro

NiFi is an open source data flow platform developed by Apache that allows users to automate the movement of data between disparate systems in a configurable way. It was originally developed by the NSA and donated to Apache in 2014. NiFi provides real-time data processing, transformation and routing. It includes over 165 processors to interact with data, track data provenance, and prioritize data flows. NiFi runs on any device that supports Java and allows processing data from many sources in a flexible, extensible manner.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
You are on page 1/ 24

PRESENTED BY: Pradeep Devarasetty

Md Abdul Mujaheed
INTRODUCING APACHE NiFi
History

● NiFi (previously Niagara Files) was in


development and used within the National
Security Agency(NSA), USA for the last 8 years.

● It was donated to the Apache Software


Foundation(ASF) on November 2014 through
NSA's Technology Transfer Program.
What is NiFi?

● NiFi Stands for Niagara Files


● Apache NiFi is an easy to use, powerful and reliable
framework to process and distribute data
● It is a platform for automating the movement of data
between disparate systems
● It is a component based extension model
● In simpler terms, NiFi is a system for moving, filtering, and
enhancing data
● We can trace the data in NiFi, just like we track our delivery
package from Flipkart, FedEx, etc
What is NiFi? Contd...

● NiFi allows user to send, receive, route,


transform, and sort data, as needed, in an
automated and configurable way
Why NiFi?

● NiFi was designed from the begining to be field ready-


flexible, extensible & suitable for a wide range of devices.
● It allows us to interact with the dataflow directly in the
browser.
● It provides us with real time control which makes it easy to
manage the movement of data between any data source
to any destination.
● It features a fine grained data provenance tools.
● NiFi has several extensions for dealing with file-based
dataflows such as FTP, SFTP, HTTP, etc.
What is Apache NiFi used for?

● Reliable and secure transfer of data between


systems
● Delivery of data from sources to analytic
platforms
● Enrichment and preparation of data:
-Conversion between formats
-Extraction/Parsing
-Routing decisions
Advantages

● Data source and destination-agnostic


● Provides connection processors for many data
sources
● Runs on any device that runs Java
● Build in one place, copy to anywhere else
● Apache NiFi is ideal for data sources sitting out
on the edge or sources with poor connectivity
and priority data
Terminology
● FlowFile
-Unit of data moving through the system
-Content + Attributes (key/value pairs).

Processor
-Performs the work, can access FlowFiles.

Connection
-Links between processors.
-Queues that can be dynamically prioritized.

Process Group
-Set of processors and their connections.
-Receive data via input ports, send data via output ports.
NiFi - Provenance

● Tracks data at each point as


it flows through the system
● Records, indexes, and
makes events available for
display
● Handles merging and
splitting of data
● View attributes and content
at given point of time
NiFi – Queue Prioritization
● Configure a prioritizer
per connection
● Determine what is
important for your data
– time based, arrival
order, importance of a
data set
● Funnel many
connections down to a
single connection to
prioritize across data
sets
● Develop your own
prioritizer if needed
NiFi – Back Pressure
● Configure back-pressure per
connection
● Based on number of FlowFiles
or total size of FlowFiles
● Upstream processor no longer
scheduled to run until below
threshold
NiFi - Architecture
NiFi - Explaining Architecture
● NiFi executes within a JVM on a host operating system.
- The Primary Components are
● Web Server
-The purpose of the web server is to host NiFi’s HTTP-based command and
control API.
● Flow Controller
-The flow controller is the brains of the operation. It provides threads for
extensions to run on, and manages the schedule of when extensions receive
resources to execute.
● Extensions
-NiFi has several extensions for dealing with file-based dataflows such as FTP,
SFTP, HTTP, etc
NiFi – Architecture contd..

● FlowFile Repository
-The FlowFile Repository is where NiFi keeps track of
the state of what it knows about a given FlowFile
● Content Repository
-The Content Repository is where the actual content
bytes of a given FlowFile live.
● Provenance Repository
-The Provenance Repository is where all provenance
event data is stored.
GETTING STARTED WITH
APACHE NiFi
Downloading

NiFi can be downloaded from its apache's official
website: https://nifi.apache.org/download.html
NiFi - Release
NiFi - Installing

● To run NiFi, the system should be installed with jdk1.8 or more


● Extract the NiFi tar file
● Open the terminal and navigate to to the directory where NiFi is
installed
● To run NiFi in the foreground, run bin/nifi.sh run
● To run NiFi in the background, instead run bin/nifi.sh start
● To check the status and see if NiFi is currently running, execute
the command bin/nifi.sh status
● NiFi can be shutdown by executing the command bin/nifi.sh
stop
NiFi – User Interface

● Drag and drop processors to build a flow


● Start, stop, and configure components in real time
● View errors and corresponding error messages
● View statistics and health of data flow
● Create templates of common processor & connections
NiFi - Processor

● There are in total 165 processors in the latest release


of NiFi-nifi-1.0.0
● Of this we have explored some of these processors
– ConvertAvroToJson
– ExecuteSQL
– EvaluateJsonPath
– EvaluateXPath
– ExtractText
– HandleHttpRequest
Processor contd..

– HandleHttpResponse
– InvokeHttp
– LogAttribute
– PutSQL
– RouteOnAttribute
– ReplaceText
– SplitXML
– UpdateAttribute
Resources

● https://nifi.apache.org/docs.html
● http://hortonworks.com/apache/nifi/#section_1
● https://community.hortonworks.com/articles/7999/
apache-nifi-part-1-introduction.html
● https://nifi.apache.org/developer-guide.html
● https://kisstechdocs.wordpress.com/2015/01/15/w
hat-is-apache-nifi/
● http://www.ssglimited.com/what-is-apache-nifi/
● https://www.federallabs.org/index.php?tray=su
ccess_stories&tid=1FLtop55&cid=flcSS57
Thank You

You might also like