0% found this document useful (0 votes)

33 views4 pages

Apache Flink Getting Started

Apache-Flink Getting Started

Uploaded by

ckaws01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views4 pages

Apache Flink Getting Started

Apache-Flink Getting Started

Uploaded by

ckaws01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Getting Started

On This Page 
Getting Started
Prerequisites
Installation
SQL Client
Hello World
Source Tables
Continuous Queries
Sink Tables
Looking for Help!
Resources to Learn more

This documentation is for an unreleased version of Apache Flink. We recommend you use the
latest stable version.

Getting Started
Flink SQL makes it simple to develop streaming applications using standard SQL. It is easy to learn
Flink if you have ever worked with a database or SQL like system by remaining ANSI-SQL 2011
compliant. This tutorial will help you get started quickly with a Flink SQL development environment.

Prerequisites
You only need to have basic knowledge of SQL to follow along. No other programming experience is
assumed.

Installation
There are multiple ways to install Flink. For experimentation, the most common option is to download
the binaries and run them locally. You can follow the steps in local installation to set up an
environment for the rest of the tutorial.

Once you’re all set, use the following command to start a local cluster from the installation folder:

./bin/start-cluster.sh

Once started, the Flink WebUI on localhost:8081 is available locally, from which you can monitor the
diﬀerent jobs.
SQL Client
The SQL Client is an interactive client to submit SQL queries to Flink and visualize the results. To start
the SQL client, run the sql-client script from the installation folder.

./bin/sql-client.sh

Hello World
Once the SQL client, our query editor, is up and running, it’s time to start writing queries. Let’s start
with printing ‘Hello World’, using the following simple query:

SELECT 'Hello World';

Running the HELP command lists the full set of supported SQL statements. Let’s run one such
command, SHOW, to see a full list of Flink built-in functions.

SHOW FUNCTIONS;

These functions provide users with a powerful toolbox of functionality when developing SQL queries.
For example, CURRENT_TIMESTAMP will print the machine’s current system time where it is executed.

SELECT CURRENT_TIMESTAMP;

Back to top
Source Tables
As with all SQL engines, Flink queries operate on top of tables. It diﬀers from a traditional database
because Flink does not manage data at rest locally; instead, its queries operate continuously over
external tables.

Flink data processing pipelines begin with source tables. Source tables produce rows operated over
during the query’s execution; they are the tables referenced in the FROM clause of a query. These
could be Kafka topics, databases, filesystems, or any other system that Flink knows how to
consume.

Tables can be defined through the SQL client or using environment config file. The SQL client support
SQL DDL commands similar to traditional SQL. Standard SQL DDL is used to create, alter, drop
tables.

Flink has a support for diﬀerent connectors and formats that can be used with tables. Following is an
example to define a source table backed by a CSV file with emp_id, name, dept_id as columns in a
CREATE table statement.

CREATE TABLE employee_information (

emp_id INT,
name VARCHAR,
dept_id INT
) WITH (
'connector' = 'filesystem',
'path' = '/path/to/something.csv',
'format' = 'csv'
);

A continuous query can be defined from this table that reads new rows as they are made available
and immediately outputs their results. For example, we can filter for just those employees who work
in department 1.

SELECT * from employee_information WHERE dept_id = 1;

Back to top
Continuous Queries
While not designed initially with streaming semantics in mind, SQL is a powerful tool for building
continuous data pipelines. Where Flink SQL diﬀers from traditional database queries is that is
continuously consuming rows as the arrives and produces updates to its results.

A continuous query never terminates and produces a dynamic table as a result. Dynamic tables are
the core concept of Flink’s Table API and SQL support for streaming data.

Aggregations on continuous streams need to store aggregated results continuously during the
execution of the query. For example, suppose you need to count the number of employees for each
department from an incoming data stream. The query needs to maintain the most up to date count
for each department to output timely results as new rows are processed.

SELECT
dept_id,
COUNT(*) as emp_count
FROM employee_information
GROUP BY dept_id;

Such queries are considered stateful. Flink’s advanced fault-tolerance mechanism will maintain
internal state and consistency, so queries always return the correct result, even in the face of
hardware failure.

Sink Tables
When running this query, the SQL client provides output in real-time but in a read-only fashion.
Storing results - to power a report or dashboard - requires writing out to another table. This can be
achieved using an INSERT INTO statement. The table referenced in this clause is known as a sink
table. An INSERT INTO statement will be submitted as a detached query to the Flink cluster.

INSERT INTO department_counts

SELECT
dept_id,
COUNT(*) as emp_count
FROM employee_information
GROUP BY dept_id;

Once submitted, this will run and store the results into the sink table directly, instead of loading the
results into the system memory.

Back to top
Looking for Help!
If you get stuck, check out the community support resources. In particular, Apache Flink’s user
mailing list consistently ranks as one of the most active of any Apache project and a great way to get
help quickly.

Resources to Learn more

• SQL: Supported operations and syntax for SQL.
• SQL Client: Play around with Flink SQL and submit a table program to a cluster without
programming knowledge
• Concepts & Common API: Shared concepts and APIs of the Table API and SQL.
• Streaming Concepts: Streaming-specific documentation for the Table API or SQL such as
configuration of time attributes and handling of updating results.
• Built-in Functions: Supported functions in Table API and SQL.
• Connect to External Systems: Available connectors and formats for reading and writing data to
external systems.

Stream Processing - Hands-On With Apache Flink (Giannis Polyzos) (Z-Library)
No ratings yet
Stream Processing - Hands-On With Apache Flink (Giannis Polyzos) (Z-Library)
234 pages
Mastering Apache Spark
67% (3)
Mastering Apache Spark
1,831 pages
Barclays Data Engineer Interview Questions
No ratings yet
Barclays Data Engineer Interview Questions
17 pages
Apache Flink Introduction - Big Data Landscape
No ratings yet
Apache Flink Introduction - Big Data Landscape
26 pages
SQL For Beginners
No ratings yet
SQL For Beginners
171 pages
Python Programming & SQL
100% (4)
Python Programming & SQL
152 pages
Smart Data Boden Introduction Flink
No ratings yet
Smart Data Boden Introduction Flink
37 pages
Hyderabad Meetup Dec 7th 2024 - Diptiman - Confluent
No ratings yet
Hyderabad Meetup Dec 7th 2024 - Diptiman - Confluent
85 pages
Stream Processing Hands On With Apache Flink Free Lms Version
No ratings yet
Stream Processing Hands On With Apache Flink Free Lms Version
232 pages
Certified Professional Diploma in Data Analytics
No ratings yet
Certified Professional Diploma in Data Analytics
49 pages
Apache Flink Tutorial
100% (1)
Apache Flink Tutorial
44 pages
World Ranking Universities
No ratings yet
World Ranking Universities
58 pages
Python Data Analysis - 2E (2017)
No ratings yet
Python Data Analysis - 2E (2017)
186 pages
SQL Training 01 Intro To Flink SQL
No ratings yet
SQL Training 01 Intro To Flink SQL
23 pages
Flink HandsOn
No ratings yet
Flink HandsOn
39 pages
THINK LEVEL 1 - Student's Book (Second Edition)
100% (16)
THINK LEVEL 1 - Student's Book (Second Edition)
131 pages
662a5089e0494246e350140dslides - Data Wrangling With SQL
No ratings yet
662a5089e0494246e350140dslides - Data Wrangling With SQL
85 pages
Streamlined Data Ingestion With Pandas Chapter3
No ratings yet
Streamlined Data Ingestion With Pandas Chapter3
43 pages
Apache Calcite - A Foundational Framework For Optimized Query Processing Over Heterogeneous Data Sources - Sigmod-2018
No ratings yet
Apache Calcite - A Foundational Framework For Optimized Query Processing Over Heterogeneous Data Sources - Sigmod-2018
23 pages
Madhyahna Lila PDF
No ratings yet
Madhyahna Lila PDF
206 pages
DSC L2 CQL
No ratings yet
DSC L2 CQL
40 pages
InfluxDB Documentation
No ratings yet
InfluxDB Documentation
34 pages
Udacity Challenge Prog - DS With Python - NOTES
No ratings yet
Udacity Challenge Prog - DS With Python - NOTES
84 pages
Apache Flink
No ratings yet
Apache Flink
116 pages
Data Science (Oct 2024)
No ratings yet
Data Science (Oct 2024)
13 pages
Apache Flink
No ratings yet
Apache Flink
40 pages
Deloitte Data Engineer Interview Experience (0-3 Yoe)
No ratings yet
Deloitte Data Engineer Interview Experience (0-3 Yoe)
22 pages
Data Engineering Agenda
No ratings yet
Data Engineering Agenda
19 pages
Apache Flink.9443699.Powerpoint
No ratings yet
Apache Flink.9443699.Powerpoint
6 pages
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
No ratings yet
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
23 pages
Setting Up Databases With Postgresql, Psequel, and Python
No ratings yet
Setting Up Databases With Postgresql, Psequel, and Python
54 pages
Flink
No ratings yet
Flink
5 pages
Data Science Tools Study Guides For MIT's 15.003
No ratings yet
Data Science Tools Study Guides For MIT's 15.003
23 pages
Python and Pyspark With Databricks, With Azure Project
No ratings yet
Python and Pyspark With Databricks, With Azure Project
9 pages
Unit-5 Spark SQL and Spark Streaming
No ratings yet
Unit-5 Spark SQL and Spark Streaming
24 pages
A I Using Python
No ratings yet
A I Using Python
10 pages
Flask Part3
No ratings yet
Flask Part3
31 pages
ADBMS-Chapter 1
No ratings yet
ADBMS-Chapter 1
16 pages
Unified Batch and Real Time Stream Processing
No ratings yet
Unified Batch and Real Time Stream Processing
68 pages
Dsmlusingpython
No ratings yet
Dsmlusingpython
10 pages
Data Analtycs Professional-1
No ratings yet
Data Analtycs Professional-1
15 pages
Data Lake 1
No ratings yet
Data Lake 1
48 pages
Kumpulan Judul Thesis s2 Bahasa Inggris
100% (3)
Kumpulan Judul Thesis s2 Bahasa Inggris
7 pages
Wa0000
No ratings yet
Wa0000
9 pages
57.13 - Logical Operators - mp4
No ratings yet
57.13 - Logical Operators - mp4
6 pages
Abinitio 12 Curriculum V6
No ratings yet
Abinitio 12 Curriculum V6
9 pages
First Steps - Apache - Flink
No ratings yet
First Steps - Apache - Flink
4 pages
Aashto - Catalog, Volume 1, 2019 PDF
No ratings yet
Aashto - Catalog, Volume 1, 2019 PDF
28 pages
Mod5 Bda
No ratings yet
Mod5 Bda
9 pages
BDA Notes (Unit-1)
No ratings yet
BDA Notes (Unit-1)
11 pages
Questions For Preparation
No ratings yet
Questions For Preparation
9 pages
Data Analytics Using Python
No ratings yet
Data Analytics Using Python
10 pages
Form 48 Waiver
No ratings yet
Form 48 Waiver
5 pages
Glossary - Apache-Flink
No ratings yet
Glossary - Apache-Flink
4 pages
Data Engineering Roadmap uYdSPm5q
100% (1)
Data Engineering Roadmap uYdSPm5q
5 pages
Data Engineering Bootcamp
No ratings yet
Data Engineering Bootcamp
5 pages
Eltd Sem1 Idc Theory 2024
No ratings yet
Eltd Sem1 Idc Theory 2024
2 pages
String Pattern SQL
No ratings yet
String Pattern SQL
5 pages
Flink: Another Data Stream Framework!
No ratings yet
Flink: Another Data Stream Framework!
7 pages
g12 2nd Hiphop Cheerdance
No ratings yet
g12 2nd Hiphop Cheerdance
39 pages
Full Photoinitiators: Structures, Reactivity and Applications in Polymerization Jean-Pierre Fouassier PDF All Chapters
100% (3)
Full Photoinitiators: Structures, Reactivity and Applications in Polymerization Jean-Pierre Fouassier PDF All Chapters
66 pages
HBase and Apache Phoenix
No ratings yet
HBase and Apache Phoenix
6 pages
7 - Term Structure
No ratings yet
7 - Term Structure
44 pages
Viceroys of India
No ratings yet
Viceroys of India
4 pages
TLE9 - COOKERY - Module - Q4 Week 1
No ratings yet
TLE9 - COOKERY - Module - Q4 Week 1
14 pages
FINAL Copy - 2023 ANNUAL IMPLEMENTATION PLAN Template
100% (2)
FINAL Copy - 2023 ANNUAL IMPLEMENTATION PLAN Template
5 pages
Cultural Studies STD 4-Questions and Answers4
No ratings yet
Cultural Studies STD 4-Questions and Answers4
13 pages
Anthony Nutt
No ratings yet
Anthony Nutt
3 pages
SQL Fundamentals: Course
No ratings yet
SQL Fundamentals: Course
1 page
Ipc2024 134169
No ratings yet
Ipc2024 134169
7 pages
Final Verb Assignment-1
No ratings yet
Final Verb Assignment-1
18 pages
Pokemon D20 - Core Rules 2 PDF
No ratings yet
Pokemon D20 - Core Rules 2 PDF
10 pages
Quizusa
No ratings yet
Quizusa
21 pages
Assignment 2
No ratings yet
Assignment 2
12 pages
Tle 9 - Summative
100% (3)
Tle 9 - Summative
2 pages
MN 30067 Multi Banking
No ratings yet
MN 30067 Multi Banking
21 pages
Annotated Bibliography and Literature Review
No ratings yet
Annotated Bibliography and Literature Review
10 pages
01354c5e24d93dfa73237a18dcb00ce9
No ratings yet
01354c5e24d93dfa73237a18dcb00ce9
6 pages
Pentaksiran Standard Setara: Sekolah Kebangsaan Seberang Dungun 23000 Dungun, Terengganu
No ratings yet
Pentaksiran Standard Setara: Sekolah Kebangsaan Seberang Dungun 23000 Dungun, Terengganu
9 pages
What Is Egornomic
No ratings yet
What Is Egornomic
7 pages
Truth in The News Essay
No ratings yet
Truth in The News Essay
3 pages
20 Companies Mission Statement You Want To Work For
No ratings yet
20 Companies Mission Statement You Want To Work For
4 pages
Diogenes of Sinope
No ratings yet
Diogenes of Sinope
2 pages
Statement of Purpose
No ratings yet
Statement of Purpose
2 pages
OverHead Supply
No ratings yet
OverHead Supply
1 page

Apache Flink Getting Started

Uploaded by

Apache Flink Getting Started

Uploaded by

Getting Started

SELECT 'Hello World';

CREATE TABLE employee_information (

SELECT * from employee_information WHERE dept_id = 1;

INSERT INTO department_counts

Resources to Learn more

You might also like