0% found this document useful (0 votes)

21 views183 pages

Knime PDF

KNIME Analytics Platform is a comprehensive tool for data analysis, offering graphical programming, various extensions, and integrations with languages like R and Python. It supports data access from multiple sources, big data processing, and provides functionalities for data transformation, analysis, visualization, and deployment. The platform includes a user-friendly workspace, workflow management, and a rich library of nodes for diverse data tasks.

Uploaded by

naveenkumar.s2022eee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views183 pages

Knime PDF

Uploaded by

naveenkumar.s2022eee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 183

Overview

KNIME Analytics Platform

What is KNIME Analytics Platform?

A tool for data analysis, manipulation, visualization, and

reporting
Based on the graphical programming paradigm
Provides a diverse array of extensions:
Text Mining
Network Mining
Cheminformatics
Many integrations,
such as Java, R, Python, Weka, Keras, Plotly, H2O, etc.

© 2021 KNIME AG. All rights 2

reserved.
Visual KNIME Workflows

Nodes
perform tasks on
data

Workflows
combine nodes to model data
flow

Document

Components
Preprocessi
ng

encapsulate complexity &

expertise

© 2021 KNIME AG. All rights 3

reserved.
Data Access

Databases
MySQL, PostgreSQL, Oracle
Theobald
any JDBC (DB2, MS SQL
Server)
Amazon DynamoDB
Files
CSV, txt, Excel, Word, PDF
SAS, SPSS
XML, JSON, PMML
Images, texts, networks
Other
Twitter, Google
Amazon S3, Azure Blob Store
Sharepoint, Salesforce
Kafka
REST, Web services

© 2021 KNIME AG. All rights 4

reserved.
Big Data

Spark & Databricks

HDFS support
Hive
Impala
In-database
processing

© 2021 KNIME AG. All rights 5

reserved.
Transformation

Preprocessing
Row, column, matrix based
Data blending
Join, concatenate, append
Aggregation
Grouping, pivoting, binning
Feature Creation and
Selection

© 2021 KNIME AG. All rights 6

reserved.
Analysis & Data Mining

Regression
Linear, logistic
Classification
Decision tree, ensembles, SVM,
MLP,
Naïve Bayes
Clustering
k-means, DBSCAN, hierarchical
Validation
Cross-validation, scoring, ROC
Deep Learning
Keras, DL4J
External
R, Python, Weka, H2O, Keras

© 2021 KNIME AG. All rights 7

reserved.
Visualization

Interactive Visualizations
JavaScript-based nodes
Scatter Plot, Box Plot, Line Plot
Networks, ROC Curve, Decision
Tree
Plotly Integration
Adding more with each release!
Misc
Tag cloud, open street map,
molecules
Script-based visualizations
R, Python

© 2021 KNIME AG. All rights 8

reserved.
Deployment

Database
Files
Excel, CSV, txt
XML
PMML
to: local, KNIME Server, Amazon S3,
Azure Blob Store
BIRT Reporting

© 2021 KNIME AG. All rights 9

reserved.
Over 2000 Native and Embedded Nodes
Included:

Data Access Transformati Analysis & Visualization Deployment

MySQL, Oracle, ... on Mining R via BIRT PMML
Statistics JFreeChart
SAS, SPSS, ... Row Column JavaScript
XML, JSON
Excel, Flat, ... Matrix Data Mining Databases
Machine Learning Plotly
Hive, Impala, ... Text, Image Community / Excel, Flat, etc.
Web Analytics
XML, JSON, PMML Time Series 3rd Text, Doc,
Text Mining
Text, Doc, Java Python Network Analysis Image Industry
Image, ... Web Community / Social Media Specific
Crawlers Industry 3rd Analysis Community /
Specific R, Weka, Python 3rd
Community / 3rd Community / 3rd
© 2021 KNIME AG. All rights 10
reserved.
Install KNIME Analytics Platform
Select the KNIME version for your
computer:
Mac
Windows – 32 or 64 bit
Linux
Download the archive and extract the
file, or download installer package and
run it

© 2021 KNIME AG. All rights 11

reserved.
Start KNIME Analytics Platform

Use the shortcut created by the

installer

Or go to the installation directory and launch KNIME via the

knime.exe

© 2021 KNIME AG. All rights 12

reserved.
The KNIME Workspace

The workspace is the folder/directory in which workflows (and

potentially data
files) are stored for the current KNIME session
Workspaces are portable (just like KNIME)

© 2021 KNIME AG. All rights 13

reserved.
The KNIME Analytics Platform Workbench

KNIME
Explorer
Node
Description

Workflow
Coach

Workflow
Editor

KNIME
Node Hub
Repository

Console & Node

Monitor
Outlin
e

© 2021 KNIME AG. All rights 14

reserved.
KNIME Explorer

In LOCAL you can access your

own workflow projects.
Other mountpoints allow you to
connect to
EXAMPLE Server
KNIME Hub
KNIME Server
The Explorer toolbar on the top
has a search box and buttons to
select the workflow displayed in the
active
editor
refresh the view
The KNIME Explorer can contain
4 types of content:
Workflows
Workflow groups
Data files
Shared Components

© 2021 KNIME AG. All rights 15

reserved.
Creating New Workflows, Importing, and
Exporting
Right-click inside the KNIME Explorer to create a new workflow or a
workflow
group, or to import a workflow
Right-click the workflow or workflow group to export

© 2021 KNIME AG. All rights 16

reserved.
Node Repository

The Node Repository lists all

KNIME nodes
The search box has 2 modes
Standard Search – exact match of
node
name
Fuzzy Search – finds the most
similar node name

© 2021 KNIME AG. All rights 17

reserved.
Description

The Description view

provides information
about:
Node functionality
Input & output
Node settings
Ports
References to literature

© 2021 KNIME AG. All rights 18

reserved.
Workflow Description

When selecting the workflow,

the Description view gives
you information about the
workflow:
Title
Description
Associated tags and links
Creation date
Author

© 2021 KNIME AG. All rights 19

reserved.
Workflow Coach

Node recommendation engine

Gives hints about which node to use next in the
workflow
Based on KNIME communities' usage statistics
Based on own KNIME workflows

© 2021 KNIME AG. All rights 20

reserved.
Node Monitor

By default the Node Monitor shows you the output table of the node
selected in
the workflow editor
Click on the three dots on the upper right to show the flow variables,
configuration, etc.

© 2021 KNIME AG. All rights 21

reserved.
Console and Other Views

Console view prints out error

and warning messages
about what is going
on under the hood

Click View and select Other…

to add different views
Node Monitor, Licenses, etc.

© 2021 KNIME AG. All rights 22

reserved.
Inserting and Connecting Nodes

Insert nodes into workspace by dragging them from the

Node Repository or by double-clicking in the Node Repository
Connect nodes by left-clicking the output port of Node A and dragging
the cursor to the (matching) input port of Node B
Common port types:

Mode
l
Flow
Variable
Imag
e

Dat DB DB
a Connection Data

© 2021 KNIME AG. All rights 23

reserved.
More on Nodes…

A node can have 4 states:

Not Configured:
The node is waiting for configuration or incoming data.

Configured:
The node has been configured correctly, and can be
executed.

Executed:
The node has been successfully executed. Results
may be viewed and used in downstream nodes.

Error:
The node has encountered an error during execution.
© 2021 KNIME AG. All rights 24
reserved.
Node Configuration

Most nodes need to be configured

To access a node configuration
dialog:
Double-click the node
Right-click -> Configure

© 2021 KNIME AG. All rights 25

reserved.
Node Execution

Right-click node
Select Execute in the context
menu
If execution is successful, status
shows green light
If execution encounters errors,
status shows red light

© 2021 KNIME AG. All rights 26

reserved.
Tool Bar

The buttons in the toolbar can be used for the active workflow.
The most important buttons are:

Execute selected and executable nodes (F7) Execute all executable nodes
Execute selected nodes and open first view Cancel all selected, running
nodes (F9) Cancel all running nodes

© 2021 KNIME AG. All rights 27

reserved.
Node Views

Right-click node to inspect the execution

results by
selecting output ports (last option in the context
menu)
to inspect tables, images, etc.
selecting Interactive View to open visualization
results in a browser

Scatter
Plot

Data
Plot View
View

© 2021 KNIME AG. All rights 28

reserved.
KNIME File Extensions

Dedicated file extensions for workflows and workflow groups

associated with
KNIME Analytics Platform

*.knwf for KNIME Workflow Files

*.knar for KNIME Archive Files

© 2021 KNIME AG. All rights 29

reserved.
Getting Started: KNIME Hub

Place to search and

share
Workflows
Nodes
Components
Extensions

https://hub.knime.c
om
© 2021 KNIME AG. All rights 30
reserved.
Getting Started: KNIME Example Server

Connect via KNIME Explorer to a public repository with large

selection of
example workflows for many, many applications

© 2021 KNIME AG. All rights 31

reserved.
Hot Keys (for Future Reference)
Task Hot key Description
Node Configuration F6 opens the configuration window of the selected node
F7 executes selected configured nodes
Shift + F7 executes all configured nodes

Node Execution Shift + F10 executes all configured nodes and opens all views
F9 cancels selected running nodes
Shift + F9 cancels all running nodes
Node Connections Ctrl + L connects selected nodes
Ctrl + Shift + L disconnects selected nodes
Ctrl + Shift + Arrow moves the selected node in the arrow direction
Move Nodes and Ctrl + Shift + moves the selected annotation in the front or in the back of
Annotations PgUp/PgDown all overlapping annotations
F8 resets selected nodes
Ctrl + S saves the workflow
Workflow Operations
Ctrl + Shift + S saves all open workflows
Ctrl + Shift + W closes all open workflows
Metanode Shift + F12 opens metanode wizard

© 2021 KNIME AG. All rights 32

reserved.
Introduction to
the Big Data
Course
Goal of this Course

Become familiar with the Hadoop Ecosystem

and the
KNIME Big Data Extensions

What you need:

KNIME Analytics Platform with
KNIME Big Data Extensions
KNIME Big Data Connectors
KNIME Extension for Apache Spark
KNIME Extension for Local Big Data
Environment
KNIME File Handling Nodes

© 2021 KNIME AG. All rights 34

reserved.
Installation of Big Data Extensions

© 2021 KNIME AG. All rights 35

reserved.
…or install via drag-and-drop from the KNIME
Hub

© 2021 KNIME AG. All rights 36

reserved.
Big Data Resources (1)

SQL Syntax and Examples

https://www.w3schools.com

Apache Spark MLlib

https://spark.apache.org/docs/latest/ml-
guide.html

KNIME Big Data Extensions (Hadoop +

Spark) https://www.knime.com/knime-big-
data-extensions

Example workflows on KNIME Hub

https://www.knime.com/nodeguide/big-data

© 2021 KNIME AG. All rights 37

reserved.
Big Data Resources (2)

Whitepaper “KNIME opens the Doors to Big Data”

https://www.knime.com/sites/default/files/inline-
images/big_data_in_knime_1.pdf
Blog Posts
https://www.knime.org/blog/Hadoop-Hive-meets-Excel
https://www.knime.com/blog/SparkSQL-meets-HiveSQL
https://www.knime.com/blog/speaking-kerberos-with-knime-big-data-
extensions
https://www.knime.com/blog/new-file-handling-out-of-labs-and-into-production
Video
https://www.knime.com/blog/scaling-analytics-with-knime-big-data-extensions

© 2021 KNIME AG. All rights 38

reserved.
Overview

1. Use a traditional Database,

and KNIME Analytics
Platform native Machine
Learning Nodes
2. Moving In-Database
Processing
to Hadoop Hive
3. Moving In-Database
Processing and Machine
Learning to Spark

© 2021 KNIME AG. All rights 39

reserved.
Today’s Example: Missing Values Strategy

Missing Values are a big problem in Data Science!

Many strategies to deal with the problem (see “How to deal with
missing values” KNIME Blog https://www.knime.com/blog/how-to-deal-
with-missing-values?)

We adopt the strategy that predicts the missing values based on the
other attributes on the same data row

CENSUS Data Set with missing COW values from

http://www.census.gov/programs-surveys/acs/data/pums.html

© 2021 KNIME AG. All rights 40

reserved.
CENSUS Data Set

CENSUS data contains questions to a sample of US residents

(1%)
over 10 years
CENSUS data set description:
http://www2.census.gov/programs-
surveys/acs/tech_docs/pums/data_dict/PUMSDataDict15.pdf
ss13hme (60K rows) -> questions about housing to Maine
residents
ss13pme (60K rows) -> questions about themselves to Maine
residents
ss13hus (31M rows) -> questions about housing to all US
residents in the sample
ss13pus (31M rows) -> questions about themselves to all US
residents
in the sample
© 2021 KNIME AG. All rights 41
reserved.
Missing Values Strategy Implementation

Connect to the CENSUS data set

Separate data rows with COW from data rows with missing COW
Train a decision tree to predict COW (obviously only on data rows
with COW)
Apply decision tree to predict COW where COW is missing
Update original data set with new predicted COW values

© 2021 KNIME AG. All rights 42

reserved.
Today’s Example: Missing Values Strategy

© 2021 KNIME AG. All rights 43

reserved.
Let’s Practice First
on a Traditional
Database
Database
Extension
Database Extension

Visually assemble complex SQL statements (no SQL coding

needed)
Connect to all JDBC-compliant databases
Harness the power of your database within KNIME

© 2021 KNIME AG. All rights 46

reserved.
Database Connectors

Many dedicated DB Connector nodes available

If connector node missing, use DB Connector node with
JDBC driver

© 2021 KNIME AG. All rights 47

reserved.
In-Database Processing

Database Manipulation nodes generate SQL query on top of the

input SQL
query (brown square port)
SQL operations are executed on the database!

© 2021 KNIME AG. All rights 48

reserved.
Export Data

Writing data back into

database
Exporting data into KNIME

© 2021 KNIME AG. All rights 49

reserved.
Tip

SQL statements can be easily viewed from the DB node

output ports.

© 2021 KNIME AG. All rights 50

reserved.
Database Port
Types
Database Port Types

Database Connection Port

(brown)
Connection information
SQL statement

Database JDBC Connection Port

(red)
Connection information

© 2021 KNIME AG. All rights 52

reserved.
DB Connection Port View

© 2021 KNIME AG. All rights 53

reserved.
DB Data Port View

Copy SQL
statement

© 2021 KNIME AG. All rights 54

reserved.
Connect to
Database and
Import Data
Database Connectors

Dedicated nodes to connect

to specific Databases
Necessary JDBC driver included
Easy to use
Import DB specific
behavior/capability

Hive and Impala connector

part of the KNIME Big Data
Connectors extension

General Database Connector

Can connect to any JDBC source
Register new JDBC driver via
File -> Preferences -> KNIME ->
Databases

© 2021 KNIME AG. All rights 56

reserved.
Register JDBC Driver

Register single jar

file JDBC
drivers

Register new JDBC

driver with
companion files

Open KNIME and go

to File -> Preferences,
then KNIME ->
Databases

© 2021 KNIME AG. All rights 57

reserved.
DB Connector Node

© 2021 KNIME AG. All rights 58

reserved.
DB Connector Node – Type Mapping

KNIME will do its best to guess

what type mappings are
appropriate based on what it
knows about your database
If you need more control, you
can specify type mappings
manually in two ways
By name, for individual fields – or
groups of
fields using RegEx
By type
Two separate tabs to govern
input and output type
mappings

© 2021 KNIME AG. All rights 59

reserved.
Dedicated Database Connectors

MS SQL Server, MySQL, Postgres,

SQLite, …
Propagate connection information to
other DB nodes

© 2021 KNIME AG. All rights 60

reserved.
Workflow Credentials – Usage

Replaces username and

password fields
Supported by several nodes
that
require login credentials
DB connectors
Remote file system connectors
Send mail
…

© 2021 KNIME AG. All rights 61

reserved.
Credentials Configuration Node

Works together with all nodes that

support
workflow credentials

© 2021 KNIME AG. All rights 62

reserved.
DB Table Selector Node

Takes connection information and constructs a

query
Explore DB metadata
Outputs a SQL query

© 2021 KNIME AG. All rights 63

reserved.
DB Reader node

Executes incoming SQL Query on

Database
Reads results into a KNIME data table

Database Connection KNIME Data

Port Table

© 2021 KNIME AG. All rights 64

reserved.
Section Exercise – 01_DB_Connect

Connect to the database (SQLite) newCensus.sqlite in folder

1_Data
Use SQLite Connector (Note: SQLite Connector supports knime:// protocol)
Explore DB metadata
Select table ss13pme (person data in Maine)
Import the data into a KNIME data table

Optional: Create a Credentials Input node and use it in a

MySQL Connector instead of user name and password.

You can download the training workflows from the KNIME Hub:
https://hub.knime.com/knime/spaces/Education/latest/Courses/

© 2021 KNIME AG. All rights 65

reserved.
In-Database
Processing
Query Nodes

Filter rows and

columns
Join tables/queries
Extract samples
Bin numeric columns
Sort your data
Write your own
query
Aggregate your data

© 2021 KNIME AG. All rights 67

reserved.
Data Aggregation

RowID Group Value

r1 m 2

r2 f 3 RowID Group Sum(Value)

r1+r3+r6 m 8
r3 m 1
r2+r4+r5 f 15
r4 f 5

r5 f 7

r6 m 5

Aggregated on “Group” by
method:
sum(“Value”)

© 2021 KNIME AG. All rights 68

reserved.
Database GroupBy Node

Aggregate to summarize
data

© 2021 KNIME AG. All rights 69

reserved.
DB GroupBy Node – Manual Aggregation

Returns number of rows per

group

© 2021 KNIME AG. All rights 70

reserved.
Database GroupBy Node – Pattern Based
Aggregation

Tick this option if the search pattern is

a regular expression otherwise it is
treated as string with wildcards ('*'
and '?')

© 2021 KNIME AG. All rights 71

reserved.
Database GroupBy Node – Type Based
Aggregation

Matches
all
columns

Matches
all
numeric
columns

© 2021 KNIME AG. All rights 72

reserved.
Database GroupBy Node – DB Specific Aggregation
Methods

SQLite: 7 aggregation PostgreSQL: 25 aggregation

functions functions

© 2021 KNIME AG. All rights 75

reserved.
DB GroupBy Node – Custom Aggregation
Function

© 2021 KNIME AG. All rights 76

reserved.
Joining Columns of Data
Left Table Right Table

Join by ID

Inner Join

Left Outer Join Right Outer Join

Missing values Missing values
in in the left
the right table. table.

© 2021 KNIME AG. All rights 77

reserved.
Joining Columns of Data
Left Table Right Table

Join by ID

Full Outer Join

Missing values
in the right
table
Missing
values
in the left
table

© 2021 KNIME AG. All rights 78

reserved.
Database Joiner Node

Combines columns from 2 different

tables
Top port contains “Left” data table
Bottom port contains the “Right”
data
table

© 2021 KNIME AG. All rights 79

reserved.
Joiner Configuration – Linking Rows

Values to join on.

Multiple joining
columns are allowed.

© 2021 KNIME AG. All rights 80

reserved.
Joiner Configuration – Column Selection

Columns from
left table to
output table

Columns
from right
table to
output table

© 2021 KNIME AG. All rights 81

reserved.
Database Row Filter Node

Filters rows that do not match the filter criteria

Use the IS NULL or IS NOT NULL operator to filter missing
values

© 2021 KNIME AG. All rights 82

reserved.
Database Sorter Node

Sorts the input data by one or multiple

columns

© 2021 KNIME AG. All rights 83

reserved.
Database Query Node

Executes arbitrary SQL queries

#table# is replaced with input
query

© 2021 KNIME AG. All rights 84

reserved.
Section Exercise – 02_DB_InDB_Processing

From tables ss13hme (house data) and ss13pme (person data) in

database
newCensus.sqlite create 4 tables
1. Join ss13hme and ss13pme on SERIALNO. Remove all columns
named PUMA* and PWGTP* from both tables.
2. Filter all rows from ss13pme where COW is NULL.
3. Filter all rows from ss13pme where COW is NOT NULL.
4. Calculate average AGEP for the different SEX groups. For all 4
tasks, at the end load data into KNIME.

Optional. Sort the data rows by descending AGEP and extract top
10 only. Hint: Use LIMIT to restrict the number of rows returned by
the db.
© 2021 KNIME AG. All rights 85
reserved.
Predicting COW
Values with
KNIME
Missing Value Implementation Approach

Remember that after we perform some

in- database ETL on the data, a key
task is to fill in missing values for the
COW field in our dataset

We could try to do this by applying

some simple business rules, but a more
sophisticated approach is to build a
model to predict COW

Therefore, we will train and apply a

decision tree model for COW

© 2021 KNIME AG. All rights 87

reserved.
Section Exercise – 03_DB_Modelling

Train a Decision Tree to predict the COW where COW is not null
Apply Decision Tree Model to predict COW where COW is
missing (null)

© 2021 KNIME AG. All rights 88

reserved.
Write/Load Data into a
Database
Database Writing Nodes

Create table as
select
Insert/append data
Update values in
table
Delete rows from
table

© 2021 KNIME AG. All rights 90

reserved.
DB Writer Node

Writes data from a KNIME data

table
directly into a database table

Append to
Increase batch size
or drop
for better
existing
performance
table

© 2021 KNIME AG. All rights 91

reserved.
DB Writer Node (continued)

Writes data from a KNIME data

table
directly into a database table

Apply custom
variable
types

© 2021 KNIME AG. All rights 92

reserved.
DB Connection Table Writer Node

Creates a new database table based on the input

SQL query

© 2021 KNIME AG. All rights 93

reserved.
DB Delete Node

Deletes all database records that match the

values
of the selected columns

Increase batch size

for better
performance

© 2021 KNIME AG. All rights 94

reserved.
Database Update Node

Updates all database

records
that match the update
criteria

Columns to
update

Columns that
identify the
records to update

© 2021 KNIME AG. All rights 95

reserved.
Utility

Drop table
missing table handling
cascade option
Execute any SQL statement e.g.
DDL
Manipulate existing queries

© 2021 KNIME AG. All rights 96

reserved.
More Utility Nodes and Transaction Support
DB Connection
Extractor
DB Connection Closer

DB Transaction Start/End
Take advantage of these nodes
to group several database data
manipulation operations into a
single unit of work
This transaction either
completes entirely or not at all.
Uses the default isolation level
of
the connected database.

Workflow is available on the KNIME

Hub:
© 2021 KNIME AG. All rights 97
reserved.
Section Exercise – 04_DB_WritingToDB

Write the original table to ss13pme_original table with a Database

Connection
Table Writer node ... just in case we mess up with the updates in the
next step.
Update all rows in ss13pme table with the output of the predictor
node. That is all rows with missing COW value with the predicted COW
value, using column SERIALNO for WHERE condition (SERIALNO
uniquely identifies each person). Check the UpdateStatus column for
success.

Optional: Write the learned Decision Tree Model and the timestamp into a new table
named "model”

© 2021 KNIME AG. All rights 98

reserved.
Let’s Now Try the Same
with Hadoop
A Quick Intro to
Hadoop
Apache Hadoop

Open-source framework for distributed storage and processing of

large data
sets
Designed to scale up to thousands of machines
Does not rely on hardware to provide high availability
Handles failures at application layer instead
First release in 2006
Rapid adoption, promoted to top level Apache project in 2008
Inspired by Google File System (2003) paper
Spawned diverse ecosystem of products

© 2021 KNIME AG. All rights 101

reserved.
Hadoop Ecosystem

Access HIVE

Processing MapReduc Tez Spark

e
Resource
Management
YARN

Storage HDFS

© 2021 KNIME AG. All rights 102

reserved.
HDFS

Hadoop distributed file system HIVE

Stores large files across multiple

MapReduce Tez Spark
YARN
machines HDFS

Fil File
(large!)
e
Blocks (default:
64MB)

DataNode
s

© 2021 KNIME AG. All rights 103

reserved.
HDFS – NameNode and DataNode

NameNode DataNodes
Master service that manages file Workers, store and retrieve blocks
system per request of client or namenode
namespace
Periodically report to namenode that
Maintains metadata for all files they are running and
and directories in filesystem tree which blocks they are storing
Knows on which datanode blocks
of a
given file are located
Whole system depends on
availability of NameNode

© 2021 KNIME AG. All rights 104

reserved.
HDFS – Data Replication and File Size

Data
Replication
All blocks of a file are stored
as sequence of blocks File
1
B B B
Blocks of a file are replicated 1 2 3

for fault tolerance (usually 3

replicas)
Aims: improve data reliability, NameNod n
B
n B
n
1 2
availability, and network bandwidth e 1 1 1
utilization B B B
n n n
1 1 2
2 2 2
B B B
n n 3
n
3 2
3 3 3
B
3
n n n
4 4 4
rack rack rack
1 2 3

© 2021 KNIME AG. All rights 106

reserved.
HDFS – Access and File Size

Several ways to access HDFS File

data Size:
Hadoop is designed to handle
HDFS
Direct transmission of data from
fewer large files instead of
nodes to client lots of small files
Needs access to all nodes in Small file: File significantly
cluster
smaller than Hadoop block
WebHDFS
Direct transmission of data from
size
nodes to client via HTTP Problems:
Needs access to all nodes in Namenode memory
cluster
MapReduce performance
HttpFS
All data is transmitted to client
via one
single gateway node -> HttpFS
service

© 2021 KNIME AG. All rights 107

reserved.
HDFS – Access

HDF
S
Hadoo
p

PC DataNod
e

Client, e.g.
KNIME DataNod
e

DataNod
e

© 2021 KNIME AG. All rights

reserved.
YAR
N
Cluster resource management system
Two elements
Resource manager (one per cluster):
Knows where worker nodes are located and how many resources
they have
Scheduler: Decides how to allocate resources to applications
Node manager (many per cluster):
Launches application containers
Monitor resource usage and report to Resource Manager

HIVE
MapReduce Tez Spark
YARN
HDFS

© 2021 KNIME AG. All rights 11

reserved.
MapReduce
Inpu Splitti Mappin Shuffling Resul
t ng g Reducing t
blue, 1 blue, 1
blue red
red, 1 blue, 1 blue, 3
orange
orange, 1 blue, 1

blue red blue, 3

orange yellow, 1 yellow, yellow, 1
yellow blue yellow, 1
yellow blue blue, 1 1 orange, 1
blue red,1

blue blue, 1 orange, orange,

1 1

Map applies a function to each red, 1 red, 1

element
For each word emit: word, 1 HIVE
Reduce aggregates a list of values to one MapReduce Tez Spark
result
For all equal words sum up count YARN
HDFS

© 2021 KNIME AG. All rights 11

reserved.
Hive

SQL-like database on top of files in HDFS

Provides data summarization, query, and analysis
Interprets a set of files as a database table (schema
information to be provided)
Translates SQL queries to MapReduce, Tez, or Spark jobs
Supports various file formats:
Text/CSV
SequenceFile
Avro
ORC HIVE
Parquet MapReduce Tez Spark
YARN
HDFS

© 2021 KNIME AG. All rights 118

reserved.
Hive

SQL select * from table

MAP(...)
MapReduce / Tez / Spark
REDUCE(...)

Var1 Var2 Var3 Y

DataNode DataNode x1 z1 n1 y1

x2 z2 n2 y2
DataNode table_1.csv
x3 z3 n3 y3
table_2.csv table_3.csv

© 2021 KNIME AG. All rights 119

reserved.
Spark

Cluster computing framework for large-scale data

processing
Keeps large working datasets in memory between
jobs
No need to always load data from disk -> much (!) faster than
MapReduce
Programmatic interface
Scala, Java, Python, R
Functional programming paradigm: map, flatmap, filter, reduce,
fold, …
Great for:
HIVE
MapReduce Tez Spark
Iterative algorithms YARN

Interactive analysis HDFS

© 2021 KNIME AG. All rights 120

reserved.
Spark – Data Representation

DataFrame: Name Surname Age

John Doe 35

Table-like: Collection of rows, organized in Jane Roe 29

columns with names and types … … …

Immutable: Not
Data manipulation = creating new DataFrame from an e:
existing one by applying a function on it Earlier versions of
KNIME and Spark used
RDDs (resilient distributed
Lazily evaluated: datasets)
Functions are not executed until an action is triggered, In KNIME, DataFrames
that are always used in Spark
requests to actually see the row data 2 and later.

Distributed:
Each row belongs to exactly one partition
Each partition is held by a Spark Executor
© 2021 KNIME AG. All rights 12
reserved.
Spark – Lazy Evaluation

Functions ("transformations") on DataFrames are not executed

immediately
Spark keeps record of the transformations for each DataFrame
The actual execution is only triggered once the data is needed
Offers the possibility to optimize the transformation steps

Triggers
evaluati
on

© 2021 KNIME AG. All rights 125

reserved.
Spark Context

Spark Context
Main entry point for Spark
functionality
Represents connection to a Spark
cluster
Allocates resources on the cluster

Spark
Executor
Task Task
Spark Driver
Spark Context
Spark
Executor
Task Task

© 2021 KNIME AG. All rights 126

reserved.
Big Data Architecture with KNIME

Scheduled
Hadoop
execution and a
RESTful workflow l Cluster
submission Impa
Submit
KNIME Server Large Impala
with Big Data queries
Extensions via JDBC
2
Workflow er
Upload Hiveserv
via
HTTP(S) Submit
Build Hive
Spark queries
workflow via JDBC
s
graphicall y
y Liv
KNIME Analytics Interact
Platform with Spark
with Big Data via
HTTP(S)
Extensions

© 2021 KNIME AG. All rights 13

reserved.
In-Database
Processing on
Hadoop
KNIME Big Data Connectors

Package required drivers/libraries for

HDFS,
Hive, Impala access
Preconfigured database connectors
Hive
Impala

© 2021 KNIME AG. All rights 136

reserved.
Hive Connector

Creates JDBC connection to

Hive
On unsecured clusters no
password required

© 2021 KNIME AG. All rights 137

reserved.
Preferences

© 2021 KNIME AG. All rights 138

reserved.
Create Local Big Data Environment Node

Creates a fully functional big

data
environment
on your local machine with
Apache Hive
HDFS
Apache Spark
Try out Big Data nodes
without Hadoop cluster
Build and test workflows
locally on
sample data

reserved.
Section Exercise – 01_Hive_Connect

Execute the workflow 00_Setup_Hive_Table to create a local big data

environment
with the data used in this training.

On the workflow implemented in the previous section to predict

missing COW values, move execution from database to Hive.
That means:

change this workflow to run on the ss13pme table on the Hive

database in your local big data environment.

reserved.
Write/Load Data into
Hadoop
Loading Data into Hive/Impala

Connectors are from KNIME Big Data Connectors

Extension
Use DB Table Creator and DB Loader from regular DB
framework

Other possible
nodes

reserved.
DB Loader

reserved.
Hive Partitioning

Sale Parent
s Table

Jan201 Feb20 Mar20

7 17 17
date ≥ 01-01- date ≥ 02-01- date ≥ 03-01-
Range
2017 2017 2017 Partition
date ≤ 01-31- date ≤ 02-29- date ≤ 03-31- by date
2017 2017 2017

Europ Europ Europ List Sub-

e Asi e e partition by
Asi Asi
region
a
re
gion = re
gion = region
E= US Europa US Europa US
urop
regio = A regio
e = A regio
e = A
en Asia n Asia n Asia
region = region = region =
USA USA USA

reserved.
Partitioning

About partition
columns:
Optional (!) performance
optimization
Use columns that are often used
in WHERE clauses
Use only categorical columns Partition
with suitable value range, i.e. column

not too few distinct values (e.g.

2) and not too many distinct
values (e.g. 10 million)
Partition columns should not
contain missing values

reserved.
Section Exercise – 02_Hive_WritingToDB

Start from the workflow that implements the missing value strategy and
write the
results back into Hive. That is:

write the results onto a new table named "newTable" in Hive using the
HDFS Connection of the Local Big Data Environment along with both
DB Table Creator and DB Loader nodes.

reserved.
HDFS File
Handling
File Handling Nodes

Connection/Administration
HDFS Connection
HDFS File Permission
Utilize the existing remote file handling
nodes
Transfer Files
Create Folders
Delete Files/Folders
Compress and Decompress
Full documentation available

reserved.
Ready for
Spark?
KNIME
Extension for
Apache Spark
Spark: Machine Learning on Hadoop

Runs on Hadoop
Supported Spark Versions
1.2, 1.3, 1.5, 1.6, 2.x
One KNIME extension for all Spark versions
Scalable machine learning library (Spark MLlib and
spark.ml)
Algorithms for
Classification (decision tree, naïve Bayes, logistic regression, …)
Regression (linear regression, …)
Clustering (k-means)
Collaborative filtering (ALS)
Dimensionality reduction (SVD, PCA)
Item sets / Association rules

reserved.
Spark Integration in KNIME

reserved.
Spark Contexts: Creating

Three nodes to create a Spark

context:
Create Local Big Data Environment
Runs Spark locally on your machine (no cluster
required)
Good for workflow prototyping

Create Spark Context (Livy)

Requires a cluster that provides the Livy service
Good for production use

Create Databricks Environment

Runs Spark on a remote Databricks cluster
Good for large-scale production use

reserved.
Spark Contexts: Using, Destroying

Spark Context port is required by all Spark

nodes
Destroying a Spark Context destroys all
Spark DataFrames within the context

Spark
Context
Port

reserved.
Create Spark Context (Livy)

Allows to use Spark nodes on

clusters
with Apache Livy
Out-of-the-box compatibility with:
Hortonworks (v2.6.3 and higher)
Amazon EMR (v5.9.0 and higher)
Azure HDInsight (v3.6 and higher)

Also supported:

reserved.
Import Data
from KNIME or
Hadoop
Import Data to Spark

From KNIME data

table Spark

KNIME
Context

From CSV file Remote file

Connection

in HDFS
Spark Context

Hive
query
Spark

From Context

Hive
From other Spark
Context

sources
Database

From query Spark

Context

Spark DataFrame port points to a

DataFrame
in Spark cluster
Data stays within Spark Spark
DataFrame
Output port provides data preview and column Port
information

Reminder: Lazy Evaluation

A green node status does not always mean that
computation
has been performed!

reserved.
Section Exercise – 01_Spark_Connect

To do:
Connect to Spark via the Create Local Big Data
Environment node

Import the ss13pme table from Hive into Spark

reserved.
Virtual Data Warehouse

reserved.
Pre-Processing with
Spark
Spark Column Filter Node

reserved.
Spark Row Filter Node

reserved.
Spark Joiner Node

reserved.
Spark Missing Value Node

reserved.
Spark GroupBy and Spark Pivot Nodes

reserved.
Spark Sorter Node

reserved.
Spark SQL Query Node

reserved.
Section Exercise – 02_Spark_Preprocessing

This exercise will demonstrate some data manipulation operations

in Spark.
It initially imports the ss13pme and ss13hme tables from Hive.

To do:
Column Filter to remove PWGTP* and PUMA* columns
Join with ss13hme on SERIAL NO
Find rows with top ten AGEP in dataset and import them into
KNIME
Calculate average average AGEP per SEX group
Split the dataset into two:
One where COW is null
One where COW is not null

reserved.
Mix & Match

Thanks to the transferring

nodes (Hive to Spark and Spark
to Hive, Table to Spark and
Spark to Table) you can mix
and match in-database
processing operations

reserved.
Modularize and Execute Your Own Spark Code: Java
Snippets

reserved.
Modularize and Execute Your Own Spark Code: PySpark
Script

reserved.
Machine Learning with
Spark
MLlib Integration: Familiar Usage Model

Usage model and dialogs like existing nodes

No coding required
Various algorithms for classification, regression and clustering
supported

reserved.
MLlib Integration: Spark MLlib Model Port

MLlib model ports for model transfer

Model ports provide more information about the
model itself

Spark MLlib
model

reserved.
MLlib Integration: Categorical
features
MLLib learner nodes only support
numeric
features and labels
String columns (with categorical
values) need to be mapped to
numeric first

reserved.
MLlib Integration: Categorical Values for Decision Tree
Algorithms
MLlib tree algorithms have optional PMML input
port
If connected, hints to Decision Tree algorithm which numeric
columns
are categorical in nature
Improves performance in some cases

Spark Predictor (Mllib) assigns labels based on an MLlib

model
Supports all supervised classification & regression MLlib
models
Spark.ml models have a separate learner/predictor

Alternative
nodes

reserved.
Section Exercise – 03_Spark_Modelling

On the ss13pme table, the current workflow separates the rows where
COW is null,
from those where COW is not null, and then modifies COW to be zero-
based.

To do:
Where COW is not null:
Fix missing values in feature columns
Train a decision tree on COW on data rows
Where COW is null:
Remove COW column
Apply the decision tree model to predict COW

reserved.
Mass Learning in Spark – Conversion to PMML

Mass learning on Hadoop

Convert supported MLlib models to
PMML

reserved.
Sophisticated Model Learning in KNIME - Mass Prediction
in Spark
Supports KNIME models and pre-processing steps
Sophisticated model learning in KNIME
Mass prediction in Spark using Spark PMML Model
Predictor

reserved.
Closing the Loop

Apply Learn
model on model
demand at scale

PMML MLlib
model model

Sophisticate Apply
d model model
learning at scale

reserved.
Mix and Match

KNIME <-> Hive <->

Spark

reserved.
Export Data
back into
KNIME/Hadoop
Export Data from Spark

To KNIME Spark
DataFrame
KNIME data
table

To CSV file in
Remote file
connection
Spark DataFrame

HDFS
To
Hive connection
Spark Hive
table
DataFrame

Hive
To Other Spark

Storages
DataFrame

To
Database
connection Database
Spark table

Database DataFrame

reserved.
Cloud & Big Data Connectivity: Databricks

Create Databricks Environment: connect to your

Databricks cluster
Azure or AWS
Databricks Delta, Databricks File System, or Apache Spark

reserved.
Cloud & Big Data Connectivity: Google

Connectivity to
Google Cloud Storage
Google Big Query (via DB
Nodes)
Google Cloud Dataproc

reserved.
Section Exercise – 04_Spark_WritingToDB

This workflow provides a Spark predictor to predict COW values for the
ss13pme
data set. The model is applied to predict COW values where they are
missing.

Now export the new data set without missing values to:
A KNIME table
A Parquet file in HDFS
A Hive table

reserved.
Exampl
es
Analyzing the Irish Meter Dataset Using Spark
SQL

reserved.
Analyzing the Irish Meter Dataset Using Spark
SQL

reserved.
Columnar File Formats

Available in KNIME Analytics Platform: ORC and Parquet

Benefits:
Efficient compression: Stored as columns and compressed, which leads to smaller disk
reads.

Fast reads: Data is split into multiple files. Files include a built-in index, min/max values,
and other aggregates. In addition, predicate pushdown pushes filters into reads so that
minimal rows are read.

Proven in large-scale deployments: Facebook uses the ORC file format for a 300+ PB
deployment.

Improves performance when Hive is reading, writing, and processing

data in HDFS

reserved.
Example: Columnar File Formats

ID Gender Ag
e
1 female 45
I Gender
D
333 female Ag
2 male 20
4 e 4
2 mal 2
Metainformation 5
e ⋮ 0
⋮ ⋮ ⋮ ⋮ ⋮
ID: min = 1 ; max = I Gender
3333 Gender: 6666 male 4 D Ag
666 femal 4
3333 male
female; male Age: 42 2 e
Metainformat 7 e 5
min = 20; max = 45 2 mal 2
ID: min = 3334 ; max =
ion
e 0
6666 ⋮ ⋮ ⋮
Gender: female; male 10000 male 42
Age: min = 5; max = 24
Metainformation
ID: min = 6667 ; max =
10000
Gender: female; male
Age: min = 45; max = 87

reserved.
Example: Columnar File Formats
ID Gender Age
1 female 45
2 male 2
⋮ ⋮ 0 ⋮I Gender Age
D female 5
3333 male 42 3334
Metainformation 2 male 24
ID: min = 1 ; max = 3333 ⋮ ⋮ ⋮
Gender: female; male ID Gender Age
6666 male 17
Age: min = 20; max = 45 666 male 45
Metainformation 7
2 male 87
ID: min = 3334 ; max =
6666 ⋮ ⋮ ⋮
Gender: female; male 10000 male 65
Age: min = 5; max = 24
Metainformation
ID: min = 2 ; max = 10000
Gender: male
Age: min = 45; max = 87

Select ID from table where

reserved.
H2O Integration

KNIME integrates the H2O machine learning

library
H2O: Open source, focus on scalability and
performance
Supports many different models
Generalized Linear Model
Gradient Boosting Machine
Random Forest
k-Means, PCA, Naive Bayes, etc. and more to come!
Includes support for MOJO model objects for
deployment
Sparkling water = H2O on Spark

reserved.
The H2O Sparkling Water Integration

reserved.
Conclusio
ns
SQLite

reserved.
Hadoop Hive

reserved.
Spark

reserved.
Stay Up-To-Date and Contribute

Follow the KNIME Community Journal on

Medium Low Code for Advanced Data
Science

Daily content on data stories, data science

theory, getting started with KNIME and
more
for the community by the community

Would you like to share your data

story with the KNIME community?

Contributions are always

welcome!

reserved.
The End
Twitter: @KNIME

ZEN 2.6 (Blue Edition) - Software Manual
No ratings yet
ZEN 2.6 (Blue Edition) - Software Manual
968 pages
Simplified Reinforced Concrete Design 2010 NSCP Dit Gillesania PDF
No ratings yet
Simplified Reinforced Concrete Design 2010 NSCP Dit Gillesania PDF
223 pages
l4 BD Slides
No ratings yet
l4 BD Slides
184 pages
Knime
No ratings yet
Knime
97 pages
Oracle: Exam 1z0-808
No ratings yet
Oracle: Exam 1z0-808
10 pages
Solo Leveling Vol 1
No ratings yet
Solo Leveling Vol 1
310 pages
Knime Overview PDF
No ratings yet
Knime Overview PDF
101 pages
1106 Textminingforknime Slides
No ratings yet
1106 Textminingforknime Slides
201 pages
MULESOFT DEVELOPMENT - Interview
No ratings yet
MULESOFT DEVELOPMENT - Interview
21 pages
KNIME For Data Scientists Basics
No ratings yet
KNIME For Data Scientists Basics
180 pages
04 Chapter 4 Using TCL To Control The HyperMesh Session 12 1 PDF
100% (1)
04 Chapter 4 Using TCL To Control The HyperMesh Session 12 1 PDF
16 pages
Extraoral Radiography PDF
No ratings yet
Extraoral Radiography PDF
2 pages
Atellica LIS Interface
No ratings yet
Atellica LIS Interface
42 pages
Informatica HCL
100% (1)
Informatica HCL
221 pages
Free Online Translator - Preserves Your Document's Layout (PDF, Word, Excel, PowerPoint, OpenOffice, Text)
No ratings yet
Free Online Translator - Preserves Your Document's Layout (PDF, Word, Excel, PowerPoint, OpenOffice, Text)
6 pages
Text Mining Course For KNIME Analytics Platform
100% (1)
Text Mining Course For KNIME Analytics Platform
202 pages
FusionModule800 Smart Small Data Center V100R021C00 User Manual
No ratings yet
FusionModule800 Smart Small Data Center V100R021C00 User Manual
508 pages
Unit 3
0% (1)
Unit 3
30 pages
CS001 Final Term Spring 2010 10
No ratings yet
CS001 Final Term Spring 2010 10
16 pages
KNIME Analytics Platform Course For Beginners PDF
No ratings yet
KNIME Analytics Platform Course For Beginners PDF
163 pages
KPI in Data Science
No ratings yet
KPI in Data Science
3 pages
Analytics - Platform - Best - Practices - Guide Knime
No ratings yet
Analytics - Platform - Best - Practices - Guide Knime
23 pages
KNIME UserTraining Beginner CC
No ratings yet
KNIME UserTraining Beginner CC
151 pages
FST CIA 1 Answers
No ratings yet
FST CIA 1 Answers
17 pages
QI+ 5 GD en
No ratings yet
QI+ 5 GD en
138 pages
Mendix Advanced Exam Guide
No ratings yet
Mendix Advanced Exam Guide
12 pages
Project Lab
No ratings yet
Project Lab
4 pages
Analytics Platform User Guide
No ratings yet
Analytics Platform User Guide
53 pages
Computational Geometry: Gun Srijuntongsiri
No ratings yet
Computational Geometry: Gun Srijuntongsiri
71 pages
HPXS302 1 Jul Dec2024 SA1 OD V3 09072024
No ratings yet
HPXS302 1 Jul Dec2024 SA1 OD V3 09072024
7 pages
Installation and Maintenance Guide: UPS3000 LV and UPS3000 HV
No ratings yet
Installation and Maintenance Guide: UPS3000 LV and UPS3000 HV
84 pages
Unit 4 - QB - FST
No ratings yet
Unit 4 - QB - FST
53 pages
Abdul Raheem
No ratings yet
Abdul Raheem
3 pages
Features of WAH Server
No ratings yet
Features of WAH Server
5 pages
1 1 Scale Perception in Virtual and Augmented Real
No ratings yet
1 1 Scale Perception in Virtual and Augmented Real
10 pages
KNIME Analytics Platform
No ratings yet
KNIME Analytics Platform
2 pages
UNit 3 QB
No ratings yet
UNit 3 QB
2 pages
KNIME Installation
No ratings yet
KNIME Installation
14 pages
LM 8 23 en
No ratings yet
LM 8 23 en
8 pages
KNIME Analytics Platform
No ratings yet
KNIME Analytics Platform
27 pages
EventStudio System Designer Manual
No ratings yet
EventStudio System Designer Manual
130 pages
Professional Sitecore Development PDF
No ratings yet
Professional Sitecore Development PDF
556 pages
Hydrocarbon Sector Skill Council: Education Qualification & Experience Requirement
No ratings yet
Hydrocarbon Sector Skill Council: Education Qualification & Experience Requirement
10 pages
1106 Slides UserTrainingBeginners
No ratings yet
1106 Slides UserTrainingBeginners
164 pages
DR Stefan Helfrich PPTs - KNIME Analytics
No ratings yet
DR Stefan Helfrich PPTs - KNIME Analytics
77 pages
2020 11 l3 PC Slides
No ratings yet
2020 11 l3 PC Slides
178 pages
DICS504 Internet and Web Designing
No ratings yet
DICS504 Internet and Web Designing
10 pages
Knime Analytics Platform For Data Scientists
No ratings yet
Knime Analytics Platform For Data Scientists
163 pages
2020 11 l1 Ls Slides
No ratings yet
2020 11 l1 Ls Slides
167 pages
1107 Bigdataforknime Slides
No ratings yet
1107 Bigdataforknime Slides
176 pages
How To Add Linux Host To Nagios Monitoring Server Using NRPE Plugin
No ratings yet
How To Add Linux Host To Nagios Monitoring Server Using NRPE Plugin
7 pages
JFrame
No ratings yet
JFrame
9 pages
Practical No.:-6: Name of The Experiment: To Study KNIME Tool
No ratings yet
Practical No.:-6: Name of The Experiment: To Study KNIME Tool
6 pages
ASR5x00 Series Sessmgr WARN State Due T
No ratings yet
ASR5x00 Series Sessmgr WARN State Due T
3 pages
KNIME Installation & FAQs
No ratings yet
KNIME Installation & FAQs
14 pages
From Alteryx To KNIME: Written By: Corey Weisinger Edited By: Someone, I Hope
No ratings yet
From Alteryx To KNIME: Written By: Corey Weisinger Edited By: Someone, I Hope
41 pages
Analytics Platform Workbench Guide
No ratings yet
Analytics Platform Workbench Guide
40 pages
KNIME Quickstart Guide: KNIME AG, Zurich, Switzerland Version 4.3 (Last Updated On 2020-09-07)
No ratings yet
KNIME Quickstart Guide: KNIME AG, Zurich, Switzerland Version 4.3 (Last Updated On 2020-09-07)
27 pages
KNIME - Quick Guide
No ratings yet
KNIME - Quick Guide
35 pages
KNIME Workbench Guide
No ratings yet
KNIME Workbench Guide
41 pages
KNIME Introduction 2023-07
No ratings yet
KNIME Introduction 2023-07
52 pages
Big Data Workshop
No ratings yet
Big Data Workshop
77 pages
COS10022 DSP Lab Week02
No ratings yet
COS10022 DSP Lab Week02
31 pages
KNIME - User Interface Walkthrough Document
No ratings yet
KNIME - User Interface Walkthrough Document
20 pages
Analytics Platform User Guide
No ratings yet
Analytics Platform User Guide
66 pages
Analytics Platform Quickstart Guide
No ratings yet
Analytics Platform Quickstart Guide
24 pages
Actian DataFlow Getting Started Guide
No ratings yet
Actian DataFlow Getting Started Guide
39 pages
Analytics Platform User Guide
No ratings yet
Analytics Platform User Guide
75 pages
KNIME - Best Practices Guide
No ratings yet
KNIME - Best Practices Guide
43 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
Analytics Platform User Guide
No ratings yet
Analytics Platform User Guide
61 pages
Analytics Platform User Guide
No ratings yet
Analytics Platform User Guide
64 pages
From Excel To KNIME 081921-44
No ratings yet
From Excel To KNIME 081921-44
52 pages
KNIME-Analytics Platform Installation Guide
No ratings yet
KNIME-Analytics Platform Installation Guide
31 pages
KNIME
No ratings yet
KNIME
43 pages
KNIME Beginner's Luck v5 2 Ebook Sample
No ratings yet
KNIME Beginner's Luck v5 2 Ebook Sample
39 pages
Omero Knime
No ratings yet
Omero Knime
35 pages
Analytics Platform Installation Guide
No ratings yet
Analytics Platform Installation Guide
34 pages
Learn How Enterprises Transition From Alteryx To KNIME Software, An Open Ecosystem Built From The Ground Up For Data Science
No ratings yet
Learn How Enterprises Transition From Alteryx To KNIME Software, An Open Ecosystem Built From The Ground Up For Data Science
41 pages
Introduction To KNIME
No ratings yet
Introduction To KNIME
9 pages
Analytics Platform Installation Guide-Knime
No ratings yet
Analytics Platform Installation Guide-Knime
21 pages
Introduction To KNIME
No ratings yet
Introduction To KNIME
9 pages
KNIME Installation Guide
No ratings yet
KNIME Installation Guide
8 pages
KNIME Installation Guide
No ratings yet
KNIME Installation Guide
8 pages
From Alteryx To KNIME 092019
No ratings yet
From Alteryx To KNIME 092019
40 pages
KNIME Server User Guide: KNIME AG, Zurich, Switzerland Version 4.12 (Last Updated On 2021-04-13)
No ratings yet
KNIME Server User Guide: KNIME AG, Zurich, Switzerland Version 4.12 (Last Updated On 2021-04-13)
39 pages
Analytics Platform Extensions and Integrations
No ratings yet
Analytics Platform Extensions and Integrations
4 pages
KNIME Analytics Platform ProductSheet 122020
No ratings yet
KNIME Analytics Platform ProductSheet 122020
2 pages
Knime
No ratings yet
Knime
2 pages
CheatSheet Server A3 Web
No ratings yet
CheatSheet Server A3 Web
2 pages
Knime: The Konstanz Information Miner
No ratings yet
Knime: The Konstanz Information Miner
4 pages
Alteon Application Switch 25.1.0 RN
No ratings yet
Alteon Application Switch 25.1.0 RN
13 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Python Data Science Cookbook
From Everand
Python Data Science Cookbook
Taryn Voska
No ratings yet
Amazon EMR Solutions in Cloud Computing: Definitive Reference for Developers and Engineers
From Everand
Amazon EMR Solutions in Cloud Computing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet