Data Flow Tools & Properties v1.6
Data Flow Tools & Properties v1.6
Sub-category
• Read from file
• Read from db
• Read from API
• Read from S3
• Read from ADLS
Tool : Read from File
Delimiter Auto-populate
• File
Ports
-> No Input port
-> 1 output Port
Output/ Result
Database Name Drop down ( drop down to list the alias of the database name )
Table Name Drop down list the select the database table
Tool : Input/output port Tool : Validation
• All
Ports
-> No Input port
-> 1 output Port
Output/ Result
API Source Drop down ( list of API alias defined in the Data Source)
Tool : Input/output port Tool : Validation
• All
Ports
-> No Input port
-> 1 output Port
Output/ Result
Browse File Browse file path button ( allow to list the files available over file storage )
File Name Non editable field, automatically get the uploaded file name
File Format Dropdown (1. excel 2. csv 3. tsv 4. psv 5. json 6. parquet 7. xml)
Automatically get the file extension but allow to change
• S3 connection Name
• Browse File
• File Name
• File format
Ports
• Delimiter
-> No Input port
-> 1 output Port
Output/ Result
Name Drop down ( list of ADLS connection defined in the data source )
Browse File Browse file path button ( allow to list the files available over file storage )
File Name Non editable field, automatically get the uploaded file name
File Format Dropdown (1. excel 2. csv 3. tsv 4. psv 5. json 6. parquet 7. xml)
Automatically get the file extension but allow to change
Output/ Result
Sub-category
• Write to file
• Write to db
• Write to S3
• Write to ADLS
Tool : Write to File
Configuration Panel
Configuration Properties
Browse Path Browse file path button ( allow to save the file over file storage ) and define the format
File Name Editable field, although automatically get the name provided while saving the file name
File format Editable drop down (1. excel 2. csv 3. tsv 4. psv 5. json 6. parquet 7. xml)
Tool : Input/output port Tool : Validation
• Browse File
• File Name
• File format
Ports
-> 1 Input port
-> no output Port
Output/ Result
Configuration Panel
Configuration Properties
Database Name Drop down ( drop down of list of the database names )
Table Name Drop down list the select the database table + New Table
• Browse File
• File Name
• File format
Ports
-> 1 Input port
-> no output Port
Output/ Result
Configuration Panel
Configuration Properties
Browse Path Browse file path button ( allow to save the file over file storage ) and define the format
File Name Editable field, although automatically get the name provided while saving the file name
File format Editable drop down (1. excel 2. csv 3. tsv 4. psv 5. json 6. parquet 7. xml)
Tool : Input/output port Tool : Validation
• Browse File
• File Name
• File format
Ports
-> 1 Input port
-> no output Port
Output/ Result
Configuration Panel
Configuration Properties
ADLS Connection Drop down –list of ADLS connection, defined in data source
Browse Path Browse file path button ( allow to save the file over file storage ) and define the format
File Name Editable field, although automatically get the name provided while saving the file name
File format Editable drop down (1. excel 2. csv 3. tsv 4. psv 5. json 6. parquet 7. xml)
Tool : Input/output port Tool : Validation
• Browse File
• File Name
• File format
Ports
-> 1 Input port
-> no output Port
Output/ Result
Sub-category
• Join
• Set Operator
• Record ID
• Sort
• Filter rows
• Filter Attribute
Tool : Join
Configuration Panel
Configuration Properties
Join Type Drop down (Inner Join ; Left Outer Join ; Right Outer Join ; Full outer join )
• ALL
Ports
-> 2 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
Operator Type Display list of set operators ( Union, Union all, Intersect, Minus )
Config Drop down to display list ( Auto config by Name ; Auto Config by Position; Manual config )
No panel required for auto config but for manual config below is the reference
Reference
Tool : Input/output port Tool : Validation
• ALL
Ports
-> 2 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
Record ID Field Editable Text field with default value “Record ID”
Position Drop down display “First Column” and “Last column” ; Default first column
Reference
Tool : Input/output port Tool : Validation
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
Allow to change the sequence of defining the fields and +/- any field
Reference
Tool : Input/output port Tool : Validation
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
And/or + ( Add )
Reference
Tool : Input/output port Tool : Validation
• ALL
Ports
-> 1 Input port
->1 output Port
Output/ Result
Configuration Panel
Configuration Properties
Metadata preview with “checkbox” List metadata preview of the source connection with “checkbox”
Field Name ; Data Type; Size; allowing user to select and un-select the fields to carry to the next
tool
Reference
Tool : Input/output port Tool : Validation
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Sub-category
• Advance Match
• Dedupe
Tool : Advance Match
Configuration Panel
Configuration Properties
Join Type Drop down (Inner Join ; Left Outer Join ; Right Outer Join ; Full outer join )
• ALL
Ports
-> 2 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Sub-category
• Filter
• Regex
• Replace
Tool : Replace
Configuration Panel
Configuration Properties
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Sub-category
• Addition
• Subtraction
• Division
• Multiplication
Tool : Addition
Configuration Panel
Configuration Properties
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
• ALL
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Sub-category
• AVG
• COUNT
• MAX
• MIN
• STDDEV
• SUM
• VARIANCE
• MEAN
Tool : Same for all other group functions
Configuration Panel
Configuration Properties
Select group fields Drop down list with multiple select option
List all the fields received from previous port/node
Apply on Drop down list all the fields received from previous port/node
Reference
Tool : Input/output port Tool : Validation
• Apply on
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Sub-category
• Python
• PySpark
• SQL
Tools:
Sub-category
• Add attribute
• Condition
Tool : Add Attribute
Configuration Panel
Configuration Properties
• Apply on
Ports
-> 1 Input port
-> 1 output Port
Output/ Result
Configuration Panel
Configuration Properties
Port 2
Config Default Value ( Radio button auto selected )
Reference
Tool : Input/output port Tool : Validation
• ALL
Ports
-> 1 Input port
-> 2 output Port
Output/ Result