We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3
Ab Initio Components
Ab Initio graph building involves two distinct sets of components:
1. Dataset Components: These components act as containers, holding and managing
data within the graph. They play a crucial role in data storage and retrieval. o Input File: This component holds the data your program will work with. o Output File: This component stores the processed data from your program. o Lookup File: This component allows you to quickly find specific data records based on certain criteria. o Intermediate File: This component represents one or multiple serial files or a multifile of intermediate results that a graph writes during execution and saves for your review afterward. 2. Program Components: These components are responsible for data processing and transformations. They manipulate the data according to your desired outcomes within the graph. o Partition Components: These components are used to divide large datasets into smaller, manageable subsets. This partitioning can significantly improve performance, especially when dealing with massive amounts of data. Partition by Key: Reads records from the input port and distributes data records to its output flow partitions according to key values. Partition by Round Robin: Distributes blocks of data records evenly to each output flow in a round-robin fashion, without requiring a partitioning key. Partition by Expression: Distributes data records based on a specified expression. Partition by Range: Distributes data records based on the ranges of key values specified for each partition. o Departition Components: These components are essentially the opposite of partitioning components. They combine multiple flow partitions of data records into a single flow. Concatenate: This component appends multiple flow partitions one after another, creating a single flow of concatenated data. Gather: This component collects data records from multiple flow partitions arbitrarily, without preserving any specific order. Interleave: This component combines data records from multiple flow partitions in a round-robin fashion, alternating blocks of data from each partition. Merge: This component combines data records from multiple flow partitions that have been sorted according to the same key specifier, maintaining the sort order. Transform Components
Transform components are the workhorses of Ab Initio, responsible for manipulating
and transforming data within a graph. They perform a wide range of operations, from simple data cleaning to complex data analysis.
Data Manipulation Components:
o Filter: Selects data records based on specified criteria. o Sort: Sorts data records according to a specified key. o Join: Combines data from multiple sources based on a common key. o Aggregate: Calculates summary statistics for groups of data records. o Project: Selects specific columns from a dataset. o Reformat: Converts data from one format to another (e.g., text to numeric). o Lookup: Retrieves data from a lookup table based on a specified key. o Replicate: Creates multiple copies of a dataset. o Union: Combines data from multiple datasets into a single dataset. o Set Operations: Performs set operations like intersection, union, and difference on datasets. Data Transformation Components: o Decode: Decodes data that has been encoded using a specific format. o Encode: Encodes data into a specific format. o Validate: Validates data against predefined rules and constraints. o Derive: Creates new data fields based on existing data. o Mask: Masks sensitive data to protect privacy. o Encrypt: Encrypts data to secure it from unauthorized access. o Decrypt: Decrypts encrypted data. o Format: Formats data according to specific requirements (e.g., date formatting, number formatting). Control Flow Components: o Decision: Branches the data flow based on a condition. o Loop: Repeats a set of operations until a condition is met. Additional Utilities Filter by Expression: This component is used to filter data records based on specific criteria. Join: This component combines data from multiple sources based on a common key. Reformat: This component allows you to rearrange data by adding, removing, combining, or changing fields. Rollup: This component summarizes data into a smaller form by grouping and calculating totals. Aggregate: Similar to Rollup but with less control over data processing. Scan: This component generates a series of summary records that accumulate data over time.