0% found this document useful (0 votes)
27 views

Component 2. Pipeline 3. Data Component

There are 3 types of parallelism: component parallelism where different graph components run simultaneously on separate data, pipeline parallelism where downstream components can process records written by upstream components allowing both to run in parallel, and data parallelism achieved through multi-file systems (MFS) where a file is partitioned across multiple disks or nodes allowing parallel processing of each partition.

Uploaded by

Kv Kad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Component 2. Pipeline 3. Data Component

There are 3 types of parallelism: component parallelism where different graph components run simultaneously on separate data, pipeline parallelism where downstream components can process records written by upstream components allowing both to run in parallel, and data parallelism achieved through multi-file systems (MFS) where a file is partitioned across multiple disks or nodes allowing parallel processing of each partition.

Uploaded by

Kv Kad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Parallelism:There Are 3 types

1. Component
2. Pipeline
3. Data
Component
A graph running simultaneously on separate data using different components like Filter, Rollup,
Join etc in same phase is called Component Parallelism.
Pipeline
Each component in the pipeline continuously reads from upstream components, processes data,
and writes to downstream components. Since a downstream component can process records
previously written by an upstream component, both components can operate in parallel.
Component and Pipeline Parallelisms are default in Ab Initio, Programmer dont have any
control on these Parallelisms.
Data
Data Parallelism is achieved using Multi File System (MFS). A multifile is a parallel file that is
composed of individual files on different disks and/or nodes. The individual files are partitions of
the multifile. Each multifile contains one control partition and one or more data partitions.
Control partition will have pointers to data partition.
If there 4 data partition then MFS is called 4 Way MFS
If you have 8 data partition then that MFS is called 8 Way MFS and so on.

You might also like