Informatica Question - Answer
Informatica Question - Answer
Here are the main differences between Informatica PowerCenter 8x and 9x:
Connected Lookup
A Connected Lookup is a Lookup transformation that is connected to the mapping pipeline. It receives
input data from the pipeline, performs the lookup, and returns the result to the pipeline.
Characteristics:
- Receives input data from the pipeline
- Performs the lookup in real-time
- Returns the result to the pipeline
- Supports caching for better performance
Unconnected Lookup
An Unconnected Lookup is a Lookup transformation that is not connected to the mapping pipeline. It
performs the lookup independently and returns the result as a variable or output.
Characteristics:
- Does not receive input data from the pipeline
- Performs the lookup independently
- Returns the result as a variable or output
- Does not support caching
Key Differences:
1. Input Data: Connected Lookup receives input data from the pipeline, while Unconnected Lookup does
not.
2. Lookup Timing: Connected Lookup performs the lookup in real-time, while Unconnected Lookup
performs the lookup independently.
3. Caching: Connected Lookup supports caching, while Unconnected Lookup does not.
4. Output: Connected Lookup returns the result to the pipeline, while Unconnected Lookup returns the
result as a variable or output.
When to use each:
- Use Connected Lookup when you need to perform a lookup based on input data from the pipeline.
- Use Unconnected Lookup when you need to perform a lookup independently, without relying on input
data from the pipeline.;
Stop
- Stops the workflow or session gracefully, allowing currently running tasks to complete.
- Ensures data integrity by committing or rolling back transactions as needed.
- Can be used to pause a workflow or session temporarily.
Abort
- Immediately terminates the workflow or session, canceling all running tasks.
- May result in data inconsistencies or incomplete transactions.
- Should be used with caution, as it can lead to data corruption or loss.
Key differences:
1. Immediate termination: Abort stops the workflow or session immediately, while Stop allows currently
running tasks to complete.
2. Data integrity: Stop ensures data integrity by handling transactions properly, while Abort may lead to
data inconsistencies.
3. Usage: Stop is suitable for pausing or stopping a workflow or session temporarily, while Abort should
be used with caution and only when necessary.
When to use each:
- Use Stop when you need to pause or stop a workflow or session temporarily, ensuring data integrity.
- Use Abort when you need to immediately terminate a workflow or session, but be aware of the
potential risks to data integrity.
Static Cache
1. Pre-loaded: The cache is pre-loaded with data before the mapping starts.
2. Fixed size: The cache size is fixed and defined during design time.
3. No runtime updates: The cache is not updated during runtime.
4. Faster lookup: Since the cache is pre-loaded, lookups are faster.
Dynamic Cache
1. Runtime loading: The cache is loaded dynamically during runtime.
2. Variable size: The cache size can grow or shrink dynamically based on runtime conditions.
3. Runtime updates: The cache is updated during runtime as new data is processed.
4. Slower lookup: Since the cache is loaded dynamically, lookups might be slower compared to static
cache.
Key differences:
1. Cache loading: Static cache is pre-loaded, while dynamic cache is loaded during runtime.
2. Cache size: Static cache has a fixed size, while dynamic cache size can vary.
3. Runtime updates: Static cache is not updated during runtime, while dynamic cache is updated.
4. Lookup performance: Static cache generally provides faster lookups, while dynamic cache might be
slower.
When to use each:
1. Use Static Cache:
- When the lookup data is relatively small and doesn't change frequently.
- When fast lookup performance is critical.
2. Use Dynamic Cache:
- When the lookup data is large or changes frequently.
- When the cache needs to be updated during runtime.
Reusable Transformation
1. Single transformation: A Reusable Transformation is a single transformation that can be reused across
multiple mappings.
2. Specific function: It performs a specific function, such as aggregation, filtering, or data conversion.
3. Configurable: Reusable Transformations can be configured to accept input parameters and produce
output values.
4. Used within a mapping: Reusable Transformations are used within a mapping to perform a specific
task.
Mapplet
1. Group of transformations: A Mapplet is a group of transformations that can be reused across multiple
mappings.
2. Complex logic: Mapplets can contain complex logic, including multiple transformations, variables, and
parameters.
3. Reusable logic: Mapplets provide reusable logic that can be applied to different datasets and
mappings.
4. Used as a building block: Mapplets can be used as building blocks to create more complex mappings
and workflows.
Key differences
1. Scope: Reusable Transformations are single transformations, while Mapplets are groups of
transformations.
2. Complexity: Mapplets can contain complex logic, while Reusable Transformations are typically simpler.
3. Reusability: Both Reusable Transformations and Mapplets are reusable, but Mapplets provide more
flexibility and customization options.
4. Usage: Reusable Transformations are used within a mapping, while Mapplets can be used as building
blocks to create more complex mappings and workflows.
7. How the Informatica server sorts the string values in Rank transformation ?
In Informatica, the Rank transformation sorts string values based on the session properties and the
database code page. Here's how it works:
Sorting Mechanism
1. Session Properties: Informatica uses the session properties to determine the sorting order. Specifically,
the Code Page and Sort Order properties control how string values are sorted.
2. Database Code Page: If the session properties are not set, Informatica uses the database code page to
determine the sorting order.
3. ASCII Values: Informatica compares the ASCII values of each character in the string to determine the
sorting order.
4. Case Sensitivity: By default, Informatica performs a case-sensitive sort. However, you can change this
behavior by setting the Case Sensitive property in the Rank transformation.
Sorting Order
The sorting order for string values in the Rank transformation is as follows:
1. Null values: Null values are sorted first.
2. Empty strings: Empty strings are sorted next.
3. String values: String values are sorted based on their ASCII values, taking into account the session
properties and database code page.
Example
Suppose you have a string column with the following values:
- Apple
- Banana
- Cherry
- Date
- NULL
- "" (empty string)
The Rank transformation would sort these values as follows:
1. NULL
2. "" (empty string)
3. Apple
4. Banana
5. Cherry
6. Date
Note that the sorting order may vary depending on the session properties and database code page.
Passive Sorter
A Sorter transformation is considered passive when:
1. No aggregation: The Sorter is not performing any aggregation, such as grouping or summarizing data.
2. No data rejection: The Sorter is not rejecting any data, and all input rows are passed to the output.
3. Only sorting: The Sorter is only sorting the data based on the specified sort keys.
In this case, the Sorter transformation is passive because it does not modify the data or reject any rows.
Active Sorter
A Sorter transformation is considered active when:
1. Aggregation: The Sorter is performing aggregation, such as grouping or summarizing data.
2. Data rejection: The Sorter is rejecting data, and some input rows are not passed to the output.
3. Distinct or Group By: The Sorter is using the DISTINCT or GROUP BY options, which modify the data.
In this case, the Sorter transformation is active because it modifies the data or rejects rows.
Key differences
1. Data modification: Active Sorter modifies the data, while passive Sorter does not.
2. Data rejection: Active Sorter rejects data, while passive Sorter does not.
3. Aggregation: Active Sorter performs aggregation, while passive Sorter does not.
Components
1. Client Tier: Informatica PowerCenter Client, Informatica Developer, or other client tools connect to the
Informatica Server.
2. Web Tier: Informatica Web Services provide a web-based interface for administering and monitoring
the Informatica Server.
3. Application Tier: Informatica Server, which includes the following components:
1. Integration Service: Executes data integration tasks, such as workflows and mappings.
2. Repository Service: Manages the Informatica Repository, which stores metadata, mappings, and
workflows.
3. Security Service: Handles authentication, authorization, and encryption for the Informatica Server.
4. Database Tier: Informatica Repository Database stores metadata, mappings, and workflows.
Architecture
1. N-Tier Architecture: Informatica Server follows an n-tier architecture, allowing for scalability and
flexibility.
2. Distributed Architecture: Informatica Server can be deployed in a distributed environment, with
multiple nodes and services.
3. Load Balancing: Informatica Server supports load balancing, ensuring high availability and
performance.
4. Failover: Informatica Server provides failover capabilities, minimizing downtime and ensuring business
continuity.
Benefits
1. Scalability: Informatica Server architecture allows for horizontal scaling, handling large volumes of
data and users.
2. High Performance: Optimized for performance, Informatica Server ensures fast data processing and
integration.
3. Security: Robust security features, including encryption and access control, protect sensitive data.
4. Flexibility: Informatica Server supports various data sources, targets, and formats, making it a versatile
data integration platform.
10. In update strategy Relational table or flat file which gives us more
performance? Why?
In Informatica, when using an Update Strategy transformation to update a relational table or a flat file,
the relational table typically provides better performance. Here's why:
11. What are the out put files that the Informatica server creates during running a
session?
When running a session, the Informatica server creates several output files that contain information
about the session execution, errors, and data processing. Here are some of the common output files
created by the Informatica server:
12. Can you explain what are error tables in Informatica are and how we do error
handling in Informatica?
Error tables in Informatica are tables that store data that fails validation or processing during a mapping
or workflow execution. These tables help in error handling and provide a way to track and manage
errors.
Types of Error Tables
1. Error Table: Stores data that fails validation or processing due to errors such as invalid data, data type
mismatch, or constraint violations.
2. Reject Table: Stores data that is rejected during processing, such as duplicate records or invalid data.
3. Exception Table: Stores data that fails processing due to exceptions such as database connectivity
issues or network errors.
Error Handling in Informatica
1. Error Handling Options: Informatica provides various error handling options, such as:
- Abort on Error: Stops the workflow or mapping execution when an error occurs.
- Continue on Error: Continues the workflow or mapping execution even if an error occurs.
- Use Error Table: Stores error data in an error table.
2. Error Tables Configuration: Error tables can be configured to store specific error data, such as error
codes, error messages, and error timestamps.
3. Error Handling Transformations: Informatica provides various transformations for error handling, such
as:
- Router Transformation: Routes data to different paths based on error conditions.
- Filter Transformation: Filters out data that meets specific error conditions.
- Expression Transformation: Uses expressions to handle error conditions.
1. Use Error Tables: Use error tables to store error data and track errors.
2. Configure Error Handling Options: Configure error handling options to suit your workflow or mapping
requirements.
3. Use Error Handling Transformations: Use error handling transformations to handle error conditions
and route data accordingly.
4. Monitor and Analyze Error Data: Monitor and analyze error data to identify and resolve errors.
13. Difference between constraint base loading and target load plan?
In Informatica, Constraint-Based Loading (CBL) and Target Load Plan (TLP) are two different approaches
to loading data into a target system.
Key differences
1. Loading order: CBL determines the loading order automatically, while TLP requires manual definition.
2. Complexity: CBL supports complex relationships, while TLP supports simple relationships.
3. Manual effort: CBL reduces manual effort, while TLP requires manual effort.
4. Flexibility: TLP provides more flexibility in defining the loading order.
IIF Function
1. Syntax: IIF(condition, true_value, false_value)
2. Usage: Evaluates a condition and returns one of two values based on the condition.
3. Flexibility: Limited to two possible outcomes (true or false).
4. Readability: Simple and easy to read.
DECODE Function
1. Syntax: DECODE(column, search1, result1, search2, result2, ..., default)
2. Usage: Compares a column value to multiple search values and returns a corresponding result.
3. Flexibility: Can handle multiple search values and return different results.
4. Readability: Can be more complex and harder to read, especially with multiple search values.
Key differences
1. Conditional logic: IIF uses a simple true/false condition, while DECODE uses multiple search values.
2. Flexibility: DECODE is more flexible, handling multiple search values and results.
3. Readability: IIF is simpler and easier to read, while DECODE can be more complex.
When to use each.
1. Use IIF:
- For simple true/false conditions.
- When you need to evaluate a single condition.
2. Use DECODE:
- For more complex conditional logic.
- When you need to compare a column value to multiple search values.
By following these steps, you can import an Oracle sequence into Informatica using the method that best
suits your needs.
Types of Parameters
1. String parameters: Parameters that store string values.
2. Integer parameters: Parameters that store integer values.
3. Date parameters: Parameters that store date values.
Normal Load
1. Row-by-row loading: Normal Load loads data into the target system one row at a time.
2. SQL INSERT statements: Normal Load uses SQL INSERT statements to load data into the target system.
3. Target system logging: Normal Load logs each insert operation in the target system's transaction log.
4. Error handling: Normal Load provides detailed error messages for each row that fails to load.
5. Performance: Normal Load is generally slower than Bulk Load, especially for large volumes of data.
Bulk Load
1. Batch loading: Bulk Load loads data into the target system in batches, rather than one row at a time.
2. Native load utilities: Bulk Load uses native load utilities, such as Oracle's SQL*Loader or Microsoft's
BCP, to load data into the target system.
3. Minimal logging: Bulk Load minimizes logging in the target system's transaction log, which improves
performance.
4. Error handling: Bulk Load provides summary-level error messages, rather than detailed error messages
for each row.
5. Performance: Bulk Load is generally faster than Normal Load, especially for large volumes of data.
Key differences
1. Loading method: Normal Load loads data row-by-row, while Bulk Load loads data in batches.
2. SQL statements: Normal Load uses SQL INSERT statements, while Bulk Load uses native load utilities.
3. Logging: Normal Load logs each insert operation, while Bulk Load minimizes logging.
4. Error handling: Normal Load provides detailed error messages, while Bulk Load provides summary-
level error messages.
5. Performance: Bulk Load is generally faster than Normal Load.
18. How u will create header and footer in target using Informatica?
In Informatica, you can create a header and footer in a target file using a combination of transformations
and target properties. Here's a step-by-step guide:
Creating a Header
1. Create a new transformation: In the Informatica Mapping Designer, create a new transformation, such
as an Expression transformation or a Transaction Control transformation.
2. Add a header variable: In the transformation, add a new variable to store the header text. For
example, you can add a string variable called HEADER_TEXT.
3. Assign the header text: Assign the desired header text to the HEADER_TEXT variable. You can use a
constant value or an expression to generate the header text.
4. Use the header variable in the target: In the target definition, use the HEADER_TEXT variable as the
header text. You can do this by adding a new field to the target definition and assigning the
HEADER_TEXT variable to it.
Creating a Footer
1. Create a new transformation: In the Informatica Mapping Designer, create a new transformation, such
as an Expression transformation or a Transaction Control transformation.
2. Add a footer variable: In the transformation, add a new variable to store the footer text. For example,
you can add a string variable called FOOTER_TEXT.
3. Assign the footer text: Assign the desired footer text to the FOOTER_TEXT variable. You can use a
constant value or an expression to generate the footer text.
4. Use the footer variable in the target: In the target definition, use the FOOTER_TEXT variable as the
footer text. You can do this by adding a new field to the target definition and assigning the FOOTER_TEXT
variable to it.
Target Properties
1. Set the header and footer properties: In the target definition, set the Header and Footer properties to
the corresponding variables (HEADER_TEXT and FOOTER_TEXT).
2. Specify the header and footer formats: You can also specify the format of the header and footer text
using the Header Format and Footer Format properties.
By following these steps, you can create a header and footer in your target file using Informatica.
20. Where does Informatica store rejected data? How do we view them?
In Informatica, rejected data is stored in a file or database table, depending on the configuration of the
session. Here are the common locations where Informatica stores rejected data:
1. Reject File: By default, Informatica stores rejected data in a reject file, which is a text file that contains
the rejected records. The reject file is usually named <session_name>_reject.txt and is stored in the
Informatica server's temporary directory.
2. Reject Table: If you configure the session to store rejected data in a database table, Informatica will
store the rejected records in a table specified by you. This table is usually named <session_name>_reject
and is stored in the target database.
21. What is difference between partitioning of relational target and file targets?
In Informatica, partitioning is a technique used to divide large amounts of data into smaller, more
manageable pieces, called partitions. Partitioning can be applied to both relational targets and file
targets, but there are differences in how partitioning is implemented and used for each type of target.
Key differences
1. Partitioning complexity: Relational target partitioning is more complex and powerful than file target
partitioning.
2. Partitioning methods: Relational targets support more advanced partitioning methods, such as range
and hash partitioning, while file targets support simpler methods.
3. Partitioning keys: Relational targets use column-level attributes as partitioning keys, while file targets
use file-level attributes.
4. Benefits: Relational target partitioning provides more benefits, such as improved query performance
and reduced storage requirements, while file target partitioning focuses on data organization and
exchange.
22. What are mapping parameters and variables in which situation we can use
them?
In Informatica, mapping parameters and variables are used to make mappings more flexible and
reusable. Here's a brief overview of each:
Mapping Parameters
1. Defined at the mapping level: Mapping parameters are defined at the mapping level and can be used
throughout the mapping.
2. Passed as input: Mapping parameters can be passed as input to the mapping from the workflow or
session.
3. Used for filtering, sorting, and aggregating: Mapping parameters can be used for filtering, sorting, and
aggregating data within the mapping.
4. Example: A mapping parameter can be used to specify the date range for which data needs to be
extracted.
Mapping Variables
1. Defined within the mapping: Mapping variables are defined within the mapping and can be used
within the mapping.
2. Initialized with a value: Mapping variables can be initialized with a value, which can be a constant, an
expression, or a parameter.
3. Used for calculations and transformations: Mapping variables can be used for calculations and
transformations within the mapping.
4. Example: A mapping variable can be used to store the result of a calculation, such as the total sales
amount.
Best practices
1. Use meaningful names: Use meaningful names for mapping parameters and variables to make them
easy to understand and use.
2. Document usage: Document the usage of mapping parameters and variables to ensure that they are
used correctly and consistently.
3. Test thoroughly: Test mappings thoroughly to ensure that mapping parameters and variables are
working as expected.
23. What do you mean by direct loading and Indirect loading in session
properties?
In Informatica, Direct Loading and Indirect Loading are two loading methods that can be specified in the
Session Properties to control how data is loaded into a target database or file.
Direct Loading
1. Loads data directly into the target: Direct Loading loads data directly into the target database or file,
without creating a temporary file or staging area.
2. Faster loading times: Direct Loading is generally faster than Indirect Loading, since it eliminates the
need to create and manage temporary files.
3. Less disk space required: Direct Loading requires less disk space, since no temporary files are created.
Indirect Loading
1. Loads data into a temporary file first: Indirect Loading loads data into a temporary file or staging area,
which is then loaded into the target database or file.
2. More flexible and recoverable: Indirect Loading provides more flexibility and recoverability, since the
temporary file can be used to restart the loading process in case of failures.
3. More disk space required: Indirect Loading requires more disk space, since temporary files are
created.
1. Use checkpoints: Checkpoints are points in the batch process where the current state of the process is
saved. If a failure occurs, the batch can restart from the last checkpoint.
2. Implement transactional control: Use transactional control to ensure that multiple operations are
treated as a single, atomic unit. If any part of the transaction fails, the entire transaction can be rolled
back.
3. Use batch IDs and sequence numbers: Assign a unique batch ID and sequence number to each batch.
This allows you to track the progress of each batch and recover from failures.
4. Log batch progress: Log the progress of each batch, including any errors or exceptions that occur. This
log can be used to recover from failures and diagnose issues.
5. Implement retry logic: Implement retry logic to handle transient failures, such as network connectivity
issues.
6. Use a message queue or event-driven architecture: Consider using a message queue or event-driven
architecture to handle batch processing. This allows you to decouple the batch processing from the main
application and provides a more robust recovery mechanism.
7. Monitor batch progress: Monitor the progress of each batch and alert administrators if any issues
arise.
8. Implement a recovery process: Establish a recovery process that can be triggered in case of a failure.
This process should include steps to recover from the failure and restart the batch from the last
checkpoint.
Types of Batches
1. Scheduled Batch: A scheduled batch is a batch that is executed at a specific time or interval, such as
daily, weekly, or monthly. Scheduled batches are used for tasks that need to be performed regularly, such
as data backups or report generation.
2. On-Demand Batch: An on-demand batch is a batch that is executed manually by a user or
administrator. On-demand batches are used for tasks that need to be performed immediately, such as
data imports or exports.
3. Real-Time Batch: A real-time batch is a batch that is executed in real-time, as soon as the data is
available. Real-time batches are used for tasks that require immediate processing, such as transactional
data processing or event-driven processing.
4. Parallel Batch: A parallel batch is a batch that is executed concurrently, with multiple tasks or jobs
running simultaneously. Parallel batches are used for tasks that require high processing power, such as
data processing or scientific simulations.
5. Sequential Batch: A sequential batch is a batch that is executed one task or job at a time, in a
sequential order. Sequential batches are used for tasks that require a specific order of execution, such as
data imports or exports.
2. NoSQL Database
- Stores metadata in a variety of formats, such as key-value, document, graph, or column-family stores.
- Examples: MongoDB, Cassandra, Couchbase.
3. Cloud-based Repository
- Stores metadata in a cloud-based storage service, such as Amazon S3 or Microsoft Azure Blob Storage.
- Examples: AWS Glue, Azure Data Catalog.
4. Graph Database
- Stores metadata as nodes and edges in a graph structure.
- Examples: Neo4j, Amazon Neptune.
5. Object-oriented Database
- Stores metadata in an object-oriented format, using classes and objects.
- Examples: Gemstone, Matisse.
6. XML Database
- Stores metadata in an XML format, using XML documents and schemas.
- Examples: MarkLogic, Tamino.
7. Ontology-based Repository
- Stores metadata as ontologies, which are formal representations of knowledge.
- Examples: Protégé, TopBraid Composer.
9. Data Warehouse
- Stores metadata in a centralized repository, often used for business intelligence and data analytics.
- Examples: Amazon Redshift, Google BigQuery.
29. Can you use the mapping parameters or variables created in one mapping into
another mapping?
In Informatica, mapping parameters and variables are specific to each mapping. However, there are ways
to share and reuse them across multiple mappings:
Mapping Parameters
1. Global Parameters: You can create global parameters that can be shared across multiple mappings.
Global parameters are defined at the workflow or folder level.
2. Parameter Files: You can store mapping parameters in a parameter file, which can be shared across
multiple mappings.
3. Workflow Variables: You can create workflow variables that can be used to pass values between
mappings.
Mapping Variables
1. Workflow Variables: You can create workflow variables that can be used to pass values between
mappings.
2. Parameter Files: You can store mapping variables in a parameter file, which can be shared across
multiple mappings.
3. Global Variables: You can create global variables that can be shared across multiple mappings. Global
variables are defined at the workflow or folder level.
Best Practices
1. Use meaningful names: Use meaningful names for mapping parameters and variables to make them
easy to understand and reuse.
2. Document usage: Document the usage of mapping parameters and variables to ensure that they are
used correctly across multiple mappings.
3. Test thoroughly: Test mappings thoroughly to ensure that shared parameters and variables are
working as expected.
By using stored procedures in ETL applications, you can improve performance, data integrity, and
maintainability, while reducing network traffic and improving overall efficiency.
31.When we can join tables at the Source qualifier itself, why do we go for joiner
transformation?
While it's true that you can join tables at the Source Qualifier level, there are scenarios where using a
Joiner transformation is more suitable or even necessary. Here are some reasons why:
Reasons to use Joiner transformation
1. Complex join conditions: When you have complex join conditions, such as multiple join conditions,
conditional joins, or joins with aggregations, a Joiner transformation provides more flexibility and
control.
2. Multiple join types: Joiner transformation allows you to perform different types of joins, such as inner
join, left outer join, right outer join, full outer join, and cross join. In contrast, Source Qualifier joins are
typically limited to inner joins.
3. Joining data from different sources: When you need to join data from different sources, such as
databases, files, or applications, a Joiner transformation provides a more flexible and scalable solution.
4. Data transformation and aggregation: Joiner transformation allows you to perform data
transformations and aggregations on the joined data, which may not be possible or efficient at the
Source Qualifier level.
5. Performance optimization: In some cases, using a Joiner transformation can improve performance by
reducing the amount of data being joined or by allowing for more efficient join algorithms.
6. Reusability and modularity: Joiner transformation can be reused across multiple mappings and
workflows, making it a more modular and maintainable solution.
7. Debugging and troubleshooting: Joiner transformation provides more detailed logging and debugging
capabilities, making it easier to troubleshoot join-related issues.
When you configure a Lookup transformation, it performs an inner join between the input data and the
lookup data. This means that only the rows that have a match in both the input data and the lookup data
are returned.
Here's a step-by-step explanation of how the inner join works in a Lookup transformation:
1. The Lookup transformation receives input data from the previous transformation or source.
2. The Lookup transformation searches for matching values in the lookup table.
3. If a match is found, the Lookup transformation returns the matched row(s) from the lookup table.
4. If no match is found, the Lookup transformation returns a null value or a default value specified in the
transformation properties.
Other join options in Lookup transformation
While the default join operation is an inner join, you can also configure the Lookup transformation to
perform other types of joins, such as:
- Left Outer Join
- Right Outer Join
- Full Outer Join
- Cross Join
To change the join type, you need to modify the Lookup transformation properties and select the desired
join type.
34.In a joiner transformation, you should specify the table with lesser rows as the
master table. Why?
In a Joiner transformation, specifying the table with lesser rows as the master table is a best practice for
performance optimization. Here's why:
Cached Lookup
1. Stores lookup data in memory: The lookup data is cached in memory, allowing for faster lookup
operations.
2. Faster lookup performance: Since the data is stored in memory, lookup operations are faster and more
efficient.
3. Reduced database queries: By storing the lookup data in memory, the number of database queries is
reduced, resulting in improved performance.
4. Suitable for static or infrequently changing data: Cached Lookup is suitable for lookup data that does
not change frequently, as the cached data can be reused.
Uncached Lookup
1. Retrieves lookup data from the database in real-time: The lookup data is retrieved from the database
in real-time, without caching.
2. Slower lookup performance: Since the data is retrieved from the database in real-time, lookup
operations can be slower.
3. Increased database queries: Uncached Lookup results in more database queries, which can impact
performance.
4. Suitable for dynamic or frequently changing data: Uncached Lookup is suitable for lookup data that
changes frequently, as the latest data is always retrieved from the database.
Key differences
1. Performance: Cached Lookup is generally faster than Uncached Lookup.
2. Data freshness: Uncached Lookup ensures that the latest data is always retrieved, while Cached
Lookup may use stale data if the cache is not updated.
3. Database queries: Cached Lookup reduces database queries, while Uncached Lookup increases them.
In summary, Cached Lookup is suitable for static or infrequently changing data, while Uncached
Lookup is suitable for dynamic or frequently changing data. The choice between Cached and Uncached
Lookup depends on the specific requirements of your ETL process.
Initialization
1. Workflow validation: DTM validates the workflow configuration, including the mappings,
transformations, and session properties.
2. Resource allocation: DTM allocates the necessary resources, such as memory and CPU, for the
workflow.
Session Creation
1. Session creation: DTM creates a new session for the workflow, which includes the session properties,
such as the source and target connections.
2. Session initialization: DTM initializes the session, including setting up the logging and error handling
mechanisms.
Data Processing
1. Data reading: DTM reads the data from the source systems, according to the mapping and session
configurations.
2. Data transformation: DTM applies the transformations, such as aggregations, filters, and joins, to the
data according to the mapping configuration.
3. Data writing: DTM writes the transformed data to the target systems, according to the mapping and
session configurations.
Completion
1. Workflow completion: DTM completes the workflow execution, including releasing the allocated
resources and updating the workflow status.
2. Post-session processing: DTM performs any post-session processing, such as sending notifications or
updating the workflow history.
37.Explain what Load Manager does when you start a work flow?
When you start a workflow in Informatica, the Load Manager performs the following steps:
Initialization
1. Workflow validation: Load Manager validates the workflow configuration, including the mappings,
transformations, and session properties.
2. Resource allocation: Load Manager allocates the necessary resources, such as memory and CPU, for
the workflow.
Load Balancing
1. Node selection: Load Manager selects the most suitable node(s) to execute the workflow, based on
factors such as node availability, workload, and resource utilization.
2. Load distribution: Load Manager distributes the workload across the selected nodes, ensuring optimal
resource utilization and minimizing bottlenecks.
Aggregate Functions
1. Sum: Calculates the total value of a column.
2. Average: Calculates the average value of a column.
3. Count: Counts the number of rows in a group.
4. Count Distinct: Counts the number of unique values in a column.
5. Group By: Groups data by one or more columns.
6. Max: Returns the maximum value in a column.
7. Min: Returns the minimum value in a column.
8. Standard Deviation: Calculates the standard deviation of a column.
9. Variance: Calculates the variance of a column.
3. Use Indexes
- Create indexes on the columns used in the GROUP BY clause and aggregate functions.
- Ensure that the indexes are properly maintained and updated.
5. Partition Data
- Partition the data into smaller chunks to reduce the amount of data being processed.
- Use partitioning techniques, such as range partitioning or hash partitioning.
1. Lookup Cache
- Purpose: Improves the performance of Lookup transformations by storing lookup data in memory.
- How it works: The Lookup Cache stores the lookup data in memory, allowing the Integration Service to
quickly retrieve the data without having to query the lookup source.
- Benefits: Reduces the number of queries to the lookup source, improving performance and reducing
the load on the lookup source.
2. Aggregator Cache
- Purpose: Improves the performance of Aggregator transformations by storing aggregated data in
memory.
- How it works: The Aggregator Cache stores the aggregated data in memory, allowing the Integration
Service to quickly retrieve the data without having to re-aggregate it.
- Benefits: Reduces the time required to perform aggregations, improving performance and reducing the
load on the system.
3. Joiner Cache
- Purpose: Improves the performance of Joiner transformations by storing join data in memory.
- How it works: The Joiner Cache stores the join data in memory, allowing the Integration Service to
quickly retrieve the data without having to re-join it.
- Benefits: Reduces the time required to perform joins, improving performance and reducing the load on
the system.
4. Data Cache
- Purpose: Improves the performance of data processing by storing frequently accessed data in memory.
- How it works: The Data Cache stores frequently accessed data in memory, allowing the Integration
Service to quickly retrieve the data without having to access the underlying data source.
- Benefits: Reduces the time required to access data, improving performance and reducing the load on
the underlying data source.
5. Index Cache
- Purpose: Improves the performance of data processing by storing index data in memory.
- How it works: The Index Cache stores index data in memory, allowing the Integration Service to quickly
retrieve the data without having to access the underlying index.
- Benefits: Reduces the time required to access index data, improving performance and reducing the
load on the underlying index.
6. Pushdown Optimization Cache
- Purpose: Improves the performance of data processing by storing pushdown optimization data in
memory.
- How it works: The Pushdown Optimization Cache stores pushdown optimization data in memory,
allowing the Integration Service to quickly retrieve the data without having to re-optimize it.
- Benefits: Reduces the time required to perform pushdown optimization, improving performance and
reducing the load on the system.
In summary, Informatica provides several types of caches to optimize different aspects of data
processing. By using these caches, you can improve performance, reduce the load on the system, and
optimize data processing.
1. Data Volume Limitation: The Joiner transformation can handle large volumes of data, but it may not
perform well with extremely large datasets.
2. Memory Limitation: The Joiner transformation requires sufficient memory to store the data from both
sources. If the data is too large, it may not fit in memory, leading to performance issues.
3. Complex Join Conditions: The Joiner transformation supports simple join conditions, but complex join
conditions may not be supported.
4. Multiple Master Tables: The Joiner transformation only supports one master table. If you need to join
multiple master tables, you may need to use multiple Joiner transformations.
5. Self-Joins: The Joiner transformation does not support self-joins, where the same table is used as both
the master and detail tables.
6. Full Outer Joins: The Joiner transformation does not support full outer joins, where all records from
both tables are returned, even if there is no match.
7. Data Type Limitations: The Joiner transformation may not support all data types, such as LOB (Large
OBject) data types.
8. Performance Overhead: The Joiner transformation can introduce performance overhead, especially
when dealing with large datasets or complex join conditions.
9. Limited Support for Unstructured Data: The Joiner transformation may not support unstructured data,
such as XML or JSON data.
10. Limited Support for Real-Time Data: The Joiner transformation may not support real-time data, such
as data from messaging systems or streaming data sources.
It's essential to consider these limitations when designing your Informatica workflow and to choose the
most suitable transformation for your specific use case.
45.What is Mapplet?
In Informatica, a Mapplet is a reusable transformation object that contains a set of transformations and
mappings. It's a self-contained unit of work that can be used to perform a specific data transformation
task.
Think of a Mapplet as a mini-workflow that can be easily reused across multiple mappings and
workflows. Mapplets are particularly useful when you need to perform a common data transformation
task, such as data cleansing, data aggregation, or data formatting.
Key Features:
1. Reusability: Mapplets can be reused across multiple mappings and workflows.
2. Modularity: Mapplets are self-contained units of work that can be easily maintained and updated.
3. Flexibility: Mapplets can be used to perform a wide range of data transformation tasks.
Benefits:
1. Improved Productivity: Mapplets can save time and effort by reducing the need to recreate common
data transformation tasks.
2. Consistency: Mapplets can help ensure consistency across multiple mappings and workflows by
providing a standardized approach to data transformation.
3. Easier Maintenance: Mapplets can make it easier to maintain and update data transformation tasks by
providing a single point of maintenance.
Mapplets are a powerful feature in Informatica that can help simplify data transformation tasks,
improve productivity, and ensure consistency across multiple mappings and workflows.
Passive Transformations
Passive transformations, on the other hand, do not change the number of rows or columns in the data
flow. They only perform operations that modify the data values, such as conversions, calculations, and
data masking.
Examples of Passive transformations:
1. Expression
2. Converter
3. Data Masking
4. Calculator
Key differences between Active and Passive transformations:
1. Data flow modification: Active transformations can modify the data flow, while Passive
transformations do not.
2. Row and column changes: Active transformations can change the number of rows or columns, while
Passive transformations do not.
3. Data value modification: Both Active and Passive transformations can modify data values, but Passive
transformations only perform this type of operation.
Understanding the difference between Active and Passive transformations is essential for designing
efficient and effective data integration workflows in Informatica.
Additional Options
1. Update as Update: Treats all updates as updates, even if the row already exists in the target table.
2. Update as Insert: Treats all updates as inserts, even if the row already exists in the target table.
3. Insert Else Update: Inserts a new row if the row does not exist in the target table; otherwise, updates
the existing row.
Custom Options
1. Custom: Allows you to specify a custom update strategy using a SQL query or a stored procedure.
These options enable you to control how data is updated in the target table, ensuring that your data
integration workflow meets your specific business requirements.
Informatica supports various code pages, which can be categorized into the following types:
1. ASCII (American Standard Code for Information Interchange): A 7-bit code page that supports English
characters.
2. ISO 8859-1 (Latin-1): An 8-bit code page that supports Western European languages, such as French,
German, and Italian.
3. Windows Code Pages (e.g., Windows-1252): A set of code pages developed by Microsoft, which
support various languages, including English, European languages, and some Asian languages.
4. Unicode Code Pages (e.g., UTF-8, UTF-16): A set of code pages that support a wide range of languages,
including Asian languages, and can handle characters that require more than one byte to represent.
5. EBCDIC (Extended Binary Coded Decimal Interchange Code): An 8-bit code page used on IBM
mainframe systems, which supports a limited set of characters.
6. DBCS (Double-Byte Character Set) Code Pages: A set of code pages used to support Asian languages,
such as Chinese, Japanese, and Korean.
In summary, Rank Cache is a performance optimization technique used in the Rank transformation to
improve processing times and reduce memory usage.
50.How can you delete duplicate rows with out using Dynamic Lookup?
Tell me any other ways using lookup delete the duplicate rows?
Deleting duplicate rows without using a Dynamic Lookup can be achieved through various methods.
Here are a few approaches using Lookup transformations:
To run a batch using pmcmd, you would use the following syntax:
pmcmd startbatch -b <batch_name>
Replace <batch_name> with the actual name of the batch you want to run.
For example:
pmcmd startbatch -b my_batch
This command would start the batch named "my_batch".
These objects are not supported within a Mapplet because Mapplets are designed to be reusable,
modular transformations that can be easily integrated into workflows. By limiting the types of objects
that can be included in a Mapplet, Informatica ensures that Mapplets remain simple, flexible, and easy
to maintain.
Why?
When you drag multiple sources to a single Source Qualifier, Informatica creates a Union transformation
implicitly. This Union transformation combines the data from the multiple sources into a single data
stream.
So, to summarize:
1. Dragging multiple sources to a single Source Qualifier creates an implicit Union transformation.
2. The Union transformation combines the data from the multiple sources into a single data stream.
3. You need to create a mapping that includes the Union transformation to map the combined data
stream to the target.
Hope that helps clarify things!
Mapping
The Debugger requires a valid mapping to create a debug session. The mapping defines the data flow
and transformations that will be executed during the debug session.
When you create a debug session, the Debugger uses the mapping to:
1. Identify the sources and targets involved in the data flow.
2. Determine the transformations and operations that will be executed.
3. Set up the necessary connections and configurations for the debug session.
Without a valid mapping, the Debugger cannot create a debug session, and you will not be able to debug
your data integration workflow.
Update Strategy transformation is used to specify the update strategy for target table, and it must be the
last transformation in the mapping.
According to Informatica, "You cannot use an Active transformation after an Update Strategy
transformation. The Update Strategy transformation must be the last transformation in the mapping."
Session Level
Note that the update strategy set at the session level overrides the update strategy set at the mapping
level.
Best Practice
To avoid system hangs, consider the following:
1. Use Offline mode: When working on a personal laptop, use the Offline server connect option to
reduce the load on your system and network resources.
2. Optimize system resources: Ensure your laptop has sufficient CPU, memory, and disk resources to run
Informatica smoothly.
3. Check network connectivity: Verify that your network connection is stable and fast before using the
Online server connect option.
1. Data replication: CDC enables the replication of data changes to other systems, ensuring data
consistency and synchronization.
2. Data integration: CDC facilitates the integration of data changes with other systems, applications, or
processes.
3. Auditing and compliance: CDC provides a record of all changes made to data, supporting auditing,
compliance, and regulatory requirements.
4. Real-time analytics: CDC enables real-time analytics and reporting by providing timely and accurate
data changes.
1. Data source identification: Identify the data sources to be monitored for changes.
2. Change detection: Detect changes made to the data, such as inserts, updates, or deletes.
3. Change capture: Capture the changes and store them in a format suitable for processing.
4. Change delivery: Deliver the captured changes to target systems, applications, or processes.
64.Write a session parameter file which will change the source and
targets for every session. i.e different source and targets for each
session run ?
Here's an example of a session parameter file (.spf) that allows you to change the source and targets for
every session run:
$$SOURCE_DB=database_name
$$SOURCE_SCHEMA=schema_name
$$SOURCE_TABLE=table_name
$$TARGET_DB=database_name
$$TARGET_SCHEMA=schema_name
$$TARGET_TABLE=table_name
Now, every time you run the session, you can modify the source and target parameters by editing the
session_params.spf file.
For example:
- To change the source database, update the $$SOURCE_DB parameter in the file.
- To change the target table, update the $$TARGET_TABLE parameter in the file.
By using a session parameter file, you can easily switch between different sources and targets for each
session run.
1. Reader Thread
The Reader thread reads data from the source system, such as a database or file.
2. Transformation Thread
The Transformation thread applies the data transformation rules, such as aggregations, joins, and data
conversions.
3. Writer Thread
The Writer thread writes the transformed data to the target system, such as a database or file.
7. DTM Thread
The DTM thread manages the overall DTM process, including thread creation, synchronization, and error
handling.
These threads work together to efficiently process and transform data, improving overall performance
and scalability.
68.What is Transformation?
In Informatica, a transformation is a repository object that performs a specific operation on data, such as
aggregating, filtering, or converting data.
Transformations are used to manipulate and transform data as it flows through a mapping, which is a set
of instructions that defines how data is extracted, transformed, and loaded (ETL) into a target system.
There are several types of transformations in Informatica, including:
70.How do you recognize whether the newly added rows got inserted
or updated?
To recognize whether newly added rows got inserted or updated, you can use the following methods:
72.My flat file’s size is 400 MB and I want to see the data inside the FF
with out opening it? How do I do that?
You can use the head or tail command in Unix/Linux or the type command in Windows to view a portion
of the flat file without opening it.
head -n 10 filename
Replace filename with the name of your flat file, and 10 with the number of lines you want to view.
3. Alternatively, use the tail command to view the last few lines of the file:
tail -n 10 filename
Windows
1. Open a command prompt and navigate to the directory where the flat file is located.
2. Use the type command to view a portion of the file:
type filename | more
Replace filename with the name of your flat file. The more command will display the file one screen at a
time.
These methods allow you to view a portion of the flat file without opening it, which can be helpful for
large files like yours (400 MB).
Filter Transformation
1. Purpose: Filters out data that does not meet specific conditions.
2. Functionality: Evaluates each row of data against a condition, and if the condition is true, the row is
passed through; otherwise, it is discarded.
3. Output: Only one output is possible, which is the filtered data.
Router Transformation
1. Purpose: Routes data to different targets based on conditions.
2. Functionality: Evaluates each row of data against multiple conditions, and based on the condition met,
the row is routed to the corresponding target.
3. Output: Multiple outputs are possible, each corresponding to a specific condition.
Key differences:
- Filtering vs. Routing: Filter transformation filters out data, whereas Router transformation routes data
to different targets.
- Number of Outputs: Filter has only one output, whereas Router can have multiple outputs.
- Condition Evaluation: Filter evaluates data against a single condition, whereas Router evaluates data
against multiple conditions.
In summary, use Filter when you want to remove data that doesn't meet a specific condition, and
use Router when you want to direct data to different targets based on multiple conditions.
74.How do you handle the decimal places when you are importing the
flat file?
When importing a flat file, handling decimal places can be a bit tricky. Here are some ways to handle
decimal places:
1. Specify the decimal separator: In Informatica, you can specify the decimal separator in the flat file
definition. For example, if the decimal separator is a comma (,), you can specify it in the flat file
definition.
2. Use a data type with decimal places: When defining the flat file structure, use a data type that
supports decimal places, such as Decimal or Numeric.
3. Specify the precision and scale: When using a Decimal or Numeric data type, specify the precision
(total number of digits) and scale (number of digits after the decimal point).
4. Use a transformation to handle decimal places: If the flat file contains decimal places in a format that's
not supported by Informatica, you can use a transformation (such as an Expression transformation) to
handle the decimal places.
5. Use a format string: When importing the flat file, you can use a format string to specify the decimal
separator and other formatting options.
Used to reference a parameter that is defined within the same mapping or parameter file.
Example: $ParameterName
Used to reference a parameter that is defined at the session level or in a parent parameter file.
Example: $$ParameterName
Best practices:
- Use $ for mapping-specific parameters to avoid conflicts with session-level parameters.
- Use $$ for session-level parameters to ensure they are accessible across multiple mappings.
By using $ and $$ correctly, you can effectively manage parameters in your Informatica mappings and
parameter files.
This metadata is used to create a source definition in Informatica, which can then be used to extract data
from the relational database.
By importing this metadata, you can save time and effort in defining your source structure, and ensure
accuracy and consistency in your data integration processes.
PowerMart
1. Data mart-focused: Designed specifically for data mart and data warehouse environments.
2. Simplified architecture: Has a simpler architecture compared to PowerCenter.
3. Limited scalability: Suitable for smaller-scale data integration projects.
4. Fewer features: Offers a subset of features compared to PowerCenter.
PowerCenter
1. Enterprise-focused: Designed for large-scale, enterprise-wide data integration projects.
2. Robust architecture: Has a more robust and scalable architecture.
3. High scalability: Suitable for large-scale data integration projects.
4. Advanced features: Offers a wide range of advanced features, including real-time data integration,
data quality, and data governance.
In summary, PowerMart is ideal for smaller-scale data mart and data warehouse projects, while
PowerCenter is better suited for large-scale, enterprise-wide data integration initiatives.
A SQL override enables you to replace the default SQL query with a custom query that meets your
specific needs. You can enter your custom SQL query in the SQL override editor, and Informatica will use
this query instead of the default one.
SQL overrides can be applied at various levels, including:
- Source qualifier
- Aggregator
- Joiner
- Target
By using SQL overrides, you can gain more control over the data extraction process, optimize query
performance, and implement custom business logic.
By leveraging cache, these transformations can handle large datasets more efficiently and improve
overall mapping performance.
Reject Rows:
- Sends rows that do not meet the transformation or target conditions to a reject pipeline or a separate
reject file.
- Rows that fail transformation or loading are rejected.
Filtering Methods
1. Filter Transformation: Use a filter transformation to filter records based on conditions.
2. Source Qualifier: Use a source qualifier to filter records at the source level.
3. Aggregator Transformation: Use an aggregator transformation to filter records based on aggregated
values.
4. Router Transformation: Use a router transformation to filter records based on conditions and route
them to different targets.
5. Expression Transformation: Use an expression transformation to filter records based on complex
conditions.
6. Data Validation: Use data validation to filter records based on data quality rules.
Conditional Filtering
1. Conditional Statements: Use conditional statements (e.g., IF-THEN-ELSE) to filter records.
2. Case Statements: Use case statements to filter records based on specific conditions.
By using these methods, you can filter records in various ways to achieve your data processing goals.
Database Source
Method 1: Using a Transformation
1. Aggregator Transformation: Use an Aggregator transformation to group the data by the desired
columns and then use a Router transformation to route the duplicate records to a separate pipeline for
deletion.
2. Delete Transformation: Use a Delete transformation to delete the duplicate records.
In summary, you can use various methods to delete duplicate records from a database or flat file source,
including transformations, post-SQL scripts, and pre-load scripts. Choose the method that best fits your
specific use case.
88.You are required to perform “bulk loading” using Informatica on
Oracle, what action would perform at Informatica + Oracle level for a
successful load?
To perform a successful bulk loading using Informatica on Oracle, follow these steps:
Informatica Level
1. Create a Mapping: Design a mapping that extracts data from the source and loads it into the Oracle
target.
2. Use a Bulk Load Option: In the Informatica mapping, enable the "Bulk Load" option for the Oracle
target. This option allows Informatica to use Oracle's bulk loading capabilities.
3. Configure the Bulk Load Properties: Set the bulk load properties, such as the batch size, commit
interval, and error handling options.
4. Use a Direct Load: Use a direct load instead of a normal load. This will allow Informatica to load data
directly into the Oracle database without using an intermediate file.
Oracle Level
1. Grant Necessary Privileges: Ensure that the Oracle user account used by Informatica has the necessary
privileges to perform bulk loading, such as the "INSERT" and "SELECT" privileges.
2. Create a Directory Object: Create a directory object in Oracle that points to the location where
Informatica will store the data files.
3. Configure the Oracle Database: Configure the Oracle database to allow bulk loading by setting the
"compatible" parameter to a value that supports bulk loading (e.g., 10.2.0 or higher).
4. Disable Logging: Consider disabling logging for the bulk load operation to improve performance.
However, this may impact data recovery in case of failures.
Additional Considerations
1. Data File Format: Ensure that the data file format used by Informatica is compatible with Oracle's bulk
loading requirements.
2. Error Handling: Configure error handling options in Informatica and Oracle to handle errors that may
occur during the bulk load operation.
3. Performance Optimization: Optimize the bulk load performance by adjusting parameters such as batch
size, commit interval, and parallelism.
By following these steps and considering the additional factors, you can perform a successful bulk
loading operation using Informatica on Oracle.
Informatica-Specific Precautions
1. Use a Shared Sequence: Use a shared sequence generator transformation that can be accessed by
multiple sessions.
2. Configure Sequence Options: Configure sequence options, such as the cache size and sequence step
value, to optimize performance for concurrent sessions.
3. Use a Sequence Generator with High-Performance Options: Use a sequence generator transformation
with high-performance options, such as the "High-Performance" option in Informatica PowerCenter.
By taking these precautions, you can ensure that your reusable Sequence Generator transformation
works correctly and efficiently in a concurrent session environment.
Example
Suppose you want to generate a sequence of values that start from 10 and decrement by 1. You can
configure the Sequence Generator transformation as follows:
- Start Value: 10
- Increment: -1
- Step Value: 1
In this example, the Sequence Generator will produce the following sequence of values: 10, 9, 8, 7, ...
By using a negative increment value or step value, you can generate a sequence of values that
decrement instead of increment.
91.Which directory Informatica looks for parameter file and what
happens if it is missing when start the session? Does session stop after
it starts?
Informatica looks for parameter files in the following directories:
- $PMParamFileDir (parameter file directory specified in the Informatica environment variable)
- $PMRootDir/pmparamfiles (default parameter file directory)
- $PMRootDir/pmparamfiles/<session_name> (session-specific parameter file directory)
93.You have more five mappings use the same lookup. How can you
manage the lookup?
When multiple mappings share the same lookup, it's essential to manage the lookup efficiently to avoid
duplication, inconsistencies, and performance issues. Here are some ways to manage the lookup:
Reusable Transformation
Create a reusable lookup transformation that can be shared across multiple mappings. This approach
ensures that the lookup logic is consistent and maintained in one place.
Lookup Cache
Use a lookup cache to store the lookup data in memory. This approach improves performance by
reducing the number of database queries and allowing multiple mappings to share the same lookup
data.
Mapping Parameter
Use a mapping parameter to pass the lookup data to multiple mappings. This approach allows you to
maintain the lookup data in one place and pass it to multiple mappings using a parameter.
Mapplet
Create a mapplet that contains the lookup transformation and other related transformations. This
approach allows you to reuse the lookup logic and other transformations across multiple mappings.
Benefits
Managing the lookup efficiently provides several benefits, including:
- Improved performance
- Reduced maintenance efforts
- Consistent lookup logic across multiple mappings
- Simplified debugging and troubleshooting
By using one of these approaches, you can manage the lookup efficiently and ensure that your
mappings are consistent, efficient, and easy to maintain.
94.What will happen if you copy the mapping from one repository to
another repository and if there is no identical source?
If you copy a mapping from one repository to another repository, and there is no identical source in the
target repository, Informatica will:
Replace the Source
Informatica will replace the source definition in the copied mapping with a new source definition that
matches the source name, but it will not be connected to the actual source.
Best Practices
To avoid this issue, it's recommended to:
1. Use a consistent naming convention for sources across repositories.
2. Create a new source definition in the target repository before copying the mapping.
3. Use the "Replace Source" option when copying the mapping to ensure that the new source definition
is used.
By following these best practices, you can avoid issues related to missing or mismatched sources when
copying mappings between repositories.
Using a Semaphore
1. Create a semaphore variable in the workflow.
2. Set the semaphore value to the maximum number of sessions you want to run concurrently.
3. Use a decision task to check the semaphore value before starting a new session.
4. If the semaphore value is greater than 0, decrement the value and start the session.
5. If the semaphore value is 0, wait until one of the running sessions completes and then decrement the
value and start the new session.
Using a Counter
1. Create a counter variable in the workflow.
2. Initialize the counter value to 0.
3. Use a decision task to check the counter value before starting a new session.
4. If the counter value is less than the maximum number of sessions you want to run concurrently,
increment the counter value and start the session.
5. If the counter value is equal to the maximum number of sessions, wait until one of the running
sessions completes and then decrement the counter value and start the new session.
Using a Queue
1. Create a queue in the workflow.
2. Set the queue size to the maximum number of sessions you want to run concurrently.
3. Use a decision task to check the queue size before starting a new session.
4. If the queue size is less than the maximum number of sessions, add the new session to the queue and
start the session.
5. If the queue size is equal to the maximum number of sessions, wait until one of the running sessions
completes and then remove the completed session from the queue and add the new session to the
queue.
96.An Aggregate transformation has 4 ports (l sum (col 1), group by col
2, col3), which port should be the output?
In an Aggregate transformation, the output port is typically the one that represents the aggregated
value.
In this case, since the transformation has 4 ports:
1. sum (col 1) - This is the aggregated value port.
2. group by col 2 - This is a group by port.
3. col3 - This is another group by port.
The output port should be the one that represents the aggregated value, which is:
- sum (col 1)
This port will contain the aggregated sum of column 1 for each group defined by columns 2 and 3.
By connecting this port to the next transformation or target, you'll be able to use the aggregated value in
your mapping.
98.If you have more than one pipeline in your mapping how will change
the order of load?
If you have multiple pipelines in your mapping and you want to change the order of load, you can use
the following methods:
Here are some of the key components that are typically included in the exported XML file:
1. Workflow definition: This includes the workflow's structure, tasks, and dependencies.
2. Task definitions: Detailed information about each task in the workflow, including task type, properties,
and settings.
3. Session properties: Configuration details for each session in the workflow, such as source and target
connections, mapping, and transformation settings.
4. Mapping and transformation metadata: Information about the mappings and transformations used in
the workflow, including source and target metadata.
5. Connection information: Details about the connections used in the workflow, including database
connections, file locations, and other relevant settings.
6. Workflow variables: Definitions of workflow variables, including their data types, default values, and
scope.
7. Security and access control: Information about the security settings and access controls applied to the
workflow, including user and group permissions.
So, while the exported XML file does contain the workflow definition, it also includes a wealth of
additional metadata and configuration information that provides context and supports the workflow's
execution.
100. Your session failed and when you try to open a log file, it complains
that the session details are not available. How would do trace the error?
What log file would you seek for?
When a session fails and you're unable to access the session details, tracing the error can be a challenge.
Here's a step-by-step approach to help you identify the issue:
- Informatica Server Log: This log provides information about the Informatica server, including startup
and shutdown events, errors, and other significant events.
- Integration Service Log: This log contains information about the Integration Service, including session
starts, stops, and errors.
- Session Start and End Times: Verify that the session started and ended as expected.
- Error Messages: Look for error messages related to the session failure.
- Transformation and Mapping Errors: Check for errors related to specific transformations or mappings.
- Workflow Start and End Times: Verify that the workflow started and ended as expected.
- Task Errors: Look for errors related to specific tasks within the workflow.
- Incorrect Settings: Verify that the configuration settings are correct and consistent.
- Missing or Corrupted Files: Check for missing or corrupted configuration files.
- Informatica Documentation: Review the Informatica documentation for troubleshooting guides and
error messages.
- Informatica Support: Reach out to Informatica support for further assistance.
By following these steps, you should be able to identify the cause of the session failure and troubleshoot
the issue effectively.
Here's an example of how you can specify the attachment file in the Email Task properties:
By following these steps, you can attach a file as an email attachment from a particular directory using
the "Email Task" in Informatica.
102. You have a requirement to alert you of any long running sessions in
your workflow. How can you create a workflow that will send you email
for sessions running more than 30 minutes. You can use any method,
shell script, procedure or Informatica mapping or workflow control?
To create a workflow that sends an email alert for long-running sessions, you can use a combination of
Informatica workflow features and a shell script. Here's one approach:
Informatica Workflow
1. Create a new workflow in Informatica.
2. Add a Session Monitor task to the workflow.
3. Configure the Session Monitor task to monitor the sessions you're interested in.
4. Set the Monitoring Interval to a suitable value (e.g., 5 minutes).
5. Create a new Variable in the workflow to store the session duration.
6. Use a Decision task to check if the session duration exceeds 30 minutes.
7. If the session duration exceeds 30 minutes, use a Send Email task to send an alert email.
Schedule this script to run at regular intervals (e.g., every 5 minutes) using a scheduler like cron.
By using either the Informatica workflow approach or the shell script approach, you can create a system
that alerts you to long-running sessions in your workflow.