Etl
Etl
Etl
3) Mention what are the types of data warehouse applications and what is the
difference between data mining and data warehousing?
Info Processing
Analytical Processing
Data Mining
Additive Facts
Semi-additive Facts
Non-additive Facts
Cubes are data processing units comprised of fact tables and dimensions from
the data warehouse. It provides multi-dimensional analysis.
OLAP stands for Online Analytics Processing, and OLAP cube stores large data
in muti-dimensional form for reporting purposes. It consists of facts called as
measures categorized by dimensions.
Tracing level is the amount of data stored in the log files. Tracing level can be
classified in two Normal and Verbose. Normal level explains the tracing level
in a detailed manner while verbose explains the tracing levels at each and
every row.
Grain fact can be defined as the level at which the fact information is stored. It
is also known as Fact Granularity
A fact table without measures is known as Factless fact table. It can view the
number of occurring events. For example, it is used to record an event such as
employee count in a company.
Round-Robin Partitioning:
Hash Partitioning:
14) Using SSIS ( SQL Server Integration Service) what are the possible ways to
update table?
15) In case you have non-OLEDB (Object Linking and Embedding Database)
source for the lookup what would you do?
In case if you have non-OLEBD source for the lookup then you have to use
Cache to load data and use it as source
16) In what case do you use dynamic cache and static cache in connected and
unconnected transformations?
Dynamic cache is used when you have to update master table and
slowly changing dimensions (SCD) type 1
For flat files Static cache is used
Connected lookup participates in mapping - It is used when lookup function is used instead
of an expression transformation while mapping
Static or dynamic cache can be used for connected Unconnected as only static cache
Lookup
Connected lookup supports user defined default Unconnected look up does not support
values user defined default values
In Connected Lookup multiple column can be Unconnected lookup designate one return
return from the same row or insert into dynamic port and returns one column from each
lookup cache row
17) Explain what are the differences between Unconnected and Connected
lookup?
19) Explain what is the difference between OLAP tools and ETL tools ?
ETL tool is meant for the extraction of data from the legacy systems and load
into specified data base with some process of cleansing data.
With the power connect option you extract SAP data using informatica
Install and configure the PowerConnect tool
Import the source into the Source Analyzer. Between Informatica and
SAP Powerconnect act as a gateaway. The next step is to generate the
ABAP code for the mapping then only informatica can pull data from
SAP
To connect and import sources from external systems Power Connect is
used.
21) Mention what is the difference between Power Mart and Power Center?
Suppose to process huge volume of data Suppose to process low volume of data
It supports ERP sources such as SAP, people soft etc. It does not support ERP sources
It converts local into global repository It has no specification to convert local into glob
22) Explain what staging area is and what is the purpose of a staging area?
Data staging is an area where you hold the data temporary on data warehouse
server. Data staging includes following steps
For the various business process to identify the common dimensions, BUS
schema is used. It comes with a conformed dimensions along with a
standardized definition of information
Data purging is a process of deleting data from data warehouse. It deletes junk
data's like rows with null values or extra spaces.
Schema objects are the logical structure that directly refer to the databases
data. Schema objects includes tables, views, sequence synonyms, indexes,
clusters, functions packages and database links
How do you validate each and every record whether value in source and target are same.
Extraction :
Take data from an external source and move it to the warehouse pre-processor
database.
Transformation:
Transform data task allows point-to-point generating, modifying and transforming
data.
Loading:
Load data task adds records to a database table in a warehouse.
2. Question 2. What Is The Difference Between Etl Tool And Olap Tools?
Answer :
ETL tool is meant for extraction data from the legacy systems and load into
specified database with some process of cleansing data.
ex: Informatica, data stage ....etc
1.
Question 16. What Is A Mapping, Session, Worklet, Workflow, Mapplet?
Answer :
o A mapping represents dataflow from sources to targets.
o A mapplet creates or configures a set of transformations.
o A workflow is a set of instructions that tell the Informatica server how to
execute the tasks.
o A worklet is an object that represents a set of tasks.
o A session is a set of instructions that describe how and when to move
data from sources to targets.
1. Question 59. What Is The Difference Between Power Center & Power Mart?
Answer :
PowerCenter - ability to organize repositories into a data mart domain and share
metadata across repositories.
PowerMart - only local repository can be created.
2. Question 60. What Are Snapshots? What Are Materialized Views & Where
Do We Use Them? What Is A Materialized View Log?
Answer :
Snapshots are read-only copies of a master table located on a remote node
which is periodically refreshed to reflect changes made to the master table.
Snapshots are mirror or replicas of tables.
Views are built using the columns from one or more tables. The Single Table View
can be updated but the view with multi table cannot be updated.
A View can be updated/deleted/inserted if it has only one base table if the view is
based on columns from one or more tables then insert, update and delete is not
possible.
Materialized view
A pre-computed table comprising aggregated or joined data from fact and possibly
dimension tables. Also known as a summary or aggregate table.