0% found this document useful (0 votes)
247 views9 pages

FAQ Fo ETL TESTING

Vinayak has 3+ years of experience in manual testing and ETL testing. He currently works at Mindtree and has experience with tools like Informatica, Oracle database, Quality Centre, and basic UNIX commands. He describes his last project architecture which involved extracting data from various sources via FTP into a landing area, then validating, transforming, and loading the data into a data warehouse for reporting. He explains his testing approach of understanding requirements, writing test cases and queries, reviewing with the BA team, preparing test data, and executing tests to validate data loads.

Uploaded by

mustaq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
247 views9 pages

FAQ Fo ETL TESTING

Vinayak has 3+ years of experience in manual testing and ETL testing. He currently works at Mindtree and has experience with tools like Informatica, Oracle database, Quality Centre, and basic UNIX commands. He describes his last project architecture which involved extracting data from various sources via FTP into a landing area, then validating, transforming, and loading the data into a data warehouse for reporting. He explains his testing approach of understanding requirements, writing test cases and queries, reviewing with the BA team, preparing test data, and executing tests to validate data loads.

Uploaded by

mustaq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

FAQ

• Tell me about yourself.


I am vinayak. I have completed BCA graduation from Karnataka university dharwad in 2012. I am
having 3+ yeas of experience in manual testing and ETL Testing .Currently I am working in mindtree. I
have experience in ETL concepts Good knowledge in ETL Tool like informatica, Oracle database,
Quality Centre and basic UNIX commands.
I have worked on different domains like Insurance domain and Health care domain.
Comes to my family background basically my family from a pottery /agriculture background. Its all
about me.
• Explain project architecture in detail
My project architecture having different layers.

From Source system data will be coming some data from tables and some data from flat files. From
source system via FTP protocol the files will be move to the landing area. In landing area again we
have three different layers one layer is Root layer, Reject layer and another one is Archive layer.
The files will move to root folder from source via FTP protocol from the root folder data will be
extract by ETL mechanism. Before going to extracting the data from root folder to there will be a pre
validations done by the ETL mechanism like checking the file formats, file naming conversions,
delimiters, data formats and header and footers etc… is there all validations are correct then ETL
mechanism will extract the data and load into the staging area. In the staging area all the
transformations will be applied and transform the data into required business format/logic/rules and
then we load the data into the data ware house database. From there will prepare the reports.
After the completion of this process the data will be removed from the root folder and move
to the archive folder. In archive folder data will be present up to 30 days.
If file format not correct then the data will not be extract by the ETL mechanism and job will
be failed and file will be moved to the reject folder.

Note: Process of checking the reject files


We will check the log files and we identify the errors and correct the file then will move to the
root folder and re run the job.

• Explanation about current project/Last project


My project belongs to insurance domain my client is American Skyline Insurance Company (ASIC)
from USA is the one of the respected players in Health Care Insurance Industry. Global health reports
are new set of reports to measure the health of retail orders in the manufacturing system. They have
multiple branches across the USA and they do different polices in each and every branch there will be
some transactions selling different policies under different agents for that want centralized
database(DWH). Where we can generate the reports for the business analysis to take their business
decisions, they generate reports branch wise, agent wise that how many polices done, how many
claims have done in the quarters, half yearly and year basis.

• Tell me some basic validations which you have done in your last project.
• I check table structure and data types.
• I check source and target record count
• I check duplicate records in target table.
• I compare source and target column mapping.

• Apart from these what are the validations you have done.
• I have validated surrogate key values
• Validated lookup table data
• Validated fact values properly calculated or not as per the requirement.
• I have validated initial and incremental load data.

• What is your testing approach?


• Once we get the requirement from client I will understand the requirement if there
any doubts will discuss with technical and BA team to clarify those doubts.
• I will identify all the test scenarios then I will write the test cases.
• I will write the SQL queries to validate the data.
• We will do peer reviews. After that we read test cases for final reviews for the BA
team.
• Then we will prepare the test data.
• Once the code deployed in test environment, we will run the ETL jobs and will execute
the test cases.

• What is the Test plan Documents contains?


• Testing approach.
• What are the methodologies we follow for testing?
• What are the software’s and tools in the project.
• What are the resources required and what the resource skills set required.
• Entry criteria.
• What are the roles and responsibilities in your current project?
• Understanding requirements by referring BRD, FSS and STM documents.
• Running the ETL jobs/workflows
• Preparing the test scenarios and test cases
• Reviews the test cases with the BA.
• Prepared and ran sql queries to verify dimensional and fact tables.
• Verifying the data in target database after the ETL process the data will be loaded
successfully or not.
• Write sql query to validate the count of source and target data.
• Perform the column data mapping between the source and target databases.
• Checking that duplicate records.
• Interactions between the BA and Development teams to resolves the issues.
• Reporting daily testing status reports.
• Defect analyzing and Reporting in quality Centre.
Or Finding the defects and reports to the developer for its closure in quality Centre.
• If necessary then writing the Unix commands to validate the data.

• Write duplicate query?


Select column_name,count(column_name) from table_name group by column_name having
count(column_name)>1;

Ex: select id ,count(id) from emp group by id havinfcount(id)>1;

• Difference between Decode and NVL?


Decode function is use to value from one format to another format.
And NVL function is use to replace the null values by some other value.

• How many types of joins are there?


• Equi join (inner join)
• Cross join or Cartesian product
• Self join
• Outer join
In outer join Left outer join and Right outer join.

• Difference between left outer join and right outer join?


• Left outer join written the all rows from the left table even if there is no matches in the
right table. Or
Left outer join written all rows from the left table and match the rows from the right table
and also will display the value from left table which are matched with the right table for
unmatched record in right table will display null value.
• Right outer join written all rows from the right table even if there is no matches in the left
table.
Or
Right outer join written all rows from the Right table and match the rows from the Left
table and also will display the value from Right table which are matched with the left table
for unmatched record in left table will display null value.

• What are the conditions for the column mapping (minus query)?
• Both the query column count should be same
• All the column data types should be same
• And column order should be same

• What are the difference between delete and truncate?

Delete:
• Delete will delete data from the table and table structure remains same.
• We can roll back the data.
• We can delete specific selective data by using where clause.
• Will delete the data row by row.
• Low performance
• Will not delete your memory space
Truncate:
• Will delete the data from the table and table structure will be remains same.
• Cannot roll back the data
• Cannot delete the specific selective data because where clause doesn’t support by the
truncate command
• Will delete the whole data at once
• High performance
• It will delete the whole space of the array.

• What are the difference between primary key and surrogate key?
Primary key:
• PK is used to maintain unique record in the OLTP database’
• PK values entered by the user
• PK values can be alpha and numeric values.
• PK values are belongs to the business data/ table data.
• PK sequence order can be randomly.
Surrogate key:
• SK is used to maintain unique record in Data warehouse database
• SK values are generated by the ETL mechanism/System
• SK values can be numeric only.
• SK values are doesn’t belongs to business data/table data
• SK sequence order should be sequence order.

• What are the difference between Function and Procedure?


Function:
• Function must return a value(function within a function)
• Function can be call by SQL statement
• Used for computations
• Return a value

Procedure:
• Procedure may or may not return a value
• Procedure cannot call by SQL statements
• Procedure normally used for execute the business logic
• Return more than one value
Note: we can increase the performance of the application by reducing traffic of functions
and reducing the CPU load.
• What is a primary key and unique key constraint?
Unique: The unique constraint uniquely identifies each record in a data base table.
Primary key: Primary key constraint uniquely identifies the each record in data base table.
Primary key must contain unique values.
Primary key column cannot contain NULL value.
Each table should have primary keys and each table can have only one primary key.

Note:
• The unique and primary key constraints both provide a guarantee for uniqueness for a
column or set of column.
• A primary key constraint automatically has a unique and NOT NULL constraint defined
on it.
• Note that you can have many unique constraints per table, but only one primary key
constraint per table.

• What is a foreign key constraint?


A foreign key means that values that means in one table must also appear in another table.
• The reference table is called parent table while the table with the foreign key is called
the child table. The foreign key in the child table will generally reference a primary key
in the parent table.

• What is a NOT NULL constraint?


• NOT NULL: The Not NULL constraint enforces that column not accept null value.
• The not null constraint enforces a field to always contain a value. This means that you
cannot insert a new record or update a record without adding value to this field.

• What is joins?
A SQL joins is a query that combines rows from two or more tables, based on a relationship
between certain columns in the tables.

Tables in a database are often related to each other with keys.


A primary key is a column with unique value for each row. Each primary key value must be
unique within the table.

Join condition:
• Most join queries contain at least one join condition, either in the FROM clause or in
the WHERE clause.
• The join condition compares two columns each from different table.
• Cross join?
Cross join returns the Cartesian product of the rows from table in the join. In other words, It
will produce rows which combines each row from the first table with each row from second
table.
Syntax: select * from emp, dept;

• Difference between union and union all?


UNION: The union operator allows you to combine the result sets of two or more “select”
queries.
• UNION operator sorts the combined result set and removes the duplicate rows
between the various select statements.
• Each SQL statement within the UNION query must have same number of fields in the
result sets with similar data types.
UNION ALL: The union all operator allows you to combine the result sets of two or more
“select” queries.
• UNION ALL operator returns the all rows from both tables (including duplicate).
• UNION ALL does not remove the duplicate records.
• Each SQL statement within the UNION ALL query must have same number of fields in
the result sets with similar data types.

Union all is faster than union, union’s duplicate elimination requires sorting operation, which
takes time.

• Group by clause, Having clause and where clause?


Group by Clause:
The group by clause can used in a SELECT statement to collect data from the multiple records
and group the record by one or more particular filed and to apply an aggregate function on it.
• We normally use a group by clause with an aggregate expression like sum, count etc.
Having clause : The having clause is used in combinations with the GROUP BY clause. It can be
used in a SELECT statement to filter the records in a GROUP BY returns.
Note: The having clause was added to SQL because the WHERE keyword could not be used
with aggregate functions.
WHERE clause: The where clause allows you to filter the results from an SQL statements.
• LIKE condition or Wild cards?
LIKE CONDITION:
The LIKE condition allow you to use wildcards in where clause of an SQL statement. This
allows you to perform pattern matching. The like conditions can be used in any valid SQL
statement.

Wild cards:
% allows you to match any string of any length
_ allows you to match on a single character.

• Difference between RANK and DENSERANK?


RANK:
Rank functions generate the rank values based on the values, if two people get same rank
then next rank value will be skipped.

DENSE RANK: Dense rank is same as rank it will also generate rank values but it does not skip
the rank value.
Null values not consider the rank.

• INLINE query?
When we write a inner query after the from clause of an SQL statement.

• Sub query?
• Sub query or inner query or nested query is a query in a query. A sub query is usually
added in the WHERE clause of the SQL statement. Most of the time, a sub query is
used when you know how to search a value using a SELECT statement but don’t know
the exact value.
• Sub queries are an alternative way of returning data from multiple tables.

• Co related Sub query?


• A query is called co-related sub query when both the inner query and outer query are
interdependent. For every row processed by inner query, the outer query is processed
as well. The inner query depends on the outer query before it can process

• What is a view?
A view is a virtual table. Which provide access to a subset of column from one or more table.
A view can derive its data from one or more table, An output of query can be stored in view.
View act like a table but it does not physically take any space. View is good way to present
data from one user to another user instead of accessing the table directly. A view in oracle is
nothing but a stored sql scripts. Views itself contain no data.

• What is index and stored procedure?


INDEX: An index is a pointer location of the data just like an index in a book. The purpose of index
is to make SQL queries run faster and trace the information faster.
STORED PROCEDURE:
• A stored procedure is a set of SQL command that has been compiled and stored on the
database server.
• Once the stored procedure has been stored, client applications can execute the stored
procedure over and over again without sending it to the database server and without
compiling it again.
• Stored procedures improves performance by reducing network traffic and CPU load

You might also like