0% found this document useful (0 votes)
15 views7 pages

DATASTAGE

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views7 pages

DATASTAGE

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Q.

source table
name
A
A
B
B
B
C
C
D

In source table data like this


but I want traget table like this
name count
A1
A2
B1
B2
B3
C1
C2
D1

Soln:

Seq----transformer---seq.
Go to the stage variable properties create two staging variables first is s1
and second is s2 .
s2 = if inputcolumn = s1 then s2 +1 else 1
s1= inputcolumn
create new column is count.
s2= count

Q.

If the paralle or seq job do not have configuration file..will it run


successfully or not. ( If the job throws any error if yes what type of error it
will be)

Soln:

Without Config File the job wont even compile


Q.

Without using stage variable how can we delete the duplicates using
Transformer?

Soln:

The prerequisite for this is data should be partitioned by a key and sorted.
Later, follow the below steps:
S2 = if input.column = S1 then 0 else 1
S1 = input.column
In constraint : S2=1

And, you will get the expected result.

Q,

I have a source file having data like:


10
10
10
20
20
20
30
30
40
40
50
60
70
i want three output from the above input file, these output would be:
1) having only unique records no duplicates should be there. Like:
10
20
30
40
50
60
70
2) having only duplicate records, no unique records should be there. Like:
10
10
10
20
20
20
30
30
40
40
3) only unique record should be present. Like:
50
60
70
how can i achieve this using datastage 8.5?

Soln:

Sourcefile --> copy stage --> 1st link --> Removeduplicate stage -->
outputfile1 with 10,20,30,40,50,60,70

Copy stage-->2nd link --> aggregator stage (creates the row count)-->
filter stage-->filter1 (count>1) -->outputfile2 with 10,20,30,40 --
>Filter2(count=1)-->outputfile3 with 50,60,70

Q.

Details:
Col
-----
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10

we want to generate output like:


Col1 Col2 Col3
C1 C2 C3
C4 C5 C6
C7 C8 C9
C10

Can anyone please help me on this scenario?

Soln:

Use below constraints in xfm to move the data into 3 different columns.
Make the xfm partition to run sequential.
mod(@inrownum,3) = 1
mod(@inrownum,3) = 2
mod(@inrownum,3) = 0

This will move first record to first output file, second to second and third
record to third reference link.

Q.

I have a sequential file it is having some records and the same file is
having header and footer now my question is how to count the records in
a file which is not counts the header and footer records and then
transform the records in to target and then again we fetch the header and
footer to that file which records are matched with the header

Soln:

Cat | sed 1d ; $d | wc -l

sed 1d --- for removing 1st row


sed $d ---- for removing last row

Q.

input
-------------
name | no
--------------------
Bose 1
Mani 2
Arun 3

Output
-------------
name | no
--------------------
Bose 1
Mani 2
Mani 2
Arun 3
Arun 3
Arun 3

to get the Using transformer stage

Soln:

Seq--->Tnx----Dataset

Seq: load input file


Tx:
1) sort(stage roperties --》
inputs--》partitioning --》partion type =Hash--》select one column --》select
check box perform sort

2) in d loop condition --》Loop while --》right click, select system variable


--》select, @ITERATION <=DSlink. input No

3) in d link derivation
DSlink.name ---Name
@ITERATION ----Required_NumRows

Taget: give the target file, extension is .ds

O/p:
Name No Required_NumRows
BOSE 1 1
MANI 2 2
MANI 2 2
ARUN 3 3
ARUN 3 3
ARUN 3 3

Q,

Incase of only dropping columns(without any transformations or business


rules), we can go for copy stage instead of transformer.But can anyone tell
me exactly why copy stage is better in performance than transformer?

Soln:

Look Copy stage is made up for only copying data and transformer stage
is made up for multiple functionalities and also it is made up of c++ code.
So if you want to only copy data and rename datatypes so you can go with
copy stage because if you use transformer stage it will call all c++
functions for running single operation and it will consume too much time
for single functionality so here copy stage is better than transformer
stage. But you want to multiple functionalities like copy ,and want to use
mathematical functions, miscellaneous functions ,logical etc so
transformer stage is good for multitasking .

Q,

Which partition followed by joiner,merge. lookup,remove duplicate

Soln:

For Join, Merge and Remove duplicates, have data on links hash key
partitioned an sorted on Key columns specified. For lookup - primary link
needs to be hash key partitioned and sorted and reference link has to use
entire partition method.

Q,

How to improve the Performance Tuning any 5 points

Soln:

1.First filter then extract. But dont extract and filter. Use SQL instead
of table method when extracting. Say 1 million records are coming
from input table but there is a filter condition (Acct_Type=S) in job
as per business documents which results only few records say (100).
2. Reduce as many as transformer stages.
3. Reduce stage variables.
4.Use Copy stage instead of a Transformer for simple operations like
:

•placeholder between stages


•renaming Columns
•dropping Columns
•implicit (default) type Conversions
5)Handle null and duplicate values properly
6)Reduce number of transformers as much as possible
7)Do not take more than 20 stages in a perticuler job

Q.

input file A contains12345678910input file B


contains6789101112131415Output file X contains12345Output file y
contains678910Output file z contains1112131415How can we do in this in
a single ds job in px ?.

Soln:

You might also like