DataStage Parallel Job Warnings-Its Solutions
DataStage Parallel Job Warnings-Its Solutions
Warnings-Its Solutions
By
Sreedevi Boddepalli
DataStage Parallel Jobs:Warnings-its solutions
INDEX
1 INTRODUCTION--------------------------------------------------------------------------------------------- 3
13 SUMMARY ---------------------------------------------------------------------------------------------------- 22
14 CONCLUSION ------------------------------------------------------------------------------------------------ 22
Page 2 of 22
DataStage Parallel Jobs:Warnings-its solutions
1 INTRODUCTION
Anyone who has worked on DataStage will know the kind of warnings that it gives.
Now, while the developer may be well aware of the nature of these warnings and their
impact on the data in question, it is often difficult to convince the customer that these
warnings are after all just warnings.
Personally, I find it easier to remove them from the job’s altogether and avoid the
(possibly) unpleasant showdown with the client at a point when time is scarce.
In course of this document, you will come across several common albeit interesting
warnings that are relatively simple to deal with.
We will use this job as an example to walk through the most common of DataStage
warnings and see how these can be removed.
The First Stage i.e. SOURCE_TABLE reads data from a table called TEST with the below
structure
Page 3 of 22
DataStage Parallel Jobs:Warnings-its solutions
We import the metadata for the table TEST into the SOURCE_TABLE stage.
The column’s tab in the stage is as follows
Upon running the job we encounter the following warnings in DataStage director.
Page 4 of 22
DataStage Parallel Jobs:Warnings-its solutions
Adjusting scale (or Type Conversion) Warning is one of the more common warnings
occurring in DataStage.
It states that a column can not be implicitly converted from a number or decimal format
to a character format by data stage internally.
It may also occur when the metadata of a column states the data type to be
decimal(38,10) but the developer changes it to number or varchar.
Solution:
Experience shows that DataStage will give this warning whenever a column type is in the
number or decimal format.
This is because, as the warning states, DataStage does not support the floating point
decimal (number) type.
The key to avoiding or removing this error is to convert the column in question to
VARCHAR in the SQL query itself and propagate it until you reach a transformer stage
wherein the developer can easily do a StringToDecimal conversion.
NULL handling is a must here else you will be stuck with a NULL handling warning.
Refer to the below screenshot for some insight into how we can rid ourselves of this
pesky warning:
Page 5 of 22
DataStage Parallel Jobs:Warnings-its solutions
We then manually edit the data type in the Columns tab of the SOURCE_TABLE stage.
If we run the job after this change, we will encounter the following Implicit Conversion
warning
Page 6 of 22
DataStage Parallel Jobs:Warnings-its solutions
The above warning will come if we simply change the data type in the stage without handling the
conversion. It has to be uniform until you find a conveniently located transformer where we can
apply a StringToDecimal conversion given below.
NULL Handling is strongly recommended in any type of conversion to avoid warnings related to
Nulls.
Page 7 of 22
DataStage Parallel Jobs:Warnings-its solutions
When we propagate Nullable column to Not Null column it will through warnings like
“Converting Nullable Source to Not-Null result”
Now, if we alter the Nullable field for any of the nullable columns as follows
Page 8 of 22
DataStage Parallel Jobs:Warnings-its solutions
We encounter a similar warning as no column can have its Nullable field as ‘No’
Solution:
After you import the metadata for any table and propagate it through the job, care must
be taken to ensure that the Nullable property for the column is ‘Yes’ throughout.
i.e. across all stages.
Page 9 of 22
DataStage Parallel Jobs:Warnings-its solutions
This warning mostly occurs in transformer stage. If we are using any column in the stage
variable level without null handling then this warning will encounter.
To remove this warning, ensure that you perform NULL handling whenever you perform a
transformation to a column in DataStage.
6 TRUNCATION WARNING
Refer to the first screenshot of the column metadata in the SOURCE_TABLE stage.
Now, if we reduce the length of any column to match with the length of target column
then we will get this Truncation Warning.
To elaborate, assume you are inserting data from Table A into Table B using a straight
through mapping. Length of column 1 in A is 200 and its corresponding column in B has
length 150.When we manually change the length of column 1 in the source stage itself
from 200 to 150 as given below
Page 10 of 22
DataStage Parallel Jobs:Warnings-its solutions
The change in the length of any column may not really affect the data being transferred
But it will through the below warning:
Page 11 of 22
DataStage Parallel Jobs:Warnings-its solutions
Solution:
When you import the metadata of the column e.g. data of type VARCHAR(200) and you
only require up to length 150, DO NOT change the value in the DataStage stage
manually. Instead do it in the next available transformer as given below:
7 AGGREGATOR WARNING
It is important to note that the output from the aggregator should always be of type
Double else a warning will occur.
Therefore, no matter the input data type, change the data type of the aggregated column
to double.
Page 12 of 22
DataStage Parallel Jobs:Warnings-its solutions
We are counting the number of ID’s and storing that value in ROW_NUM. However, we
have given the data type of ROW_NUM to be Integer because of which we encounter the
following warning
Page 13 of 22
DataStage Parallel Jobs:Warnings-its solutions
Solution:
Always bear in mind that the output column from the aggregator viz. the column storing
the count, sum etc should have a Double data type.
In order to avoid/remove this warning we simply make the data type of the column in
question to be Double.
This should be avoided at all costs else you may lose important data.
Page 14 of 22
DataStage Parallel Jobs:Warnings-its solutions
Solution:
The key to avoiding this warning (and the unpleasant data loss) is to give NAME, ADDR
and PHONE different names and then propagate them.
As seen, we rename the columns from one of the links and then propagate the columns
and the warning is removed.
Ignoring Duplicate Warning occurs in the Lookup Stage. It comes whenever the
column looked up on has the same value in more than one record
I.e. If we lookup on the table TEST on Name and there are 2 (or more) records with the
same name, DataStage gives an Ignoring Duplicate warning.
Page 15 of 22
DataStage Parallel Jobs:Warnings-its solutions
Solution:
There is no sure-fire way to remove this warning because more often there is always a
logical explanation for it, like our table may have multiple columns (refer table
description) but the column used for lookup may have the same value for multiple
records
e.g. our table is
Assume we are required to perform a lookup on Name. If 2 or more records in the table
have the same name then we will get the Ignoring Duplicate Warning even if the records
are distinct.
However, using distinct in the SQL or setting ‘Allow Duplicates’ to True will remove
the warning.
Bear in mind that these methods of warning removal are based purely on
requirements.
Page 16 of 22
DataStage Parallel Jobs:Warnings-its solutions
It occurs because the column that is being exported (or written to) the flat file contains
NULL values.
Page 17 of 22
DataStage Parallel Jobs:Warnings-its solutions
3. Click on Nullable
Now simply set the Null Field value and specify the Null Field vale in the appearing
Textbox and you’re good to go!!
Alternatively, you can perform NULL handling in the transformer prior to the Sequential
File stage and in the Sequential File Stage, go to the Format tab and click on Field
Defaults, from the list appearing in the bottom right area, click on Null field value and
set it to ‘’ or whatever you prefer.
Page 18 of 22
DataStage Parallel Jobs:Warnings-its solutions
This warning is encountered in jobs having an Aggregator stage and it looks like this..
It occurs when the Method option in the Options property of the Aggregator is ‘Hash’
This is because, when we perform an aggregation on large volumes of data, the hash
table (which stores temporary data) grows to a size beyond its capacity resulting in the
above warning.
Page 19 of 22
DataStage Parallel Jobs:Warnings-its solutions
All you have to do is switch the Method option in the Options property of the
Aggregator to ‘Sort’ as shown below.
If you look at the Information panel (bottom right) in the above screenshot, it’ll even tell
you which option to use when!!
This warning is little dicey because removing it depends purely on your requirements.
It looks something like this:
Page 20 of 22
DataStage Parallel Jobs:Warnings-its solutions
This warning is basically the result of partitioning data on the same key in
successive stages.
In the warning illustrated above, we see that the warning occurs in the Sort stage because
we are partitioning on the same keys in successive stages, in this case the Join stage and
the Sort stage.
To remove this warning, go to the stage prior to the stage highlighted in the warning and
in the Stage tab (in our case the Join Stage) and set Preserve Partitioning to ‘Clear’.
Now Click Ok, Compile and Run your job and the warning will have gone.
Page 21 of 22
DataStage Parallel Jobs:Warnings-its solutions
13 SUMMARY
To summarize the above, refer to the embedded excel sheet.
DataStage Warnings
Summarized.xls
14 CONCLUSION
I hope this document has provided you with enough insight into Warning Fixes in
DataStage Parallel Jobs. By following the step by step method provided in doc ,you can
easily develop warning free designs in Datastage. All the Best!!
Page 22 of 22