0% found this document useful (0 votes)
32 views2 pages

S As Cleansing

This document contains SAS code to clean and standardize supplier name data from an Access database. It imports the data, examines the contents, identifies unique supplier names, cleans the names by removing abbreviations and special characters, links the cleaned names back to the original data, and outputs the results.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views2 pages

S As Cleansing

This document contains SAS code to clean and standardize supplier name data from an Access database. It imports the data, examines the contents, identifies unique supplier names, cleans the names by removing abbreviations and special characters, links the cleaned names back to the original data, and outputs the results.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 2

/*PROC IMPORT OUT= WORK.

Products
DATATABLE= "Products"
DBMS=ACCESS2000 REPLACE;
DATABASE="C:\DataWarehousing04s\SASDataQuality.mdb";
RUN;

Proc Contents Data= Products;

run;

Proc Freq Data= Products;


tables Supplier/nopercent nocum;
Run;

Data Products2 Changed;


Set Products;
Supplier= Tranwrd(supplier, "Incorporated", "Inc.");
Supplier= Tranwrd(supplier, "Inc.", "");
Supplier= Tranwrd(supplier, "Company", "Co.");
Supplier= Tranwrd(supplier, ",", " ");
Supplier= Tranwrd(supplier, "and", "&");
If scan(Reverse(Trim(supplier)),1) = "oC" then do;
Supplier= Tranwrd(supplier, "Co.", "");
Output changed;
End;
If scan(Reverse(Trim(supplier)),1, " ") = "& " then do;
Supplier= Tranwrd(supplier, "&", "");
Output changed;
End;
Department= Substr(pcode,1,2);
If Supplier= "Trinkets and Things" then supplier= "Trinkets n' Things";
If Supplier= "Trinkets & Things" then supplier= "Trinkets n' Things";
If Supplier= "R.C.I." then supplier= "RCI";
If Supplier= "JJ Higgins &" then supplier= "JJ Higgins";
If Supplier= "Programmer's Coice" then supplier= "Programmer's Choice";
If Supplier= "L&s Alive" then supplier= "L&s Alive!";
Output Products2;

Proc Freq Data= Products2;


Tables supplier/out= SupplierNames;

Data SupplierNames;
Keep Supplier;
Length Supplier $30.;
Set suppliernames;

Proc print data= suppliernames;


*/
Proc sort data= Products;
By supplier;

Data GoodSupplier ErrorSupplier All;


Keep supplier inall ingood;
Length Supplier $30;
Merge Products (in= all) SupplierNames (in= good);
By supplier;
Inall= all;
Ingood= good;
If all;
If good then output GoodSupplier;
Else output ErrorSupplier;
output all;

Proc print data= GoodSupplier;


Var supplier;
format supplier $30.;

Proc print data= ErrorSupplier;


Var Supplier;
format supplier $30.;

Proc print data= all;


format supplier $30.;

Run;

You might also like