0% found this document useful (0 votes)
100 views97 pages

Creating SAS Data Sets

Uploaded by

Gaurav Chandra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views97 pages

Creating SAS Data Sets

Uploaded by

Gaurav Chandra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 97

Chapter 6

Creating SAS Data Sets

Section 6.1
Reading Raw Data Files:
Column Input

Objectives

Create a temporary SAS data set from a raw data file.


Create a permanent SAS data set from a raw data file.
Explain how the DATA step processes data.
Read standard data using column input.

Accessing Data Sources


Data
Entry

Raw Data
File

Other Software
File

Conversion Process
FSEDIT
FSVIEW

DATA
Step
SAS
Data Set

SAS/ACCESS
Software

Reading Raw Data Files


Data for flights from New York to Dallas (DFW) and Los
Angeles (LAX) are stored in a raw data file. Create a SAS
data set from the raw data.
1
1
2
1---5----0----5----0
43912/11/00LAX 20137
Description Column
92112/11/00DFW 20131
Flight Number s 1- 3
11412/12/00LAX 15170
Date
4-11
98212/12/00dfw 5 85
Destination
12-14
43912/13/00LAX 14196
First Class
15-17
98212/13/00DFW 15116
Passengers
43112/14/00LaX 17166
Economy
18-20
98212/14/00DFW 7 88
Passengers
11412/15/00LAX
187
98212/15/00DFW 14 31
5

Creating a SAS Data Set

Raw Data File

In order to create a SAS data set


from a raw data file, you must

1
1
2
1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170

1. start a DATA step and name


the SAS data set being
created (DATA statement)

DATA Step

2. identify the location of the raw


data file to read
(INFILE statement)

data SAS-data-set-name;
infile 'raw-data-filename';
input input-specifications;
run;

3. describe how to read the data


fields from the raw data file Flight
(INPUT statement).
439
921
114

SAS Data Set


Date

Dest First
Class
12/11/00 LAX
20
12/11/00 DFW
20
12/12/00 LAX
15

Economy
137
131
170

...

Creating a SAS Data Set


General form of the DATA statement:
DATA
DATAlibref.SAS-data-set(s);
libref.SAS-data-set(s);
Example: This DATA statement creates a temporary
SAS data set named dfwlax:
data work.dfwlax;
Example: This DATA statement creates a permanent
SAS data set named dfwlax:
libname ia 'SAS-data-library';
data ia.dfwlax;
8

Pointing to a Raw Data File


General form of the INFILE statement:
INFILE
INFILE filename
filename<options>;
<options>;
Examples:
OS/390
infile 'edc.prog1.dfwlax';
UNIX
infile '/users/userid/dfwlax.dat';
Windows
infile 'c:\workshop\winsas\prog1\dfwlax.dat';

The PAD option in the INFILE statement is useful for


reading variable-length records typically found in
Windows and UNIX environments.

Reading Data Fields


General form of the INPUT statement:
INPUT
INPUTinput-specifications;
input-specifications;
input-specifications
names the SAS variables
identifies the variables as character or numeric
specifies the locations of the fields in the raw data
can be specified as column, formatted, list or
named input.

10

Reading Data Using Column Input


Column input is appropriate for reading
data in fixed columns
standard character and numeric data.
General form of a column INPUT statement:
INPUT
INPUTvariable
variable<$>
<$> startcol-endcol
startcol-endcol .. .... ;;
Examples of standard numeric data:
15

11

-15

15.4

+1.23

1.23E3

-1.23E-3

The Raw Data


Description Column
Flight Number s 1- 3
Date
4-11
Destination
12-14
First Class
15-17
Passengers
Economy
18-20
Passengers

12

1
1
2
1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
98212/12/00dfw 5 85
43912/13/00LAX 14196
98212/13/00DFW 15116
43112/14/00LaX 17166
98212/14/00DFW 7 88
11412/15/00LAX
187
98212/15/00DFW 14 31

Reading Data Using Column Input


Raw Data File
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170

Read the raw data


file using column
input.

DATA Step
data SAS-data-set-name;
infile 'raw-data-filename';
input variable <$> startcol-endcol ...;
run;
SAS Data Set

13

Flight Date

Dest

439
921
114

LAX
DFW
LAX

12/11/00
12/11/00
12/12/00

FirstClass Economy
20
20
15

137
131
170

Reading Data Using Column Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;

14

...

Reading Data Using Column Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;

15

...

Reading Data Using Column Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;

16

...

Reading Data Using Column Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;

17

...

Reading Data Using Column Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;

18

...

Create Temporary SAS Data Sets


Store the dfwlax data set in the work library.
data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;

NOTE: The data set WORK.DFWLAX has 10 observations and 5


variables.

19

c06s1d1

Create Permanent SAS Data Sets


Alter the previous DATA step to permanently store the
dfwlax data set.
libname ia 'SAS-data-library';
data ia.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
NOTE: The data set IA.DFWLAX has 10 observations and 5
variables.

20

c06s1d2

Looking Behind the Scenes


The DATA step is processed in two phases:
compilation
execution.
data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;

21

Looking Behind the Scenes


At compile time, SAS creates
an input buffer to hold the current raw data file record
that is being processed

a program data vector (PDV) to hold the current


SAS observation
Flight Date
Dest
FirstClass Economy
$ 3
$ 8
$ 3
N 8
N 8

the descriptor portion of the output data set.

Flight Date
$ 3
$ 8

22

Dest
$ 3

FirstClass Economy
N 8
N 8

Compiling the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;

23

...

Compiling the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
Input Buffer

24

...

Compiling the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
Input Buffer

PDV
Flight
$ 3

25

...

Compiling the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
Input Buffer

PDV
Flight Date
$ 3
$ 8

26

...

Compiling the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
Input Buffer

PDV
Flight Date
$ 3
$ 8

27

Dest
$ 3
...

Compiling the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
Input Buffer

PDV
Flight Date
$ 3
$ 8

28

Dest
$ 3

FirstClass
N 8
...

Compiling the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
Input Buffer

PDV
Flight Date
$ 3
$ 8

29

Dest
$ 3

FirstClass Economy
N 8
N 8
...

Compiling the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
Input Buffer

PDV
Flight Date

Dest

dfwlax descriptor portion

30

Flight Date
$ 3
$ 8

Dest
$ 3

FirstClass Economy

FirstClass Economy
N 8
N 8

...

Executing the DATA Step


data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
Input Buffer

PDV

Flight Date

dfwlax

Flight Date

31

Dest

FirstClass Economy

Dest

FirstClass Economy

Flight Date

Dest

FirstClass Economy
.
.

Flight Date

Dest

FirstClass Economy

dfwlax

PDV

Input Buffer

32

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date

Dest

FirstClass Economy
.
.

Flight Date

Dest

FirstClass Economy

dfwlax

PDV

Input Buffer

33

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date
439

Dest

FirstClass Economy
.
.

Flight Date

Dest

FirstClass Economy

dfwlax

PDV

Input Buffer

34

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date
439
12/11/00

Dest

FirstClass Economy
.
.

Flight Date

Dest

FirstClass Economy

dfwlax

PDV

Input Buffer

35

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date
439
12/11/00

Dest
LAX

FirstClass Economy
.
.

Flight Date

Dest

FirstClass Economy

dfwlax

PDV

Input Buffer

36

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date
439
12/11/00

Dest
LAX

FirstClass Economy
20
.

Flight Date

Dest

FirstClass Economy

dfwlax

PDV

Input Buffer

37

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date
439
12/11/00

Dest
LAX

FirstClass Economy
20
137

Flight Date

Dest

FirstClass Economy

dfwlax

PDV

Input Buffer

38

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date
439
12/11/00

Dest
LAX

FirstClass Economy
20
137

Automatic
output
Flight
Date
439
12/11/00

Dest
LAX

FirstClass Economy
20
137

dfwlax

PDV

Input Buffer

39

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight
$ 1-3 Date $ 4-11
Automatic
return
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date

Dest

FirstClass Economy
.
.

Flight Date
439
12/11/00

Dest
LAX

FirstClass Economy
20
137

dfwlax

PDV

Input Buffer

40

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
Reinitialize variables to
11412/12/00LAX 15170
missing

...

Flight Date

Dest

FirstClass Economy
.
.

Flight Date
439
12/11/00

Dest
LAX

FirstClass Economy
20
137

dfwlax

PDV

Input Buffer

41

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date

Dest

FirstClass Economy
.
.

Flight Date
439
12/11/00

Dest
LAX

FirstClass Economy
20
137

dfwlax

PDV

Input Buffer

42

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date
921
12/11/00

Dest
DFW

FirstClass Economy
20
131

Flight Date
439
12/11/00

Dest
LAX

FirstClass Economy
20
137

dfwlax

PDV

Input Buffer

43

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Flight Date
921
12/11/00

Dest
DFW

FirstClass Economy
20
131

Automatic
output
Flight
Date
439
12/11/00
921
12/11/00

Dest
LAX
DFW

FirstClass Economy
20
137
20
131

dfwlax

PDV

Input Buffer

44

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight
$ 1-3 Date $ 4-11
Automatic
return
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

...

Executing the DATA Step

Flight Date

dfwlax

PDV

Input Buffer

45

Flight
439
921
114

Date
12/11/00
12/11/00
12/12/00

Dest

FirstClass Economy
.
.

Dest
LAX
DFW
LAX

FirstClass Economy
20
137
20
131
15
170

Raw Data

data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20; 43912/11/00LAX 20137
run;
92112/11/00DFW 20131
11412/12/00LAX 15170

DATA Step Execution: Summary


Compile Program
Initialize Variables
to Missing (PDV)
Execute INPUT
Statement
Execute Other
Statements
Output to
SAS Data Set
46

End of
File?

No

Yes

Next
Step

Access Temporary SAS Data Sets


proc print data=work.dfwlax;
run;
The SAS System

47

Obs

Flight

Date

Dest

1
2
3
4
5
6
7
8
9
10

439
921
114
982
439
982
431
982
114
982

12/11/00
12/11/00
12/12/00
12/12/00
12/13/00
12/13/00
12/14/00
12/14/00
12/15/00
12/15/00

LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW

First
Class
20
20
15
5
14
15
17
7
.
14

Economy
137
131
170
85
196
116
166
88
187
31

c06s1d1

Access Permanent SAS Data Sets


To access a permanently stored SAS data set,
submit a LIBNAME statement to assign a libref to the
SAS data library
use the libref as the first-level name of the SAS data
set.
The LIBNAME statement only needs to be submitted
once per SAS session.

48

Access Permanent SAS Data Sets


libname ia 'SAS-data-library';
proc print data=ia.dfwlax;
run;
The SAS System
Obs
1
2
3
4
5
6
7
8
9
10

49

Flight
439
921
114
982
439
982
431
982
114
982

Date

Dest

12/11/00
12/11/00
12/12/00
12/12/00
12/13/00
12/13/00
12/14/00
12/14/00
12/15/00
12/15/00

LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW

First
Class
20
20
15
5
14
15
17
7
.
14

Economy
137
131
170
85
196
116
166
88
187
31

c06s1d2

Objectives

50

Read standard and nonstandard character and


numeric data using formatted input.
Read date values and convert them to
SAS date values.

Reading Data Using Formatted Input


Formatted input is appropriate for reading
data in fixed columns
standard and nonstandard character and numeric
data
calendar values to be converted to SAS date values.

51

Reading Data Using Formatted Input


General form of the INPUT statement with formatted input:
INPUT
INPUTpointer-control
pointer-control variable
variableinformat
informat .... .. ;;
Formatted input is used to read data values by
moving the input pointer to the starting position of
the field
specifying a variable name
specifying an informat.

52

Reading Data Using Formatted Input


Pointer controls:
@n

moves the pointer to column n.

+n

moves the pointer n positions.

An informat specifies

53

the width of the input field

how to read the data values that are stored in the


field.

What Is a SAS Informat?


An informat is an instruction that SAS uses to read data
values.
SAS informats have the following form:
Indicates a
character
informat

<$>informat-namew.<d>
Informat
name
Total width of
the field to
read

54

Number of
decimal places

Required
delimiter

Selected Informats
8. or 8.0

reads 8 columns of numeric data.

Raw Data Value

Informat
8.0

SAS Data Value

8.0
8.2

reads 8 columns of numeric data and may insert


a decimal point in the value.

Raw Data Value

Informat
8.2
8.2

55

SAS Data Value

Selected Informats
$8.

reads 8 columns of character data and removes


leading blanks.

Raw Data Value

Informat
$8.

SAS Data Value

$CHAR8. reads 8 columns of character data and preserves


leading blanks.
Raw Data Value

56

Informat
$CHAR8.

SAS Data Value

Selected Informats
COMMA7.

reads 7 columns of numeric data and removes


selected nonnumeric characters such as dollar
signs and commas.

Raw Data Value

MMDDYY8.

SAS Data Value

reads dates of the form 10/29/01.

Raw Data Value

57

Informat
COMMA7.0

Informat
MMDDYY8.

SAS Data Value

Working with Date Values


Date values that are stored as SAS dates are special
numeric values.
A SAS date value is interpreted as the number of days
between January 1, 1960, and a specific date.
01JAN1959
01JAN1960
informat
-365

01JAN1961

366

01/01/1960

01/01/1961

format
01/01/1959

58

...

Convert Dates to SAS Date Values


SAS uses date informats to read and convert dates to
SAS date values.
Examples:
Raw Data
Value
10/29/2001
15277
10/29/01
29OCT2001
29/10/2001
15277

59

Converted
Informat
Value
MMDDYY10.
MMDDYY8.
15277
DATE9.
15277
DDMMYY10.
Number of days between
01JAN1960 and 29OCT2001

Using Formatted Input

Raw Data File


43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170

Read the raw data


file using formatted
input.

DATA Step
data SAS-data-set-name;
infile 'raw-data-filename';
input pointer-control variable informat-name;
run;

SAS Data Set

60

Flight Date

Dest

439
921
114

LAX
DFW
LAX

14955
14955
14956

SAS date values


FirstClass Economy
20
20
15

137
131
170

Reading Data: Formatted Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;

61

...

Reading Data: Formatted Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;

62

...

Reading Data: Formatted Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;

63

...

Reading Data: Formatted Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;

64

...

Reading Data: Formatted Input


1

1---5----0----5----0
43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;

65

...

Reading Data: Formatted Input


1

1---5----0----5----0
Raw Data File 43912/11/00LAX 20137
92112/11/00DFW 20131
11412/12/00LAX 15170
data work.dfwlax;
infile 'raw-data-file';
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;
run;

66

c06s2d1

Reading Data: Formatted Input


proc print data=work.dfwlax;
run;
SAS date values
The SAS System

Obs
1
2
3
4
5
6
7
8
9
10

67

Flight

Date

Dest

439
921
114
982
439
982
431
982
114
982

14955
14955
14956
14956
14957
14957
14958
14958
14959
14959

LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW

First
Class
20
20
15
5
14
15
17
7
.
14

Economy
137
131
170
85
196
116
166
88
187
31

c06s2d1

Reading Data: Formatted Input


proc print data=work.dfwlax;
format Date date9.;
Formatted SAS
run;
date values
The SAS System
Obs
1
2
3
4
5
6
7
8
9
10

68

Flight
439
921
114
982
439
982
431
982
114
982

Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000

Dest
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW

First
Class

Economy

20
20
15
5
14
15
17
7
.
14

137
131
170
85
196
116
166
88
187
31

c06s2d2

Objectives

69

Define types of data errors.


Identify data errors.

What Are Data Errors?


SAS detects data errors when
the INPUT statement encounters invalid data in a field
illegal arguments are used in functions
impossible mathematical operations are requested.

70

Examining Data Errors


When SAS encounters a data error,
1. a note that describes the error is printed in the
SAS log
2. the input record being read is displayed in the
SAS log (contents of the input buffer)
3. the values in the SAS observation being created
are displayed in the SAS log (contents of the PDV)
4. a missing value is assigned to the appropriate
SAS variable
5. execution continues.

71

Objectives

72

Assign permanent attributes to SAS variables.


Override permanent variable attributes.

Default Variable Attributes


When a variable is created in a DATA step, the
name, type, and length of the variable are
automatically assigned
remaining attributes such as label and format are not
automatically assigned.
When the variable is used in a later step,
the name is displayed for identification purposes
its value is displayed using a system-determined
format.

73

Default Variable Attributes


Create the ia.dfwlax data set.
libname ia 'SAS-data-library';
data ia.dfwlax;
infile 'raw-data-file';
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;
run;

74

c06s4d1

Default Variable Attributes


Examine the descriptor portion of the ia.dfwlax
data set.
proc contents data=ia.dfwlax;
run;
Partial Output
-----Alphabetic List of Variables and Attributes----#
Variable
Type
Len
Pos
------------------------------------2
Date
Num
8
0
3
Dest
Char
3
27
5
Economy
Num
8
16
4
FirstClass
Num
8
8
1
Flight
Char
3
24

75

c06s4d1

Specifying Variable Attributes


Use LABEL and FORMAT statements in the
PROC step to temporarily assign the attributes (for the
duration of the step only)
DATA step to permanently assign the attributes (stored
in the data set descriptor portion).

76

Temporary Variable Attributes


Use LABEL and FORMAT statements in a PROC step to
temporarily assign attributes.
proc print data=ia.dfwlax label;
format Date mmddyy10.;
label Dest='Destination'
FirstClass='First Class Passengers'
Economy='Economy Passengers';
run;

77

c06s4d1

Temporary Variable Attributes


The SAS System

78

Obs

Flight

1
2
3
4
5
6
7
8
9
10

439
921
114
982
439
982
431
982
114
982

Date
12/11/2000
12/11/2000
12/12/2000
12/12/2000
12/13/2000
12/13/2000
12/14/2000
12/14/2000
12/15/2000
12/15/2000

Destination

First
Class
Passengers

Economy
Passengers

LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW

20
20
15
5
14
15
17
7
.
14

137
131
170
85
196
116
166
88
187
31

Permanent Variable Attributes


Assign labels and formats in the DATA step.
libname ia 'SAS-data-library';
data ia.dfwlax;
infile 'raw-data-file';
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;
format Date mmddyy10.;
label Dest='Destination'
FirstClass='First Class Passengers'
Economy='Economy Passengers';
run;

79

c06s4d2

Permanent Variable Attributes


Examine the descriptor portion of the ia.dfwlax
data set.
proc contents data=ia.dfwlax;
run;
Partial Output
-----Alphabetic List of Variables and Attributes----# Variable
Type Len Pos Format
Label
---------------------------------------------------------------2 Date
Num
8
0 MMDDYY10.
3 Dest
Char
3
27
Destination
5 Economy
Num
8
16
Economy Passengers
4 FirstClass Num
8
8
First Class Passengers
1 Flight
Char
3
24

80

c06s4d2

Permanent Variable Attributes


proc print data=ia.dfwlax label;
run;
The SAS System

81

Obs

Flight

1
2
3
4
5
6
7
8
9
10

439
921
114
982
439
982
431
982
114
982

Date
12/11/2000
12/11/2000
12/12/2000
12/12/2000
12/13/2000
12/13/2000
12/14/2000
12/14/2000
12/15/2000
12/15/2000

Destination

First
Class
Passengers

Economy
Passengers

LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW

20
20
15
5
14
15
17
7
.
14

137
131
170
85
196
116
166
88
187
31

c06s4d2

Override Permanent Attributes


Use a FORMAT statement in a PROC step to
temporarily override the format stored in the data set
descriptor.
proc print data=ia.dfwlax label;
format Date date9.;
run;

82

c06s4d3

Override Permanent Attributes


The SAS System

Obs
1
2
3
4
5
6
7
8
9
10

83

Flight
439
921
114
982
439
982
431
982
114
982

Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000

Destination

First
Class
Passengers

LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW

20
20
15
5
14
15
17
7
.
14

Economy
Passengers
137
131
170
85
196
116
166
88
187
31

Section 6.5
Changing Variable Attributes

Objectives

85

Use features in the windowing environment to change


variable attributes.
Use programming statements to change variable
attributes.

Changing Variable Attributes

This demonstration illustrates using the SAS


windowing environment to change variable
attributes under the SAS windowing environment.

86

The DATASETS Procedure


You can use the DATASETS procedure to modify a
variables
name
label
format
informat.

87

The DATASETS Procedure


General form of PROC DATASETS for changing variable
attributes:

PROC
PROCDATASETS
DATASETS LIBRARY=libref
LIBRARY=libref ;;
MODIFY
MODIFY SAS-data-set
SAS-data-set ;;
RENAME
RENAME old-name-1=new-name-1
old-name-1=new-name-1
<.
<... .. old-name-n=new-name-n>;
old-name-n=new-name-n>;
LABEL
LABELvariable-1='label-1'
variable-1='label-1'
<.
<..... variable-n='label-n'>;
variable-n='label-n'>;
FORMAT
FORMATvariable-list-1
variable-list-1 format-1
format-1
<.
<..... variable-list-n
variable-list-n format-n>;
format-n>;
INFORMAT
INFORMATvariable-list-1
variable-list-1 informat-1
informat-1
<.
<..... variable-list-n
variable-list-n informat-n>;
informat-n>;
RUN;
RUN;
88

Data Set Contents


Use the DATASETS procedure to change the name of the
variable Dest to Destination.
Look at the attributes of the variables in the ia.dfwlax
data set.
proc contents data=ia.dfwlax;
run;
-----Alphabetic List of Variables and Attributes----#
Variable
Type
Len
Pos
------------------------------------2
Date
Char
8
19
3
Dest
Char
3
27
5
Economy
Num
8
8
4
FirstClass
Num
8
0
1
Flight
Char
3
16

89

c06s5d1

The DATASETS Procedure


Rename the variable Dest to Destination.
proc datasets library=ia;
modify dfwlax;
rename Dest=Destination;
run;

90

c06s5d1

Data Set Contents


Look at the attributes of the variables in the ia.dfwlax
data set after running PROC DATASETS.
proc contents data=ia.dfwlax;
run;
-----Alphabetic List of Variables and Attributes----#
Variable
Type
Len
Pos
-------------------------------------2
Date
Char
8
19
3
Destination
Char
3
27
5
Economy
Num
8
8
4
FirstClass
Num
8
0
1
Flight
Char
3
16

91

c06s5d1

Objectives

92

Create a SAS data set from an Excel spreadsheet


using the Import Wizard.
Create a SAS data set from an Excel spreadsheet
using PROC IMPORT.

Business Task
The flight data for Dallas and Los Angeles are in an
Excel spreadsheet. Read the data into a SAS data set.
Excel Spreadsheet

SAS Data Set


Flight Date
439
921
114

93

Dest

FirstClass Economy

SAS Data Set

12/11/00
12/11/00
12/12/00

LAX
DFW
LAX

20
20
15

137
131
170

The Import Wizard


The Import Wizard is a point-and-click graphical
interface that enables you to create a SAS data set
from several types of external files including
dBASE files (*.DBF)
Excel spreadsheets (*.XLS)
Microsoft Access tables (*.MDB)
delimited files (*.*)
comma-separated values (*.CSV).

94

Reading Raw Data with the


Import Wizard
c06s6d1.sas

This demonstration illustrates using the Import


Wizard to create a SAS data set from an Excel
spreadsheet.

95

The IMPORT Procedure


General form of the IMPORT procedure:
PROC
PROCIMPORT
IMPORTOUT=SAS-data-set
OUT=SAS-data-set
DATAFILE='external-file-name
DATAFILE='external-file-name
DBMS=file-type;
DBMS=file-type;
GETNAMES=YES;
GETNAMES=YES;
RUN;
RUN;

96

The IMPORT Procedure


Look at the file created by the Import Wizard.
PROC IMPORT OUT= WORK.DFWLAX
DATAFILE= "DallasLA.xls"
DBMS=EXCEL2000 REPLACE;
GETNAMES=YES;
RUN;
What if the data in the previous example were stored in
a tab-delimited file?

97

c06s6d2

The IMPORT Procedure


Change the PROC IMPORT code to read the
tab-delimited file.
PROC IMPORT OUT= WORK.DFWLAX
DATAFILE= "DallasLA.txt"
DBMS=TAB REPLACE;
GETNAMES=YES;
RUN;

98

c06s6d3

You might also like