SDA Lab 2
SDA Lab 2
Q1: Open the dta file pslm_inc_edu.dta and provide the command in your do-file
• Byte
• Integar
• Long
• Float (this includes non-integar values)
• Double (Non-integar values)
Eeman S Qureshi SDA Lab 2
The following shows how the data appears in the browse window:
Q3) Import the gradebook.xls dataset again and provide the command in your do-files.
Q4) What is the first row of this dataset telling us? Type the answer as a comment in your do-
file.
We will use the browse command for this. If we need more information regarding our variables so we
use the ‘describe’ command to check for the properties of the dataset as well as what each variable
name represents.
Eeman S Qureshi SDA Lab 2
Distribution of Variables
Discrete Variables: It is in count form and can only take numerical values in jumps e.g.
number of siblings. (1,2,3)
Continuous Variables: They can have infinite no. of values within a given interval e.g.,
height, speed, exam marks etc.
CROSS-TABULATION/TWO-WAY TABULATION:
How can we see the school wise distribution of females? We can do this through cross
tabulation or use relational operators with the tab command. Which tabular data is
easy to interpret?
female
school 0 1 Total
1 5 4 9
2 7 3 10
3 5 5 10
4 5 5 10
Total 22 17 39
Eeman S Qureshi SDA Lab 2
school
female 1 2 3 4 Total
0 5 7 5 5 22
1 4 3 5 5 17
Total 9 10 10 10 39
Q8: Interpret the following table and provide the interpretation as a comment in your
do-file
• Histogram Mid (The vertical scale of a 'density histogram' shows units that make the
total area of all the bars add to 1.)
• Histogram Mid, normal
Q9: How do we get the average/mean of Mids? Type the command in your
do-files.
If we want the summary statistics for all variables, we simply type sum and it shows us the following
table:
. sum
Codebook mid
Now suppose we want to find the mean of mid score for only females. To do
this we use the ‘if’function. We can provide STATA the if statement using
relational and logical operators. Both can be used together.
Logical Operators
• & (both conditions need to hold true/met simultaneously)
• | (this stands for ‘OR’ only one condition must hold true/be met)
To check average mid score for females we will type the following
command:
What if you want to get summary statistics of math scores for school 1 and
school 3?
.
. *2) Inlist command
.
. sum Mid if inlist(school, 2,3,4)
.
. *3) Inrange command
.
. sum Mid if inrange(school, 2,4)
Q11: Find the summary statistics for Mid from school 1,3,4?