0% found this document useful (0 votes)
99 views3 pages

Statistics 1

Statistics involves collecting, organizing, and summarizing data so that valid conclusions can be drawn. Descriptive statistics describe the basic features of data through techniques like frequency counts, ranges, means, and standard deviations. Inferential statistics are then used to make predictions or generalizations from samples to populations. Key concepts in inferential statistics include populations, samples, parameters, and statistics. Samples are subsets of populations that are studied to make inferences about populations, which can be impractical to study entirely.

Uploaded by

kuashask2
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views3 pages

Statistics 1

Statistics involves collecting, organizing, and summarizing data so that valid conclusions can be drawn. Descriptive statistics describe the basic features of data through techniques like frequency counts, ranges, means, and standard deviations. Inferential statistics are then used to make predictions or generalizations from samples to populations. Key concepts in inferential statistics include populations, samples, parameters, and statistics. Samples are subsets of populations that are studied to make inferences about populations, which can be impractical to study entirely.

Uploaded by

kuashask2
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 3

Lecture Sheet 1

Statistics is the science of collecting, organizing and summarizing data such that valid conclusions can be made from them. The collecting, organizing and summarizing part is called descriptive statistics, while making valid conclusions is inferential statistics. Descriptive statistics, not surprisingly, "describe" data that have been collected. Commonly used descriptive statistics include frequency counts, ranges (high and low scores or values), means, modes, median scores, and standard deviations. Two concepts are essential to understanding descriptive statistics: variables and distributions. Inferential statistics are used to draw conclusions and make predictions based on the descriptions of data. In this section, we explore inferential statistics by using an extended example of experimental studies. Key concepts used in our discussion are probability, populations, and sampling. Population: the universal set of all objects under study. Sample: Any subset of the population. A large population may be impractical and costly to study, collecting data from every member of the population. A sample is more manageable and easier to study. After collecting and organizing the data, a summary is made such as average values. Hopefully valid conclusions can be made on the whole population based on the sample data. Therefore it is important that the sample data collected be representative of the population. Otherwise conclusions may be invalid. Conclusions are only as reliable as the sampling process, and information can change from sample to sample. A parameter is a numerical measurement that describes a characteristic of a population, while a sample is a numerical measurement that describes a characteristic of a sample. In general, we will use a statistic to infer something about a parameter. Collecting: data points; each element in a set of data. Organizing: frequency distribution; a chart that lists each data point with the number of times it occurs. Relative Frequency: expressed as a percent of the total number of data points. Classification of Variables: A. According to continuity of values 1. Continuous variables. These are variables that can take the form of decimals. Example. Weight, length, height, school achievement. 2. Discrete or discontinuous variable. These are variables that cant take the form of decimals. Example: number of students, number of houses, size of a family, etc. B. According to scale and measurements: 1. Nominal variable. This property allows one to make statements of similarities or differences. Example: sex- member of population may be classified as male or female,

socio-economic status the member of the group may be classified as those belonging to high, average or low socio-economic status 2. Ordinal variable. This variable refers to a property whereby members of a group are ranked. Example: one can judge and rank the contestants in a beauty contest. 3. Internal variable. This property allows one to make statements of equality of intervals. Example: height, weight, temperature, test scores, etc. 4. Ratio variable. This property permits making statements of quality of ratios. Example: If Cora is 48 yrs. old and Philline is 22 years old. Their ages can be expressed in the ratio of 48:22 or 24:11 (twenty-four is to eleven) C. According to Functional Relationship: 1. Independent variable. This is sometimes termed as predictor variable. 2. Dependent variable. This is sometimes called criterion variable. Example: Academic achievement is dependent on I.Q . I.Q. is independent variable and academic achievement is the dependent variable What is the difference between categorical, ordinal and interval variables? In talking about variables, sometimes you hear variables being described as categorical (or sometimes nominal), or ordinal, or interval. Below we will define these terms and explain why they are important. Categorical A categorical variable (sometimes called a nominal variable) is one that has two or more categories, but there is no intrinsic ordering to the categories. For example, gender is a categorical variable having two categories (male and female) and there is no intrinsic ordering to the categories. Hair color is also a categorical variable having a number of categories (blonde, brown, brunette, red, etc.) and again, there is no agreed way to order these from highest to lowest. A purely categorical variable is one that simply allows you to assign categories but you cannot clearly order the variables. If the variable has a clear ordering, then that variable would be an ordinal variable, as described below. Ordinal An ordinal variable is similar to a categorical variable. The difference between the two is that there is a clear ordering of the variables. For example, suppose you have a variable, economic status, with three categories (low, medium and high). In addition to being able to classify people into these three categories, you can order the categories as low, medium and high. Now consider a variable like educational experience (with values such as elementary school graduate, high school graduate, some college and college graduate). These also can be ordered as elementary school, high school, some college, and college graduate. Even though we can order these from lowest to highest, the spacing between the values may not be the same across the levels of the variables. Say we assign scores 1, 2, 3 and 4 to these four levels of educational experience and we compare the difference in education between categories one and two with the difference in educational experience between categories two and three, or the difference between categories three and four. The difference between categories one and two (elementary and high school) is probably much bigger than the difference between categories two and three (high school and some college). In this example, we can order the people in level of educational experience but the size of the difference between categories is

inconsistent (because the spacing between categories one and two is bigger than categories two and three). If these categories were equally spaced, then the variable would be an interval variable. Interval An interval variable is similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. For example, suppose you have a variable such as annual income that is measured in dollars, and we have three people who make $10,000, $15,000 and $20,000. The second person makes $5,000 more than the first person and $5,000 less than the third person, and the size of these intervals is the same. If there were two other people who make $90,000 and $95,000, the size of that interval between these two people is also the same ($5,000). Cross-sectional data or cross section (of a study population) in statistics and econometrics is a type of onedimensional data set. Cross-sectional data refers to data collected by observing many subjects (such as individuals, firms or countries/regions) at the same point of time, or without regard to differences in time. Analysis of cross-sectional data usually consists of comparing the differences among the subjects. A time series is a sequence of observations which are ordered in time (or space). If observations are made on some phenomenon throughout time, it is most sensible to display the data in the order in which they arose, particularly since successive observations will probably be dependent. Time series are best displayed in a scatter plot. The series value X is plotted on the vertical axis and time t on the horizontal axis. Time is called the independent variable (in this case however, something over which you have little control). Panel data is data from a (usually small) number of observations over time on a (usually large) number of cross-sectional units like individuals, households, firms, or governments.

You might also like