0% found this document useful (0 votes)
3 views20 pages

L1 Introduction 2023

econometric intro

Uploaded by

potheadpandafk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views20 pages

L1 Introduction 2023

econometric intro

Uploaded by

potheadpandafk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

The Nature of

Regression Analysis

1
Historical Origin of the Term Regression
● Introduced by Francis Galton.

● Tendency for tall parents to have tall children and for short parents to
have short children, the average height of children born of parents of a
given height tended to move or “regress” toward the average height in
the population as a whole.

● Galton’s law of universal regression was confirmed by Karl Pearson.


The Modern Interpretation of Regression
● Study of the dependence of one variable, the dependent variable, on
one or more other variables, the explanatory variables, with a view to
estimating and/or predicting the (population) mean or average value of
the former in terms of the known or fixed (in repeated sampling) values
of the latter.

● Finance examples
Statistical versus Deterministic Relationships
● In statistical relationships among variables we deal with random or
stochastic variables, that is, variables that have probability
distributions. Eg. dependence of crop yield on temperature, rainfall,
sunshine, and fertilizer.

● In functional or deterministic dependency, on the other hand, we also


deal with variables, but these variables are not random or stochastic.
Eg. Newton’s law of gravity, which states:

● Every particle in the universe attracts every other particle with a force
directly proportional to the product of their masses and inversely
proportional to the square of the distance between them
Regression versus Causation
● A statistical relationship in itself cannot logically imply causation.
Regression versus Correlation
● In correlation analysis, we measure the strength or degree of linear
association between two variables.

● For example, between scores on statistics and mathematics


examinations, between high school grades and college grades, and so
on.
Regression versus Correlation
● In regression analysis there is an asymmetry in the way the dependent
and explanatory variables are treated.

● The dependent variable is assumed to be statistical, random, or


stochastic, that is, to have a probability distribution.

● The explanatory variables, on the other hand, are assumed to have


fixed values (in repeated sampling)
Terminology and Notation
● We can use the following notations
● Term random is a synonym for the term stochastic.

● A random or stochastic variable is a variable that can take on any set


of values, positive or negative, with a given probability.
The Nature and Sources of Data for Economic Analysis
Types of Data

● Time Series Data: A time series is a set of observations on the values


that a variable takes at different times.

● Most empirical work based on time series data assumes that the
underlying time series is stationary.

● In simple words, a time series is stationary if its mean and variance do


not vary systematically over time.
Cross-Section Data
● Cross-section data are data on one or more variables collected at the
same point in time.

● For example, the dividend payment by Nifty 50 companies in the year


2022.
Pooled Data
● In pooled, or combined, data are elements of both time series and
cross-section data.

● For example, for each Nifty 50 company, we have dividend payment


data for the years 2020 and 2021, the combined data of 100
observations is called the pooled data set.
Panel, Longitudinal, or Micropanel Data
● This is a special type of pooled data in which the same cross-sectional
unit (say, a family or a firm) is surveyed over time.

● If all the companies have the same number of observations, we have


what is called a balanced panel.

● If the number of observations is not the same for each company, it is


called an unbalanced panel.
Measurement Scales of Variables
● Ratio scale,
● Interval scale,
● Ordinal scale, and
● Nominal scale.
Ratio Scale
● For a variable X, taking two values, X1 and X2, the ratio X1/X2 and the
distance (X2 - X1) are meaningful quantities.

● Also, there is a natural ordering (ascending or descending) of the


values along the scale.
Interval Scale
● An interval scale variable satisfies the last two properties of the ratio
scale variable but not the first.

● The distance between two time periods, say (2000–1995) is


meaningful, but not the ratio of two time periods (2000/1995).

● 90 degrees temperature is not 50% warmer than 60 degree temperate


as the 0 degree is not the natural base.
Ordinal Scale
● A variable belongs to this category only if it satisfies the third property
of the ratio scale (i.e., natural ordering).

● Examples are grading systems (A, B, C grades) or income class


(upper, middle, lower)
Nominal Scale
● Variables in this category have none of the features of the ratio scale
variables.

● Variables such as gender (male, female) and marital status (married,


unmarried, divorced, separated) simply denote categories.

You might also like