0% found this document useful (0 votes)
18 views18 pages

DAM Theory

Uploaded by

AppleDugar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views18 pages

DAM Theory

Uploaded by

AppleDugar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Important theory questions

1. Cross sectional data vs Time series


Cross-sectional data and time series data are two different types of data used in statistics and
econometrics to analyze and draw insights from various phenomena. Let's delve into each
type:
Cross-sectional data: Cross-sectional data refers to data collected at a single point in time from
multiple individuals, entities, or units. In other words, it provides a snapshot of different
observations at a specific moment. Each observation in cross-sectional data represents a
separate entity, and the data is collected by surveying or observing a diverse set of entities. For
instance, if you collect data on the income of individuals in a country in a specific year, and
each observation corresponds to a different individual, you have cross-sectional data.
Example: Survey data collected from different people in a city to study their education levels,
incomes, and housing preferences at a particular time.
Time series data: Time series data, on the other hand, involves collecting data on a single
variable over a sequence of time intervals. This type of data is used to analyze trends, patterns,
and changes that occur over time. Each observation in time series data corresponds to a
specific time point, and the data is collected over a period of time. Time series data is essential
for understanding how a variable evolves, fluctuates, or grows over time.
Example: Monthly stock prices of a company recorded over several years, allowing analysts to
study its performance trends and make predictions.

2. Components of a time series


A time series is a sequence of data points collected at successive and equally spaced time
intervals. When analyzing a time series, it's often helpful to decompose it into different
components, each of which represents a distinct underlying pattern or source of variability.
The main components of a time series are:
1. Trend: The trend component represents the long-term movement or direction in the
data. It captures the overall upward or downward movement in the series over an
extended period. Trends can be linear, nonlinear, or even absent.
Example: The annual sales of a company increasing steadily over the past five years, indicating
a positive linear trend.
2. Seasonality: The seasonality component refers to regular patterns that repeat at fixed
intervals within the time series. These patterns could be daily, weekly, monthly, or any
other regular cycle.
Example: Ice cream sales showing higher values during the summer months and lower values
during the winter months, reflecting a seasonal pattern.
3. Cyclic Variation: The cyclic component represents fluctuations that are not of a fixed
frequency like seasonality but occur due to economic or business cycles. Cycles are
longer-term patterns that are less predictable and more irregular than seasonality.
Example: Real estate prices in a city showing a cycle of booms and busts every 8-10 years due
to economic cycles.
4. Irregular (Residual) Variation: The irregular component, also known as the residual or
noise, accounts for the random or unpredictable fluctuations in the time series that
cannot be explained by the other components. These fluctuations can result from
random variations, measurement errors, or unforeseen events.
Example: Sudden spikes or drops in a stock price due to unexpected news or events impacting
the market.

3. Differences between correlation and regression.

Aspect Correlation Regression

Measure strength and Model relationships for


Purpose and Objective
direction of linear association. prediction or analysis.

Calculates correlation Fits a regression line or curve


Type of Analysis
coefficients. to data points.
Aspect Correlation Regression

Dependent and Both variables are equal One variable is dependent;


Independent Variables partners. others are independent.

Can imply causation,


Causation Does not imply causation.
depending on context.

Coefficients range from -1 to Slope and intercept have direct


Interpretation
1. interpretations.

Measures linear relationships Handles both linear and


Linearity
only. nonlinear relationships.

Uses correlation coefficients Employs least squares to find


Calculation Method
(e.g., Pearson's r). best-fit line.

Measures degree of Aims for prediction,


Focus
association. explanation, or analysis.

Height and weight might be Predicting house prices based


Example
correlated. on features.
4. PERT vs CPM

Aspect PERT CPM

Program Evaluation and Review


Full Form Critical Path Method
Technique

Emphasizes time estimates and Focuses on identifying critical


Focus
uncertainty. activities.

Initially designed for R&D and Originally developed for


Purpose
engineering projects. construction projects.

Uses 3 time estimates: optimistic,


Uses a single duration
Activity Duration pessimistic, and most likely durations
estimate for each activity.
to account for uncertainty.

Calculates expected time using


Calculation of weighted average of the three Determines the critical path
Timing estimates (optimistic, pessimistic, and and total project duration.
most likely).
Aspect PERT CPM

Focuses on the uncertainty in Emphasizes the identification


Path Variability individual activity times and overall of the critical path, which has
project duration. no slack time.

More suitable for projects with Suited for projects with well-
Project
uncertain activity durations and non- defined activities and known
Scheduling
critical paths. durations.

Still used in projects with uncertain Widely used in various


Modern Usage activity durations and research- industries for project
oriented projects. management.

5. Why multiple regression analysis is important over simple the simple


linear regression?
Multiple regression analysis is important over simple linear regression when the relationship
between the dependent variable and independent variables is more complex and involves
multiple factors influencing the outcome. Here are five reasons why multiple regression
analysis is preferred over simple linear regression in certain scenarios:
a) Multiple Factors: In many real-world situations, the outcome of interest is influenced by
multiple independent variables rather than just one. Multiple regression allows you to
account for the joint effects of several variables on the dependent variable, providing a
more comprehensive understanding of the relationships.
b) Control for Confounding: When there are multiple independent variables, it's possible
for some of them to be correlated with each other. Multiple regression enables you to
control for these correlations and isolate the unique contribution of each variable to the
dependent variable's variation, reducing the risk of spurious or misleading results.
c) Increased Accuracy: Including additional relevant independent variables in the analysis
can lead to a more accurate model. By considering multiple factors, the model can
capture more of the underlying complexity in the data, resulting in better predictions
and more robust conclusions.
d) Realistic Modeling: Many real-world phenomena are influenced by a combination of
factors, not just a single one. For example, when predicting a person's income, you
would likely need to consider variables such as education, experience, and location
simultaneously to create a realistic model.
e) Enhanced Explanation: Multiple regression allows you to explore interactions and
relationships among various independent variables. This can lead to insights about how
different variables interact to affect the outcome. Simple linear regression can overlook
these interactions.

6. Mention the conditions for using qualitative and quantitative


forecasting.
Qualitative and quantitative forecasting methods are used based on the availability of data, the
nature of the forecast, and the level of uncertainty. Here are the conditions for using each type
of forecasting:
Qualitative Forecasting:
 Conditions for Use: Qualitative forecasting is suitable when historical data is limited or
unavailable, and when the forecast relies on expert judgment, opinions, or subjective
assessments.
 Methods: Qualitative methods include Delphi method (experts provide independent
judgments), market research (surveys and focus groups), and scenario analysis
(developing multiple scenarios based on different assumptions).
Quantitative Forecasting:
 Conditions for Use: Quantitative forecasting is appropriate when historical data is
available and the forecast is based on quantitative relationships or patterns. It's used for
more structured and data-driven forecasts.
 Methods: Quantitative methods encompass time series methods (e.g., moving averages,
exponential smoothing) and causal methods (e.g., regression analysis, econometric
models) that utilize historical data and relationships among variables.
7. What is an index number? Why index numbers are called economic
barometers?
An index number is a statistical measure designed to express changes in a variable, relative to a
base period or value. It is used to track the relative changes in quantities, prices, or other
economic or non-economic variables over time. Index numbers provide a way to analyze and
compare data by converting them into a standardized form, making it easier to identify trends,
patterns, and changes in various factors.
They're called "economic barometers" because, like barometers measure weather changes,
index numbers indicate shifts in economic conditions. They reflect economic changes, act as
early indicators, help monitor trends, provide predictive insights, and offer standardized
comparisons across time and categories, making them essential for understanding and
predicting economic shifts.
Example: Consider a Consumer Price Index (CPI), which measures the average change in prices
of a basket of goods and services consumed by households. If the CPI for the current year is
120 and the base year's CPI was 100, it indicates that prices have increased by 20% compared
to the base year. This percentage change serves as a snapshot of inflation trends, helping
economists and policymakers gauge changes in the cost of living.
Similarly, just as a barometer's reading can suggest upcoming weather changes, the CPI's value
can indicate whether an economy is experiencing inflation (rising prices) or deflation (falling
prices). This index number acts as an economic barometer, providing valuable information
about shifts in economic conditions.

8. What is residual analysis? How it helps to decide the appropriateness


of the model to fit the data?
Residual analysis is a critical step in evaluating the performance and appropriateness of a
statistical model in fitting data. It involves examining the differences between the observed
values and the predicted values (residuals) generated by the model. Residuals are the vertical
distances between each data point and the corresponding point on the model's fitted line or
curve.
The main goal of residual analysis is to assess whether the model adequately captures the
underlying patterns and variability in the data. It helps determine if the model fits the data
appropriately by checking assumptions (linearity, normality, etc.), identifying patterns that
might indicate model issues, detecting outliers and influential points, detecting auto-
correlation and guiding model improvements if needed. It ensures the model captures the
data's underlying patterns, leading to more accurate predictions and informed decisions.

9. What are the general assumptions of regression model?


The assumptions of a regression model form the foundation for its validity and accuracy.
Violations of these assumptions can lead to biased or unreliable results. The general
assumptions of a regression model, particularly for linear regression, are as follows:
a) Linearity: The relationship between the independent variables and the dependent
variable is assumed to be linear. This means that changes in the independent variables
should result in constant changes in the dependent variable.
b) Independence of Residuals: The residuals (the differences between observed and
predicted values) should be independent of each other. This assumption ensures that
the residuals are not correlated, indicating that one observation's residual doesn't
provide information about another's.
c) Normality of Residuals: The residuals are assumed to follow a normal distribution. This
assumption is crucial for hypothesis testing and confidence interval estimation.
Departures from normality might affect the reliability of statistical tests.
d) No Multicollinearity: The independent variables should not be highly correlated with
each other. Multicollinearity can make it challenging to distinguish the individual effects
of variables, leading to unstable coefficient estimates.
e) Zero Mean of Residuals: The mean of the residuals is assumed to be zero. This
assumption ensures that the model isn't systematically underestimating or
overestimating the dependent variable.
f) No Autocorrelation: The residuals are assumed to be independent of each other and
not correlated across time or observations. Autocorrelation can lead to inefficiency in
parameter estimates.

10. Write application of regression modeling in business industry with


example.
Regression modeling has numerous applications in the business industry, helping organizations
make informed decisions, optimize processes, and understand relationships between variables.
Here's an example of its application in predicting sales based on advertising expenditure:
Example: Predicting Sales from Advertising Expenditure
Imagine a retail company that wants to understand how its advertising spending impacts its
sales. The company collects data on advertising expenditure (in dollars) and corresponding
sales (in dollars) over several months. They decide to use a simple linear regression model to
predict sales based on advertising expenditure.
1. Data Collection: The company gathers data on advertising expenditure and sales for
each month.
2. Model Building: They build a linear regression model where sales is the dependent
variable, and advertising expenditure is the independent variable.
Regression Model: Sales = β0 + β1 * Advertising Expenditure + ε
 Sales: Dependent variable (the one being predicted)
 Advertising Expenditure: Independent variable (predictor)
 β0: Intercept
 β1: Coefficient for Advertising Expenditure
 ε: Error term
3. Analysis: Using statistical software, the company estimates the model's coefficients (β0
and β1) that best fit the data.
4. Interpretation: The coefficient β1 represents the change in sales for a one-unit change in
advertising expenditure, assuming all other factors are constant.
5. Prediction: With the model, the company can predict sales for a given advertising
expenditure.
6. Validation: They assess the model's performance using techniques like residual analysis,
R-squared (coefficient of determination), and hypothesis tests to ensure the model is
valid and reliable.
7. Decision Making: The company can use the model to make data-driven decisions. For
instance, they can use it to optimize their advertising budget to maximize sales or
predict sales for different advertising scenarios.
In this scenario, regression modeling helps the company understand the relationship between
advertising expenditure and sales, enabling them to allocate resources more effectively and
make informed marketing decisions.
Remember that this is just one example, and regression modeling has a wide range of
applications in business, including demand forecasting, pricing optimization, customer churn
prediction, risk assessment, and more. It's a versatile tool that assists businesses in extracting
insights from their data to drive better outcomes.

11. Simple vs Multiple regression

Aspect Simple Regression Multiple Regression

Involves one independent


Number of Involves one dependent variable
variable and one dependent
Variables and multiple independent variables.
variable.

Analyzes the influence of multiple


Analyzes the relationship
Purpose variables on one dependent
between two variables.
variable.

Simpler to implement and More complex due to interactions


Complexity
interpret. among multiple variables.

Simple linear equation: Y = β0 + Multiple linear equation: Y = β0 +


Model Equation
β1 * X + ε β1 * X1 + β2 * X2 + ... + ε

Coefficient β1 represents the Coefficients represent the impact of


Interpretation change in Y for a one-unit each X variable, holding others
change in X. constant.

Assumptions Similar assumptions as multiple Similar assumptions as simple


Aspect Simple Regression Multiple Regression

regression: linearity,
regression, plus no multicollinearity.
independence, etc.

Used when analyzing


Used when multiple factors
Applications relationships between two
influence the dependent variable.
variables.

Less prone to overfitting due to More variables can lead to


Complexity
fewer variables. overfitting if not managed properly.

Predicting sales based on Predicting house prices considering


Example
advertising spending. factors like size, location, etc.

12. What do you mean by multi collinearity?


Multicollinearity refers to a statistical phenomenon in which two or more independent
variables in a regression model are highly correlated with each other. In other words, there's a
strong linear relationship among the predictor variables. This can cause problems when fitting
a regression model because it makes it difficult to distinguish the individual effects of each
independent variable on the dependent variable.
Multicollinearity can lead to unreliable and unstable coefficient estimates, making it
challenging to interpret the significance and impact of individual predictors. It doesn't directly
affect the predictive power of the model, but it can undermine the model's ability to provide
meaningful insights about the relationships between variables.
We can detect multi collinearity using Variation Inflation Factor (VIF). VIF measures how much
the variance of the estimated regression coefficient is increased due to multicollinearity. VIF
values greater than 10 (or sometimes 5, depending on the context) indicate significant
multicollinearity.

13. Explain about dummy variable in regression model.


In a regression model, a dummy variable (also known as an indicator variable) is used to
incorporate categorical data or qualitative factors into the analysis. Since regression models are
based on mathematical equations, they require numerical input. Dummy variables are
introduced to represent categorical variables with two or more categories, allowing the model
to account for categorical effects.
Dummy variables can be assigned only binary values: 0 and 1. For example: Consider the
categorical variable “sex” with two categories: Male and Female. We may assign 0 to Male and
1 to Female (or vice-versa) and include the dummy variables as predictors in the regression
equation.

14. What is autocorrelation?


Autocorrelation, also known as serial correlation, refers to the correlation of a variable with
itself over different time intervals or lags. In other words, it's the degree of similarity between
observations in a time series with observations at previous time points. Autocorrelation is a
common phenomenon in time series data, and it can have significant implications for analysis
and modeling.
Positive autocorrelation occurs when a variable's values at one time are positively correlated
with its values at previous times. This suggests that past values influence future values, and
there is a trend or pattern in the data. Negative autocorrelation implies that past values are
inversely related to future values, while zero autocorrelation indicates no correlation between
current and past values.
To detect autocorrelation, we use Durbin-Watson test.

15. ANOVA in regression analysis.


In the context of regression analysis, ANOVA serves two main purposes:
a) Global Test of Model Fit: ANOVA can be used to test the overall significance of the
regression model. This is done by comparing the variability explained by the regression
model (sum of squares regression) to the unexplained variability (sum of squares
residuals). The resulting F-statistic is used to determine whether the regression model as
a whole significantly improves the fit compared to a null model (no predictors).
b) Variable Significance Testing: ANOVA is also used to assess the significance of individual
predictor variables. It compares the variability explained by a specific predictor (sum of
squares regression for that predictor) to the unexplained variability (sum of squares
residuals) when that predictor is excluded from the model. This helps determine
whether adding a specific predictor significantly improves the model's fit.
In summary, ANOVA in regression analysis helps in testing the overall significance of the model
and the significance of individual predictor variables. It provides valuable insights into whether
the model explains a significant portion of the variability in the dependent variable and
whether each predictor variable contributes significantly to the model's predictive power.

16. What are the uses of index numbers in business?


Index numbers find versatile uses in business:
a) Price Analysis: Monitor cost changes for pricing strategies.
b) Performance: Evaluate industry and stock performance.
c) Economic Indicators: Gauge economic health and activity.
d) Market Research: Track consumer sentiment and trends.
e) Real Estate: Monitor property value changes.
f) Risk Management: Assess creditworthiness and risk.
g) Supply Chain: Analyze performance and efficiency.
h) Currency: Evaluate exchange rates for trade.
i) Employee Compensation: Benchmark salaries.
j) Inventory and Sales: Optimize management and marketing.
k) Productivity: Evaluate workforce efficiency.
Index numbers guide decisions, strategy, and analysis across diverse business functions.

17. What is cost of living index number? What does it measure?


A Cost-of-Living Index (COLI) number is a statistical measure that quantifies the relative cost of
living between different geographical areas or time periods. It provides a way to compare the
overall expenses required to maintain a standard of living across different locations or over
time. The COLI is often presented as an index number that reflects the percentage difference in
the cost of living between a base area or period and the areas or periods being compared.
The Cost-of-Living Index measures the average price levels of a representative basket of goods
and services commonly consumed by households. These items typically include housing costs,
food, transportation, healthcare, education, and other essential goods and services. By
comparing the cost of this basket across different areas or time periods, the COLI offers insights
into the relative affordability of living in various places or during different time spans.

18. Why Fisher’s index number is called an ideal index number?


Fisher ‘s index number is called the ideal because of the following reasons:
a) Fisher's index uses the geometric mean of the Laspeyre’s and Paasche’s index numbers
which is considered best for constructing index numbers.
b) It satisfies both the time reversal and factor reversal tests.
c) It takes into account both current year as well as base year prices and quantities.
d) It is free from bias.

19. Describe the importance of time series analysis in business decision


making.
Time series analysis is vital for business decisions:
a) Forecasting: Predict demand and trends.
b) Planning: Optimize budgets and resources.
c) Risk Management: Identify and mitigate risks.
d) Performance Evaluation: Measure growth and progress.
e) Marketing: Tailor campaigns to consumer behavior.
f) Supply Chain: Enhance efficiency and demand management.
g) Investments: Analyze market trends and stock prices.
h) Customer Insights: Understand preferences and loyalty.
i) Operations: Improve resource allocation and production planning.
20. What is forecasting error in time series? List the measures of
forecast accuracy.
Forecasting error in time series refers to the discrepancy between the predicted values
generated by a forecasting model and the actual observed values. It quantifies how well the
forecasted values match the real outcomes. A low forecasting error indicates a more accurate
forecast, while a higher error suggests that the model's predictions deviate from reality.
Measures of Forecast Accuracy are:
a) Mean Absolute Deviation (MAD)
b) Mean Squared Error (MSE)
c) Mean Absolute Percentage Error (MAPE) *(Write formula for each as well in the exam)

21. What do you mean by active constraints in LPP?


In Linear Programming (LP), active constraints refer to the constraints that are "active" or
"binding" at the optimal solution of the LP problem. An active constraint is one that is satisfied
as an equality (i.e., holds with equality) and defines a boundary of the feasible region.
For example: Consider the constraint 2 x+3 y ≤ 10 and optimal solution ( x , y ) =(2 , 2). If we put x=2
and y=2 in the L.H.S of the constraint, we get
2 ( 2 ) +3 ( 2 ) ≤ 1 0
i .e .10 ≤ 1 0

Since L.H.S = R.H.S, so it is an active constraint.

22. Primal vs Dual in LPP


Aspect Primal Problem Dual Problem

Minimize cost or maximize Maximize revenue or minimize


Objective
profit. resource usage.

Decision variables represent Dual variables correspond to


Variables
activities or quantities. constraints.

Constraints define resource Constraints reflect the economic


Constraints
limitations. interpretation.

Inequality constraints with ≤ or Inequality constraints with ≥ or ≤


Formulation
≥ signs. signs.

Objective Coefficients reflect costs or Coefficients reflect resource


Coefficients profits. availabilities.

Primal's objective value ≤ Dual's Dual's objective value ≥ Primal's


Relationships
objective value. objective value.

Shadow prices reflect resource Variables in the dual provide


Sensitivity Analysis
sensitivity. sensitivity information.
23. Redundant constraints in LPP.
In Linear Programming (LP), redundant constraints are constraints that do not affect the
feasible region or the optimal solution of the problem. These constraints are essentially "extra"
and can be removed without changing the feasible region's boundaries or the optimal
solution's values. In other words, redundant constraints provide redundant information and
can be safely eliminated from the LP problem without altering the problem's optimal solution
or the set of feasible solutions.

24. Critical Path Method


CPM stands for Critical Path Method; a project management technique used to plan and
manage activities in complex projects. It helps identify the most critical tasks and the shortest
time in which a project can be completed. CPM involves creating a network diagram to
visualize task dependencies, durations, and critical paths.

25. Discuss why dummy activities are required in a network diagram.


Dummy activities are used in network diagrams in project management techniques like the
Critical Path Method (CPM) and the Program Evaluation and Review Technique (PERT). They
are introduced to represent certain types of dependencies between tasks that cannot be
shown using only regular activities. Dummy activities serve to maintain the correct sequence
and logic of the network diagram. Here's why dummy activities are required:
a) Finish-to-Start Relationships: In project management, most tasks have dependencies
where one task must finish before another can start. However, if a finish-to-start
relationship exists between two tasks, a direct connection may not be possible in the
network diagram. To represent such dependencies, a dummy activity is introduced.
b) Maintaining Sequence: Dummy activities help maintain the correct sequence of tasks
and ensure that the logical relationships between tasks are accurately reflected in the
network diagram.
c) Handling Parallel Activities: When multiple activities need to start simultaneously, a
dummy activity can be used to show their coordination and ensure the correct flow of
the network diagram.
d) Avoiding Loops: Dummy activities can prevent the formation of loops or circular
dependencies in the network diagram, which can create confusion and inaccuracies in
project scheduling.
e) Critical Path Calculation: Dummy activities influence the calculation of the critical path
in CPM. They can impact the total project duration and help identify which tasks are
truly critical for project completion.
f) Visual Clarity: Introducing dummy activities can improve the clarity of the network
diagram by representing complex relationships in a more understandable way.

26. Define slack and surplus value with suitable examples.


In Linear Programming (LP), "slack" and "surplus" are terms used to describe the difference
between the available resources and the resource usage or constraints within the solution of
the linear programming problem.
a) Slack: Slack refers to the amount by which a resource constraint can be relaxed
without affecting the optimal solution. It represents the surplus availability of a resource
beyond what is required to satisfy the constraints. It is associated with ≤type constraint.
For example: if the constraint is 3 x+ 2 y ≤ 12 and optimal solution ( x , y ) =(2 , 2), then
3 ( 2 ) +2 ( 2 ) ≤ 1 2
¿ 10 ≤1 2

Hence, slack = 12 – 10 = 2
b) Surplus: Surplus is the excess of resources over the requirement stated in a
constraint. It represents the amount by which a constraint can be strengthened or
tightened without affecting the optimal solution. It is associated with ≥type constraint.
For example: if the constraint is 3 x+ 2 y ≥ 12 and optimal solution ( x , y ) =(2 , 4 ), then
3 ( 2 ) +2 ( 4 ) ≥ 1 2
¿14≥12

Hence, surplus = 14 – 12 = 2
In summary, slack represents unused resources within a constraint, while surplus represents
excess resources beyond a constraint's requirement in a linear programming problem.

You might also like