1 Introduction and Objectives: IEOR E4150 Introduction To Probability and Statistics CVN Fall 2019 Dr. A. B. Dieker

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

IEOR E4150 Introduction to Probability and Statistics CVN Fall 2019

Dr. A. B. Dieker

Final project

The syllabus contains the following information about the project:

Students are required to do a project using financial data. The project will require you
to do some computer programming. At the end of the semester, you are required to
submit a report on your project as well as the computer code. Your report cannot be
longer than 5 pages (excluding references and computer code). You are also required
to submit your computer code.

See the syllabus for details on grading.

1 Introduction and objectives

You just started a new job at an investment company, and your new boss gives you the following
assignment on your very first day: You need to carry out a statistical analysis of stocks to be sent
to clients. The clients are familiar with basic statistical tools, and they are interested in original
questions and insightful analysis.

The final project is open ended to simulate a real-world setting. The goal is to investigate sta-
tistically if you can find evidence for an “old wives’ tale” of your choice. An example would be
that January stock returns predict annual returns, see here for more information (but choosing
this wouldn’t be original since I’m mentioning this example here). You don’t have to use stock
data, you may also investigate mutual funds, bonds, etc; in this document I call all of this ‘stock’
data. The questions you formulate and what you discuss in your report are totally up to you!
This document contains the ‘bare-bones’ instructions, but originality and creativity
substantially affect your grade.

2 Report

In your report, make sure you: (1) Describe the data set. (2) Formulate a set of questions you want
to investigate, i.e., the goals of your project. (3) Analyze the data and draw conclusions.

You do not want to overwhelm clients with unimportant details, so you have to choose what you
want to say. Short is often better, think carefully about the points you want to make. For instance,
clients will not be interested in an endless number of figures. Make it count what you write!

I’ve uploaded an example project report from a related class (see Courseworks under “Files/project”).
The instructions were different for that class, but it is the best example that I can share. (This
example report doesn’t have any figures, so please note the above comment that carefully chosen
figures can be very helpful.)
3 Data

You need to use log-returns for your stock market data, which you may assume to be drawn from
a random sample. You can for instance download this data from Yahoo or Google Finance, or use
a convenient package.

Calculate log-returns for your stocks. The log-return is defined as


 close 
St
log
Stopen

where Stclose and Stopen are the close and open prices, respectively, on day t. (Here log is natural log.)
Depending on the question you want to explore, you may alternatively measure time in minutes,
hours, days, weeks, etc.

4 Code

You can choose between coding in R or Python. Please write clean code, it affects your grade.
Document your code. Your code needs to have the following capabilities at a minimum:

• Given one stock symbol, your code needs to be able to: (1) Display histograms for your data
by stock symbol. (2) Display a normal probability plot to see if the data is approximately
normal. (3) Create (approximate) confidence intervals for the means and variances given a
confidence level. (4) Perform a regression of the log-return on time.

• Given two stock symbols, your code needs to be able to: (1) Test the equality of the two
population means (use a test for paired data if appropriate). (2) Perform a regression of one
log-return on the other.

All regression output needs to include intercept and slope estimates, a diagram of the data with
the least-squares line, a graphical depiction of residuals, and R2 .

5 Resources

The Anaconda distribution is recommended for your project. If you choose to use R, make sure
you use R Studio as your IDE.

6 Submission

You need to submit a report and code in one zip file. The file name of your zip file should be
UNI.zip, where you need to replace UNI with your UNI. The file should contain the following files:

1. Your report in PDF format.

2
2. All of the code. It needs to be ready to run on my machine. For instance, all dependent
data files must be included (and in the folder that is referenced in your code). I will set the
working directory to your code directory when I grade.

You need to submit this on Courseworks before December 6, 11:59pm.

You might also like