0% found this document useful (0 votes)
68 views4 pages

Extra Practice #3 Making A Movie

extra homework
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views4 pages

Extra Practice #3 Making A Movie

extra homework
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

BUAD 425

Extra Practice #3 Making a Movie

Use JMP13, Excel and the data to answer the following questions. Where appropriate,
provide clear, succinct justifications of your rational.

Background

In 2016, the film industry in the US was over $11.5 Billion dollars.

Since talented directors and lead actors often consistently create blockbuster movies,
the name of the director and cast are generally good predictive variables for a film’s
success. However, a number of other variables may also be important to consider
including the film budget, country of origin, and plot (is the movie a sequel?).

In the past several years, social media has emerged as an important method of
marketing and advertising. In this assignment, we attempt to predict whether a movie
will be a “financial success” by considering a number of factors pertaining to the social
media presence, the duration of the movie, and the overall popularity of the cast and
director etc. We say a movie is a “financial success” if the movie yields a 30% profit or
higher.

The dataset movies.jmp consists of a subset of 2,823 movies produced from 1936 to
2010. A detailed description of the variables and their meanings are available at the
end of this document.

Instructions
Download the dataset “movies.jmp” from BB. The “.jmp” version has already been split
into a training set and a test set (the first 1000 movies are considered testing data).

You should use JMP, Excel or both to answer the following questions. Be sure to detail
your work and approach. Where appropriate, justify your responses quantitatively. You
may work in teams, but everyone must write-up their own solutions (and perform their
own computations) separately.

Logistic Regression (Linear Classification):


1. (2 pts) Build a logistic regression model to predict whether a movie will be a
financial success using all the variables EXCEPT:

 Movie_title
 Title_year
 Content_rating
 Country
 Director_name
NOTICE: This case and its solutions are COPYRIGHTED. They may not be copied, sold, published, disseminated, shared, or other wise
communicated to third parties whether in person, online or otherwise and whether or not for a profit or nonprofit purpose (2016).
1
BUAD 425

 Actor_main
 Testing_Data

What is the R^2 of this model?

2. (2 pts) Use stepwise and the “Go” option to build another logistic regression
model to predict whether a movie will be a financial success. Allow JMP to use
the same variables as in Q1. What variables does JMP ultimately pick? What is
the R^2 of this model?

3. (2 pts) Compare your answers in Q1, and Q2. Which do you think is a better
model for the business? Justify your response using the JMP output.

4. (6 pts) In the stepwise model you created in Q2, what is the coefficient (or
weight) of “director_facebook_likes”? Comparatively, what is the coefficient (or
weight) of the “actor_main_facebook_likes”?

Do these coefficients make sense? Justify your response. What conclusions


might you draw for the business?

5. (4 pts) Using the stepwise model from Q2 and a probability threshold of .57,
what is the resulting confusion matrix? (Remember to only use the testing data).

6. (4 pts) What is the accuracy of the classifier in Q5 (on the testing data)?

7. (6 pts) As a conservative estimate, let’s assume that we invest $1M for every
movie we predict is going to be a financial success, and on average, we earn a
profit of $4M every time we are correct (i.e. when we correctly invest in movies
which are financial successes).

In such a scenario, each time we incorrectly predict a movie is going to be a


financial success, we lose $1M and each time we predict a movie is not going to
be a success, and it is, we lose the opportunity to make $4M.

According to the testing data in our model, how much money would we have
lost given the scenario above? Can we adjust our model to lower this number?
Briefly explain.

NOTICE: This case and its solutions are COPYRIGHTED. They may not be copied, sold, published, disseminated, shared, or other wise
communicated to third parties whether in person, online or otherwise and whether or not for a profit or nonprofit purpose (2016).
2
BUAD 425

Decision Tree Questions:

8. (4 pts) Use the “Go” option and fit a decision tree model with the same variables
as in Q1. How many times did JMP split the data? Comment on why JMP
stopped at this number?

9. (4 pts) What is the R Square value of your model on the training data? On the
testing data?

10. (4 pts) A retired movie producer comments that social media has changed the
movie production world, and at this stage, the main actor having high facebook
likes is the most important indication of whether a movie will be successful or
not.

Based on the decision tree model in Q8, do you agree? If not, which variable is
the most important indication?

11. (4 pts) Describe the characteristics of a movie which is most likely NOT to be a
financial success (in other words, describe the characteristics of a movie which
would lead us to most confidently assume that the movie will not be a success)

12. (4pts) What is the confusion matrix for this model? Use a probability threshold
of .65.

13. (4 pts) What is the overall accuracy of this model? Is this model more accurate
than the Logistic Regression model from the previous parts?

Based on the assumptions stated in Q7, which model would you suggest the
company use? State clearly any assumptions you are making and justify your
response quantitatively.

Appendix:

Variable key for “movies.jmp”


 movietitle = the title of the movie
NOTICE: This case and its solutions are COPYRIGHTED. They may not be copied, sold, published, disseminated, shared, or other wise
communicated to third parties whether in person, online or otherwise and whether or not for a profit or nonprofit purpose (2016).
3
BUAD 425

 title_year = the year the movie was released


 duration = the length of the movie in minutes
 content_rating = the rating of the content of the movie
 country = the country in which the movie was produced
 director_name = the name of the director
 actor_main = the name of the main actor in the movie
 director_facebook_likes = the number of facebook likes the director received
during the promotion phase of the movie
 actor_main_facebook_likes = the number of facebook likes the main actor
received during the promotion phase of the movie
 cast_total_likes = the number of facebook likes the entire cast received during
the promotion phase of the movie
 movie_facebook_likes = the number of facebook likes the movie overall
received during the promotion phase of the movie
 num_voted_users = the number of users who voted in various social media
campaigns including facebook
 facenumber_in_poster = the number of people in the promotional posters for
the movie
 imdb_score = the IMDB score for the movie (movies are scored before being
released)
 Financial_Success = an indication of whether the movie was financially
successful (1 = financially successful or 0 = not financially successful)
 Testing = an indication of whether the record is training or testing data (1 =
testing data or 0 = training data)

NOTICE: This case and its solutions are COPYRIGHTED. They may not be copied, sold, published, disseminated, shared, or other wise
communicated to third parties whether in person, online or otherwise and whether or not for a profit or nonprofit purpose (2016).
4

You might also like