0% found this document useful (0 votes)

14 views6 pages

BUSANA 7001 Group Assignement

The BUSANA 7001 group assignment involves analyzing CEO compensation for IBM Corporation in 2025 using provided datasets. Students must conduct OLS regressions, time series analysis, variable importance assessments, and sentiment analysis on product reviews, utilizing Python and SAS Visual Analytics. The assignment has specific submission guidelines, deadlines, and penalties for late submissions, emphasizing the importance of data preparation and proper reporting.

Uploaded by

amux790

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views6 pages

BUSANA 7001 Group Assignement

Uploaded by

amux790

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

BUSANA 7001 - Predictive and Visual Analytics

for Business

2025 S1

Group Assignment

Instructions

1. The assignment can be done in groups of one to three students. All team mem-
bers are expected to contribute approximately equally to a group assignment.
A group can eliminate an underperforming member who then will need to do
the assignment individually or join another group. Similarly, one can quit
an underperforming group to do the assignment individually or join another
group. All group members will get the same mark for the assignment.

2. The maximum score is 25 points.

3. The presentation of your write-up is important.

4. Whenever possible, numerical analysis (including data cleaning, etc.), as well

as all tables and gures, should be done using Python or SAS Visual Analytics.
However, you may use Excel or Word, etc., to make tables for regressions.

5. Please retain your Python code and make sure that it is user-friendly (use
comments where necessary). Using your submitted code, one should be able
to produce all your results, tables, and gures.

6. Please retain a copy of the problem set that is submitted.

7. Only one member of a group submits 4 les:

`Assignment Cover Sheet', which must be signed (electronic signature is

okay) and dated

1
the report (in doc, docx, or pdf format) for Tasks 1-4; the report should be
properly formatted and be similar to a business report; font: 12 pt Times
New Roman; maximum number of pages: 10 (no penalty for exceeding
this limit)
Python code (in py format or Jupyter notebook, or Google Colab based
.ipynb document)
Spreadsheet with the user reviews (used in Task 4).

8. Lecturer can refuse to accept assignments, which do not have a signed ac-
knowledgment of the University's policy on plagiarism.

9. Any suspected plagiarism will be severely punished. This includes any student
that submits copied work or any student that allows their work to be copied.

10. You must acknowledge any external material you use in your answers, e.g.,
material from websites, textbooks, academic journals and newspaper articles.

11. All queries (including deadline extensions) for this project should be directed
to the Course Coordinator.

12. The submission deadline for the problem set is 6pm, Friday the 13th of June,
2025.

13. The submission must be done through MyUni.

14. Late submission will be penalized 2.5 points per day.

Agenda

Assume that you are a compensation consultant working at a leading consulting

rm. Your client is IBM Corporation one of the United States' largest IT and
consulting companies. They need your help to determine the compensation of their
CEO in year 2025.
You have been provided with 2 data sets. The dataset `salaries_2025_S1' con-
tains the following variables:

GVKEY Company ID Number

YEAR Fiscal Year

2
TDC1 Total Compensation (Salary + Bonus + Other Annual + Restriced
Stock Grants + LTIP Payouts + All Other + Value of Option Grants).

The dataset `companies_2025_S1' contains the following variables:

GVKEY Company ID Number

YEAR Fiscal Year

AT Assets - Total (in $ millions)

CONM Company Name

SALE Sales/Turnover (Net) (in $ millions)

debt_at Financial leverage (debt divided by assets)

roa Return on assets (net income divided by assets)

cash_at Cash holdings divided by assets

rd_at Research and development expenses divided by assets

capex_at Capital expenditure (investments) divided by assets

mb Market-to-book ratio

ppe_at Property, plant, and equipment divided by assets

sic4 Industry code.

First, you should prepare your data for the analysis:

remove duplicates (if any)

check for outliers and take necessary actions to deal with them

merging datasets

and so on.

3
1 OLS regressions (7 points)

This task needs to be done using Python. Discuss briey your sample, including the
number of observations, outliers. Provide the descriptive statistics of the sample.
How you choose to do this is entirely at your discretion. However, it is recommended
that you consider using both summary statistic and graphical methods (this task
should include at least one properly formatted table, one pie chart, one histogram,
and one scatter plot) while also noting any peculiarities within the data set. You
should put more emphasis on TDC1. Tables and gures should be included in the
report rather than appendix.
Find the determinants of the total compensation (TDC1). Estimate 3 dierent
OLS regressions (in order to ensure the robustness of results) with year and industry
xed eects and several independent variables. To ensure that regression residuals
`behave well', you may need to scale or transform one or more variables. For ex-
ample, to use a natural logarithm value of the variable instead of its raw value.
Provide a properly formatted table with the regression results in the report (not in
the appendix; however, you may put additional tables in the appendix if needed).
Discuss the determinants of the total compensation: what variables are statistically
signicant; which variables increase and which variables decrease total compensa-
tion; any insights from the coecient estimates of year and industry xed eects
and so on.
Predict the total compensation in 2025 (this value has been deleted in the dataset
`salaries') using the results from the 3 regressions for IBM Corporation (GVKEY =
006066). Are the predictions similar across the 3 models?

2 Time series analysis (7 points)

This task needs to be done using Python. Generate a time series values for total
compensation (TDC1); that is, annual averages for each year.
Plot the obtained time series. Use ARMA type models to predict its values for
the next 2 periods. Motivate and discuss ARMA orders used in the analysis. Plot
the predicted values (a scatterplot against the actual values; then time series plots
of the actual and predicted values in the same gure). Do actual and predicted time
series tend to move in the same direction over time?
Given the time series predictions for years 2025 and 2026, what would be the

4
predicted total compensation of the CEO of IBM Corporation, obtained in Task
1 (assume that the total compensation of the CEO of IBM Corporation evolves
similarly as an average CEO total compensation)?

3 Variable importance (4 points)

This task needs to be done using Python. Using decision trees, identify 5 the most
important determinants of TDC1. Try 3 dierent models. Are the results consistent?
Then repeat the analysis using categorical version of TDC1 dened as follows:

`high' if TDC1 is in the top 20%

`low' if TDC1 is in the bottom 50%

and `moderate' otherwise.

Compare the results to those obtained when TDC1 was used as a continuous
variable.

4 Sentiment Analysis (7 points)

This task needs to be done using SAS Viya for Learners platform. Consider three
rms: Microsoft (GVKEY = 012141), Apple (GVKEY = 001690), and Amazon
(GVKEY = 064768), and their respective products: Microsoft Surface Go 4, Amazon
Kindle (e.g., Kindle Scribe 2024 32GB with Premium Pen), and Apple iPad Mini.1
From the Google website, download 20 reviews of each product made in the year
2024. Then repeat this task for reviews made in the year 2025. You should collect at
least 120 reviews. Create a spreadsheet with four columns: product, user_id (e.g.,
1, 2, . . . ), review, and year. The user_id needs to be unique. This spreadsheet
needs to be submitted as a separate document together with the report.
Next, import this spreadsheet into SAS Viya for Learners. Create one word
cloud for each product. Are they visually dierent or the same?
Then generate sentiment scores for each product in 2024 and 2025. This might
involve manually copying sentiment scores from each review to the spreadsheet and
calculating the mean sentiment scores there. You should get six values. Create a
table with these values and briey discuss the results.
1 If you wish, you may choose dierent products.

5
Finally, plot the annual growth of CEO pay for each company (between 2024
and 2025) on the same graph as the change in the mean sentiment scores of each
product. Is there any relation between the change in customer satisfaction and CEO
pay growth?
Good luck!

Unit 10 Big Data and Business Analytics Assignment 2 LAB LAC
No ratings yet
Unit 10 Big Data and Business Analytics Assignment 2 LAB LAC
5 pages
PHP Laravel Syllabus 2025
No ratings yet
PHP Laravel Syllabus 2025
7 pages
Apply and Innovate 2018 Honda Kawabe
No ratings yet
Apply and Innovate 2018 Honda Kawabe
41 pages
Predictive Modelling Project 2
100% (4)
Predictive Modelling Project 2
32 pages
Index: SR. NO. Practical Name Date of Perform NO. Sign
No ratings yet
Index: SR. NO. Practical Name Date of Perform NO. Sign
28 pages
2020 6 19 Exam Pa Project Statement PDF
No ratings yet
2020 6 19 Exam Pa Project Statement PDF
6 pages
Data Science and AI Simplified
From Everand
Data Science and AI Simplified
Ekaaksh Deshpande
No ratings yet
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Week 3 v1.1 (Hidden) Supervised Learning (Regression)
No ratings yet
Week 3 v1.1 (Hidden) Supervised Learning (Regression)
52 pages
Business 360°: Unlocking Computer Application
From Everand
Business 360°: Unlocking Computer Application
NotesKaro
No ratings yet
IBM Cognos Business Intelligence
From Everand
IBM Cognos Business Intelligence
Dustin Adkison
No ratings yet
AI-900: Microsoft Azure AI Fundamentals Preparation
From Everand
AI-900: Microsoft Azure AI Fundamentals Preparation
Georgio Daccache
No ratings yet
Anshul Dyundi Predictive Modelling Alternate Project July 2022
No ratings yet
Anshul Dyundi Predictive Modelling Alternate Project July 2022
11 pages
Crystal Reports Introduction: Versions 2008-2016
From Everand
Crystal Reports Introduction: Versions 2008-2016
Seth Bonder
No ratings yet
Data Science Practicals
No ratings yet
Data Science Practicals
40 pages
BDA Important Questions
No ratings yet
BDA Important Questions
3 pages
Tableau Training Manual 9.0 Basic Version: This Via Tableau Training Manual Was Created for Both New and Intermediate
From Everand
Tableau Training Manual 9.0 Basic Version: This Via Tableau Training Manual Was Created for Both New and Intermediate
Larry Keller
3/5 (1)
Devidutta Predictive Modeling PDF
No ratings yet
Devidutta Predictive Modeling PDF
25 pages
Capstone Project Assignment
No ratings yet
Capstone Project Assignment
3 pages
Assessment Task 3 - Individual - Excel - Assignment Guidelines - Rubric - ACF 5904 s1 2025
No ratings yet
Assessment Task 3 - Individual - Excel - Assignment Guidelines - Rubric - ACF 5904 s1 2025
13 pages
PS1 Solutions
No ratings yet
PS1 Solutions
2 pages
PracticalList - EDT - BCA - 2024 SET B1 - 4
No ratings yet
PracticalList - EDT - BCA - 2024 SET B1 - 4
8 pages
Management Science using Excel: Harnessing Excel's advanced features for business optimization (English Edition)
From Everand
Management Science using Excel: Harnessing Excel's advanced features for business optimization (English Edition)
Dr. Isaac Gottlieb
No ratings yet
How to Track Schedules, Costs and Earned Value with Microsoft Project
From Everand
How to Track Schedules, Costs and Earned Value with Microsoft Project
Akram Najjar
No ratings yet
Evidence Guided: Creating High Impact Products in the Face of Uncertainty
From Everand
Evidence Guided: Creating High Impact Products in the Face of Uncertainty
Itamar Gilad
No ratings yet
Linear and Nonlinear Programming Essentials
From Everand
Linear and Nonlinear Programming Essentials
Tanushri Kaniyar
No ratings yet
BA Questions
No ratings yet
BA Questions
5 pages
Singh Project1 Report
No ratings yet
Singh Project1 Report
12 pages
Todays Assessment Questions
No ratings yet
Todays Assessment Questions
14 pages
Tableau 8.2 Training Manual: From Clutter to Clarity
From Everand
Tableau 8.2 Training Manual: From Clutter to Clarity
Larry Keller
No ratings yet
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
From Everand
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
Georgio Daccache
No ratings yet
4BUIS014W Business Computing-Portfolio
No ratings yet
4BUIS014W Business Computing-Portfolio
7 pages
Cognizant Data Analyst Interview Questions 1745235888
No ratings yet
Cognizant Data Analyst Interview Questions 1745235888
18 pages
SMDM Predictive Modeling Business Report 05.02.2022 PDF
No ratings yet
SMDM Predictive Modeling Business Report 05.02.2022 PDF
38 pages
PMP Question Bank
From Everand
PMP Question Bank
Mohammad Usmani
4/5 (34)
7 Financial Models for Analysts, Investors and Finance Professionals: Theory and practical tools to help investors analyse businesses using Excel
From Everand
7 Financial Models for Analysts, Investors and Finance Professionals: Theory and practical tools to help investors analyse businesses using Excel
Paul Lower
No ratings yet
Predictive Modelling Alternate Project Business Case
No ratings yet
Predictive Modelling Alternate Project Business Case
47 pages
Instructions 22may2025
No ratings yet
Instructions 22may2025
2 pages
BUS1001 Instructions
No ratings yet
BUS1001 Instructions
3 pages
ADA Assignment - Final - 2022
No ratings yet
ADA Assignment - Final - 2022
6 pages
Manufacturing: Engineering, Management and Marketing
From Everand
Manufacturing: Engineering, Management and Marketing
S.O.T Ogaji
No ratings yet
Linear Regression Datascience Basit PDF
No ratings yet
Linear Regression Datascience Basit PDF
19 pages
Problem Set 4
No ratings yet
Problem Set 4
2 pages
Big Data Visualization
From Everand
Big Data Visualization
James D. Miller
No ratings yet
Microsoft Visual C++ Windows Applications by Example
From Everand
Microsoft Visual C++ Windows Applications by Example
Stefan BjÃ¶rnander
3.5/5 (3)
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
From Everand
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
Brian Knight
3/5 (1)
Predictive Modelling Sweta Kumari
No ratings yet
Predictive Modelling Sweta Kumari
35 pages
Python Practice Questions
No ratings yet
Python Practice Questions
5 pages
Kushal Kadayat
No ratings yet
Kushal Kadayat
33 pages
XSTK Project PDF
No ratings yet
XSTK Project PDF
26 pages
Project - Stat - Fall 2023
No ratings yet
Project - Stat - Fall 2023
5 pages
Ids Final Sol
No ratings yet
Ids Final Sol
16 pages
IGNOU BCA Introduction to Database Management Systems Previous Year Unsolved Papers MCS 023
From Everand
IGNOU BCA Introduction to Database Management Systems Previous Year Unsolved Papers MCS 023
Manish Soni
No ratings yet
Middlesex University Coursework 1: 2020/21 CST2330 Data Analysis For Enterprise Modelling
No ratings yet
Middlesex University Coursework 1: 2020/21 CST2330 Data Analysis For Enterprise Modelling
8 pages
19MCMS017012 ARUN REDDY Assignment - Summer Semester - Business Mathematics 2 - BBA - 2018 - 19
No ratings yet
19MCMS017012 ARUN REDDY Assignment - Summer Semester - Business Mathematics 2 - BBA - 2018 - 19
9 pages
Concept Based Practice Questions for Tableau Desktop Specialist Certification Latest Edition 2023
From Everand
Concept Based Practice Questions for Tableau Desktop Specialist Certification Latest Edition 2023
Exam OG
No ratings yet
CW1 Specification CSI 4 DMA 2425
No ratings yet
CW1 Specification CSI 4 DMA 2425
8 pages
EDA Assignment 1 Devyani1
No ratings yet
EDA Assignment 1 Devyani1
7 pages
Past Yr. Paper-BIDS
No ratings yet
Past Yr. Paper-BIDS
1 page
Third Assessment-Business Analytics-2019-S1
No ratings yet
Third Assessment-Business Analytics-2019-S1
2 pages
Python For Ds Evaluation Quiz
No ratings yet
Python For Ds Evaluation Quiz
23 pages
Tableau Hacks - Tips and Tricks to Build Dashboards Like a Pro
From Everand
Tableau Hacks - Tips and Tricks to Build Dashboards Like a Pro
Hema
No ratings yet
W4 C2 Student Worksheet PDF
No ratings yet
W4 C2 Student Worksheet PDF
9 pages
Agilent 1220 Infinity LC User Manual PDF
No ratings yet
Agilent 1220 Infinity LC User Manual PDF
380 pages
Change The VxRail Manager IP Address
No ratings yet
Change The VxRail Manager IP Address
2 pages
BSD-2000 Deep Regional Hyperthermia System Family
No ratings yet
BSD-2000 Deep Regional Hyperthermia System Family
2 pages
Combustion Monitor GEK106832
83% (6)
Combustion Monitor GEK106832
16 pages
Law and Emerging Technologies Unit 1
No ratings yet
Law and Emerging Technologies Unit 1
103 pages
N2OS-UserManual-20 0 7 4
No ratings yet
N2OS-UserManual-20 0 7 4
256 pages
Lakhan Frontpage
No ratings yet
Lakhan Frontpage
7 pages
A CNN-LSTM Model For Gold Price Time Series Forecasting NCA
No ratings yet
A CNN-LSTM Model For Gold Price Time Series Forecasting NCA
12 pages
Che205s17 Reading 01e
No ratings yet
Che205s17 Reading 01e
14 pages
Python GUI Automation For Beginners
100% (1)
Python GUI Automation For Beginners
126 pages
An Insight Into Embedded System Design: Pantech Solutions PVT LTD Chennai-17
No ratings yet
An Insight Into Embedded System Design: Pantech Solutions PVT LTD Chennai-17
83 pages
Roles of Mass Media in Education: Mr. John Michael O. Cadoy
No ratings yet
Roles of Mass Media in Education: Mr. John Michael O. Cadoy
8 pages
How To Do ESD Protection During SMT Assembly Process
No ratings yet
How To Do ESD Protection During SMT Assembly Process
18 pages
Tender Specifications
No ratings yet
Tender Specifications
13 pages
Advanced Topics in Control Systems: Exercises and Project Ideas
No ratings yet
Advanced Topics in Control Systems: Exercises and Project Ideas
12 pages
Evotech 3.0 Invitation
No ratings yet
Evotech 3.0 Invitation
18 pages
PDF-3 SRT - Files - PKJ
No ratings yet
PDF-3 SRT - Files - PKJ
11 pages
ISW2001NBF - AEB (VERSIONE 6.1.3) - Installatore - ENG PDF
No ratings yet
ISW2001NBF - AEB (VERSIONE 6.1.3) - Installatore - ENG PDF
33 pages
ZMF4ECL Users Guide
No ratings yet
ZMF4ECL Users Guide
254 pages
Bread, Milk Bread, Diapers, Beer, Eggs Bread, Diapers, Beer, Cola Bread, Milk, Diapers, Beer Bread, Milk, Diapers, Cola
No ratings yet
Bread, Milk Bread, Diapers, Beer, Eggs Bread, Diapers, Beer, Cola Bread, Milk, Diapers, Beer Bread, Milk, Diapers, Cola
4 pages
Narrative Part 2
No ratings yet
Narrative Part 2
67 pages
Kamera Sultan
No ratings yet
Kamera Sultan
4 pages
Verification Academy Patterns Library: Pattern Name: The BFM-Proxy Pair Pattern
No ratings yet
Verification Academy Patterns Library: Pattern Name: The BFM-Proxy Pair Pattern
5 pages
27MP58VQP
No ratings yet
27MP58VQP
30 pages
Concepts in Enterprise Resource Planning
No ratings yet
Concepts in Enterprise Resource Planning
10 pages
Missing Neighbors in WCDMA Analysis Guide
100% (2)
Missing Neighbors in WCDMA Analysis Guide
15 pages
PL 900
No ratings yet
PL 900
14 pages

BUSANA 7001 Group Assignement

Uploaded by

BUSANA 7001 Group Assignement

Uploaded by

BUSANA 7001 - Predictive and Visual Analytics

2. The maximum score is 25 points.

3. The presentation of your write-up is important.

4. Whenever possible, numerical analysis (including data cleaning, etc.), as well

6. Please retain a copy of the problem set that is submitted.

7. Only one member of a group submits 4 les:

 `Assignment Cover Sheet', which must be signed (electronic signature is

13. The submission must be done through MyUni.

14. Late submission will be penalized 2.5 points per day.

Assume that you are a compensation consultant working at a leading consulting

 GVKEY  Company ID Number

 YEAR  Fiscal Year

The dataset `companies_2025_S1' contains the following variables:

 GVKEY  Company ID Number

 YEAR  Fiscal Year

 AT  Assets - Total (in $ millions)

 CONM  Company Name

 SALE  Sales/Turnover (Net) (in $ millions)

 debt_at  Financial leverage (debt divided by assets)

 roa  Return on assets (net income divided by assets)

 cash_at  Cash holdings divided by assets

 rd_at  Research and development expenses divided by assets

 capex_at  Capital expenditure (investments) divided by assets

 ppe_at  Property, plant, and equipment divided by assets

 sic4  Industry code.

First, you should prepare your data for the analysis:

 remove duplicates (if any)

2 Time series analysis (7 points)

3 Variable importance (4 points)

 `high' if TDC1 is in the top 20%

 `low' if TDC1 is in the bottom 50%

 and `moderate' otherwise.

4 Sentiment Analysis (7 points)

You might also like

7. Only one member of a group submits 4 les:

`Assignment Cover Sheet', which must be signed (electronic signature is

GVKEY Company ID Number

YEAR Fiscal Year

GVKEY Company ID Number

YEAR Fiscal Year

AT Assets - Total (in $ millions)

CONM Company Name

SALE Sales/Turnover (Net) (in $ millions)

debt_at Financial leverage (debt divided by assets)

roa Return on assets (net income divided by assets)

cash_at Cash holdings divided by assets

rd_at Research and development expenses divided by assets

capex_at Capital expenditure (investments) divided by assets

ppe_at Property, plant, and equipment divided by assets

sic4 Industry code.

remove duplicates (if any)

`high' if TDC1 is in the top 20%

`low' if TDC1 is in the bottom 50%

and `moderate' otherwise.