0% found this document useful (0 votes)

4 views10 pages

BI Case Study 3

This case study focuses on a text analysis project analyzing customer feedback for the iPhone 16, utilizing business intelligence techniques to derive insights for product improvement. The analysis revealed a predominantly neutral to positive sentiment among users, with notable concerns regarding battery life and screen refresh rate. Challenges included manual data collection limitations and natural language complexities, with proposed solutions involving automation and advanced NLP models.

Uploaded by

kenomeshack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views10 pages

BI Case Study 3

Uploaded by

kenomeshack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

BIA-5401-0LC

Professor Viktoria Varga

Case Study #3
Submitted Date: 2 April 2025
David Akporuere

Isha Ajit Phakatkar

Ismail Sahin

Meshack Oniera

Rakshini Prabu

Tran Bao Ngoc Nguyen

1
Contents
Summary of the case:.................................................................3
Deliverables................................................................................4
Step 1: Text Analysis Application for Business:.........................4
Step 2: Data Collection Strategy...............................................4
Step 3: Data Storage Strategy...................................................5
Step 4: Text Corpus Construction in Python.....................7
Step 5: Discussion of Challenges...............................................8
Results and Key Findings:.........................................................9
Conclusion:.............................................................................10

Summary of the case:

2
This assignment involves a text analysis project focused on customer feedback regarding the iPhone 16.
The project utilizes business intelligence techniques to extract insights from user-generated content,
facilitating data-driven decision-making for product improvement and marketing strategies. Key steps
include data collection from various tech platforms, sentiment analysis using tools like Python and NLTK,
and data storage in CSV format. The analysis revealed a predominantly neutral to positive sentiment
among users, with concerns over features like battery life and screen refresh rate. Challenges
encountered included manual data collection limitations and natural language complexities, with
proposed solutions involving automation and advanced NLP models.

3
Deliverables
Step 1: Text Analysis Application for Business:
● Data Source: User-generated content (reviews, forum posts, expert commentary)
pertaining to the iPhone 16.
● Business Value: Analyzing customer feedback provides direct insights into
customer perceptions, preferences, and pain points. This informs data-driven
decisions to improve products/services, enhance customer satisfaction, and
drive competitive advantage.
● Example: Identification of prevalent complaints regarding battery life serves as a
clear indicator for improvements in future iPhone models.

Step 2: Data Collection Strategy

● Sources: PCMag, Reddit, Medium, Apple Community, GSM Arena, The Verge, and
other tech platforms.
● Method: Manual curation of user and expert review snippets.
● Format: Plain text (.txt) file named "iPhone16_reviews.txt."
● Tools:
● Data Source: Websites & online tech review platforms
● Data Format: .txt file
● Data Tool: Python open() for reading, nltk for sentiment analysis
● Rationale: Manual collection ensured diverse and relevant content without API or
authentication complexities. The collected reviews were saved in a .txt file
containing 15 structured lines, each representing one review. These were read
into Python using a simple file handler (open()), and each line was treated as a
separate review record for analysis.
This following pie chart illustrates the sentiment distribution of iPhone 16 reviews.
● Neutral: 62.5% of the reviews had a neutral sentiment.
● Positive: 34.4% of the reviews were positive.
● Negative: Only 3.1% of the reviews were negative.

4
Step 3: Data Storage Strategy
1. Data Structure: The CSV file stores the data in a tabular format. Each row
represents a single review, and the columns represent different attributes of the
review and its analysis. Based on the notebook, the columns include:
● "Review #": The index or number of the review.
● "Review": The actual text of the iPhone 16 review.
● "Sentiment": The overall sentiment classification assigned to the review
(e.g., "Positive", "Negative", or "Neutral").
● "Score": The compound sentiment score generated by the VADER
sentiment analyzer (a numerical value indicating the intensity and
direction of the sentiment).
2. Implementation: With the use of the pandas library to create a DataFrame from
the sentiment analysis results and then save the DataFrame to a CSV file using
the to_csv() function. The index=False argument prevents pandas from writing
the DataFrame index as a separate column in the CSV.
3. Rationale: As previously stated, the reasons for choosing CSV in this context are:

5
● Simplicity: CSV is a very simple file format, making it easy to understand,
create, and parse.
● Portability: CSV files can be opened and viewed in almost any spreadsheet
program (Excel, Google Sheets, etc.) or text editor.
● Ease of Use with Pandas: pandas provides excellent support for reading
and writing CSV files, making it a natural choice for data analysis
workflows.
● Suitable for Tabular Data: The sentiment analysis results are inherently
tabular, making CSV a good fit.

4. Advantages in this Specific Case:

● The case processes a limited number of iPhone 16 reviews (based on the
example output). CSV is perfectly adequate for this scale.
● The analysis is relatively straightforward (sentiment classification and
scoring). Complex querying or data relationships aren't required.
● The focus is on analyzing the sentiment of the reviews and storing the
results for further examination. CSV provides a simple way to achieve this.
5. Limitations and Alternatives:
● Scalability: CSV is not suitable for very large datasets (millions or billions
of reviews).

6
● Querying: CSV files are not efficient for complex queries. we would need to
load the entire file into memory and then perform filtering or searching.
● Concurrency: CSV files are not designed for concurrent access by multiple
users or processes.
● Data Integrity: CSV files do not enforce data types or constraints.

Step 4: Text Corpus Construction in Python

In this step, we structured and cleaned our collected reviews to create a well-prepared text
corpus for analysis. The main objective was to transform raw, unstructured text into a format
suitable for natural language processing (NLP).

Text Preprocessing Steps:

To ensure consistency and accuracy in analysis, we applied several preprocessing techniques:

● Text Normalization – Convert text to a standard format by handling special characters

and encoding issues.
● Lowercasing – Standardize all text by converting it to lowercase.
● Tokenization – Split text into individual words for better processing.
● Stopword Removal – Eliminate common words that do not add value (e.g., “the,” “is,”
“and”).
● Punctuation & Special Character Removal – Clean unnecessary symbols and digits.

● Lemmatization – Reduce words to their root form (e.g., “running” →

“run”).

7
Results & Key Changes

Step 5: Discussion of Challenges

The team encountered several challenges during the text analysis process, which
influenced the project's efficiency, accuracy, and scalability.
1. Limitations in Manual Data Collection
● Challenge: Labor-intensive and potentially biased data collection due to
the absence of APIs or web scraping tools.
● Proposed Solution: Employ web scraping tools (e.g., BeautifulSoup) or
leverage APIs (e.g., Reddit API, Google Search API) to automate data
collection, expand data volume, and ensure a more representative dataset.
2. Presence of Data Noise and Inconsistencies
● Challenge: Non-standard expressions, spelling errors, emojis, and
inconsistent formatting compromised analysis quality.
● Proposed Solution: Implement advanced text-cleaning functions, such as
emoji removal and typo correction with libraries like TextBlob, to enhance
data quality further.
3. Ambiguity and Complexity in Natural Language
● Challenge: Ambiguity in natural language, sarcasm, and mixed sentiments
complicated the interpretation of results from the VADER sentiment
analyzer.
● Proposed Solution: Combine VADER with other NLP models like TextBlob
or utilize transformer-based models like BERT (especially for larger
datasets) to enhance the accuracy of sentiment detection.

8
4. Limited Size of the Dataset
● Challenge: The small dataset (15 reviews) limited the generalizability of
sentiment findings and increased susceptibility to outliers.
● Proposed Solution: Automate data collection to increase the dataset to
hundreds or thousands of reviews, enabling more thorough statistical
analysis and trend visualization.
5. Constraints in Storage and Data Management
● Challenge: CSV file limitations for managing larger or real-time data,
restrictions regarding querying or simultaneous access.
● Proposed Solution: Implement structured databases like SQLite or NoSQL
options such as MongoDB to facilitate efficient storage, indexing, and
querying of unstructured text data for larger-scale analyses.

Results and Key Findings:

1. Sentiment Analysis Findings:
● The sentiment distribution of iPhone 16 reviews revealed that:
● 62.5% of the reviews were neutral.
● 34.4% were positive.
● Only 3.1% were negative.
● This indicates a predominantly neutral to positive perception of the iPhone
16, with minimal outright dissatisfaction.
2. Insights from Customer Feedback:
● Positive aspects highlighted include seamless functionality, excellent
camera performance, and high-quality construction.
● Negative aspects focused on issues such as battery life, notification
system usability, and occasional hardware failures like charging port
malfunctions.
3. Challenges in Data Processing:
● Manual data collection was labor-intensive and limited in scale.
● Ambiguities in natural language, such as sarcasm and mixed sentiments,
complicated sentiment classification.
● The dataset was small (15 reviews), limiting generalizability.

9
4. Proposed Solutions:
● Automation of data collection using web scraping or APIs to increase
dataset size and diversity.
● Use of advanced NLP models (e.g., BERT) for more nuanced sentiment
analysis.
● Transition to structured databases (e.g., SQLite, MongoDB) for efficient
storage and querying of larger datasets.

Conclusion:
The text analysis project successfully demonstrated the value of customer feedback in
identifying areas for improvement and guiding product development for the iPhone 16.
While the sentiment analysis provided actionable insights into customer perceptions,
scalability and accuracy challenges were evident due to the manual data collection
process and limited dataset size. Future iterations should leverage automation tools
and advanced NLP techniques to enhance efficiency and analytical depth, ensuring
more comprehensive and reliable insights for business decision-making.

The Pearson Guide To The GPAT and Other Competitive Examinations in Pharmacy 3rd Edition Umang Shah Instant Download
No ratings yet
The Pearson Guide To The GPAT and Other Competitive Examinations in Pharmacy 3rd Edition Umang Shah Instant Download
67 pages
Meshack Oniera Naïve Bayes Assignment
No ratings yet
Meshack Oniera Naïve Bayes Assignment
1 page
Class Exercise Week 11 HR Database Task 1
No ratings yet
Class Exercise Week 11 HR Database Task 1
1 page
HR Data Loading Script
No ratings yet
HR Data Loading Script
6 pages
Introduction To Database and SQL - BIA 5002 Week 4 Winter 22
No ratings yet
Introduction To Database and SQL - BIA 5002 Week 4 Winter 22
40 pages
PMGT732 - Section004 - Group 13
No ratings yet
PMGT732 - Section004 - Group 13
6 pages
Wurth 1312121320437 Led RGB
No ratings yet
Wurth 1312121320437 Led RGB
13 pages
Airbus A Global Leader in Aeronautics
No ratings yet
Airbus A Global Leader in Aeronautics
8 pages
Project Report - M13 Sentiment Analyzer
No ratings yet
Project Report - M13 Sentiment Analyzer
9 pages
Corporate Finance, 6th Edition PDF
No ratings yet
Corporate Finance, 6th Edition PDF
53 pages
Final Presentation
No ratings yet
Final Presentation
8 pages
STA301-Quiz-2 by Vu Topper RM
No ratings yet
STA301-Quiz-2 by Vu Topper RM
125 pages
Dataset Description: Amazon Reviews of Unlocked Phone
No ratings yet
Dataset Description: Amazon Reviews of Unlocked Phone
4 pages
NLP Project Report
No ratings yet
NLP Project Report
17 pages
NLP SentimentAnalysis
No ratings yet
NLP SentimentAnalysis
9 pages
Picrosiriusred Protocol
No ratings yet
Picrosiriusred Protocol
8 pages
NM Project
No ratings yet
NM Project
18 pages
Machine Learning Based Integrated Scheduling and Rescheduling For Elective and Emergency Patients in The Operating Theatre
No ratings yet
Machine Learning Based Integrated Scheduling and Rescheduling For Elective and Emergency Patients in The Operating Theatre
24 pages
Customer Sentiment Analysis Project
No ratings yet
Customer Sentiment Analysis Project
3 pages
Wa0004.
No ratings yet
Wa0004.
20 pages
M1L3 LN
No ratings yet
M1L3 LN
7 pages
Andrew Antena CV3PX310R1 CRET INTEGRADO
No ratings yet
Andrew Antena CV3PX310R1 CRET INTEGRADO
2 pages
DBMS
No ratings yet
DBMS
7 pages
Arsalan's Project
No ratings yet
Arsalan's Project
4 pages
Arsalan's Project New
No ratings yet
Arsalan's Project New
4 pages
Analyzing Customer Feedback Using NLP
No ratings yet
Analyzing Customer Feedback Using NLP
21 pages
ICT550 Assignment 1b New
No ratings yet
ICT550 Assignment 1b New
2 pages
Bar 2
No ratings yet
Bar 2
3 pages
Text Analysis in Business Using Python
No ratings yet
Text Analysis in Business Using Python
5 pages
Harsh Internship
No ratings yet
Harsh Internship
18 pages
Project Report Forensic 25082023
No ratings yet
Project Report Forensic 25082023
27 pages
Sentiment Analysis
100% (1)
Sentiment Analysis
35 pages
Dupesh
No ratings yet
Dupesh
9 pages
Geotechnical Earthquake Engineering: Prof. Deepankar Choudhury
No ratings yet
Geotechnical Earthquake Engineering: Prof. Deepankar Choudhury
38 pages
Design Implementation
No ratings yet
Design Implementation
17 pages
Software Engineering - Project Proposal
No ratings yet
Software Engineering - Project Proposal
13 pages
ITPM Assign 2-1
No ratings yet
ITPM Assign 2-1
2 pages
Colorfast™ by Eagle Point
No ratings yet
Colorfast™ by Eagle Point
46 pages
FRAMEWORK
No ratings yet
FRAMEWORK
3 pages
FRAMEWORK
No ratings yet
FRAMEWORK
3 pages
Preview: Gradient Based Histogram Equalization of Thermal Infrared Images
No ratings yet
Preview: Gradient Based Histogram Equalization of Thermal Infrared Images
24 pages
Short Bowel Syndrome: Tinjauan Pustaka
No ratings yet
Short Bowel Syndrome: Tinjauan Pustaka
19 pages
Sentimental Analysis
No ratings yet
Sentimental Analysis
37 pages
Social Media Sentiment Analysis
No ratings yet
Social Media Sentiment Analysis
2 pages
Defenitif - Brief 6
No ratings yet
Defenitif - Brief 6
10 pages
Detailed Report
No ratings yet
Detailed Report
6 pages
Project Review On The Opinion Minin
No ratings yet
Project Review On The Opinion Minin
4 pages
Physical Chemistry (471) : Faculty of Applied Sciences Laboratory Report
No ratings yet
Physical Chemistry (471) : Faculty of Applied Sciences Laboratory Report
12 pages
Software Engineering - Documentation 02023
No ratings yet
Software Engineering - Documentation 02023
9 pages
Research Paper Text Classification
No ratings yet
Research Paper Text Classification
17 pages
Paper PDF Data
No ratings yet
Paper PDF Data
3 pages
NM Project Report-Sentiment Analysis-2
No ratings yet
NM Project Report-Sentiment Analysis-2
36 pages
Batch-17 Report Final
No ratings yet
Batch-17 Report Final
40 pages
Class 10 Portfolio Work
No ratings yet
Class 10 Portfolio Work
11 pages
Project
No ratings yet
Project
10 pages
Reserach Paper
No ratings yet
Reserach Paper
3 pages
First/Second Semester B.E.Degree Examination Engineering Chemistry
No ratings yet
First/Second Semester B.E.Degree Examination Engineering Chemistry
2 pages
Final Year Project PPT Template
No ratings yet
Final Year Project PPT Template
12 pages
VVDED302023 Altistart 48 Modbus Protocol
No ratings yet
VVDED302023 Altistart 48 Modbus Protocol
61 pages
Mini Project BDA
No ratings yet
Mini Project BDA
9 pages
School of Mechanical Engineering MEE437 Operations Research - FS 2016-17 - PBL Faculty: Siva Prasad Darla Project Based Learning Course
No ratings yet
School of Mechanical Engineering MEE437 Operations Research - FS 2016-17 - PBL Faculty: Siva Prasad Darla Project Based Learning Course
5 pages
Ch2100X - Spare Parts
100% (1)
Ch2100X - Spare Parts
195 pages
19-Article Text-51-1-10-20191231 PDF
No ratings yet
19-Article Text-51-1-10-20191231 PDF
25 pages
SentimentScanner Report (1) .PDF 157
No ratings yet
SentimentScanner Report (1) .PDF 157
20 pages
BAEMIN Group Report - SMK Group 10
No ratings yet
BAEMIN Group Report - SMK Group 10
26 pages
AE264 Spring2014 HW1
No ratings yet
AE264 Spring2014 HW1
3 pages
Complete Cello Book 2016
100% (11)
Complete Cello Book 2016
58 pages
Project Guide: Dr. K Swapna (Assistant Professor-Adhoc)
No ratings yet
Project Guide: Dr. K Swapna (Assistant Professor-Adhoc)
13 pages
Harsha Edunet
No ratings yet
Harsha Edunet
10 pages
Sentimental Analysis of Customer Reviews Which Should Be Represent in Graph by Using Plot Scatter
No ratings yet
Sentimental Analysis of Customer Reviews Which Should Be Represent in Graph by Using Plot Scatter
12 pages
Visio-PMPP-DIA-001 - Rev0 - Tank Project Delivery Process - Final
No ratings yet
Visio-PMPP-DIA-001 - Rev0 - Tank Project Delivery Process - Final
1 page
Neural Review1
No ratings yet
Neural Review1
6 pages
UOL SE LHR FYP Phase II Poster Template 2
No ratings yet
UOL SE LHR FYP Phase II Poster Template 2
1 page
Making Ubuntu Unity Look Beautiful by Enabling Transparency
No ratings yet
Making Ubuntu Unity Look Beautiful by Enabling Transparency
3 pages
Data Science Project
No ratings yet
Data Science Project
24 pages
Kavin
No ratings yet
Kavin
13 pages
Pollens
No ratings yet
Pollens
13 pages
Power System Protection
No ratings yet
Power System Protection
17 pages
Paper 8848
No ratings yet
Paper 8848
4 pages
Team-2fiinal Document AD Compressed Removed
No ratings yet
Team-2fiinal Document AD Compressed Removed
28 pages
Project Documentation 1
No ratings yet
Project Documentation 1
68 pages
Modifications For The Kenwood TS-940
No ratings yet
Modifications For The Kenwood TS-940
10 pages
Sentiment Analysis Using Text Mining PDF
100% (1)
Sentiment Analysis Using Text Mining PDF
12 pages
Tribhuvan University: Institute of Engineering
No ratings yet
Tribhuvan University: Institute of Engineering
48 pages
Twitter Sentiment Analysis For Product Review
No ratings yet
Twitter Sentiment Analysis For Product Review
19 pages
Starbucks Sentiment Analysis Using VADER
No ratings yet
Starbucks Sentiment Analysis Using VADER
23 pages
Department of Masters of Comp. Applications
No ratings yet
Department of Masters of Comp. Applications
23 pages
Sentiment Analysis On Online Reviews
No ratings yet
Sentiment Analysis On Online Reviews
11 pages
Text Sentimental Analysis
No ratings yet
Text Sentimental Analysis
32 pages
Comsats University Islamabad Wah Campus (Project Report) : Submitted by
No ratings yet
Comsats University Islamabad Wah Campus (Project Report) : Submitted by
14 pages
BDC Project Real Time
No ratings yet
BDC Project Real Time
14 pages
Department of Masters of Comp. Applications
No ratings yet
Department of Masters of Comp. Applications
12 pages
To Find Out The Quality and Popularity of A Product by Using User Comments
No ratings yet
To Find Out The Quality and Popularity of A Product by Using User Comments
8 pages
Polarity Categorization On Product Reviews
No ratings yet
Polarity Categorization On Product Reviews
4 pages

BI Case Study 3

Uploaded by

BI Case Study 3

Uploaded by

BIA-5401-0LC

Professor Viktoria Varga

Isha Ajit Phakatkar

Tran Bao Ngoc Nguyen

Summary of the case:

Step 2: Data Collection Strategy

4. Advantages in this Specific Case:

Step 4: Text Corpus Construction in Python

Text Preprocessing Steps:

To ensure consistency and accuracy in analysis, we applied several preprocessing techniques:

● Text Normalization – Convert text to a standard format by handling special characters

● Lemmatization – Reduce words to their root form (e.g., “running” →

Step 5: Discussion of Challenges

Results and Key Findings:

You might also like