0% found this document useful (0 votes)
17 views3 pages

Edited

1) The document discusses how data science can be used to solve business problems by reframing them as data science questions and leveraging tools like the data analytics lifecycle. 2) The data analytics lifecycle is a six step process involving discovery, data preparation, model planning, model building, communicating results, and operationalizing models. 3) By applying this process and techniques like machine learning, data science can help businesses make better decisions by analyzing past data and predicting future trends from large datasets.

Uploaded by

Vivian Hoo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views3 pages

Edited

1) The document discusses how data science can be used to solve business problems by reframing them as data science questions and leveraging tools like the data analytics lifecycle. 2) The data analytics lifecycle is a six step process involving discovery, data preparation, model planning, model building, communicating results, and operationalizing models. 3) By applying this process and techniques like machine learning, data science can help businesses make better decisions by analyzing past data and predicting future trends from large datasets.

Uploaded by

Vivian Hoo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Data Science in Business, a webinar presented by Dr.

Alireza
Manashty (Director of Data Science Laboratory) at the University of Regina,
Canada. Through this, we could learn how to reframe a business problem into a
data science problem and use big data and data science to solve an actual business
problem by leveraging the data science lifecycle to automate some decisions
utilizing machine learning models.

Data Science is an emerging interdisciplinary field that predicts what


will happen in any business line in the future through data analysis by a data
scientist. An ideal data scientist has both the engineering skills to acquire and
manage large data sets and has the statistician's skills to extract value from the
large data sets and present it to a large audience (Woods, 2011). Besides, they can
use business intelligence reports to analyze what has happened in the past.
Therefore, a business value can be increasing by studying the history and future
prediction (utilizing business intelligence and data science) by solving the
analytical problems in the right way.

Big data is the large volume of data that fast generating and inundates
a business on a day-to-day basis, and basically, it is hard to manage by using IT
infrastructure. There is no threshold to define the volume of data, which different
from time to time. Refer to Appendix 4, 80% of big data is unstructured, but
anyhow, a data scientist could use up either structured or big unstructured data to
accomplish the data science task and solving a business problem. During the 21 st
century, a lot of MNCs and businesses are using big data analytics in term to keep
track of primary transactions and run business more efficiently by making better
decisions. Some example (daily basis): Facebook (0.5 petabytes); Walmart (40
petabytes); Google (24 petabytes) where 1 petabyte (PT) = 1,000 terabytes (TB) =
106 gigabytes (GB). Businesses might face challenges too in their technological
innovation with significant big data: for example, a company plan to generate
40PT in its system needs 4,000 drives ($200/pcs), cost $0.8million + tax, and
other accessories budget; hardware supply lead time; transferring data with

1|Page
100MB/sec will take 12.6 years to complete storing 40PT data into the hard drive;
company spaces to store hard drives etc. must put all in consideration.

How could we reframing a business problem as a data science


question and approach the analytics problem in a more in-depth discussion? First,
we must understand the business issue, look into big data, analyze the useful
validated data, impute missing data, find out the root cause, and prescribe it.
Following, we select the framework that we want to work on it from either “Data
Analytic Lifecycle” or “Microsoft Team Data Science Process (TDSP)” (refer to
Appendix 5 for own reference), we will discuss more on lifecycle to learn how to
relate it to solve a business problem.

Data Analytic Lifecycle (refer to Appendix 6) contains six steps as


shown below:

1) Discovery

Necessary step to define and understand the business problem by conducting


interviews with the stakeholders.

2) Data Preparation

Establish the analytic sandbox and ETLT (extract, transform, load, and transform)
the data, followed by data exploration and conditioning (remove outliers/ missing
data), and summarize and visualize the data. First, access the data by
understanding each data code and then proceed for visualization, which is vital
before analyzing where data is easier to read in visual form and could apply to any
domain that is easier to study and communicate with people (refer to Appendix
7A & 7B). In analyzing stage, we could explore the relationship between two
variables and summarize findings.

3) Model Planning

Select the suitable model after data analysis to solve the unique business
problems, respectively. There is a various category of techniques in model
selection (refer to Appendix 8).

2|Page
4) Model Building

Build training and test datasets, where 80% with data labeled is for training, while
20% with unlabeled data is for testing only, which not show in the model. After
setting up the model, we will then train the selected model by evaluating the fitted
model and adjusting accordingly to get accurate results.

5) Communicate Results

The data scientist will prepare different types of presentation for further usage.
The target audience will be management, analyst, and responses. It’s to show their
findings, predictions, and recommendations to solve the business problem.

6) Operationalize

Operationalize the model by providing the code and technical documentation to


the respective responsible department for further deployment and action after
communication approval. Lifecycle is considered complete but is not finish where
we must further monitor for a refinery that new attributes can be considered, and
delivery mechanism can be simplified with self-service reporting.

Data science combines the scientific method, math and statistics,


specialized programming, advanced analytics, AI, and even storytelling to
uncover and explain the business insights buried in data (What Is Data Science?,
2020). Big data can be sourced from communications, media, entertainment,
financial services, healthcare, social media, the internet of things (IoT) etc. where
it’s exposure to every business nowadays. Big data demonstrated four dimensions
that can be analyzed for insights that lead to better decisions and strategic
business, it’s added value to the company and created valuable information to the
customer. This way, data science does matter in businesses to analyze and predict
whether the company is growing or falling shortly by figuring out how to reduce
the customer churn and acquire new customer.

3|Page

You might also like