York University Adms2510 Data Analysis-C10
York University Adms2510 Data Analysis-C10
为什么?
• Companies started to transform their business processes to incorporate more and more data
driven decision making (DDD).
• New job requirements which will be more aligned with data, processing and data analytics.
• Disruptive technologies forced all companies to rethink and redesign their business models
59
52
Data Flow 数据流
1. Data Creation – ERP systems, CRM, Emails, call centers, Transactions , etc. Many manufacturing
companies are processing devices steaming data continually(IoT)
2. Data Capture – Part of data created is captured stored in different systems or data bases
3. Data Processing – Part captured is processed and transformed in structured data ( tabular
format).
4. Reporting and Visualization. Processed data is used for reporting, graphs, trends etc.
5. Analytics. Processed data in conjunction with external information is used for forecasting,
diagnostics, optimization and other types of analytics.
59
52
■
Data Structure
Data Source Type
• Structured Data - data stored in relational data bases such as SQL, DB2, Oracle, etc. Data is
normalized and tabular making it easy to impute, store and query. Records have the same
rational schema.
• Semi- structured data - data that is not located in in a relational database but it has some
organizational properties. Not all records will have the same structure. Examples : Invoices,
Forms, Web sites (XML format), NoSQL databases.
• Unstructured data- everything else. Examples : Emails, Text files, Pictures, MP3s, Social Media.
Data Universe
59
52
■
Data Type
Big data- refers to very large data sets that are used to find hidden patterns such us customers
behaviors, correlations between KPIs, and Market trends.
• Velocity – refers to the speed of data processing, often in Big data it is real time analysis or
immediate response.
Small Data - is a data set with a format and volume that makes it accessible, understandable and ease to
process.
59
区别:Big data is about machine processing Small data is about human understanding ( small data sets
52
that can be understood by humans like samples in a market research)
Dark data – represents data collected but not used in any way.
■
Data has become extremely valuable, therefore unanalyzed data has a very high opportunity cost in lost
insights and undiscovered risks .
Enterprise Resource Planning System
• Definition:
A fully integrated, full-service suite of software with a common (relational) database that can be used
to plan and control resources across an entire organization, and often its suppliers and/or customers.
An ERP system has a common database or relational data bases that integrates all systems for all parts
of the organization
- Financial
- Manufacturing
- Sales
-
Inventory
• By linking all systems with a relational database, an ERP system allows an organization to
■
• An ERP system is able to include not just the parts of an organization but also its suppliers and
customers.
• Consequently, an ERP systems is able to integrate financial reporting, cost management, and
performance measurement along with all other systems in an organization.
Data Analysis
1. Descriptive Analytics – summarizes the data in the raw datasets to create information easily
consumable by humans. It describes the past and most of the corporate reporting and dashbording falls
in this category. 描述
2 Diagnostic Analytics – answers to the question “why that happened”. It is used to understand causes
of different outcomes , for ex: why a marketing campaign was successfully (or not). 诊断为什么发生
3. Predictive analytics - It uses big data to identify past patterns to predict the future. It answers the
question “what might happen”. Properly tuned predictive analytics can be used to support sales,
marketing, or for other types of complex forecasts. 预测
59
4. Prescriptive Analytics – recommends one or more courses of action and shows the likely outcome of
each decision. It also prescribe the best course of action for a given situation. Examples : route
optimizations, used in self driving cars (to constantly decide the best response for any traffic situation).
52
建议
■