0% found this document useful (0 votes)
22 views5 pages

Sol04 en

The document provides definitions and explanations of key concepts in big data analysis including: 1) 5W1H, mind maps, cause and effect diagrams, statistics quality control, Hadoop clusters, and complex event processing. 2) Steps for problem solving, different levels of data analysis, investigating data value, and data value cycles. 3) Technical products demanded in big data markets, major software types, and the technical architecture and planning/implementation of industrial applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views5 pages

Sol04 en

The document provides definitions and explanations of key concepts in big data analysis including: 1) 5W1H, mind maps, cause and effect diagrams, statistics quality control, Hadoop clusters, and complex event processing. 2) Steps for problem solving, different levels of data analysis, investigating data value, and data value cycles. 3) Technical products demanded in big data markets, major software types, and the technical architecture and planning/implementation of industrial applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Assigned: 2021/12/28

大數據分析 (Big Data Analysis) SOL04 Due: 2022/01/04

第一部份:名詞解釋 (30 pts., each 5 pts.)


Part I: Glossary

1. 5W1H
Who, What, When, Where, How, Why

2. Mind Map
A method of organizing a series of thoughtful ideas using a radial branching approach

3. Cause and Effect Diagram/Fishbone Diagram


 In the analysis, the main causes of the problem are listed, and then the secondary causes are listed
and categorized one by one, so that the analysis of the problem can be controlled effectively.
 It is also possible to analyze the solution as the main branch and the solution as a branch to the
goal.
 Also known as Cause and Effect Diagram, Fishbone Diagram or Iskiawa Diagram

4. Statistics, Quality, Control (SQC)


Provide a basis for quality improvement through statistical or analytical charts
 Statistics: Characterization and quantification of qualitative or quantitative items to present the
overall facts
 Quality: Analysis of the causal relationship between production factors and product quality
 Control: Optimization of projects (design + control + improvement + technology)

5. Hadoop cluster environment


 Hadoop is a cloud platform architecture that stores and manages large amounts of data, and is an
open source project under the Apache Software Foundation.
 It is a cluster system that can be scaled from a single server to thousands of machines for
integration, just like a super computer for application
 A decentralized file system that uses MapReduce programs for analysis and processing, with
the concept of Map and Reduce for decentralized computing

6. Complex Event Processing (CEP)


 Complex event processing is an analysis technique based on the flow of events in a dynamic
environment
 Events represent meaningful state changes
 By analyzing the relationship between events and using techniques such as filtering, correlation,
and aggregation, rules are established based on relationships such as time, space, dependency,
constraints, and cause and effect, and sequences of events are continuously filtered out from
the stream of events and eventually analyzed to obtain more complex composite events
第二部份:簡答題 (70 pts., each 5 pts.)
Part II: Short-Answer Type

1. What is data processing?


 For the stored data that have been pre-processed and digitized, most of them cannot be applied
directly, but still need to be analyzed in a corresponding way before they can be effectively
retrieved and converted, so that people or applications can more easily interpret and process the
information in them, and then analyze and apply them.
 The information obtained from data processing is then analyzed according to the question
objectives to determine whether the hypothesis set for the question needs to be revised or the
answer to the question can be obtained.
 Data processing before data analysis also includes how to import data into the analysis tool for
processing.

2. Please list different types of "problem".


 Restoration of the status quo: The purpose of the present is the status quo
 Idealistic question: The future goal is ideal
 Restoring the status-quo question: The purpose of the future is the status quo

3. Six steps of solving problem.


1. Identify the problem, confirm the problem, and define the problem (describe the problem)
2. collect data, analyze data, and identify causes
3. propose and select a solution (hypothesis)
4. evaluate and implement the best solution
5. verify the facts and evaluate the results
6. identify process variation and negative impacts

4. Different levels of analysis.


 Manual report analysis (grasp past and present)
 Integrate multiple data to assist in analysis and diagnosis (discover behavior patterns)
 Model prediction and prediction tracking after process (prediction)
 Automated analysis and prediction system with artificial intelligence for real-time adjustment and
correction (optimization)

5. Common investigate direction of "data value".


 Corporate Value: How data and data analysis can help manage and innovate to generate revenue
 Value of Information Economy: How to use data, information and knowledge as products and
services in the market
 The value of intelligent decision making: how to enhance the value of decision making or action
with the help of data and data analysis
6. The main content of data value in corporation (corporation value).
 Business Monitoring: Monitor the current operation status and analyze about abnormalities
 Business Insights: Use statistics, data mining, and predictive analysis to improve operations
 Business Optimization: Use prediction and optimization analysis to improve operation mode
 Data Monetization: Using data to assist in sales or market analysis to gain new sources of profit
 Business Metamorphosis: Create new business models or services to help companies transform

7. The main content of data value in infonomic (infonomic value).


 Quality: whether the information is complete or accurate
 Relevance: whether the information is easily interpreted or meaningful
 Timeliness: whether the information is meaningful for decision making in a given time frame
 Cost-effectiveness: whether the information is easy to obtain or manage
 Cost-effectiveness: Whether the information is cost effective to meet the needs
 Marketability: whether the information can be converted into a service or product to provide
revenue benefits

8. Market characteristics of infonomic value.


 Increasing marginal revenue
 Network effects / Network externality
 Two-sided market

9. Steps/process of data value cycling.


 Digitization and Data Collection
 Big Data Pool
 Data Processing and Analyzing
 Knowledge Base
 Data Driven Decision Making

10. Business analysis technology with different age and tech..


[1970s] Decision Support: Decision Support using Data Analytics
[1980s] Executive Support: Decision support using data analytics, extended to executive and
administrative decision support
[1990s] Online Analytical Process (OLAP): Methods and tools to present integrated decision information
based on multidimensional analysis and decentralized data tables (multidimensional data tables), flexible
pivot analysis (Pivot), and related operations
[1980~2000] Business Intelligence: Constructing tools to support data-driven decision-making models
with reports and related visualized data, with recent emphasis on the integrated application of data
dashboard-related technologies and model analysis and prediction
[Latter of 2000s] Data analytics: focus on statistics and mathematics for decision analysis, hoping to draw
conclusions through detailed and complex calculation results
[2010s] Big Data Analytics: Focus on the rapid processing of large amounts of unstructured data, hoping
to find common patterns through large amounts of data to draw conclusions
11. Please list the technical product that the markets request for big data analysis.
The demand (request) for big data in the market environment mainly includes data services, infrastructure
architecture, application software tools, consulting and integration services, and other technology products

12. Please list the major software product types of big data analysis.
 Data organization and management software: for data collection, organization and storage
 Data analysis and visualization software: Data mining models, statistical analysis, and
visualization techniques are used to discover patterns, predict the future, and present graphs
 Other Big Data applications: Process, analyze and present data and analysis results for specific
areas or industry-specific applications

13. Please describe the technical architecture of industrial application.


Data Scientist
Business Analyst
Domain Analysis
Applications and Data Mining Tools
Visual Representation
 Model Building
 Report  Data Analysis
Operations  Dashboard  Visualization
Staff  ...  ...
Commercial
Manager Training Data
Analysis Results Import
Application, &
Presentation Data Model
Deployment
Data Collector Data Processing Platform
Data Data Engineer
 Data Base Integration  Storage Data Manager
 Log  Management System Admin
 Event  Analysis Programmer
 ...  ...

14. Please describe the planning and implementation process of industrial applications.
 Business Strategy (What is the value of big data? Where is it used/to be used?)
 Assist operation management, market innovation, where are the valuable data sources? How
is it collected? What are the business processes that can be integrated?
 Application Context
 Identify data collection and analysis strategies, analysis models and possible technical
frameworks from the perspective of business people, customers and context
 Technology planning (can be identified together with the context discussion)
 Technology groups, technology architecture, technology risks, IT team development, data
sources, data integration, data processing, analysis applications
 Implementation introduction
 Technical capabilities for big data processing and analysis can start from small projects
 Problem understanding, data understanding, data preparation, model building, model
evaluation, application deployment
 Data Governance
 Governance framework, data ownership, data privacy, data quality, data security
15. Please describe the roles and their relationship in the big data ecosystem.
 Consumers or enterprises: end-users who use big data and don't necessarily need access to it
 Value-added providers: Organize and transform raw data for application service providers or end-
users
 Application service providers: provide data analysis, models, or applications as a commodity to
customers
 Software providers: provide data processing software, analysis software or data analysis tools, etc.
 System integration and consulting services providers: help users and vendors make the best use of
analytical tools and equipment
 Infrastructure providers: provide storage equipment, servers, wiring and other software and
hardware products

https://docs.google.com/document/d/1c581FId0TCkvOAOYWE2vNXRkTi_5OlxS/edit?usp=sharing&ouid
=100235357095740738175&rtpof=true&sd=true

You might also like