The document discusses key concepts in data science including:
1. Data science transforms data into useful information through scientific methods, statistics, and algorithms.
2. The data processing cycle involves collecting data from various sources, preparing it, inputting it into a computer, processing it, outputting it to users, and storing it for future use.
3. The data value chain demonstrates obtaining value from data through acquisition, analysis, curation, storage, and usage. It involves gathering, cleaning, exploring, modeling, managing, recording, and accessing data for decision making.
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
1K views
Assignment Data Science
The document discusses key concepts in data science including:
1. Data science transforms data into useful information through scientific methods, statistics, and algorithms.
2. The data processing cycle involves collecting data from various sources, preparing it, inputting it into a computer, processing it, outputting it to users, and storing it for future use.
3. The data value chain demonstrates obtaining value from data through acquisition, analysis, curation, storage, and usage. It involves gathering, cleaning, exploring, modeling, managing, recording, and accessing data for decision making.
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3
Assignment 2: Data Science
1. What is Data Science?
● Data science is a field of study which uses scientific methods, statistics, and algorithms to acquire information and insights from various types of data, including structured, semi-structured, and unstructured data. To put it simply, data science transforms data into useful information. Data science is important because data by itself is not fully useful unless you turn it into information.
2. Explain Data Processing Cycle?
● The data processing cycle comprises a series of steps necessary for transforming data into information. The data processing cycle begins with data collection. It is the stage wherein you gather the data from multiple sources using various data collection methods and techniques. The next step in the data processing cycle is data preparation. In this stage, data is cleaned to exclude inaccurate data and is organized to prepare data for processing. The third step is data input. The cleaned data from the previous step is converted into machine-readable form by inputting it into a computer. The fourth step is processing. In this stage, the data will be processed using various data processing methods and techniques. The fifth step is data output, wherein the processed data is delivered to the user in a readable and summarized format. Finally, the last stage of the data processing cycle is data storage, wherein the data is stored for future use in a way that the user can retrieve the data easily in any desired format.
3. Explain Data Value Chain.
● The data value chain is a conceptual framework used in data science that demonstrates a series of activities to obtain value and useful insights from data. The data value chain is comprised of the following activities: ○ Data Acquisition - It entails the process of gathering, and cleaning data before it is stored in a data warehouse or any other storage solution on which data analysis can be performed. ○ Data Analysis - It is the process of exploring, transforming, and modeling data to obtain useful information that can be used for decision-making. ○ Data Curation - It involves processes required to actively manage data, ensuring that it fits the requirements for its effective usage. Data curation includes processes such as selection, categorization, transformation, validation, and preservation of data. ○ Data Storage - It is the process of recording data in a storage solution to ensure its security and accessibility for future use. ○ Data Usage - It involves the process of accessing data, its analysis, and the tools needed for business activities and decision making. 4. What is Big Data? ● Big data is a vast and multifaceted collection of data that by no means any traditional data management tools can store, process, manage, and retrieve efficiently, thus, the only solution for managing this fast-growing data is by utilizing multiple computers and robust computer softwares for processing. Big data has three defining properties, known as the three Vs; volume, velocity, and variety. The volume in big data pertains to the large amount of data being produced at every moment. Another property of big data is variety, which relates to data having various formats, such as structured, semi-structured, unstructured, etc. And lastly, the velocity in big data refers to the speed of data generation.
Instant ebooks textbook Node js MongoDB and Angular Web Development The definitive guide to using the MEAN stack to build web applications Developer s Library 2nd Edition Brad Dayley & Brendan Dayley & Caleb Dayley download all chapters