Data Science (1)
Data Science (1)
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and
systems to extract knowledge and insights from structured and unstructured data, and apply
knowledge and actionable insights from data across a broad range of application domains. Data
Data science is a "concept to unify statistics, data analysis, informatics, and their related methods"
in order to "understand and analyze actual phenomena" with data. It uses techniques and theories
drawn from many fields within the context of mathematics, statistics, computer science, information
science, and domain knowledge. Turing award winner Jim Gray imagined data science as a "fourth
paradigm" of science (empirical, theoretical, computational and now data-driven) and asserted that
"everything about science is changing because of the impact of information technology" and the
data deluge.
Data science comprises preparing data for analysis, including cleansing, aggregating, and
manipulating the data to perform advanced data analysis. Analytic applications and data scientists
can then review the results to uncover patterns and enable business leaders to draw informed
insights.
1. Define the problem: What question are you trying to answer? What insights are you hoping to
gain?
2. Gather the data: What data do you need to answer your question? Where can you find this data?
3. Clean the data: Is your data accurate and complete? Do you need to remove any outliers or
missing values?
4. Explore the data: What patterns and trends can you see in your data? What visualizations can
5. Model the data: Can you build a model to predict or explain the patterns you see in your data?
6. Evaluate the model: How well does your model perform? Is it accurate and reliable?
7. Deploy the model: How can you use your model to make decisions or take actions? How can you
Data science is a rapidly growing field, and it is having a major impact on a wide variety of
industries. As the amount of data continues to grow, the demand for data scientists is only going to
increase.