0% found this document useful (0 votes)
32 views19 pages

iMY DATA SCIENCE - Removed

data science
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views19 pages

iMY DATA SCIENCE - Removed

data science
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Seminar Report on

“DATA SCIENCE”

Submitted towards partial fulfilment of the requirement for the


award of the degree
in
Master of Computer Application (MCA)

Under the Supervision of Submitted by


Dr. Sasmita Acharya Ashisha Kumar Baliarsingh
Assistant Professor, Roll No. – 2306151019
Department of Computer Science and Engineering

Department of Computer Science and Engineering


VEER SURENDRA SAI UNIVERSITY OF TECHNOLOGY
(Formerly, University College of Engineering, Burla)
Burla, Sambalpur, Odisha,768018

1
Department of Computer Science and Engineering
VEER SURENDRA SAI UNIVERSITY OF TECHNOLOGY
(Formerly, University College of Engineering, Burla)
Burla, Sambalpur, Odisha, 768018

CERTIFICATE

This is to certify that Ashisha Kumar Baliarsingh, student of Master in


Computer Application (MCA) 3rd semester bearing Roll No.
2306151019, have submitted their seminar entitled " Data Science "
towards partial fulfilment of the requirement for the award of the
degree Masters in Computer Application (MCA) during the session
2024-25 under my guidance.

Guide

2
ACKNOWLEDGEMENT

I wish to express my heartfelt thanks to my seminar guide Dr. Sasmita Acharya


for his valuable suggestions along with keen interest & co-operation. I am greatly
debited for his constructive and helpful guidance from time to time during the
progress of the seminar without which the seminar would not have completed.

I also wish to thank other faculties who helped me directly or indirectly to


complete this seminar.

Finally, I express my sincere gratitude and thanks to the department fraternizes,


for their technical and non-technical help, encouragement and suggestions from
time-to-time basis, towards me during the tenure of this seminar. At last, I offer
gratitude to my friends for their constant support, heartily help and interaction.

Ashisha Kumar Baliarsingh


Regd No: 2306151019

4
Index

Sr. No Content Page No.

1 Abstract 6-7

2 Introduction 7-9

3 History of Data Science 9-11

4 Definition of Data Science 11-12

5 What is Data science? 12-15


• The Data Science Process

6 Application / Uses of Data Science 15-16

7 Important of Data Science 16-17

8 Advantages Disadvantages 17-18

9 Conclusion 18-19

10 References 19-20

5
ABSTRACT

Data science is an interdisciplinary field that leverages statistical methods,


algorithms, and computational tools to extract insights and knowledge from
structured and unstructured data. It plays a crucial role in various industries, enabling
data-driven decision-making, predictive analytics, and automation. This paper
explores the core concepts and methodologies in data science, including data mining,
machine learning, and big data analytics. It also examines the practical applications
of data science across different domains, such as healthcare, finance, and marketing,
highlighting the transformative impact of data science on business operations and
societal outcomes. Furthermore, the paper discusses the ethical considerations and
challenges associated with data science, including data privacy, bias, and the
interpretability of models. Through a comprehensive review of the field, this study
provides insights into the current trends, future directions, and the evolving role of
data science in a data-centric world.

6
CHAPTER-1

INTRODUCTION

Data management and analysis is done by computer programming . In the data


science ,two programming language are most popular - Paython and R. Data is
manipulated to extract information out of it. The mathematical foundation of data
science is statistics and probability. Data Science has become very popular it is
helping your business improve productivity. Multinational companies can also
take advantage of planning motivate, from data science you to small and medium
enterprises.

In a world which is increasingly becoming a digital space, organization


deal with zetta and yottabytes of structured and unstructured information every
day. Evolving technologies have enabled better cost savings and smarter to store
critical data. In the todays now industry, there is a huge need for skilled, certified
data scientists. They are among the highest-paid professionals in the IT industry.

Before we see into the definition of data science let’s see the history of
data science. It is nothing new that have been introduced today. A Data existed
in 1940’s as well however it was not viewed the way we see it today.
Statisticians played an important role during this period and they used to do
data analysis manually. They lacked use of computer for this purpose as such
important was less.

The use of Industry of data science in popular industry’s in IT companies


organization need to address their complex and expanding data environment in
order to identify new value sources, to exploit future opportunities, and to grow
or optimise efficiently. The differentiating factor for an organization is ‘what
value they extract from their repository of data using analytics and how well they
present it.

7
Here we list some of the biggest and best companies that are hiring data scientists
at top-notch salaries. The ‘Google’ is by far the biggest companies that is on a
hiring spree for top-notch data scientist . Science today most of Google is driven
by data scientists, artificial intelligence and machine learning, Google offers
some of the best data science salaries. ‘Amazon.in’ is a global e- commerce and
cloud computing giant that is hiring data scientists on a big scale. They need data
scientists to find out about the customer mindset, enhance the geographical reach
of both the e-commerce domain and cloud domain among other business-driven
goals. ‘Visa’ is an online financial gateway for most of the companies and Visa
does transactions in the range of hundreds of millions over the course of a regular
day. Due to this the requirement for data scientists is huge at Visa to generate
more revenue, check fraudulent transactions, customize the products and services
as per the customer requirements among other things.

(Fig:1.1- Data science Life cycle)

8
CHAPTER-2

HISTORY OF DATA SCIENCE


The term "data science" has been in use for over 30 years, but it only became an
established concept more recently. Initially, it appeared as a substitute for "computer
science" by Peter Naur in the 1960s. Naur later introduced the term "datalogy" and
published a work in 1974 that utilized the term "data science" to describe
contemporary data processing methods across various applications.
In 1996, the International Federation of Classification Societies (IFCS) included the
term "data science" in the title of their biennial conference for the first time, following
its introduction during a roundtable discussion by Chikio Hayashi. The term gained
further prominence in 1997 when C.F. Jeff Wu delivered an inaugural lecture titled
"Statistics = Data Science?" at the University of Michigan. Wu characterized
statistical work as a trilogy involving data collection, modeling and analysis, and
decision-making, advocating for the renaming of statistics as data science and
statisticians as data scientists. He later presented this concept in the P.C. Mahalanobis
Memorial Lectures.
The idea of data science as an independent discipline was further solidified in 2001
by William S. Cleveland, who expanded the field of statistics to include advances in
computing with data. In 2002, the International Council for Science's Committee on
Data for Science and Technology (CODATA) started the Data Science Journal,
which focused on data systems, their online publication, applications, and legal
issues. Soon after, Columbia University began publishing The Journal of Data
Science, creating a platform for data professionals to share and exchange ideas.
Around 2007, Turing Award winner Jim Gray envisioned "data-driven science" as a
"fourth paradigm" of science, which relies on the computational analysis of large
datasets as a primary scientific method. He foresaw a future where all scientific
literature and data would be online and interoperable.
The term "data scientist" gained widespread recognition in 2012 after being

9
described as "The Sexiest Job of the 21st Century" in a Harvard Business Review
article. This article credited DJ Patil and Jeff Hammerbacher with coining the term
in 2008 to describe their roles at LinkedIn and Facebook. The role of a data scientist
was framed as business-oriented, with an emphasis on the growing demand and
shortage of professionals in this field.
In 2013, the IEEE Task Force on Data Science and Advanced Analytics was
launched, followed by the first European Conference on Data Analysis in
Luxembourg, which established the European Association for Data Science. The first
international IEEE conference on Data Science and Advanced Analytics took place
in 2014. Around the same time, new educational opportunities emerged, such as
General Assembly's paid bootcamps and The Data Incubator's competitive free data
science fellowship.
In 2014, the American Statistical Association renamed its journal to Statistical
Analysis and Data Mining: The ASA Data Science Journal and later renamed its
section to "Statistical Learning and Data Science." The International Journal on Data
Science and Analytics was launched in 2015, further solidifying data science as a
distinct and evolving discipline.

(Fig:2.1- History of Data Science)


10
CHAPTER-3

DEFINATION OF DATA SCIENCE

Data science is a multidisciplinary field that combines skills in software


engineering that combines skills in software engineering and statistics with
domain experience to support the end-to- add analysis of large and diverse data
sets, ultimately uncovering value for an organization and then communicating it to
stakeholders as actionable results

Examples
An example of the use of data science would be creating a machine learning model
that uses data from large amounts of Electronic Health Records (EHRs) to predict
if patients are at a higher risk for readmission after hospital discharge.
Another example would be an AI or neural network that analyzes f millions of images
of skin lesions and learns to predict which lesions are most likely to turn cancerous.

(Fig:3.1-Overview of data Science)

11
CHAPTER-4

INTRODUCTION TO DATA SCIENCE

Data management and analysis is done by computer programming .


In the data science ,two programming language are most popular - Paython and
R. Data is manipulated to extract information out of it. The mathematical
foundation of data science is statistics and probability. Data Science has become
very popular it is helping your business improve productivity. Multinational
companies can also take advantage of planning motivate, from data science you
to small and medium enterprises.

Data science is a study of the flow of information from colossal amount of data
present in an organization repository. It involves obtaining meaningful insights
from and unstructured data which is processed through analytical, programming
, and business skills. Companies are focusing on data analytics for their growth
major benefits form the data they already possess. We will also see few examples
that helped companies making best out of data science.

Before we see into the definition of data science let’s see the history of data
science let’s see history of data science it is nothing new that have been
introduced today. Data existed in 1940’s and 1950’s as well however it was not
viewed the way we see it today statistics played an important role during this time
period and they used to do data analysis manually. They lacked use of computers
for this purpose as such it’s importance was less.

-In 1940’s & 1950’s data storage was a big issue

12
- Today we have apply data storage opportunities
-

Here data science of diagram can be further confused the fact that common
disciplines that a data scientist may draw upon .A data science’s level of
same experience and knowledge in each often varies along a scale ranging
for big and perfect ,and expert, in which ideas.

(Fig :4.1-Characterstic of Data science)

While these ,and other disciplines and areas experience are all characters of data
scientist role ,like to think of a data foundation as being based on four pillars. this
data science data engineering, scientific method, visualization, Domain
13
The Pillars of Data science expert:

- Business domain
- Statistics and probability
- Computer science & Software programming
- Write & verbal communication
4.2 The Data Science Process
The data science process can be a bit variable depending on the project goals
and approach taken, but generally the following:

The data science process involves these phases, more or less:

• Data acquisition, collection, and storage

• Access, ingest, and integrate data

• Processing and cleaning data

• Choosing one or more potential models and algorithms

• Initial data investigation and exploratory data analysis(EDA)

• Measuring and improving result(validation and tuning)

• Delivering, communication, and presenting final result

• Repeat the process to solve a new problem

• Apply data science method and techniques(e.g., machine learning ,


statistical modelling, artificial intelligence)

14
Here is a diagram for Data Science Process:

(Fig:4.2- Process of Data Science)

15
CHAPTER-6
APPLICATION /USES OF DATA SCIENCE

The data science use of large amount store of data companies have a become
intelligent to push and sell products as per customer purchasing power and
interest .

➢ Internet Search
➢ Digital Advertisements(Targeted Advertising and re-targeting)
➢ Recommender System
➢ Image Recognition
➢ Speech Recognition
➢ Gaming
➢ Price Comparison Website
➢ Airline Route Planning
➢ Fraud and Risk Detection
➢ Delivery logistics
➢ Miscellaneous
➢ Coming up in Future
➢ Self-Driving Cars
➢ Robots
➢ Healthcare

(Fig:6.1- Application of Data Science)


16
CHAPTER – 7

IMPORTANT OF DATA SCIENCE


In the last few years, the data science is really far enough, so they are
integral to understanding the work of many industries .However, the
following are example of why complex world-class culture and economy
data are always an integral part.
• Customer of data science branch relational ship help to understand a
number of improved and powerful ways that customer have the power
and support of any of the brand and have a big role in their success and
unsuccess ,with the brand being able to connect their customers
individually so that better power and restriction are well educated. It can
be done.
• One of the reasons that status science is attracting so much attention is that
when many companies use data in a broad way when they allow the data
to be interacted in such a compelling and powerful way, they can share
their stories with them well so that a good connect is created. And it does
not connect with customer like this do to all the human emotions can be
generated.

Chart Title
Computer Science

9%
20%
Statistics and
9% mathmaticas
11% Economic and social
19% science
13% Data science and
19% analysis
Natural science

(Fig: 7.1- Visualization of Data)

17
CHAPTER-8
ADVANTAGES AND DISADVANTAGES
➢ 8.1 Advantages

• Data science competence can be developed easier due to the possible


transfer between the colleagues.
• Specialization possible in the team.
• Team lead has data science competence.
• Data science is the science of systematically discovering patterns
useful knowledge and predict something of value

➢ 8.2 Disadvantages

• Business and process understanding might suffer.


• Longer distances for coordination with department.
• High risk for data science tasks to come off badly in the daily in the
business (competing task)
• Knowledge exchange between various departments difficult.

18
CHAPTER – 9

CONCLUSION

Data science has emerged as a transformative discipline that plays a pivotal role in
today's data-driven world. By integrating skills in software engineering, statistics,
and domain knowledge, data science enables organizations to harness the power of
large and complex datasets. Through advanced analytical techniques and machine
learning, data scientists uncover patterns, make predictions, and provide actionable
insights that drive informed decision-making. As data continues to grow
exponentially, the importance of data science will only increase, shaping the future
of industries and society by unlocking the potential hidden within data.

19
REFERENCES

1. https://en.m.wikipedia.org/wiki/data_science#cite_note-16

2. https://en.m.wikipedia.org/wiki/data_science#cite_note-17

3. https://en.m.wikipedia.org/wiki/data_science#cite_note-Hayashi-4

4. 19. http://en.m.wikipedia.org/wiki/data_science#cite_note-cfiwu01-19

5. 18. https://en.m. wikipedia.org/wiki/data_science#cite_note-cfjwutk-18

6. 20. http://en.m.wikipedia.org/wiki/data_science#cite_note-cfjwu02-20

7. https://en.m. wikipedia.org/wiki/data_science#cite_note-ics12-22

8. https://en.m. wikipedia.org/wiki/data_science#cite_note-dsj12-23

9. http://en.m. wikipedia.org/wiki/data_science#cite_note-dsj02-24

10.http://en.m. wikipedia.org/wiki/data_science#cite_note-jds03-25

11.http://en.m. wikipedia.org/wiki/data_science#cite_note-TansleyTolle2009-5

12.http://en.m. wikipedia.org/wiki/data_science#cite_note-BellHey2009-6

13.http://en.m. wikipedia.org/wiki/data_science#cite_note-27

14.http://en.m. wikipedia.org/wiki/data_science#cite_note-Harvard-7

15.http://en.m. wikipedia.org/wiki/data_science#cite_note-28

16.http://en.m. wikipedia.org/wiki/data_science#cite_note-29

17.http://en.m. wikipedia.org/wiki/data_science#cite_note-30

18.http://en.m. wikipedia.org/wiki/data_science#cite_note-ASA-31

19.http://en.m. wikipedia.org/wiki/data_science#cite_note-3s

20

You might also like