CE880_Lecture_1_slides
CE880_Lecture_1_slides
Haider Raza
Tuesday, 17 Jan 2023
1
About Myself
2
Assessment Information
The aim of the module is to develop quantitative skills in the area of AI and Data
Science to enable professionals working in areas in which these topics are now being
embedded. The module will enable those future professionals to take a knowledgeable
approach to their use of AI and data science.
The deadlines are as follows:
I Students are required to complete their weekly lab work and submit all the lab
work notebooks to Faser before 14/04/2023 (11h:59m. 59s)
I If due to any reason you will not able to submit the coursework (Labs / Case
Study) on time, please consider https://www.essex.ac.uk/student/
exams-and-coursework/extenuating-circumstances
3
What is Data?
4
Types of Data
5
What is Data Science?
6
Data Science Venn Diagram
7
Advantage and Disadvantages of Data Science
Advantages Disadvantages
I Multiple job options I Data science is blurry term
I Business benefits I Mastering data science is near
I Highly paid jobs career to impossible
opportunities I Good domain knowledge
I Data science makes data better required
I Data science is versatile can be I Arbitrary data may yield
applied to any business unexpected results
I No more boring tasks I Problem of data privacy
I Everyday learning something
new
8
5 Reasons Why to Study Data Science?
I Learning about data science provides an opportunity for you to recreate yourself.
I We live in a digital world, everything is data-driven.
I Data science is also a very promising field with lots of high paying job
opportunities.
I Basic data science skills are important for personal use.
I You can use your knowledge in data science for generating side income.
9
History of Data Science
1
Figure 1: History of data science
1
https://towardsdatascience.com/the-history-of-data-science-dfe789499d50
10
Data Science Workflow
1
Microsoft
11
Data Science: Healthcare
1
Microsoft
12
Data Science: Finance
1
Microsoft
13
Data Science: Journalism
1
BBC
14
Data Science: Sports
1
Orreco
15
Data Science: Crime Prevention
1
IBM
16
Data Science: How UK Government is using it
1
blog.gov.uk 17
Data Science: Tools and Techniques
1
https://becomingadatascientist.wordpress.com/2013/07/26/choosing-a-data-science-technology-
stack-w-survey/
18
What Tools and Packages we are going to use?
19
Introduction to NumPy
What is NumPy?
NumPy is a Python library used for working with arrays.
I It also has functions for working in domain of linear algebra, fourier transform,
and matrices.
I NumPy was created in 2005 by Travis Oliphant. It is an open source project and
you can use it freely.
I NumPy stands for Numerical Python.
20
Introduction to NumPy
1
Nature: Array programming with NumPy
21
Introduction to Pandas
What is Pandas?
Pandas is a Python library used for working with data sets.
22
How Data Frame looks like?
1
W3resource
23
Introduction to Matplotlib
What is Matplotlib?
Matplotlib is a low level graph plotting library in python that serves as a visualization
utility.
24
What Matplotlib can do?
1
https://towardsdatascience.com/python-data-visualization-with-matplotlib-part-2-66f1307d42fb
25
Introduction to Scikit-Learn
What is Scikit-Learn?
Scikit-learn is a library in Python that provides many unsupervised and supervised
learning algorithms. It’s built upon some of the technology you might already be
familiar with, like NumPy, pandas, and Matplotlib! The functionality that scikit-learn
provides include:
26
Introduction to scikit-learn
1
https://scikit-learn.org/
27
Introduction to GitHub
What is GitHub?
GitHub is a code hosting platform for collaboration and version control. GitHub lets
you (and others) work together on projects.
What GitHub Repository can do?
28
What GitHub can do?
1
https://www.coursereport.com/
29
Thank you!
30