0% found this document useful (0 votes)
5 views

Python For Data Science

Uploaded by

Eric Ichaura
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Python For Data Science

Uploaded by

Eric Ichaura
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Python For Data Science

• What is data science.


• Data science is the domain of study that deals with vast
volumes of data using modern tools and techniques, to find
unseen patterns, derive meaningful information, and make
business decisions.
• Data science uses complex machine learning algorithms to
build predictive models.
• The data used for analysis can come from many different
sources and presented in various formats.
Python For Data Science
• In its most basic form, it is extracting valuable information or
insights from organized or unstructured data using business,
programming, and analysis skills.
• It is a field with many different components, including
arithmetic, statistics, computer science.
How Data science Operates.
• Data science goes through several stages which include.

• Problem statement.
• A problem statement is a clear and concise description of the
problem that needs to be solved.
• It's crucial to state or create your problem statement
accurately and distinctly.
• it defines the scope of the project and sets the direction for
the analysis.
• A well-defined problem statement will help data scientists to
focus on the relevant data, choose the appropriate methods,
and measure the success of the project.
Problem statement
• Steps to creating a problem statement.

• Identify the problem.

• Define the scope.

• State the objective.

• Formulate the question.

• Review and refine.


Data Collection:
• The next logical step after defining the problem statement is
to look for data that you might need for your model.
• Do thorough study and gather all the information you require.
Data can exist in both structured and unstructured forms.
• It could take many different shapes, including films,
spreadsheets, forms with codes, etc.
• You must compile all of these sources.
Cleaning of Data:
• The goal of data cleaning is to eliminate duplicate, redundant,
and missing data from your collection.

• With the aid of programming in either Python or R, there are


numerous tools available to do this. It is entirely up to you
which one you select .
Data analysis and
exploration:
• data structure analysis involves looking for hidden patterns,
• observing behaviors, displaying the impact of one variable
relative to others, and drawing conclusions.

With the aid of various graphs created with the use of libraries
and any programming language, we may explore the data.
• Matplotlib in Python .
• GGplo in R.
Modeling Data:
• In this section you come up with a model that will allow you to
make accurate predictions in the future .
• Here, you must pick a solid algorithm that complements your
model the best.
• Here, you must pick a solid algorithm that complements your
model the best. There are numerous types of algorithms,
including SVMs (Support vector machines), clustering,
regression, and classification.
• You might use a machine learning algorithm as your model.

• Your model is trained using the train data, and it is then tested
using the test data.
Implementation and
Optimisation
• optimization allows you to assess how well your model is
doing.
Python for data science
• Python is a high level, open source, interpreted language that
offers a fantastic approach to object-oriented programming.

• Python has excellent capabilities for working with


mathematical, statistical, and scientific functions. It offers
excellent libraries for dealing with applications of data
science.
• Because of its simplicity and ease of use, Python is one of the
most popular programming languages in the scientific and
research sectors.
Python for data science
• The Python language has the useful features listed below:
• it makes use of elegant syntax hence programs are simpler to
• read.
• The language is easy to learn, which makes it simple to get the
• application to run.
• The extensive common library and neighborhood support.
• Python's interactive mode makes it easy to test codes.
• Python makes it easy to add new modules that were created in
another compiled language, such as C++ or C, to the existing code.
• Python is a powerful language that may be integrated into other
programs to provide a customizable interface.
• Permits developers to use Linux, Windows, Mac OS X, UNIX, and
other operating systems to run their code.
Python libraries frequently
used in data science.
• Numpy:
• offers mathematical functions to manage huge
dimension arrays .
• It offers numerous Array, Metrics, and linear
algebra methods and functions .
• Numerical Python is referred to as NumPy , It
offers many practical features for n-array and
matrix operations in Python.
• NumPy makes it simple to manipulate big
multidimensional arrays and matrices .
Pandas
• It is the most widely used Python library for data manipulation
• and analysis.
• Pandas offer practical tools for working with vast amounts of
structured data. Pandas offer the simplest way to conduct
analysis.
• It offers extensive data structures and allows for the
• manipulation of time series data and numerical tables.
• Pandas is the ideal tool for handling data.
• Pandas is made to make data manipulation, aggregation, and
visualization rapid and simple.
Matplotlib
• Matplotlib offers a number of ways to visualize data more
successfully. Making line graphs, pie charts, histograms, and
other expert-level graphics is made simple with Matplotlib.

• The interactive tools in Matplotlib include zooming, planning,


and storing the Graph in a graphical format.
Scipy
• Scipy offers excellent capability for computer programming
and scientific mathematics.
• SciPy has sub-modules for common tasks in science and
engineering such optimization, linear algebra, integration,
interpolation, special functions, FFT, signal and image
processing, ODE solvers, and Statmodel.
Python syntax
• Python indentation.
• Indentation refers to the spaces at the beginning of a code
line.
• Where in other programming languages the indentation in
code is for readability only, the indentation in Python is very
important.
• Python uses indentation to indicate a block of code.
• Example.
if 5 > 2:
• print("Five is greater than two!")
Python syntax
• Python Variables.
• The tеrm 'vаrіаblеѕ' rеfеr tо thе mеmоrу lосаtіоnѕ thаt аrе
rеѕеrvеd juѕt fоr thе рurроѕе оf ѕtоrіng vаluеѕ.
• In саѕе оf Руthоn, оnе doeѕ nоt nееd tо аnnоunсе thе
vаrіаblеѕ еvеn bеfоrе mаkіng uѕе оf thеm оr еvеn аnnоunсіng
thеіr tуре.
• А vаriаble іѕ lіkе а соntаіnеr thаt ѕtоrеѕ vаluеѕ thаt уоu саn
ассеѕѕ or сhаngе.
• It іѕ а wау оf роіntіng tо а mеmоrу lосаtіоn uѕеd bу а
рrоgrаm. Yоu саn uѕе vаrіаblеѕ tо іnѕtruсt thе соmрutеr tо
ѕаvе оr rеtrіеvе dаtа tо аnd frоm thіѕ mеmоrу lосаtіоn.
Python variable
• Example of variable in python
• x=5
• y = "Hello, World!“

• x=5
• y = "John"
• print(x)
• print(y)
Python variable.
• Casting.
• x = str(3) # x will be '3'
• y = int(3) # y will be 3
• z = float(3) # z will be 3.0

• string variables can be declared either by using single or


double quotes:

• x = "John"
# is the same as
x = 'John'

You might also like