R-Python-MATLAB 7/19/2023
Collaborative Online Training Program on
“Prospects and Applications of Artificial Intelligence in Livestock Sector”
INTRODUCTION TO PYTHON, R AND
MATLAB
CS Mukhopadhyay, PhD
Senior Scientist
College of Animal Biotechnology,
Guru Angad Dev Veterinary and Animal Sciences University 1
R-Python-MATLAB 7/19/2023
INTRODUCTION
Programming language: A set of computer understandable
instructions (in form of syntactical notations) to execute some
task by the computer system
Two components of PL:
Semantics: meaning assigned to a code by strings and syntax
Syntax: writing rules, including use of symbols, indentations, etc
Low vs. High-level languages:
Low-level PL: computer understandable binary language
High level PL: human understandable English code, to be
assembled/translated to binary
2
7/19/2023
GENERATIONS OF PROG. LANGUAGES
First Generation PL: the machine level languages, now not
used
Entered through the front panel system of the computer
system
Second Generation PL: These are assembly languages
Codes are read & written by programmer
2nd GLs: used in kernels & device drivers, need
extremely intensive processing (viz. games, graphics etc)
3 GL: these are programmer friendly & more refined version
of the 2GLs.
It is also called the High Level Language
Example: ALGOL, COBOL, FORTRAN, later, C, C++, C#,
Java etc.
3
7/19/2023
GENERATIONS OF PROG. LANGUAGES
4GL: Its improved version of the 3GLs that attempt to get
closer to human language, form of thinking &
conceptualization
4GLs: high-level specification programming language or
environment
Example: SQL, R, SPSS, SAS etc.
5 GL: a programming language based on solving problems,
using constraints given to the program, rather than using
an algorithm written by a programmer
AI is the main concern in developing the 5GLs
Example: Prolog, OPS5, & Mercury
4
R-Python-MATLAB 7/19/2023
R PROGRAMMING ENVIRONMENT
R has been used primarily in academics & research
Mainly used for statistical analysis and exploratory data analysis.
R is a widely used tool for making beautiful graphs & visualizations.
R packages are available at:
Comprehensive R Archive Network (CRAN)
Bioconductor: An open source repository for packages on bioinformatics.
GitHub: Web-based Git repository hosting service.
Most of the R libraries have dependencies
Less popular than Python for deep learning & NLP
5
R-Python-MATLAB 7/19/2023
R PROGRAMMING ENVIRONMENT
Integrated development environment (IDE): either through command line or
RStudio (an integrated environment comprising of a data editor, debugging
support, & a graphics-window).
Anaconda IDE is also used but less popular than using Anaconda for Python
(jupyter notebook and Spyder / Scientific PYthon Development EnviRonment
for Python)
Jupyter Notebook: an application, used for creating and editing scripts to
display the input and output of a Python or R language script. The notebook
files
R is preferred for standalone computing or related analysis
To initiate R in jupyter: open Anaconda prompt 3 and install R using the
following code: “C:\Users\csmbi>conda install -c r r-irkernel”
6
R-Python-MATLAB 7/19/2023
R PROGRAMMING ENVIRONMENT
R Studio
R Console
7
R-Python-MATLAB 7/19/2023
PYTHON LANGUAGE
1991: Python was released as a general-purpose, interpreted
high-level programming language whose design philosophy
emphasizes code readability.
It was developed by Guido van Rossum
8
R-Python-MATLAB 7/19/2023
PYTHON LANGUAGE
Python is preferred by programmers, data analysts, data scientists
and by developers.
Coding & debugging is easy because of the simple syntax.
The indentation of code is critical as any change could alter its
meaning
Python is preferred when Web-integration of data analysis tasks by
web apps
Python is used more because of its superior code readability,
speed, & many functionalities, syntactically clear & elegant, easily
interpretable, & easy to type
9
R-Python-MATLAB 7/19/2023
PYTHON LANGUAGE
Python: suitable for most of mathematical computation & learning how
algorithms work
The Python Package Index (PyPi) & Anaconda are repositories of Python
software with all libraries
BioPython is for bioinformatics analysis.
Data Visualizations are more convoluted in Python than in R, & results are
less eye-pleasing or informative
Jupyter Notebooks & Spyder are popular, & mostly preferred IDEs in Python
Python package rpy2 which is 'an interface to R running embedded in a
Python process' or it allows you to call R from Python
10
R-Python-MATLAB 7/19/2023
MATLAB
1984: MATLAB was developed by MathWorks,
MATLAB allows matrix manipulations, plotting of functions &
data, implementation of algorithms,
It is used for creation of user interfaces, & interfacing with
programs written in other languages, including C, C++, Java,
& Fortran
It is a commercial software
11
R-Python-MATLAB 7/19/2023
ANACONDA NAVIGATOR
12
R-Python-MATLAB 7/19/2023
JUPYTER NOTEBOOK
13
R-Python-MATLAB 7/19/2023
CONFUSION MATRIX GENERATED (BY PYTHON)
14
R-Python-MATLAB 7/19/2023
POPULAR LIBRARIES AND PACKAGES: R
dplyr, tidyr and data.table to easily manipulate data
stringr to manipulate strings
zoo to work with regular and irregular time series
ggplot2 to visualize
datacaret for machine learning.. And many
15
R-Python-MATLAB 7/19/2023
POPULAR LIBRARIES AND PACKAGES: R
pandas to easily manipulate
statsmodels to explore data, estimate statistical models, and
perform statistical tests and unit tests
scipy and numpy for scientific computing
scikit-learn for machine learning
matplotlib and seaborn to make graphics
biopython for sequence analysis
16
R-Python-MATLAB 7/19/2023
MATLAB TOOLBOXES: AI-ML-DL
Bioinformatics Toolbox
Database Toolbox
Deep Learning Toolbox
Medical Imaging Toolbox
Signal Processing Toolbox
Statistics and Machine Learning Toolbox
17
R-Python-MATLAB 7/19/2023
COMPARATIVE BETWEEN R & PYTHON CODING
R-Coding Python
18
R-Python-MATLAB 7/19/2023
LET US DO SOME PRACTICAL HANDS ON
1. Installation of R and Anaconda
2. Create one dataframe in R and Python
3. Import *.csv file using R and Python
19
R-Python-MATLAB 7/19/2023
1. INSTALLATION OF R AND ANACONDA
R Download & Installation (Windows OS):
https://cran.r-project.org/bin/windows/base/
Install R (Linux OS):
sudo apt update
sudo apt install r-base r-base-dev –y
Download and Install Anaconda (Windows OS):
https://www.anaconda.com/
Install Anaconda (Linux OS):
Multiple steps for various kinds of Linux at:
https://docs.anaconda.com/free/anaconda/install/linux/
20
R-Python-MATLAB 7/19/2023
2. CREATE ONE DATAFRAME IN R AND PYTHON
R:
DF_Name =
data.frame(
Name=c("AA", "BB", "CC", 'DD', 'EE', 'FF'),
Age= c(20, 25, 30, 20, 20, 30),
Ph.D.= c(TRUE, FALSE, T, T, TRUE, F))
Python:
import pandas as pd
DF_Name_0 =
pd.DataFrame({
"Name" : ["AA", "BB", "CC", "DD", "EE", "FF"],
"Age" : [20, 25, 30, 20, 20, 30],
"Ph.D.": [True, False, True, "TRUE", "T", "FALSE"]})
21
R-Python-MATLAB 7/19/2023
DF_Name = data.frame(Name=c("AA", "BB", "CC",
'DD', 'EE', 'FF'), Age= c(20, 25, 30, 20, 20, 30),
Ph.D.= c(TRUE, FALSE, T, T, TRUE, F))
22
R-Python-MATLAB 7/19/2023
3. IMPORT *.CSV FILE USING R AND PYTHON
R:
DF_Name<- read.csv(“path/file_name.csv",
header = FALSE, sep = "\t")
Python:
import pandas
DF_Name = pandas.read_csv(“path/1_Age_Wt_Ht.csv")
DF_Name
23
IASRI Talk: GWAS Hands On 22/02/2016
ACKNOWLEDGMENTS: PROF JAMES REECY, DR. JAMES KOLTES , DR. J. R. TAIT & ERIC
FRITZ ALONG WITH TWO OTHER TRAINEES FROM PAKISTAN (DR SAIF-UR RAHMAN) AND
CHINA (XEURONG YANG)
24
22/02/2016
Thanks for your patient hearing
IASRI Talk: GWAS Hands On
25