Statistics in Python

Uploaded by

Vaishak Bhuvan M R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

141 views19 pages

Statistics in Python

Uploaded by

Vaishak Bhuvan M R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 19

snesro018 43.1. Statistics in Python — Scipy lecture notes 3.1. Statistics in Python Author: Gaé! Varoquaux Requirements + Standard scientific Python environment (numpy, scipy, matplotlib) + Pandas + Statsmodels + Seaborn To install Python and these dependencies, we recommend that you download Anaconda Python or Enthought Canopy, or preferably use the package manager if you are under Ubuntu or other linux. See also: + Bayesian statistics in Python: This chapter does not cover tools for Bayesian statistics. Of particular interest for Bayesian modelling is PyMC, which implements a probabilistic programming language in Python, + Read a statistics book: The Think stats book is available as free PDF or in print and is a great introduction to statistics. Why Python for stati Risa language dedicated to statistics. Python is a general-purpose language with statistios modules. R has ‘more statistical analysis features than Python, and specialized syntaxes. However, when it comes to building ‘complex analysis pipelines that mix statistics with e.g. Image analysis, text mining, or control of a physical ‘experiment, the richness of Python is an invaluable asset, ics? Contents Data representation and interaction = Data as a table = The pandas data-frame Hypothesis testing: comparing two groups = Student's test: the simplest statistical test * Paired tests: repeated measurements on the same indivuals, Linear models, multiple factors, and analysis of variance = “formulas” to specify statistical models in Python + Multiple Regression: including multiple factors = Post-hoc hypothesis testing: analysis of variance (ANOVA) More visualization: seaborn for statistical exploration = Pairplot: scatter matrices = Implot: plotting a univariate regression Testing for interactions Full code for the figures Solutions to this chapter's exercises ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html anesnesro018 43.1. Statistics in Python — Scipy lecture notes In this document, the Python inputs are represented with the sign ">>>". Disclaimer: Gender questions ‘Some of the examples of this tutorial are chosen around gender questions. The reason is that on such questions controlling the truth of a claim actually matters to many people. 3.1.1. Data representation and interaction 3.1.1.1. Data as a table The setting that we consider for statistical analysis is that of multiple observations or samples described by a set of different attributes or features. The data can than be seen as a 2D table, or matrix, with ‘columns giving the different attributes of the data, and rows the observations. For instance, the data contained in examples/brain_size.csv: jeight"; "Height"; "MRI_Count" 71243 "118" ;"64.5" 7816932 245"."7"72.5"7 100112 emale";133;1 Le"; 1407150 opn143ey "73 38437 72"; "68.8"; 965353 3741327134; "2477 "65.0"; 951545, 3.1.1.2. The pandas data-frame We will store and manipulate this data in a pandas .DataFrame, from the pandas module, It is the Python ‘equivalent of the spreadsheet table. Itis different from a 2D numpy array as it has named columns, can contain a mixture of different data types by column, and has elaborate selection and pivotal mechanisms. Creating dataframes: reading data files or converting arrays Separator Itis a CSV file, but the separator is *;” Reading from a CSV file: Using the above CSV file that gives observations of brain size and weight and 1Q (Willerman et al. 1991), the data are a mixture of numerical and categorical values: >>> import pandas a >>> data = pandas. read_csv('examples/brain size.csv', ) -na_value “yy >o> data Unnamed: 0 FSIQ VIQ PIO We MRL_Count ° 1 133 132 124 118.0 64.5 6932 2 149 150 124 N 72.5 100112 ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html anesnesro018 43.1. Statistics in Python — Scipy lecture notes 2 150 143.0 73.3. 1038437 3 128 172.00 68.8 965: 4 32 134 147.0 65.0 9515. ‘The weight of the second individual is missing in the CSV file. If we don't specify the missing value (NA = not available) marker, we will not be able to do statistical analysis. Creating from arrays: A pandas .DataFrame can also be seen as a dictionary of 1D ‘series’, eg arrays or lists. If we have 3 numpy arrays: >>> import numpy as np >> >>> t = np.linspace(-6, 6, 20) >>> sin_t = np.sin(t) >>> cos_t = np.cos(t) We can expose them as a pandas. DataFrame: >>> pandas.DataFrame({'t': t, ‘sin': me cos sin t -279415 ~6.000000 0 0.960170 0 1 0.609977 0.792419 -5.368421 2 0.024451 0.9997 736842 3 -0.570509 0.8212 -105263 4 0.326021 -3.473684 5 -0.295030 6 -0.802257 7 -0.999967 -1. 8 -0.811882 -0.947368 Other inputs: pandas can input data from SQL, excel files, or other formats, See the pandas documentation, Manipulating data data is a pandas .DataFrame, that resembles R's dataframe: >>> data.shape #40 rows and 8 columns (40, 8) >>> data.columns # It has columns Index([u'Unnamed: 0', u'Gender', u'FSIQ', u'VIO', ut eight", u'MRI_Count'], dtype='object') >>> print (data! 0 Female 1 Male nder']) # Columns can be addressed by name ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html anesnesro018 43.1. Statistics in Python — Scipy lecture notes 2 Male 3 Male 4 Female >>> # Simpler selector >>> data[data['Gender'] 109. ‘Female'] ['VIQ"] -mean() Note: For a quick view on a large dataframe, use its describe method: pandas .DataFrame.describe() groupby: splitting a dataframe on values of categorical variables: >>> groupby_gender = data.groupby('Gender') a >>> for gender, value in groupby gender{'Vi0"]: print ((gender, value.mean()}) ('Female', 109.45) (Male', 115.25) groupby_gender is a powerful object that exposes many operations on the resulting group of dataframes: >>> groupby_gender.mean() a Unnamed: 0 FSIQ VIO Pro Weight Height MRT_Count 11.9 109.45 110.45 137200000 65.765000 862654.6 15.0 115.25 111.60 166.444444 71.431579 95485: Use tab-completion on groupby_genderto find more. Other common grouping functions are median, count (useful for checking to see the amount of missing values in different subsets) or sum. Groupby evaluation is lazy, no work is done until an aggregation function is applied. Exercise so Fert nt + What is the mean value for VIQ for the wal L i full population? Lt ' How many males/females were included oa in this study? Hint use ‘tab completion’ to find out the, methods that can be called, instead of * ‘mean’ in the above example. T : T : What is the average value of MRI counts "| ai oy expressed in log units, for males and a a gig females? Note: groupby_gender.boxplot is used for the plots above (see this example). ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html angsnesro018 43.1. Statistics in Python — Scipy lecture notes Plotting data Pandas comes with some plotting tools (pandas. tools. plotting, using matpiotib behind the scene) to display statistics of the data in dataframes: Scatter matrices: >>> from pandas.tools import plotting p> >>> plotting. scatter_matrix(data[{'Weight', ‘Height', ‘MRI Count']]) a weg gem a ee S wowl oe “Bae ff bb ettertere s see 3 Weight Height PEgGGEG >>> plotting. scatter_matrix(data[['PI0", ' "es19"]1) >» ‘Two populations The IQ metrics are bimodal, as if there are 2 sub-populations, ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html snosnesro018 43.1. Statistics in Python — Scipy lecture notes PIQ vig FSIQ 5 Exercise Plot the scatter matrix for males only, and for females only, Do you think that the 2 sub-populations correspond to gender? 3.1.2. Hypothesis testing: comparing two groups For simple statistical tests, we will use the scipy.stats sub-module of scipy: >>> from scipy import stats p> See also: Scipy is a vast library. For a quick summary to the whole library, see the scipy chapter. 3.1.2.1. Student’s t-test: the simplest statistical test 1-sample t-test: testing the value of a population mean scipy.stats.ttest_isamp() tests if the population mean of data is likely to be equal to a given value (technically if observations are drawn from a Gaussian distributions of given population mean). It returns the T statistic, and the p-value (see the function’s help): Oly Oy pp? 30.088099970..., >>> stats.ttest_isamp (data Ttest_lsampResuit (statist ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html enesnesro018 43.1. Statistics in Python — Scipy lecture notes pvalue=1.32891964...2-28) oe With a p-value of 10*-28 we can claim that the population mean for the [mx 1Q (VIQ measure) is not 0 “ 2-sample t-test: testing for difference across populations We have seen above that the mean VIQ in the male and female populations were different. To test if this is significant, we do a 2-sample t-test with scipy.stats.ttest_ind() >>> female_viq = data[data['Gender'] ‘'Female'] ['VIQ"] ae >>> male_vig = data{datal'Gender'] Male" ]['VIO"l >>> stats.ttest_ind(female vig, male_viq) Teest_indResult (statistic=-0.77261617232..., lue=0.4445287677858...) 3.1.2.2. Paired tests: repeated measurements on the same indivuals PIQ, VIQ, and FSIQ give 3 measures of IQ. Let us test if 150 FISQ and PIQ are significantly different, We can use a2 44 > T sample test: i 130 >>> stats.ttest_ind(data['Psi0'l, datat 120 "PIQ"]) no ‘Ttest_indResult (statistic=0. 465637596 38..., pvalue=0. 64277250...) ae 90 80 i i The problem with this approach is that it forgets that there 7p : are links between observations: FSIQ and PIQ are F5IQ PQ measured on the same individuals. Thus the variance due to inter-subject variability is confounding, and can be removed, using a “paired test", or “repeated measures test’ >>> stats.ttest_rel(data['Prsi0'], data['P10"]) pe Teest_relResult (statistic-1.784201940..., pvalue-0.082172638183...) This is equivalent to a 1-sample test on the difference: 20 8 + >>> state.ctest_isanp(dataf'Fsto') - datap >?) ! 'PIQ"], 0) Tt isampResult (st. ti «784201940... a pvalue=0.082172638.. ol 5 -10 ' as ae F5IO = FO T-tests assume Gaussian errors. We can use a Wilcoxon signed-rank test, that relaxes this assumption: ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html 79snesro018 43.1. Statistics in Python — Scipy lecture notes >>> stats.wilcoxon(data['Psi0 2? 1, datat'Pi0"l) WilcoxonResult (statis' . pvalue=0.106594927...) Note: The corresponding test in the non paired case is the Mann—Whitney U test, scipy. stats mannwhitneyu(). Exercise + Test the difference between weights in males and females. + Use non parametric statistics to test the difference between VIQ in males and females. Conclusion: we find that the data does not support the hypothesis that males and females have different VIQ. 3.1.3. Linear models, multiple factors, and analysis of variance 3.1.3.1. “formulas” to specify statistical models in Python A simple linear regression Given two set of observations, x and y, we want to test the hypothesis that y is a linear function of x. In other . terms: y = « * coef + intercept + e where e is observation noise. We will use the statsmodels module to: 1. Fit a linear model. We will use the simplest strategy, ordinary least squares (OLS). 2. Test that coef is non zero. First, we generate simulated data according to the mode: ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html anosnesro018 >>> import numpy as np >>> x = np. Linspace(-5, >>> np. random. seed (1 >>> # normal distributed noise >>> y= -5 + 3*x + 4 * np. random.normal (siz -shape) 43.1. Statistics in Python — Scipy lecture notes >>> # Create a data frame containing all the relevant variables >>> data = pandas.DataFrame({'x': x, ‘y't y)) “formulas” for statistics in Python See the statsmodels documentation Then we specify an OLS model and fit it: >>> from statsmodels.formula.api import ols >>> model = ols("y ~ x", data).fit() We can inspect the various statistics derived from the fit >>> print (model. summary ()) OLS Regression Results Dep. Variable: y squared: 0.804 Model: OLS Adj. R-squared: 794 Method: Least Squares 14.03 Date: 08 Time kelihood: -57.988 No. Observ: 20 AIC 120.0 be duals: 18 122.0 DE Mode Covarian’ ce Type: Intercept coef std err t P>|t 7 357 654 Omnibus: 0.100 Durbin-Watson 2.956 Prob (Omnibus) : 0.95 rque-Bera 0.322 -0.058 Prob (JB) 2.390 Cond. No. ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html 8.86e- 95.0% Conf. 2.220 snosnesro018 43.1. Statistics in Python — Scipy lecture notes Warning [1] Standard Errors assume that the covariance matrix of the errors is correctly specified Terminology: Statsmodels uses a statistical terminology: the y variable in statsmodels is called ‘endogenous’ while the x variable is called exogenous. This is discussed in more detail here. To simplify, y (endogenous) is the value you are trying to predict, while x (exogenous) represents the features you are using to make the prediction. Exercise Retrieve the estimated parameters from the model above. Hint: use tab-completion to find the relevent attribute. Categorical variables: comparing groups or multiple categories Let us go back the data on brain size: >>> data = pandas. read_csv('examples/brain_size.csv', » na_value wemy We can write a comparison between IQ of male and female using a linear model: >>> model = ols("VIQ ~ Gender + 1", data).£it() ee >>> print (model. summary ()) OLS Regression Results Dep. Variable: vio ared: 0.015 Model: OLS Adj. R-squared: 0.010 jethod: Least Squares istic 0.5969 Date: statistic): 0.445 Time: g-Likelihood: 40 Alc: 38 BIC: Type: nonrobust t P>itl [95.0% Cong. Int.] Intercept 109.4500 : 20.619 0.000 98.704 120.196 7.507 0.773 0.445 -9.397 omnibus: 26.188 Durbin-Watson: ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html sonosnesro018 43.1. Statistics in Python — Scipy lecture notes 1.709 Prob (Omnibus) : 0.000 703 Skew: 0.010 1.510 Cond. No. Warnings: [1] Standard Errors assume that the cov: correctly specified. ance matrix of the errors is Tips on specifying model Forcing categorical: the ‘Gender’ is automatically detected as a categorical variable, and thus each of its different values are treated as different entities. An integer column can be forced to be treated as categorical using: >>> model = ols('VIQ ~ C(Gender)', data).£it() > Intercept: We can remove the intercept using - 7 in the formula, or force the use of an intercept using +1. By default, statsmodels treats a categorical variable with K possible values as K-1 ‘dummy’ boolean variables (the last level being absorbed into the intercept term). This is almost always a good default choice - however, itis possible to specify different encodings for categorical variables, (http://statsmodels. sourceforge. net/devel/contrasts. html). Link to t-tests between different FSIQ and PIQ To compare different types of IQ, we need to create a “long-form” table, listing IQs, where the type of IQ is indicated by a categorical variable: >>> data_fisa = pandas.DataFrame(('ic': data['Si0'], ‘type': 'fsiat}) >? >>> data_piq ~ pandas.DataFrame({'iq': data['PIQ'], ‘type’: 'pig'}) >>> data_long = pandas.concat ((data_fisq, data_piq) ) >>> print (data_long) iq type 0 133° fsig 140 fi 2 139 fsig pig 32, pig 33 pig >>> model = ols ("ia ~ type", data_long) .£it() >>> print (model. summary ()) OLS Regression Results Ppt [95.0% Con. Int.] ntp:!tocathos:886/view/DacumentsTraining/Python/advancediadvanced._pythorvindex html a9snesro018 43.1. Statistics in Python — Scipy lecture notes 3.683 30.807 0.000 106.119 5.208 0.466 0.643 -12.793 We can see that we retrieve the same values for test and corresponding p-values for the effect of the type of iq than the previous t-test: o> stats.ttest_ind(data['Fs19'], data['P70']) >>| Teest_indResult (statistic=0.46563759638..., pvalue=0.64277250...) 3.1.3.2. Multiple Regression: including multiple factors Consider a linear model explaining a variable z (the dependent variable) with 2 variables x and y: 2=Ly+yet+it+e Such a model can be seen in 3D as fitting a plane to a cloud of (x, y, z) points. Example: the iris data (examples/iris.csv) ‘Sepal and petal size tend to be related: bigger flowers are bigger! But is there in addition a systematic effect of species? ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html sanesnesro018 43.1. Statistics in Python — Scipy lecture notes blue: setosa, green: versicolor, red: virginica Bie 260 ass BE Ras Biss Bos & 29 ge Bs 3s Bis Re? 33 gx 21s se Hos sepal_length sepal_width --Ptaength petal_width >>> data - pandas. read_csv('examples/iris.csv') >> >>> model ~ ols("sepal width ~ petal_length", data) .£it() >>> print (model.summary()) OLS Adj. Resquared: Least Squares F-statistic: Prob (Fstatistic): 1es8e- Time: . Log-Likelihood: -38.185 No. observations: 150 arc: 84.37 bf Residuals: 146 BIC: 96.4 DE Model: 3 Covariance Ty nonzobus std err tb PSltl (95.08 cont inte 0.099 29.989 0.000 2.785 3.178 name(T.versicolor] -1.4821 0.181 8.190 ooo 2.840 124 name [T.virgii 0.256 -6.502 0.000 -2.169 158 0.2983 0.062 4.920 0,000 0.178 ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html sanosnesro018 43.1. Statistics in Python — Scipy lecture notes 0.418 omnibus: 2.868 Durbin 1.783 Prob (Omnibus) : 0.238 garqu 2.885 Skew: -0.082 Prob(JB): 0.236 kurtosis: 3.659 Cond. No. 4.0 Warnings [1] Standard Errors assume correctly specified. 3.1.3.3. Post-hoc hypothesis testing: analysis of variance (ANOVA) In the above iris example, we wish to test if the petal length is different between versicolor and virginica, after removing the effect of sepal width. This can be formulated as testing the difference between the coefficient associated to versicolor and virginica in the linear model estimated above (it is an Analysis of Variance, ANOVA). For this, we write a vector of ‘contrast’ on the parameters estimated: we want to test “name[T. versicolor] - name[T.virginica]", with an F-test: >>> print (model.f test({0, 1, -1, 91)) >

>> print data >> EDUCATION SOUTH SEX EXPERIENCE UNION WAGE AGE RACE \ 0 8 o 1 21 0 0.707570 35 2 1 9 oo 1 42 0 0.694605 57 3 2 12 o 0 1 0 0.824126 19 3 3 12 a) 4 0 0.602060 22 3 3.1.4.1. Pairplot: scatter matrices We can easily have an intuition on the interactions between continuous variables using seaborn.pairplot() to display a scatter matrix: >>> import seaborn >>>] >>> seaborn.pairplot (data, vars=['WA cee kind='reg') ee Wace EDUCATION 4500 0 ODA HHHHM oO Ss 0 6 WAGE ace EDUCATION Categorical variables can be plotted as the hue: >>> seaborn.pairplot (data, var: >» a kind='reg', ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html ssh9snesro018 43.1. Statistics in Python — Scipy lecture notes Male 20 Effect af gender: 1=Female, Wace nce EDUCATION "os 00 05 1 15 2D HTH HH Mo 6 OD % WAGE ace EDUCATION Look and feel and matplotlib settings ‘Seabom changes the default of matplotlib figures to achieve a more “modem'”, “excelike” look. It does that upon import. You can reset the default using: >>> from matplotlib import pyplot as plt me >>> plt.redefaults () To switch back to seabom settings, or understand better styling in seabor, see the relevent section of the seaborn documentation. 3.1.4.2. lmplot: plotting a univariate regression A regression capturing the relation between one variable and another, eg wage and eduction, can be plotted using seaborn.1mplot() >>> seaborn.Implot (y='WAGE', x=" ‘ATION’, data=data) 2 ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html t6n9snesro018 43.1. Statistics in Python — Scipy lecture notes 20 wace ° 5 » ® » eDucATION Robust regression Given that, in the above plot, there seems to be a couple of data points that are outside of the main cloud to the right, they might be outliers, not representative of the population, but driving the regression. To compute a regression that is less sentive to outliers, one must use a robust model. This is done in seaborn using robust=True in the plotting functions, or in statsmodels by replacing the use of the OLS by a “Robust Linear Model’, statsmodels. formula. api.r1m() 3.1.5. Testing for interactions 20 wage 00 . 05 ° 5 0 6 » education Do wages increase more with education for males than females? ‘The plot above is made of two different fits. We need to formulate a single model that tests for a variance of slope across the to population. This is done via an “interaction” ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html anesnesro018 43.1. Statistics in Python — Scipy lecture notes >>> result = sm.ols(formul wage ~ education + gender + education * gendes”? _ datasdata) .£it() >>> print (result. summary ()) stderr t P>|t| [95.08 Conf. Intercept 0.2998 4.173 0.000 o 0.441 gender [T.male 0.2750 2.972 0.003 0.093 0.457 education 0.0415 7.647 0.000 0.031 0.052 education:gender [T. -0.0134 “1.919 0.056 -0.027 0.000 Can we conclude that education benefits males more than females? Take home messages + Hypothesis testing and p-value give you the significance of an effect / difference + Formulas (with categorical variables) enable you to express rich links in your data + Visualizing your data and simple model fits matters! + Conditioning (adding factors that can explain all or part of the variation) is important modeling aspect that changes the interpretation. 3.1.6. Full code for the figures Code examples for the statistics chapter. BEEEEE Sra Boxplots and paired Plotting simple quant Analysis of Iris petal and differences of a pandas dataframe sepal sizes ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html reno43.1. Statistics in Python — Scipy lecture notes wi “peat Simple Regression Multiple Regression Test for an education/gender interaction in wages snesro018 Abe he ¥ wt ee ee. Visualizing factors Air fares before and after influencing wages ont 3.1.7. Solutions to this chapter’s exercises Relating Gender and IQ Download all examples Download all examples in python source code: in Jupyter notebooks: auto_exanples_python.zip auto_exanples_jupyter.2ip Generates by Sphinx-Gallery ntp:tocathos:886!view/DocumentsTraining/Python/advancediadvanced._pythorvindex html 199

3.1. Statistics in Python - Scipy Lecture Notes
No ratings yet
3.1. Statistics in Python - Scipy Lecture Notes
20 pages
Ss Project With Python
No ratings yet
Ss Project With Python
9 pages
Week 2 Part 1 Inferential Statistics 1 Self Paced TutorialsUpload
No ratings yet
Week 2 Part 1 Inferential Statistics 1 Self Paced TutorialsUpload
16 pages
Unit 5
No ratings yet
Unit 5
93 pages
Asset-V1 VIT+MBA109+2020+type@asset+block@Introductio To ML Using Python
No ratings yet
Asset-V1 VIT+MBA109+2020+type@asset+block@Introductio To ML Using Python
7 pages
Machine Learning Lab Word 12-1-2025. Document
No ratings yet
Machine Learning Lab Word 12-1-2025. Document
68 pages
LAB MANUAL ML R22
No ratings yet
LAB MANUAL ML R22
27 pages
Learning Pythons Basic Statistics With ChatGPT
No ratings yet
Learning Pythons Basic Statistics With ChatGPT
4 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
ML Programs
No ratings yet
ML Programs
41 pages
Dokumen - Pub Python 3 Module Examples
No ratings yet
Dokumen - Pub Python 3 Module Examples
109 pages
cs3362 Foundations of Data Science Lab Manual
No ratings yet
cs3362 Foundations of Data Science Lab Manual
53 pages
Statistics and Risk Modelling Using Python
No ratings yet
Statistics and Risk Modelling Using Python
99 pages
Stat 1
No ratings yet
Stat 1
41 pages
ML-Lab Manual - NEP - DSS
No ratings yet
ML-Lab Manual - NEP - DSS
23 pages
Machinelearning - Lab Manual
No ratings yet
Machinelearning - Lab Manual
26 pages
Unit 2 1
No ratings yet
Unit 2 1
54 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
Dsbda Lab
100% (1)
Dsbda Lab
76 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
ML Lab - Abbs
No ratings yet
ML Lab - Abbs
23 pages
Data Mining and Predictive Modelling Assignment
No ratings yet
Data Mining and Predictive Modelling Assignment
34 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
Statistics Fundamentals With Python
No ratings yet
Statistics Fundamentals With Python
771 pages
Introduction To Statistics in Python
100% (2)
Introduction To Statistics in Python
211 pages
Cheatsheetforstatistics
No ratings yet
Cheatsheetforstatistics
4 pages
STATA Basics Regression and Panal Data
100% (1)
STATA Basics Regression and Panal Data
26 pages
Data Science Programs
No ratings yet
Data Science Programs
6 pages
More On Pandas
No ratings yet
More On Pandas
51 pages
Random Variable
No ratings yet
Random Variable
10 pages
ML Lab Manual
No ratings yet
ML Lab Manual
21 pages
1 - Introduction - Jupyter Notebook
No ratings yet
1 - Introduction - Jupyter Notebook
5 pages
Statistics
No ratings yet
Statistics
163 pages
Exp 1
No ratings yet
Exp 1
22 pages
Data Analysis
No ratings yet
Data Analysis
20 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
12 pages
Unit 4
No ratings yet
Unit 4
105 pages
Unit-2 Feature Selection
No ratings yet
Unit-2 Feature Selection
92 pages
INeuron ML Practical Assignments
No ratings yet
INeuron ML Practical Assignments
14 pages
r20 Datamining Lab (2-2 Sem Lab)
No ratings yet
r20 Datamining Lab (2-2 Sem Lab)
41 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
ML Lab - Manual
No ratings yet
ML Lab - Manual
15 pages
ML LabManual
No ratings yet
ML LabManual
16 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Week2 Lab
No ratings yet
Week2 Lab
8 pages
ML Lab Manual With Statistical Formulas
No ratings yet
ML Lab Manual With Statistical Formulas
9 pages
Programming For Data Science
No ratings yet
Programming For Data Science
48 pages
Principles of AI Laboratory Varshadr
No ratings yet
Principles of AI Laboratory Varshadr
54 pages
Chapter 4 - Python For Data Analysis
No ratings yet
Chapter 4 - Python For Data Analysis
47 pages
ML Lab Manual
No ratings yet
ML Lab Manual
37 pages
ML in Python
No ratings yet
ML in Python
15 pages
Exp5ids Merged
No ratings yet
Exp5ids Merged
7 pages
cs3362 Foundations of Data Science Lab Manual
75% (8)
cs3362 Foundations of Data Science Lab Manual
53 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
Tutorial Data Visualization Pandas Matplotlib Seaborn
No ratings yet
Tutorial Data Visualization Pandas Matplotlib Seaborn
32 pages
Unit 3
No ratings yet
Unit 3
45 pages
Chapter 3 Introduction To Data Science A Python Approach To Concepts, Techniques and Applications
No ratings yet
Chapter 3 Introduction To Data Science A Python Approach To Concepts, Techniques and Applications
22 pages
Pandas1 1
No ratings yet
Pandas1 1
1 page

Statistics in Python

Uploaded by

Statistics in Python

Uploaded by

You might also like