0% found this document useful (0 votes)

7 views20 pages

Pandas Py

The document demonstrates the use of pandas and numpy libraries in Python for data manipulation and analysis. It includes creating DataFrames, reading from and writing to CSV files, and performing basic operations like describing data, indexing, and modifying values. Additionally, it showcases handling of large datasets and provides examples of generating random data.

Uploaded by

vinaysikarwar199

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views20 pages

Pandas Py

Uploaded by

vinaysikarwar199

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

In [1]: import numpy as np

import pandas as pd

In [2]: dict1 = {
"name":['harry', 'rohan','skillf','shubh'],
"marks":[92,34,24,17],
"city":['rampur','kolkata','barelly','antarctica']
}

In [3]: df = pd.DataFrame(dict1)

In [4]: df

Out[4]: name marks city

0 harry 92 rampur

1 rohan 34 kolkata

2 skillf 24 barelly

3 shubh 17 antarctica

In [5]: df.to_csv('friends.csv')

In [6]: df.to_csv('friends_index_false.csv ', index = False)

In [7]: # if we have millions of lines in data

In [8]: df.head(2)

Out[8]: name marks city

0 harry 92 rampur

1 rohan 34 kolkata

In [9]: df.tail(2)

Out[9]: name marks city

2 skillf 24 barelly

3 shubh 17 antarctica

In [10]: df.describe()

Loading [MathJax]/extensions/Safe.js
Out[10]: marks

count 4.00000

mean 41.75000

std 34.21866

min 17.00000

25% 22.25000

50% 29.00000

75% 48.50000

max 92.00000

In [11]: vinay = pd.read_csv('vinay.csv') # to read data

In [12]: vinay

Out[12]: Unnamed: 0.1 Unnamed: 0 train no. speed city

0 0 0 1521644 50 rampur

1 1 1 24165 34 kolkata

2 2 2 54876 24 barelly

3 3 3 5157 17 antarctica

In [13]: vinay['speed'][0] = 50

C:\Users\vinay\AppData\Local\Temp\ipykernel_12824\473427975.py:1: SettingWithCopyWarnin
g:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_

guide/indexing.html#returning-a-view-versus-a-copy
vinay['speed'][0] = 50

In [14]: vinay

Out[14]: Unnamed: 0.1 Unnamed: 0 train no. speed city

0 0 0 1521644 50 rampur

1 1 1 24165 34 kolkata

2 2 2 54876 24 barelly

3 3 3 5157 17 antarctica

In [15]: vinay.to_csv('vinay.csv')

In [16]: vinay.index = ['first','second','third','fourth']

In [17]: vinay

Loading [MathJax]/extensions/Safe.js
Out[17]: Unnamed: 0.1 Unnamed: 0 train no. speed city

first 0 0 1521644 50 rampur

second 1 1 24165 34 kolkata

third 2 2 54876 24 barelly

fourth 3 3 5157 17 antarctica

In [18]: ser = pd.Series(np.random.rand(34))

In [19]: type(ser)

pandas.core.series.Series
Out[19]:

In [20]: newdf = pd.DataFrame(np.random.rand(334,5), index=np.arange(334))

In [21]: newdf.head()

Out[21]: 0 1 2 3 4

0 0.192439 0.483302 0.182232 0.109495 0.346556

1 0.072344 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [22]: type(newdf)

pandas.core.frame.DataFrame
Out[22]:

In [23]: newdf.describe()

Out[23]: 0 1 2 3 4

count 334.000000 334.000000 334.000000 334.000000 334.000000

mean 0.511220 0.502170 0.515231 0.514727 0.501599

std 0.289863 0.280761 0.293557 0.272481 0.291661

min 0.009534 0.000230 0.009467 0.004386 0.002222

25% 0.275510 0.284415 0.254842 0.301877 0.229439

50% 0.526640 0.513685 0.526909 0.508590 0.516764

75% 0.767359 0.739335 0.781234 0.742986 0.753503

max 0.997394 0.996811 0.999285 0.998439 0.999803

In [24]: newdf.dtypes

0 float64
Out[24]:
1 float64
2 float64
3 float64
4 float64
dtype: object

In [25]: newdf[0][1] = 'vinay'

Loading [MathJax]/extensions/Safe.js
C:\Users\vinay\AppData\Local\Temp\ipykernel_12824\4287450646.py:1: FutureWarning: Settin
g an item of incompatible dtype is deprecated and will raise in a future error of panda
s. Value 'vinay' has dtype incompatible with float64, please explicitly cast to a compat
ible dtype first.
newdf[0][1] = 'vinay'

In [26]: newdf.head()

Out[26]: 0 1 2 3 4

0 0.192439 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [27]: newdf.index

Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
Out[27]:
...
324, 325, 326, 327, 328, 329, 330, 331, 332, 333],
dtype='int32', length=334)

In [28]: newdf.columns

RangeIndex(start=0, stop=5, step=1)

Out[28]:

In [29]: newdf.to_numpy()

array([[0.19243897629678863, 0.4833016054951558, 0.18223248119149482,

Out[29]:
0.10949487441382522, 0.346555717762674],
['vinay', 0.358511057712223, 0.8361359540599419,
0.38920071958360003, 0.6622558371512339],
[0.3511260190014004, 0.4535179465768121, 0.5329625751629071,
0.8060513324243946, 0.8801421656725747],
...,
[0.2833121022519195, 0.8041833005905062, 0.30184328883816447,
0.33450997341497823, 0.09415712001759435],
[0.6543592257723887, 0.5571194761629852, 0.24589863402724477,
0.9873811670345046, 0.7192368401412679],
[0.6643166221995344, 0.725229517706132, 0.19252707794502544,
0.38162343584405134, 0.4854364965153011]], dtype=object)

In [30]: newdf[0][0]= 0.3

In [31]: newdf.head()

Out[31]: 0 1 2 3 4

0 0.3 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [32]: newdf.T

Loading [MathJax]/extensions/Safe.js
Out[32]: 0 1 2 3 4 5 6 7 8 9 ...

0 0.3 vinay 0.351126 0.808912 0.121119 0.541671 0.810778 0.013301 0.970215 0.834933 ... 0.443

1 0.483302 0.358511 0.453518 0.194086 0.840377 0.332581 0.49378 0.546343 0.357016 0.844727 ... 0.215

2 0.182232 0.836136 0.532963 0.2441 0.933503 0.743576 0.173255 0.78586 0.456049 0.842426 ... 0.821

3 0.109495 0.389201 0.806051 0.224745 0.33241 0.498823 0.027296 0.580119 0.22295 0.937127 ... 0.761

4 0.346556 0.662256 0.880142 0.603455 0.57951 0.498658 0.963489 0.033478 0.524955 0.784691 ... 0.611

5 rows × 334 columns

In [33]: newdf.head()

Out[33]: 0 1 2 3 4

0 0.3 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [34]: newdf.sort_index(axis=0, ascending=False)

Out[34]: 0 1 2 3 4

333 0.664317 0.725230 0.192527 0.381623 0.485436

332 0.654359 0.557119 0.245899 0.987381 0.719237

331 0.283312 0.804183 0.301843 0.334510 0.094157

330 0.168163 0.853079 0.751411 0.833227 0.176438

329 0.759106 0.047294 0.450999 0.568085 0.224133

... ... ... ... ... ...

4 0.121119 0.840377 0.933503 0.332410 0.579510

3 0.808912 0.194086 0.244100 0.224745 0.603455

2 0.351126 0.453518 0.532963 0.806051 0.880142

1 vinay 0.358511 0.836136 0.389201 0.662256

0 0.3 0.483302 0.182232 0.109495 0.346556

334 rows × 5 columns

In [35]: newdf.head()

Out[35]: 0 1 2 3 4

0 0.3 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

Loading [MathJax]/extensions/Safe.js
In [36]: type(newdf[0])

pandas.core.series.Series
Out[36]:

In [37]: newdf2 = newdf #Newdf2 is only a view , will not copy

In [38]: newdf2[0][0]= 5498

In [39]: newdf

Out[39]: 0 1 2 3 4

0 5498 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133

330 0.168163 0.853079 0.751411 0.833227 0.176438

331 0.283312 0.804183 0.301843 0.334510 0.094157

332 0.654359 0.557119 0.245899 0.987381 0.719237

333 0.664317 0.725230 0.192527 0.381623 0.485436

334 rows × 5 columns

In [40]: # to copy

In [41]: newdf2 = newdf.copy()

In [42]: newdf2[0][0] = 2

C:\Users\vinay\AppData\Local\Temp\ipykernel_12824\2252306501.py:1: SettingWithCopyWarnin
g:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_

guide/indexing.html#returning-a-view-versus-a-copy
newdf2[0][0] = 2

In [43]: newdf

Loading [MathJax]/extensions/Safe.js
Out[43]: 0 1 2 3 4

0 5498 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133

330 0.168163 0.853079 0.751411 0.833227 0.176438

331 0.283312 0.804183 0.301843 0.334510 0.094157

332 0.654359 0.557119 0.245899 0.987381 0.719237

333 0.664317 0.725230 0.192527 0.381623 0.485436

334 rows × 5 columns

In [44]: newdf.loc[0,2] = 654

In [45]: newdf.head(3)

Out[45]: 0 1 2 3 4

0 5498 0.483302 654.000000 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

In [46]: newdf.columns = list('ABCDE')

In [47]: newdf.head()

Out[47]: A B C D E

0 5498 0.483302 654.000000 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [48]: newdf.loc[0,0] = 654

newdf

Loading [MathJax]/extensions/Safe.js
Out[48]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 vinay 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

... ... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

332 0.654359 0.557119 0.245899 0.987381 0.719237 NaN

333 0.664317 0.725230 0.192527 0.381623 0.485436 NaN

334 rows × 6 columns

In [49]: newdf.loc[1,'A'] = 654541

newdf

Out[49]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

... ... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

332 0.654359 0.557119 0.245899 0.987381 0.719237 NaN

333 0.664317 0.725230 0.192527 0.381623 0.485436 NaN

334 rows × 6 columns

In [50]: newdf = newdf.drop(1, axis=1)

Loading [MathJax]/extensions/Safe.js
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[50], line 1
----> 1 newdf = newdf.drop(1, axis=1)

File ~\anaconda3\Lib\site-packages\pandas\core\frame.py:5347, in DataFrame.drop(self, la

bels, axis, index, columns, level, inplace, errors)
5199 def drop(
5200 self,
5201 labels: IndexLabel | None = None,
(...)
5208 errors: IgnoreRaise = "raise",
5209 ) -> DataFrame | None:
5210 """
5211 Drop specified labels from rows or columns.
5212
(...)
5345 weight 1.0 0.8
5346 """
-> 5347 return super().drop(
5348 labels=labels,
5349 axis=axis,
5350 index=index,
5351 columns=columns,
5352 level=level,
5353 inplace=inplace,
5354 errors=errors,
5355 )

File ~\anaconda3\Lib\site-packages\pandas\core\generic.py:4711, in NDFrame.drop(self, la

bels, axis, index, columns, level, inplace, errors)
4709 for axis, labels in axes.items():
4710 if labels is not None:
-> 4711 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
4713 if inplace:
4714 self._update_inplace(obj)

File ~\anaconda3\Lib\site-packages\pandas\core\generic.py:4753, in NDFrame._drop_axis(se

lf, labels, axis, level, errors, only_slice)
4751 new_axis = axis.drop(labels, level=level, errors=errors)
4752 else:
-> 4753 new_axis = axis.drop(labels, errors=errors)
4754 indexer = axis.get_indexer(new_axis)
4756 # Case for non-unique axis
4757 else:

File ~\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:6992, in Index.drop(self,

labels, errors)
6990 if mask.any():
6991 if errors != "ignore":
-> 6992 raise KeyError(f"{labels[mask].tolist()} not found in axis")
6993 indexer = indexer[~mask]
6994 return self.delete(indexer)

KeyError: '[1] not found in axis'

In [51]: newdf.head()

Loading [MathJax]/extensions/Safe.js
Out[51]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

In [52]: newdf.loc[[1,2],['C','D']]

Out[52]: C D

1 0.836136 0.389201

2 0.532963 0.806051

In [53]: newdf.head()

Out[53]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

In [54]: newdf.loc[[1,2],:]

Out[54]: A B C D E 0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

In [55]: newdf.loc[:,['C','D']]

Loading [MathJax]/extensions/Safe.js
Out[55]: C D

0 654.000000 0.109495

1 0.836136 0.389201

2 0.532963 0.806051

3 0.244100 0.224745

4 0.933503 0.332410

... ... ...

329 0.450999 0.568085

330 0.751411 0.833227

331 0.301843 0.334510

332 0.245899 0.987381

333 0.192527 0.381623

334 rows × 2 columns

In [56]: newdf.loc[(newdf['A']<0.3)]

Out[56]: A B C D E 0

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

7 0.013301 0.546343 0.785860 0.580119 0.033478 NaN

10 0.135702 0.660754 0.382900 0.996195 0.144280 NaN

11 0.119561 0.370444 0.343563 0.792946 0.889031 NaN

17 0.264975 0.796818 0.150061 0.508361 0.895146 NaN

... ... ... ... ... ... ...

322 0.158642 0.768554 0.455983 0.236494 0.321771 NaN

323 0.024951 0.461243 0.380886 0.816249 0.067329 NaN

327 0.2804 0.002557 0.094892 0.759649 0.311843 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

92 rows × 6 columns

In [57]: newdf.loc[(newdf['A']<0.3) & newdf['C']>0.1]

Loading [MathJax]/extensions/Safe.js
Out[57]: A B C D E 0

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

7 0.013301 0.546343 0.785860 0.580119 0.033478 NaN

10 0.135702 0.660754 0.382900 0.996195 0.144280 NaN

11 0.119561 0.370444 0.343563 0.792946 0.889031 NaN

17 0.264975 0.796818 0.150061 0.508361 0.895146 NaN

... ... ... ... ... ... ...

322 0.158642 0.768554 0.455983 0.236494 0.321771 NaN

323 0.024951 0.461243 0.380886 0.816249 0.067329 NaN

327 0.2804 0.002557 0.094892 0.759649 0.311843 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

92 rows × 6 columns

In [58]: newdf.head(2)

Out[58]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

In [59]: newdf.iloc[0,4]

0.346555717762674
Out[59]:

In [60]: newdf.iloc[[0,5],[1,2]]

Out[60]: B C

0 0.483302 654.000000

5 0.332581 0.743576

In [61]: newdf.head(3)

Out[61]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

In [62]: newdf.drop([0])

Loading [MathJax]/extensions/Safe.js
Out[62]: A B C D E 0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

5 0.541671 0.332581 0.743576 0.498823 0.498658 NaN

... ... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

332 0.654359 0.557119 0.245899 0.987381 0.719237 NaN

333 0.664317 0.725230 0.192527 0.381623 0.485436 NaN

333 rows × 6 columns

In [63]: newdf.head(2)

Out[63]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

In [64]: newdf.iloc[0,4]

0.346555717762674
Out[64]:

In [65]: newdf.iloc[[0,1],[1,2]]

Out[65]: B C

0 0.483302 654.000000

1 0.358511 0.836136

In [66]: newdf.head(3)

Out[66]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

In [67]: newdf.drop([0])

Loading [MathJax]/extensions/Safe.js
Out[67]: A B C D E 0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

5 0.541671 0.332581 0.743576 0.498823 0.498658 NaN

... ... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

332 0.654359 0.557119 0.245899 0.987381 0.719237 NaN

333 0.664317 0.725230 0.192527 0.381623 0.485436 NaN

333 rows × 6 columns

In [69]: newdf.drop(['A','C'],axis=1) # newdf is not affected

Out[69]: B D E 0

0 0.483302 0.109495 0.346556 654.0

1 0.358511 0.389201 0.662256 NaN

2 0.453518 0.806051 0.880142 NaN

3 0.194086 0.224745 0.603455 NaN

4 0.840377 0.332410 0.579510 NaN

... ... ... ... ...

329 0.047294 0.568085 0.224133 NaN

330 0.853079 0.833227 0.176438 NaN

331 0.804183 0.334510 0.094157 NaN

332 0.557119 0.987381 0.719237 NaN

333 0.725230 0.381623 0.485436 NaN

334 rows × 4 columns

In [74]: newdf.drop([1,5], axis=0, inplace= True) # It will delete from newdf

#-> It will return to the newdf

In [75]: newdf.head(3)

Out[75]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

2 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

3 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

In [76]: newdf.reset_index(drop=True, inplace=True)

Loading [MathJax]/extensions/Safe.js
In [77]: newdf.head(3)

Out[77]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

2 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

In [78]: newdf.loc[:, ['B']]= 5

In [80]: newdf.head()

Out[80]: A B C D E 0

0 5498 5.0 654.000000 0.109495 0.346556 654.0

1 0.808912 5.0 0.244100 0.224745 0.603455 NaN

2 0.121119 5.0 0.933503 0.332410 0.579510 NaN

3 0.810778 5.0 0.173255 0.027296 0.963489 NaN

4 0.970215 5.0 0.456049 0.222950 0.524955 NaN

In [ ]:

NUMPY
In [81]: import numpy as np

In [91]: myarr = np.array([[14,6,32,7]], np.int8)

myarr

# By np.int_size we define the or set the limit how much we want the size it may be 8,32

array([[14, 6, 32, 7]], dtype=int8)

Out[91]:

In [92]: myarr.shape

(1, 4)
Out[92]:

In [93]: myarr.dtype

dtype('int8')
Out[93]:

In [94]: myarr[0,1]

6
Out[94]:

In [95]: myarr[0,1] =45

myarr

Loading [MathJax]/extensions/Safe.js
array([[14, 45, 32, 7]], dtype=int8)
Out[95]:

Array creation: Conversion from other python structures

In [96]: listarry = np.array([[1,2,3],[8,6,4],[2,6,7]])

In [97]: listarry

array([[1, 2, 3],
Out[97]:
[8, 6, 4],
[2, 6, 7]])

In [99]: listarry.shape

(3, 3)
Out[99]:

In [100… listarry.size

9
Out[100]:

In [102… zeros = np.zeros((2,5))

In [103… zeros

array([[0., 0., 0., 0., 0.],

Out[103]:
[0., 0., 0., 0., 0.]])

In [105… rng = np.arange(15)

rng

array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

Out[105]:

In [109… ispace = np.linspace(1,5,9)

ispace

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ])

Out[109]:

In [112… emp = np.empty((4,6))

emp

array([[6.23042070e-307, 1.86918699e-306, 1.69121096e-306,

Out[112]:
1.33511562e-306, 7.56587585e-307, 1.12503450e-311],
[0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000],
[0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
nan, 0.00000000e+000, 0.00000000e+000],
[0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
0.00000000e+000, 8.34451715e-308, 2.22507386e-306]])

In [114… emp_like = np.empty_like(ispace)

emp_like

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ])

Out[114]:

In [116… ide = np.identity(45)

ide

Loading [MathJax]/extensions/Safe.js
array([[1., 0., 0., ..., 0., 0., 0.],
Out[116]:
[0., 1., 0., ..., 0., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 0., ..., 0., 1., 0.],
[0., 0., 0., ..., 0., 0., 1.]])

In [117… ide.shape

(45, 45)
Out[117]:

In [119… arr = np.arange(99)

arr

array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,

Out[119]:
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98])

In [120… arr.reshape(3,33)

array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,

Out[120]:
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32],
[33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64,
65],
[66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81,
82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98]])

In [121… arr.reshape(3,31)

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[121], line 1
----> 1 arr.reshape(3,31)

ValueError: cannot reshape array of size 99 into shape (3,31)

In [122… arr.ravel()

array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,

Out[122]:
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98])

In [123… x = [[1,2,3],[4,5,6],[7,1,0]]

In [126… ar = np.array(x)
ar

array([[1, 2, 3],
Out[126]:
[4, 5, 6],
[7, 1, 0]])

In [127… ar.sum(axis=0)

array([12, 8, 9])
Out[127]:
Loading [MathJax]/extensions/Safe.js
In [128… ar.sum(axis=1)

array([ 6, 15, 8])

Out[128]:

In [130… ar.T

array([[1, 4, 7],
Out[130]:
[2, 5, 1],
[3, 6, 0]])

In [131… ar.flat

<numpy.flatiter at 0x21230427be0>
Out[131]:

In [132… for item in ar.flat:

print(item)

1
2
3
4
5
6
7
1
0

In [134… ar.ndim # No. of dimensions

2
Out[134]:

In [135… ar.size

9
Out[135]:

In [136… ar.nbytes

36
Out[136]:

In [137… one = np.array([1,3,4,634,2])

In [140… one.argmax() # Returns index

3
Out[140]:

In [142… one.argmin()

0
Out[142]:

In [143… one.argsort()

array([0, 4, 1, 2, 3], dtype=int64)

Out[143]:

In [144… ar

array([[1, 2, 3],
Out[144]:
[4, 5, 6],
[7, 1, 0]])

In [146… ar.argmin()
Loading [MathJax]/extensions/Safe.js
8
Out[146]:

In [147… ar.argmax(axis=0)

array([2, 1, 1], dtype=int64)

Out[147]:

In [148… ar.argmax(axis=1)

array([2, 2, 0], dtype=int64)

Out[148]:

In [149… ar.argsort(axis=0)

array([[0, 2, 2],
Out[149]:
[1, 0, 0],
[2, 1, 1]], dtype=int64)

In [150… ar.ravel()

array([1, 2, 3, 4, 5, 6, 7, 1, 0])
Out[150]:

In [151… ar.reshape((9,1))

array([[1],
Out[151]:
[2],
[3],
[4],
[5],
[6],
[7],
[1],
[0]])

In [152… ar

array([[1, 2, 3],
Out[152]:
[4, 5, 6],
[7, 1, 0]])

In [157… ar2 = np.array([[1,2,1],[8,5,12],[4,0,6]])

ar2

array([[ 1, 2, 1],
Out[157]:
[ 8, 5, 12],
[ 4, 0, 6]])

In [156… ar + ar2

array([[ 2, 4, 4],
Out[156]:
[12, 10, 18],
[11, 1, 6]])

In [158… ar * ar2

array([[ 1, 4, 3],
Out[158]:
[32, 25, 72],
[28, 0, 0]])

In [159… np.sqrt(ar)

array([[1. , 1.41421356, 1.73205081],

Out[159]:
[2. , 2.23606798, 2.44948974],
[2.64575131, 1. , 0. ]])

In [160… ar.sum()
Loading [MathJax]/extensions/Safe.js
29
Out[160]:

In [161… ar.max()

7
Out[161]:

In [162… ar.min()

0
Out[162]:

In [163… ar

array([[1, 2, 3],
Out[163]:
[4, 5, 6],
[7, 1, 0]])

In [164… np.where(ar>5)

(array([1, 2], dtype=int64), array([2, 0], dtype=int64))

Out[164]:

In [165… np.count_nonzero(ar)

8
Out[165]:

In [166… np.nonzero(ar)

(array([0, 0, 0, 1, 1, 1, 2, 2], dtype=int64),

Out[166]:
array([0, 1, 2, 0, 1, 2, 0, 1], dtype=int64))

In [167… ar[1,2] = 0

In [168… np.nonzero(ar)

(array([0, 0, 0, 1, 1, 2, 2], dtype=int64),

Out[168]:
array([0, 1, 2, 0, 1, 0, 1], dtype=int64))

In [169… import sys

In [170… py_ar = [0,4,55,2]

In [171… np_ar = np.array(py_ar)

In [172… sys.getsizeof(1)*len(py_ar)

112
Out[172]:

In [174… np_ar.itemsize * np_ar.size

16
Out[174]:

The above two are showing that numpy saves the space

In [ ]:

Loading [MathJax]/extensions/Safe.js

System Programming Notes
100% (1)
System Programming Notes
92 pages
Semi Detailed Lesson Plan in Stat. & Prob 11
100% (4)
Semi Detailed Lesson Plan in Stat. & Prob 11
7 pages
Only Pandas
No ratings yet
Only Pandas
8 pages
Dsbda Assignment 1
No ratings yet
Dsbda Assignment 1
5 pages
Practical File Ip
No ratings yet
Practical File Ip
27 pages
Pandas Part-2
No ratings yet
Pandas Part-2
9 pages
Week 3 GGG
No ratings yet
Week 3 GGG
17 pages
DMT Function
No ratings yet
DMT Function
10 pages
Machine Learning Group Project
No ratings yet
Machine Learning Group Project
22 pages
10 Minutes To Pandas - Pandas 1.2.4 Documentation
No ratings yet
10 Minutes To Pandas - Pandas 1.2.4 Documentation
18 pages
Ds Pract 5 Data Analytics1 Vedanti
No ratings yet
Ds Pract 5 Data Analytics1 Vedanti
7 pages
Assignments IP Class 12
No ratings yet
Assignments IP Class 12
9 pages
Short Notes On Pandas
No ratings yet
Short Notes On Pandas
21 pages
DSP Lec6
No ratings yet
DSP Lec6
10 pages
GR12 Record Programs 6TH Onwards
No ratings yet
GR12 Record Programs 6TH Onwards
18 pages
PANDAS
No ratings yet
PANDAS
74 pages
Pandas - Ipynb - Colab
No ratings yet
Pandas - Ipynb - Colab
22 pages
Dsbda Exp4 Part1
No ratings yet
Dsbda Exp4 Part1
39 pages
Answers Practical File
No ratings yet
Answers Practical File
19 pages
Panda Merged
No ratings yet
Panda Merged
19 pages
Pandas
No ratings yet
Pandas
8 pages
DAR CompleteFile 1
No ratings yet
DAR CompleteFile 1
41 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Data Cleaning
No ratings yet
Data Cleaning
22 pages
9.9.24 Revision
No ratings yet
9.9.24 Revision
9 pages
Numpy Boolean Indexing: Filter
No ratings yet
Numpy Boolean Indexing: Filter
39 pages
Prac3.ipynb (Auto-R) - JupyterLab
No ratings yet
Prac3.ipynb (Auto-R) - JupyterLab
6 pages
Exp 3
No ratings yet
Exp 3
10 pages
10 Minutes To Pandas
No ratings yet
10 Minutes To Pandas
26 pages
Python Pandas-DataFrames Complete - Jupyter Notebook
No ratings yet
Python Pandas-DataFrames Complete - Jupyter Notebook
34 pages
AD3301 - Data - Transformation - Ipynb - Colaboratory
No ratings yet
AD3301 - Data - Transformation - Ipynb - Colaboratory
27 pages
Numpy Dataframe
No ratings yet
Numpy Dataframe
12 pages
Dataframe
No ratings yet
Dataframe
19 pages
Merged
No ratings yet
Merged
35 pages
Pandas
No ratings yet
Pandas
24 pages
10) Merging Dataframes: # Detecting Duplicates
No ratings yet
10) Merging Dataframes: # Detecting Duplicates
7 pages
One Hot Encoding
No ratings yet
One Hot Encoding
12 pages
Ip Practical
No ratings yet
Ip Practical
23 pages
ML PROGRAMS
No ratings yet
ML PROGRAMS
55 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Dsbda 5
No ratings yet
Dsbda 5
12 pages
CSC - 310 Advanced Python Programming Continuous Assessment-2 Assignment:Ca2
No ratings yet
CSC - 310 Advanced Python Programming Continuous Assessment-2 Assignment:Ca2
33 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
K Means
No ratings yet
K Means
15 pages
Data Frame
No ratings yet
Data Frame
11 pages
Pandas & Mysql
No ratings yet
Pandas & Mysql
20 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Unit3 - 3) Pandas - Ipynb - Colab
No ratings yet
Unit3 - 3) Pandas - Ipynb - Colab
11 pages
Pandas - Datastructures
No ratings yet
Pandas - Datastructures
19 pages
Davp Pyq 2023 Solution
No ratings yet
Davp Pyq 2023 Solution
15 pages
PRGM 4
No ratings yet
PRGM 4
3 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Series and Pandas Methods
No ratings yet
Series and Pandas Methods
5 pages
IP Practical
No ratings yet
IP Practical
28 pages
Prg7a - Jupyter Notebook
No ratings yet
Prg7a - Jupyter Notebook
12 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
Ip Project
No ratings yet
Ip Project
27 pages
ML Lab Manual 1-10
No ratings yet
ML Lab Manual 1-10
58 pages
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Autodesk 3ds Max 2023: A Comprehensive Guide, 23rd Edition
From Everand
Autodesk 3ds Max 2023: A Comprehensive Guide, 23rd Edition
Prof. Sham Tickoo
No ratings yet
Autodesk Fusion 360 Black Book (V 2.0.15293) - Part 1
From Everand
Autodesk Fusion 360 Black Book (V 2.0.15293) - Part 1
Gaurav Verma
No ratings yet
Base Drive CKT
No ratings yet
Base Drive CKT
1 page
Numerical
No ratings yet
Numerical
2 pages
Line Interactive Ups
No ratings yet
Line Interactive Ups
1 page
Step 1: Inverter kVA Rating: Given
No ratings yet
Step 1: Inverter kVA Rating: Given
2 pages
Swayam Sir Micro
No ratings yet
Swayam Sir Micro
26 pages
Share 60 Output Based Questions
No ratings yet
Share 60 Output Based Questions
8 pages
Stack and Queues - DPP 01
No ratings yet
Stack and Queues - DPP 01
5 pages
CC Solution Set-1
No ratings yet
CC Solution Set-1
10 pages
OOP 6 Handout
No ratings yet
OOP 6 Handout
7 pages
Polygaon
No ratings yet
Polygaon
22 pages
Programming Assignment - File Work
No ratings yet
Programming Assignment - File Work
6 pages
CS3491 AI and ML Important Question Bank
No ratings yet
CS3491 AI and ML Important Question Bank
7 pages
IMP Questions ADA
No ratings yet
IMP Questions ADA
7 pages
Voice Recognition - 103626
No ratings yet
Voice Recognition - 103626
47 pages
Amitav Report Plagiarism
No ratings yet
Amitav Report Plagiarism
54 pages
AI - Second-Sessional 2021-22 - Odd Semester
No ratings yet
AI - Second-Sessional 2021-22 - Odd Semester
3 pages
DS Guess Paper 2024-25
No ratings yet
DS Guess Paper 2024-25
5 pages
Problem Solving Cos 102 Class-1
No ratings yet
Problem Solving Cos 102 Class-1
48 pages
DSA Pract04
No ratings yet
DSA Pract04
9 pages
ADA Course Plan NAG
No ratings yet
ADA Course Plan NAG
2 pages
Data Structures & Algorithms PART-A Answer ALL Questions (10x2 20 Marks)
No ratings yet
Data Structures & Algorithms PART-A Answer ALL Questions (10x2 20 Marks)
2 pages
Codes
No ratings yet
Codes
35 pages
MPMC Unit4
No ratings yet
MPMC Unit4
61 pages
Top 8 CAT Coins and Weights Questions With Video Solutions
No ratings yet
Top 8 CAT Coins and Weights Questions With Video Solutions
7 pages
Regular Expressions
No ratings yet
Regular Expressions
30 pages
Microsoft
No ratings yet
Microsoft
9 pages
Intro To Computers Syllabus Spring 2022
No ratings yet
Intro To Computers Syllabus Spring 2022
3 pages
Answer Maths T STPM 2014 Sem 1 Trial SMJK Jit Sin
No ratings yet
Answer Maths T STPM 2014 Sem 1 Trial SMJK Jit Sin
3 pages
242 Digit DP
No ratings yet
242 Digit DP
60 pages
Digital Electronics Course Handout
No ratings yet
Digital Electronics Course Handout
3 pages
OOP LAB JOURNAL 04 28122022 015042am
No ratings yet
OOP LAB JOURNAL 04 28122022 015042am
4 pages
IRI Install Guide
No ratings yet
IRI Install Guide
48 pages
Problem Solvers Coding Techniques
No ratings yet
Problem Solvers Coding Techniques
356 pages