0% found this document useful (0 votes)
31 views21 pages

PythonForDataSciences_Week2

The document covers the second week of a Python for Data Sciences course, focusing on sequence data types, including strings, lists, tuples, arrays, dictionaries, sets, and ranges. It explains their initialization, indexing, slicing, concatenation, and multiplication operations, along with examples for each type. The content is structured into lectures, providing a comprehensive overview of how to work with various sequence data structures in Python.

Uploaded by

abiwagh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views21 pages

PythonForDataSciences_Week2

The document covers the second week of a Python for Data Sciences course, focusing on sequence data types, including strings, lists, tuples, arrays, dictionaries, sets, and ranges. It explains their initialization, indexing, slicing, concatenation, and multiplication operations, along with examples for each type. The content is structured into lectures, providing a comprehensive overview of how to work with various sequence data structures in Python.

Uploaded by

abiwagh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

PythonForDataSciences_Week2 18/01/25, 5:11 PM

Python for Data Sciences


Week 2

Lec 7: Jupyter Setup


At each cell, the number within the square brackets next to the ‘In’ text denotes the number of times the code within the cell has
been executed To add text, change from ‘code’ to ‘markdown’ ‘#’ makes the proceeding text bold

Lec 8: Sequence Data - Part 1

Sequence Data Types


1. Sequences allow you to store multiple values in an organised and efficient fashion
2. There are several sequence types: Strings, lists, tuples, arrays and range objects
3. Dictionaries and sets are containers for non-sequential data

Strings - Sequence of characters: <'> or <">


Tuples - Sequence of compound data. Elements cannot be changed once assigned: ()
Lists - Sequence of multi-data type objects: []
Arrays - Sequence of a constrained list of objects (all objects of the same datatype) using array module from array package
Dictionary - Sequence of key-value pairs. Unordered collection of data values - {}
Sets - Sequence of an unordered collection of unique data. List cannot be an element of a set
Range - Used for looping: Using built-in range() function

These can offer unique functionalities for the variables to contain and handle more than one data datatype at a time. Supports
operations such as indexing, slicing, concatenation, multiplication, etc.

Sequence Object Initialization

In [21]: strsample = 'learning' #string


print(strsample)

learning

In [22]: lstnumbers = [1,2,3,3,3,4,5]


#list with numbers having duplicate values

print(lstnumbers)

[1, 2, 3, 3, 3, 4, 5]

In [266… lstsample = [1,2, 'a', 'sam', 2]


#list with mixed data type (having numbers and string)

print(lstsample)

[1, 2, 'a', 'sam', 2]

In [7]: from array import * #importing array module

arrsample = array('i',[1,2,3,4]) #array


print(arrsample)
for x in arrsample: print(x) #printing values of array

array('i', [1, 2, 3, 4])


1
2
3
4

In [8]: tupsample = (1,2,3,4,3,'py') #tuple


print(tupsample)

(1, 2, 3, 4, 3, 'py')

In [9]: tuplesample = 1,2,'sample'


#tuple packing: Assigning values of elements without parantheses

print(tuplesample)

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 1 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

(1, 2, 'sample')

In [20]: dictsample = {1:'first', 'second':2, 3:3, 'four':'4'} #dictionary


dictsample
#only unique keys are allowed, but duplicated values are allowed

Out[20]: {1: 'first', 'second': 2, 3: 3, 'four': '4'}

In [11]: # Creating a dictionary using 'dict' keyword


dict_list = dict([('first',1),('second',2),('four',4)])
dict_list

Out[11]: {'first': 1, 'second': 2, 'four': 4}

In [24]: setsample = {'example', 24, 87.5, 'data', 24, 'data'} #set


print(setsample)
set('example')

{24, 'data', 87.5, 'example'}


Out[24]: {'a', 'e', 'l', 'm', 'p', 'x'}

In [25]: rangesample = range(1,12,4)


#built-in sequence type used for looping

print(rangesample)
for x in rangesample: print(x)
#print the values of 'rangesample'

range(1, 12, 4)
1
5
9

Lec 9: Sequence Data Type - Part 2

Sequence data operations: Indexing


Indexing means accessing elements. Square brackets can be used to access the elements. There are many methods to access
elements in Python.

index() method finds the first occurrence of the specified value and returns its position

Syntax: object.index(sub[, start[, end]]), object[index]

Index of the element is used to access an element from ordered sequences


The index starts from 0
Negative indexing is used to access elements from the end of a list
In negative indexing, the last element of a list has the index '-1'

String Indexing

In [27]: strSample = 'learning' #string


strSample.index('l')
#to find the index of substring '1' from the string 'learning'

Out[27]: 0

In [30]: strSample.index('ning')
#to find the index of substring 'ning' from the string 'learning'

Out[30]: 4

In [35]: strSample[4]
#to find the substring corresponding to the 5th position

Out[35]: 'n'

In [34]: strSample[-3]
#to find the substring corresponding to 3rd last position

Out[34]: 'i'

In [37]: strSample[-9]
#IndexError: string index out of range

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 2 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[37], line 1
----> 1 strSample[-9]

IndexError: string index out of range

List Indexing
Syntax: list_name.index(element, start, end)

In [38]: lstSample = [1,2, 'a', 'sam', 2] #list


lstSample.index('sam')
#to find the index of element 'sam'

Out[38]: 3

In [39]: lstSample[2]
#to find the element corresponding to the 3rd position in the list

Out[39]: 'a'

In [40]: lstSample[-1]

Out[40]: 2

Array Indexing

In [48]: from array import * #importing array module


arrSample = array('i',[1,2,3,4])
#array with integer type

for x in arrSample: print(x)

1
2
3
4

In [54]: arrSample.index(2)

Out[54]: 1

In [49]: arrSample[-2]

Out[49]: 3

Tuple Indexing

In [50]: tupSample = (1,2,3,4,3,'py') #tuple


tupSample.index('py') #to find the position of the element 'py'

Out[50]: 5

In [55]: tupSample[2] #to find the 3rd element of the 'tupSample'

Out[55]: 3

Set Indexing

In [58]: setSample = {'example', 24, 87.5, 'data', 24, 'data'} #sets


setSample[4] #TypeError: 'set' object does not support indexing

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[58], line 2
1 setSample = {'example', 24, 87.5, 'data', 24, 'data'} #sets
----> 2 setSample[4]

TypeError: 'set' object is not subscriptable

Dictionary Indexing
The Python Dictionary object provides a key-value indexing facility The values in the dictionary are indexed by keys, they are not held

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 3 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

in any order

In [59]: dictSample = {1:'first', 'second':2, 3:3, 'four':'4'} #dictionary


dictSamplep[2] #KeyError: 2-indexing by values is not applicable in dictionary

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[59], line 2
1 dictSample = {1:'first', 'second':2, 3:3, 'four':'4'} #dictionary
----> 2 dictSamplep[2]

NameError: name 'dictSamplep' is not defined

In [62]: dictSample[1] #to find the value that corresponds to key 1

Out[62]: 'first'

In [63]: dictSample['second'] #to find the value that corresponds to key 2

Out[63]: 2

Range Indexing

In [64]: rangeSample = range(1,12,4) #built-in sequence type used for looping


for x in rangeSample: print(x) #print the valus of 'rangeSample'

1
5
9

In [65]: rangeSample.index(0) #ValueError: 0 is not in range

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[65], line 1
----> 1 rangeSample.index(0)

ValueError: 0 is not in range

In [67]: rangeSample.index(9) #to find index of element '9'

Out[67]: 2

In [68]: rangeSample[1] #given the index, returns the element at that index

Out[68]: 5

In [69]: rangeSample[9] #IndexError: range object index out of range

---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[69], line 1
----> 1 rangeSample[9]

IndexError: range object index out of range

Lec 10: Sequence Data Types - Part 3


Sequence Data Operations: Slicing
The slice object is used to slice a given sequence (string, bytes, tuple, list or range) or any object that supports sequence protocol

The syntax of slice is: slice(stop) or slice(start,stop,step)

In [72]: print(strSample)
strSample[slice(1,4,2)] #getting substring 'er'

learning
Out[72]: 'er'

In [73]: strSample[:] #output: learning

Out[73]: 'learning'

In [74]: print(lstSample)
print(lstSample[:3]) #slicing starts at the beginning index of the list

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 4 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

print(lstSample[2:]) #slicing continues till end index of the list

[1, 2, 'a', 'sam', 2]


[1, 2, 'a']
['a', 'sam', 2]

In [75]: lstSample[2:4]

Out[75]: ['a', 'sam']

In [77]: dictSample[1:'second'] #TypeError: unhashable type: 'slice'

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[77], line 1
----> 1 dictSample[1:'second']

TypeError: unhashable type: 'slice'

In [78]: setSample[1:2] #TypeError: 'set' object is not subscriptable

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[78], line 1
----> 1 setSample[1:2]

TypeError: 'set' object is not subscriptable

In [79]: for x in arrSample: print(x)


arrSample[1:] #array('i',[2,3,4])

1
2
3
4
Out[79]: array('i', [2, 3, 4])

In [80]: arrSample[1:-1] #array['i',[2,3]]

Out[80]: array('i', [2, 3])

Sequence Data Operations: Concatenation


Syntax: ',' '+' '+='

In [84]: print(strSample+' ', 'python') #learning python


print(strSample)
newString = strSample+' ', 'python'
print(newString)

learning python
learning
('learning ', 'python')

In [97]: lstSample=[1, 2, 'a', 'sam', 2]


print(lstSample)
lstSample+['py'] #[1, 2, 'a', 'sam', 2, 'py']

[1, 2, 'a', 'sam', 2]


Out[97]: [1, 2, 'a', 'sam', 2, 'py']

In [89]: print(arrSample)
arrSample+[50,60]
#TypeError: can only append array (not 'list') to array

array('i', [1, 2, 3, 4])


---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[89], line 2
1 print(arrSample)
----> 2 arrSample+[50,60]

TypeError: can only append array (not "list") to array

In [98]: arrSample+array('i',[50,60])
#array('i',[1,2,3,4,50,60])

Out[98]: array('i', [1, 2, 3, 4, 50, 60])

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 5 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

In [99]: tupSample += ('th','on')


print(tupSample) #(1,2,3,4,3,'py','th','on')

(1, 2, 3, 4, 3, 'py', 'th', 'on')

In [100… print(setSample)
setSample = setSample,24
#Converts to a tuple with comma-separated elements of set, dict, range
print(setSample)

{24, 'data', 87.5, 'example'}


({24, 'data', 87.5, 'example'}, 24)

Sequence Data Operations: Multiplication


Syntax: object*integer

In [101… strSample *= 3 #Concatenate thrice


print(strSample)

learninglearninglearning

In [102… lstSample*2 #[1,2,'a','sam',2,1,2,'a','sam',2]

Out[102… [1, 2, 'a', 'sam', 2, 1, 2, 'a', 'sam', 2]

In [103… lstSample[1]*2 #4

Out[103… 4

In [105… print(tupSample)
tupSample[2:4]*2
#(3,4,3,4) : Concatenate sliced tuple twice

(1, 2, 3, 4, 3, 'py', 'th', 'on')


Out[105… (3, 4, 3, 4)

In [106… arrSample*2 #array('i',[1,2,3,4,1,2,3,4])

Out[106… array('i', [1, 2, 3, 4, 1, 2, 3, 4])

In [107… rangeSample*2
#TypeError: unsupported operand type(s) for *: 'range' and 'int'

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[107], line 1
----> 1 rangeSample*2

TypeError: unsupported operand type(s) for *: 'range' and 'int'

Lec 11: Sequence Data Types - Part 4


String Methods

In [108… strSample = 'learning is fun !'


print(strSample)

learning is fun !

In [109… strSample.capitalize()
#returns the string with its
#first character capitalised and the rest lowercased

Out[109… 'Learning is fun !'

In [110… strSample.casefold()
#return a casefold copy of the string
#it is intended to remove all case distinctions in a string

Out[110… 'learning is fun !'

In [111… strSample.title()
#to capitalise the first character of each word

Out[111… 'Learning Is Fun !'

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 6 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

In [112… strSample.swapcase()
#to swap the case of strings

Out[112… 'LEARNING IS FUN !'

In [115… strSample.find('n')
#to find the index of the given letter

Out[115… 4

In [116… strSample.count('a')
#to count total number of 'a' in the string

Out[116… 1

In [117… strSample.replace('fun','joyful')
#to replace the letters/word

Out[117… 'learning is joyful !'

In [118… strSample.isalnum()
#returns true if all bytes in the sequence are
#alphabetical ASCII characters or ASCII decimal digits
#false otherwise

Out[118… False

In [119… name1 = 'GITAA'


name2 = 'Pvt'
name3 = 'Ltd'

In [122… name = '{} {}. {}.'.format(name1, name2, name3)


print(name)

GITAA Pvt. Ltd.

The below code will show all the functions that we can use for the particular variable:

In [123… print(dir(name))

['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '
__getattribute__', '__getitem__', '__getnewargs__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_sub
class__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__
reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'ca
pitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'in
dex', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintab
le', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix
', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines'
, 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

In [124… print(help(str))

Help on class str in module builtins:

class str(object)
| str(object='') -> str
| str(bytes_or_buffer[, encoding[, errors]]) -> str
|
| Create a new string object from the given object. If encoding or
| errors is specified, then the object must expose a data buffer
| that will be decoded using the given encoding and error handler.
| Otherwise, returns the result of object.__str__() (if defined)
| or repr(object).
| encoding defaults to sys.getdefaultencoding().
| errors defaults to 'strict'.
|
| Methods defined here:
|
| __add__(self, value, /)
| Return self+value.
|
| __contains__(self, key, /)
| Return key in self.
|
| __eq__(self, value, /)
| Return self==value.
|
| __format__(self, format_spec, /)

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 7 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

| Return a formatted version of the string as described by format_spec.


|
| __ge__(self, value, /)
| Return self>=value.
|
| __getattribute__(self, name, /)
| Return getattr(self, name).
|
| __getitem__(self, key, /)
| Return self[key].
|
| __getnewargs__(...)
|
| __gt__(self, value, /)
| Return self>value.
|
| __hash__(self, /)
| Return hash(self).
|
| __iter__(self, /)
| Implement iter(self).
|
| __le__(self, value, /)
| Return self<=value.
|
| __len__(self, /)
| Return len(self).
|
| __lt__(self, value, /)
| Return self<value.
|
| __mod__(self, value, /)
| Return self%value.
|
| __mul__(self, value, /)
| Return self*value.
|
| __ne__(self, value, /)
| Return self!=value.
|
| __repr__(self, /)
| Return repr(self).
|
| __rmod__(self, value, /)
| Return value%self.
|
| __rmul__(self, value, /)
| Return value*self.
|
| __sizeof__(self, /)
| Return the size of the string in memory, in bytes.
|
| __str__(self, /)
| Return str(self).
|
| capitalize(self, /)
| Return a capitalized version of the string.
|
| More specifically, make the first character have upper case and the rest lower
| case.
|
| casefold(self, /)
| Return a version of the string suitable for caseless comparisons.
|
| center(self, width, fillchar=' ', /)
| Return a centered string of length width.
|
| Padding is done using the specified fill character (default is a space).
|
| count(...)
| S.count(sub[, start[, end]]) -> int
|
| Return the number of non-overlapping occurrences of substring sub in
| string S[start:end]. Optional arguments start and end are
| interpreted as in slice notation.
|
| encode(self, /, encoding='utf-8', errors='strict')
| Encode the string using the codec registered for encoding.
|

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 8 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

| encoding
| The encoding in which to encode the string.
| errors
| The error handling scheme to use for encoding errors.
| The default is 'strict' meaning that encoding errors raise a
| UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
| 'xmlcharrefreplace' as well as any other name registered with
| codecs.register_error that can handle UnicodeEncodeErrors.
|
| endswith(...)
| S.endswith(suffix[, start[, end]]) -> bool
|
| Return True if S ends with the specified suffix, False otherwise.
| With optional start, test S beginning at that position.
| With optional end, stop comparing S at that position.
| suffix can also be a tuple of strings to try.
|
| expandtabs(self, /, tabsize=8)
| Return a copy where all tab characters are expanded using spaces.
|
| If tabsize is not given, a tab size of 8 characters is assumed.
|
| find(...)
| S.find(sub[, start[, end]]) -> int
|
| Return the lowest index in S where substring sub is found,
| such that sub is contained within S[start:end]. Optional
| arguments start and end are interpreted as in slice notation.
|
| Return -1 on failure.
|
| format(...)
| S.format(*args, **kwargs) -> str
|
| Return a formatted version of S, using substitutions from args and kwargs.
| The substitutions are identified by braces ('{' and '}').
|
| format_map(...)
| S.format_map(mapping) -> str
|
| Return a formatted version of S, using substitutions from mapping.
| The substitutions are identified by braces ('{' and '}').
|
| index(...)
| S.index(sub[, start[, end]]) -> int
|
| Return the lowest index in S where substring sub is found,
| such that sub is contained within S[start:end]. Optional
| arguments start and end are interpreted as in slice notation.
|
| Raises ValueError when the substring is not found.
|
| isalnum(self, /)
| Return True if the string is an alpha-numeric string, False otherwise.
|
| A string is alpha-numeric if all characters in the string are alpha-numeric and
| there is at least one character in the string.
|
| isalpha(self, /)
| Return True if the string is an alphabetic string, False otherwise.
|
| A string is alphabetic if all characters in the string are alphabetic and there
| is at least one character in the string.
|
| isascii(self, /)
| Return True if all characters in the string are ASCII, False otherwise.
|
| ASCII characters have code points in the range U+0000-U+007F.
| Empty string is ASCII too.
|
| isdecimal(self, /)
| Return True if the string is a decimal string, False otherwise.
|
| A string is a decimal string if all characters in the string are decimal and
| there is at least one character in the string.
|
| isdigit(self, /)
| Return True if the string is a digit string, False otherwise.
|

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 9 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

| A string is a digit string if all characters in the string are digits and there
| is at least one character in the string.
|
| isidentifier(self, /)
| Return True if the string is a valid Python identifier, False otherwise.
|
| Call keyword.iskeyword(s) to test whether string s is a reserved identifier,
| such as "def" or "class".
|
| islower(self, /)
| Return True if the string is a lowercase string, False otherwise.
|
| A string is lowercase if all cased characters in the string are lowercase and
| there is at least one cased character in the string.
|
| isnumeric(self, /)
| Return True if the string is a numeric string, False otherwise.
|
| A string is numeric if all characters in the string are numeric and there is at
| least one character in the string.
|
| isprintable(self, /)
| Return True if the string is printable, False otherwise.
|
| A string is printable if all of its characters are considered printable in
| repr() or if it is empty.
|
| isspace(self, /)
| Return True if the string is a whitespace string, False otherwise.
|
| A string is whitespace if all characters in the string are whitespace and there
| is at least one character in the string.
|
| istitle(self, /)
| Return True if the string is a title-cased string, False otherwise.
|
| In a title-cased string, upper- and title-case characters may only
| follow uncased characters and lowercase characters only cased ones.
|
| isupper(self, /)
| Return True if the string is an uppercase string, False otherwise.
|
| A string is uppercase if all cased characters in the string are uppercase and
| there is at least one cased character in the string.
|
| join(self, iterable, /)
| Concatenate any number of strings.
|
| The string whose method is called is inserted in between each given string.
| The result is returned as a new string.
|
| Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'
|
| ljust(self, width, fillchar=' ', /)
| Return a left-justified string of length width.
|
| Padding is done using the specified fill character (default is a space).
|
| lower(self, /)
| Return a copy of the string converted to lowercase.
|
| lstrip(self, chars=None, /)
| Return a copy of the string with leading whitespace removed.
|
| If chars is given and not None, remove characters in chars instead.
|
| partition(self, sep, /)
| Partition the string into three parts using the given separator.
|
| This will search for the separator in the string. If the separator is found,
| returns a 3-tuple containing the part before the separator, the separator
| itself, and the part after it.
|
| If the separator is not found, returns a 3-tuple containing the original string
| and two empty strings.
|
| removeprefix(self, prefix, /)
| Return a str with the given prefix string removed if present.
|

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 10 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

| If the string starts with the prefix string, return string[len(prefix):].


| Otherwise, return a copy of the original string.
|
| removesuffix(self, suffix, /)
| Return a str with the given suffix string removed if present.
|
| If the string ends with the suffix string and that suffix is not empty,
| return string[:-len(suffix)]. Otherwise, return a copy of the original
| string.
|
| replace(self, old, new, count=-1, /)
| Return a copy with all occurrences of substring old replaced by new.
|
| count
| Maximum number of occurrences to replace.
| -1 (the default value) means replace all occurrences.
|
| If the optional argument count is given, only the first count occurrences are
| replaced.
|
| rfind(...)
| S.rfind(sub[, start[, end]]) -> int
|
| Return the highest index in S where substring sub is found,
| such that sub is contained within S[start:end]. Optional
| arguments start and end are interpreted as in slice notation.
|
| Return -1 on failure.
|
| rindex(...)
| S.rindex(sub[, start[, end]]) -> int
|
| Return the highest index in S where substring sub is found,
| such that sub is contained within S[start:end]. Optional
| arguments start and end are interpreted as in slice notation.
|
| Raises ValueError when the substring is not found.
|
| rjust(self, width, fillchar=' ', /)
| Return a right-justified string of length width.
|
| Padding is done using the specified fill character (default is a space).
|
| rpartition(self, sep, /)
| Partition the string into three parts using the given separator.
|
| This will search for the separator in the string, starting at the end. If
| the separator is found, returns a 3-tuple containing the part before the
| separator, the separator itself, and the part after it.
|
| If the separator is not found, returns a 3-tuple containing two empty strings
| and the original string.
|
| rsplit(self, /, sep=None, maxsplit=-1)
| Return a list of the substrings in the string, using sep as the separator string.
|
| sep
| The separator used to split the string.
|
| When set to None (the default value), will split on any whitespace
| character (including \n \r \t \f and spaces) and will discard
| empty strings from the result.
| maxsplit
| Maximum number of splits (starting from the left).
| -1 (the default value) means no limit.
|
| Splitting starts at the end of the string and works to the front.
|
| rstrip(self, chars=None, /)
| Return a copy of the string with trailing whitespace removed.
|
| If chars is given and not None, remove characters in chars instead.
|
| split(self, /, sep=None, maxsplit=-1)
| Return a list of the substrings in the string, using sep as the separator string.
|
| sep
| The separator used to split the string.
|

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 11 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

| When set to None (the default value), will split on any whitespace
| character (including \n \r \t \f and spaces) and will discard
| empty strings from the result.
| maxsplit
| Maximum number of splits (starting from the left).
| -1 (the default value) means no limit.
|
| Note, str.split() is mainly useful for data that has been intentionally
| delimited. With natural text that includes punctuation, consider using
| the regular expression module.
|
| splitlines(self, /, keepends=False)
| Return a list of the lines in the string, breaking at line boundaries.
|
| Line breaks are not included in the resulting list unless keepends is given and
| true.
|
| startswith(...)
| S.startswith(prefix[, start[, end]]) -> bool
|
| Return True if S starts with the specified prefix, False otherwise.
| With optional start, test S beginning at that position.
| With optional end, stop comparing S at that position.
| prefix can also be a tuple of strings to try.
|
| strip(self, chars=None, /)
| Return a copy of the string with leading and trailing whitespace removed.
|
| If chars is given and not None, remove characters in chars instead.
|
| swapcase(self, /)
| Convert uppercase characters to lowercase and lowercase characters to uppercase.
|
| title(self, /)
| Return a version of the string where each word is titlecased.
|
| More specifically, words start with uppercased characters and all remaining
| cased characters have lower case.
|
| translate(self, table, /)
| Replace each character in the string using the given translation table.
|
| table
| Translation table, which must be a mapping of Unicode ordinals to
| Unicode ordinals, strings, or None.
|
| The table must implement lookup/indexing via __getitem__, for instance a
| dictionary or list. If this operation raises LookupError, the character is
| left untouched. Characters mapped to None are deleted.
|
| upper(self, /)
| Return a copy of the string converted to uppercase.
|
| zfill(self, width, /)
| Pad a numeric string with zeros on the left, to fill a field of the given width.
|
| The string is never truncated.
|
| ----------------------------------------------------------------------
| Static methods defined here:
|
| __new__(*args, **kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
|
| maketrans(...)
| Return a translation table usable for str.translate().
|
| If there is only one argument, it must be a dictionary mapping Unicode
| ordinals (integers) or characters to Unicode ordinals, strings or None.
| Character keys will be then converted to ordinals.
| If there are two arguments, they must be strings of equal length, and
| in the resulting dictionary, each character in x will be mapped to the
| character at the same position in y. If there is a third argument, it
| must be a string, whose characters will be mapped to None in the result.

None

In [125… print(help(str.find))

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 12 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

Help on method_descriptor:

find(...)
S.find(sub[, start[, end]]) -> int

Return the lowest index in S where substring sub is found,


such that sub is contained within S[start:end]. Optional
arguments start and end are interpreted as in slice notation.

Return -1 on failure.

None

Sequence Datatype object Initializations

In [145… strSample = 'learning is fun !' #STRING

In [146… lstSample = [1,2,'a','sam',2] #List

In [128… from array import *

In [195… arrSample = array('i',[1,2,3,4]) #Array

In [147… tupSample = (1,2,3,4,3,'py') #Tuple

In [271… dictSample = {1:'first' , 'second':2, 3:3, 'four':'4'}


#Dictionary

In [268… setSample = {'example', 24, 87.5, 'data',24,'data'}


#set

keys = (1,'second', 3, 'four')


rangeSample = range(1,12,4) #built-in sequence type used for looping
for x in rangeSample: print(x)

1
5
9

len(object) returns number of elements in the object


Accepted data types are: string, list, array, tuple, dictionary, set, range

In [139… print("No. of elements in the object:")


print("string = {}, list = {}, array = {}, tuple = {}, dictionary = {}, set = {}, range = {}".format(len(strSample),
len(arrSample), len(tupSample),
len(dictSample), len(setSample), len(rangeSample)))

No. of elements in the object:


string = 17, list = 5, array = 4, tuple = 6, dictionary = 4, set = 4, range = 3

In [140… lstSample.reverse()
#reverses the order of the list
print(lstSample)

[2, 'sam', 'a', 2, 1]

The clear() method removes all items from the object


Supported sequence data: list, dictionary, set

In [142… lstSample.clear()
print(lstSample)

[]

In [143… dictSample.clear()
print(dictSample)

{}

In [144… setSample.clear()
print(setSample)

set()

append() adds an element at the end of the object


file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 13 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

Supported datatypes are: array, list, set

In [151… arrSample.append(3) #adding an element '3' to the 'arrSample'


print(arrSample) #updated array, arrSample

array('i', [1, 2, 3, 4, 3])

In [152… print(lstSample)
lstSample.append([2,4])
#adding [2,4] list to lstSample

print(lstSample) #updated list

[1, 2, 'a', 'sam', 2]


[1, 2, 'a', 'sam', 2, [2, 4]]

In [155… setSample.append(20)
#AttributeError: 'set' object has no attribute 'append'

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[155], line 1
----> 1 setSample.append(20)

AttributeError: 'set' object has no attribute 'append'

In [269… setSample.add(20)
#add() takes single parameter(element) which needs to be added in the set

print(setSample)

{'example', 20, 87.5, 24, 'data'}

update() function in set adds elements from a set (passed as an argument) to the set
This method takes only a single argument
The single argument can be a set, list, tuples or a dictionary
It automatically converts into a set and adds to the set

In [156… setSample.update([5,10])
#adding a list of elements to the set

print(setSample) #updated set

{5, 'example', 10, 20, 87.5, 24, 'data'}

Dictionary Methods

In [157… print(dictSample)
dictSample["five"] = 5
print(dictSample)

{1: 'first', 'second': 2, 3: 3, 'four': '4'}


{1: 'first', 'second': 2, 3: 3, 'four': '4', 'five': 5}

In [272… dictSample.update(five = 5)
#update the dictionary with the key/value pairs from other, overwriting existing keys

print(dictSample)
#updated dictionary

{1: 'first', 'second': 2, 3: 3, 'four': '4', 'five': 5}

In [160… list(dictSample)
#returns a list of all the keys used in the dictionary dictSample

Out[160… [1, 'second', 3, 'four', 'five']

In [161… len(dictSample)
#returns the number of items in the dictionary

Out[161… 5

In [162… dictSample.get("five")
#it is a conventional method to access a value for a key

Out[162… 5

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 14 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

In [163… dictSample.keys()
#returns list of keys in dictionary

Out[163… dict_keys([1, 'second', 3, 'four', 'five'])

In [164… dictSample.items()
#returns a list of (key,value) tuple pairs

Out[164… dict_items([(1, 'first'), ('second', 2), (3, 3), ('four', '4'), ('five', 5)])

insert() - inserts the element at the specified index of the object


Syntax: object.insert(index, value) Supported datatypes: array, list

In [166… print(arrSample)

array('i', [1, 2, 3, 4, 3])

In [167… arrSample.insert(1,100)
#inserting the element '100' at 2nd position

print(arrSample) #printing array

array('i', [1, 100, 2, 3, 4, 3])

In [168… lstSample.insert(5,24)
#inserting the element '24' at 6th position

print(lstSample)
#printing list

[1, 2, 'a', 'sam', 2, 24, [2, 4]]

pop() - Removes the element at the given index from the object and prints the same
Syntax: object.pop(index)

Default value is -1 if index not specified, which returns the last item
Supported datatypes: array, list, set, dictionary

In [169… arrSample.pop()
#deleting the last element and prints the same

Out[169… 3

In [176… print(lstSample)
lstSample.pop(4)
#deleting the 5th element

[1, 2, 'a', 'sam', 2, [2, 4]]


Out[176… 2

In [172… print(dictSample)
dictSample.pop('second')
#deleting the key 'second'

{1: 'first', 'second': 2, 3: 3, 'four': '4', 'five': 5}


Out[172… 2

In [173… dictSample.pop(3) #deleting the key - '3'

Out[173… 3

Set is an unordered sequence and hence, pop() is not usually used

remove() - Removes the first occurrence of the element with the specified value
Syntax: object.remove(value) Supported Data types: array, list, dictionary, set

In [177… print(arrSample)

array('i', [1, 100, 2, 3, 4])

In [179… arrSample.remove(0)
#ValueError: array.remove(x): x not in array

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 15 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[179], line 1
----> 1 arrSample.remove(0)

ValueError: array.remove(x): x not in array

In [180… arrSample.remove(2)
#removes the element 2 from the array

In [181… print(arrSample)

array('i', [1, 100, 3, 4])

In [182… print(lstSample)
lstSample.remove('sam')
#removes the element 'sam' from the list
print(lstSample)

[1, 2, 'a', 'sam', [2, 4]]


[1, 2, 'a', [2, 4]]

In [183… print(setSample)
setSample.remove(57)
#KeyError: 57

{5, 'example', 10, 20, 87.5, 24, 'data'}


---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[183], line 2
1 print(setSample)
----> 2 setSample.remove(57)

KeyError: 57

In [184… setSample.discard(57)
#The set remains unchanged if
#the element passed to discard() method doesnt exist

print(setSample)

{5, 'example', 10, 20, 87.5, 24, 'data'}

del - Deletes the entire object of any data type


Syntax: del object

del is a Python keyword


object name can be variables, user-defined objects, lists, items within lists, dictionaries, etc.

In [185… del setSample


#deleting the set 'setSample'
print(setSample)
#NameError: name 'setSample' is not defined

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[185], line 2
1 del setSample #deleting the set 'setSample'
----> 2 print(setSample)

NameError: name 'setSample' is not defined

In [186… del arrSample


#deleting the array 'arrSample'

print(arrSample)
#NameError: name 'arrSample' is not defined

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[186], line 2
1 del arrSample #deleting the array 'arrSample'
----> 2 print(arrSample)

NameError: name 'arrSample' is not defined

In [187… del lstSample[2]


#deleting the 3rd item

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 16 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

print(lstSample)

[1, 2, [2, 4]]

In [188… del lstSample[1:3]


#deleting elements from 2nd to 3rd

print(lstSample)

[1]

In [189… del lstSample[:]


#deleting all elements from the list

print(lstSample)

[]

In [191… del dictSample


#deleting the dictionary 'dictSample'

print(dictSample)
#NameError: name 'dictSample' is not defined

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[191], line 1
----> 1 del dictSample #deleting the dictionary 'dictSample'
2 print(dictSample)

NameError: name 'dictSample' is not defined

extend() method adds the specified list elements (or any iterable - list, set, tuple etc.) to the end of the current
list

In [193… print(lstSample)
lstSample.extend([1,2,3])
print(lstSample)

[]
[1, 2, 3]

In [196… arrSample.extend((4,5,3,5))
#add a tuple to the 'arrSample' array

print(arrSample)

array('i', [1, 2, 3, 4, 4, 5, 3, 5])

In [198… arrSample.extend(['sam'])
print(arrSample)
#TypeError: an integer is required (got type str)

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[198], line 1
----> 1 arrSample.extend(['sam'])
2 print(arrSample)

TypeError: 'str' object cannot be interpreted as an integer

In [199… arrSample.fromlist([3,4])
#add values from a list to an array
print(arrSample)

array('i', [1, 2, 3, 4, 4, 5, 3, 5, 3, 4])

In [200… arrSample.tolist()
#to convert an array into an ordinary list with the same items

Out[200… [1, 2, 3, 4, 4, 5, 3, 5, 3, 4]

Set Operations
A set is an unordered collection of items
Every element is unique (no duplicates)
Sets can be used to perform mathematical set operations like union, intersection, symmetric difference, etc.

In [201… A = {'example', 24, 87.5, 'data',24,'data'}

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 17 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

#set of mixed data types

print(A)

{24, 'data', 87.5, 'example'}

In [203… B = {24,100}
#set of integers

print(B)

{24, 100}

In [212… print(A | B)
#Union of A and B is a set of all elements from both sets

{100, 87.5, 24, 'data', 'example'}

In [208… A.union(B)
#using union() function on B

Out[208… {100, 24, 87.5, 'data', 'example'}

In [213… print(A & B)


#intersection of A and B is a set of elements
#that are common in both sets

{24}

In [214… A.intersection(B)
#using intersection() function on B

Out[214… {24}

Lec 12: NumPy

Introduction to NumPy
NumPy is a Python package and it stands for Numerical Python
Fundamental package for numerical computations in Python
Supports N-dimensional array objects that can be used for processing multidimensional data
Supports different data types

Array
An array is a data structure that stores values of same data type
Lists can contain values corresponding to different data types
Arrays in Python can only contain values corresponding to same data type

NumPy Array
A NumPy Array is a grid of values, all of the same type, and is indexed by a tuple of non-negative integers
The number of dimensions is the rank of the array
The shape of an array is a tuple of integers giving the size of the array along each dimension

Creation of Array

In [215… my_list = [1,2,3,4,5,6]


print(my_list)

[1, 2, 3, 4, 5, 6]

To create NumPy array, we first need to import the NumPy package

In [216… import numpy as np

In [224… array = np.array(my_list, dtype = int)


print(array)

[1 2 3 4 5 6]

In [219… print(type(array)) #prints the data type of NumPy array


print(len(array)) #prints the number of elements in the array
print(array.ndim) #prints the number of dimensions of the NumPy array
print(array.shape) #prints the number of rows and columns

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 18 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

<class 'numpy.ndarray'>
6
1
(6,)

reshape() - Reshapes the NumPy array into 'i' number of rows containing 'j' number of elements
Syntax: array.reshape(i,j)

In [226… array2 = array.reshape(3,2)


#creates a new array 'array2'

print(array2)
print(array2.shape)

[[1 2]
[3 4]
[5 6]]
(3, 2)

In [228… array3 = array.reshape(3,-1)


#'-1' value makes numpy automatically calculate number of columns required
#depending on equal distribution of elements
#along each column depending on number of rows specified

print(array3)
print(array3.ndim)

[[1 2]
[3 4]
[5 6]]
2

In [230… #Initializinf NumPy arrays from nested Python lists

my_list2 = [1,2,3,4,5]
my_list3 = [2,3,4,5,6]
my_list4 = [9,7,6,8,9]

mul_arr = np.array([my_list2, my_list3, my_list4])


print(mul_arr)

print(mul_arr.shape)
print(mul_arr.ndim)

[[1 2 3 4 5]
[2 3 4 5 6]
[9 7 6 8 9]]
(3, 5)
2

In [241… print(mul_arr.reshape(1,15))

[[1 2 3 4 5 2 3 4 5 6 9 7 6 8 9]]

NumPy Attributes

In [247… a = np.array([[1,2,3],[4,5,6]])
print(a)
print(a.shape)

[[1 2 3]
[4 5 6]]
(2, 3)

In [248… #Reshaping the ndarray


a.shape = (3,2)
print(a)

[[1 2]
[3 4]
[5 6]]

In [249… #Reshape function to resize an array


b = a.reshape(3,2)
print(b)

[[1 2]
[3 4]
[5 6]]

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 19 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

In [250… r = range(24)
print(r)
#all the values of range are not printed

range(0, 24)

In [252… #Creating an array of evenly spaced numbers


a = np.arange(24)
print(a)
print(a.ndim)

[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
1

In [254… #Reshaping the array 'a'


b = a.reshape(6,4,1)
print(b)

[[[ 0]
[ 1]
[ 2]
[ 3]]

[[ 4]
[ 5]
[ 6]
[ 7]]

[[ 8]
[ 9]
[10]
[11]]

[[12]
[13]
[14]
[15]]

[[16]
[17]
[18]
[19]]

[[20]
[21]
[22]
[23]]]

NumPy Arithmetic Operations

In [255… x = np.array([[1,2],[3,4]], dtype = np.float64)


y = np.array([[5,6],[7,8]], dtype = np.float64)
print(x)
print(y)

[[1. 2.]
[3. 4.]]
[[5. 6.]
[7. 8.]]

numpy.add() - Performs elementwise addition between two arrays


Syntax: numpy.add(array_1, array_2)

In [257… print(x+y)
print(np.add(x,y))

[[ 6. 8.]
[10. 12.]]
[[ 6. 8.]
[10. 12.]]

numpy.subtract() - Performs elementwise subtraction between two arrays


Syntax: numpy.subtract(array_1, array_2)

In [258… print(x - y)
print(np.subtract(x,y))

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 20 of 21
PythonForDataSciences_Week2 18/01/25, 5:11 PM

[[-4. -4.]
[-4. -4.]]
[[-4. -4.]
[-4. -4.]]

numpy.multiply() - Performs elementwise multiplication between two arrays


Syntax: numpy.multiply(array_1, array_2)

In [259… print(x * y)
print(np.multiply(x,y))

[[ 5. 12.]
[21. 32.]]
[[ 5. 12.]
[21. 32.]]

In [260… print(x.dot(y))
print(np.dot(x,y))

[[19. 22.]
[43. 50.]]
[[19. 22.]
[43. 50.]]

In [262… print(x / y)
print(np.divide(x,y))

[[0.2 0.33333333]
[0.42857143 0.5 ]]
[[0.2 0.33333333]
[0.42857143 0.5 ]]

Function Name Description

numpy.add Add arguments element-wise

numpy.subtract Subtract arguments element-wise

numpy.multiply Multiply arguments element-wise

numpy.divide Returns a true division of inputs element-wise

numpy.logaddexp Logarithm of the sum of exponentiations of the inputs

numpy.logaddexp2 Logarithm of the sum of exponentiations of the inputs in base-2

numpy.true_divide Retruns a true division of the inputs element-wise

numpy.floor_divide Returns the largest integer smaller or equal to the division of the inputs

numpy.negative Numerical negative, element-wise

numpy.positive Numerical positive, element-wise

numpy.power First array elements raised to powers from second array, element-wise

numpy.remainder Returns element-wise remainder of division

numpy.mod Returns element-wise remainder of division

numpy.sum() - Returns sum of all array elements or sum of all array elements over a given axis
Syntax: numpy.sum(array,axis)

In [263… print(np.sum(x))
#Computes overall sum (axis = None)

10.0

In [264… print(np.sum(x, axis=0))


#Computes sum of each column

[4. 6.]

In [265… print(np.sum(x, axis=1))


#Computes sum of each row

[3. 7.]

file:///Users/abi/Downloads/PythonForDataSciences_Week2.html Page 21 of 21

You might also like