1 Introduction Recalling List, Dictionary and Regex.ipynb
1 Introduction Recalling List, Dictionary and Regex.ipynb
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "dWSm_5DRBb3U"
},
"source": [
"<h3>Hi, Welcome to <font color='green'><b>\"Python For Data
Science\"</b></font> <u>Course</u>. </h3>\n",
"\n",
"Before moving ahead :\n",
"1. make sure of recalling these **datastructures** in Python - **List and
dictionary**. \n",
"2. basic **Regex** (Regular Expressions) for daily business task automation.\
n",
"3. Loops - specifically ***for*** loop.\n",
"\n",
"Well, we learned all this and much more in the <font color='green'><b>\"Python
Core Programming Fundamentals\"</b></font>. This Notebook would quickly help you in
**`recalling`** all <u>above</u> **`pre-requiste concepts`**. \n",
"\n",
"<img src=\"https://drive.google.com/uc?
id=1YY7xshR1kA7OaUgWUudlGPmGWtDiujX0\" />"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "nkbLYjnz6ukT"
},
"source": [
"The **entire course** uses a \"**notebook\" coding environment**. In case you
are unfamiliar with notebooks, we have a [90-second intro
video](https://www.youtube.com/watch?v=4C2qMnaIKL4)."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "0UNZWjs5HfKH"
},
"source": [
"Two most used Python Data Structures - **List & Dictionary**\n",
"--\n",
"<hr>\n",
"<h3><b>A. <u>List</u> data Structure</b></h3>\n",
"\n",
"Lists are the most versatile of Python's compound data types. A list contains
items separated by commas and enclosed within square brackets (**[ ]**). *To some
extent, lists are similar to arrays in C*. One of the differences between them is
that all the items belonging to a list can be of different data type.\n",
"\n",
"The values stored in a list can be accessed using the slice operator (**[start
: stop]**) with indexes starting at 0 in the beginning of the list and working
their way to `stop-1`. The plus (+) sign is the list concatenation operator, and
the asterisk (*) is the repetition operator. \n",
"\n",
"**code examples −**\n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 118
},
"colab_type": "code",
"id": "NgjCBzaQIvDz",
"outputId": "30d038f4-7759-42eb-f09c-1b82628be3da"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['scikit', 123, 2.23, 'suven', (72+3j)]\n",
"scikit\n",
"[123, 2.23]\n",
"[2.23, 'suven', (72+3j)]\n",
"[123, 'Technology', 123, 'Technology']\n",
"['scikit', 123, 2.23, 'suven', (72+3j), 123, 'Technology']\n"
]
}
],
"source": [
"#list can hold different types of data values\n",
"list = [ 'scikit', 123 , 2.23, 'suven', 72+3j ]\n",
"tinylist = [123, 'Technology']\n",
"\n",
"print(list) \n",
"print(list[0]) \n",
"print(list[1:3]) # from index 1 to (3-1), i.e 2 \n",
"print(list[2:]) # from index 2 onwards\n",
"print(tinylist * 2) # Prints list two times\n",
"print(list + tinylist) # Prints concatenated lists"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "ubjPUJJ_PEvk"
},
"source": [
"<font color='green'><b>Few important <u>List functions</u></b></font>"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 101
},
"colab_type": "code",
"id": "QNX968kMI6kd",
"outputId": "ae4af879-d699-45fc-f137-008e29b98339"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"stuff[1] -> Java\n",
"stuff[-1] -> JavaScript\n",
"stuff.pop() -> JavaScript\n",
"Join all elements of stuff with space in between -> Python Java SQL R PHP\
n",
"Join all element 3rd to 4th with # in between -> R#PHP\n"
]
}
],
"source": [
"#--applying few functions on the list --\n",
"few_things = \"Python Java SQL R PHP JavaScript\"\n",
"\n",
"stuff = few_things.split(' ') # split function would split into words or
tokens. \n",
" # Here it is spliting on space. \n",
"\n",
"print(\"stuff[1] -> \", stuff[1])\n",
"print(\"stuff[-1] -> \",stuff[-1]) # reads from right to left.
So it would read the last element\n",
"\n",
"print(\"stuff.pop() -> \" ,stuff.pop()) # pops or removes the last
element\n",
"\n",
"print(\"Join all elements of stuff with space in between -> \", '
'.join(stuff)) # using space as a separator\n",
"print(\"Join all element 3rd to 4th with # in between -> \",
'#'.join(stuff[3:5])) # using hash as a separator"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "GHDleLL8NX-L"
},
"source": [
"**See the difference between append() and extend() method**"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 84
},
"colab_type": "code",
"id": "d3bPJZ5wMAAY",
"outputId": "7da11d8e-7c90-4fe8-9d8c-3ff9cab79661"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"after appending : ['sql', 'Java', 'Python', ['hackerrank', 'Challenge']]\
n",
"-------------\n",
"after extending : ['sql', 'Java', 'Python', 'hackerrank', 'Challenge']\n",
"-------------\n"
]
}
],
"source": [
"#-- append() appends object to the list \n",
"list1 = ['sql', 'Java', 'Python']\n",
"list1.append(['hackerrank', 'Challenge'])\n",
"print(\"after appending : \", list1)\n",
"print(\"-------------\")\n",
"\n",
"\n",
"#-- extend() adds each element of the iterable to the list one at a time. \n",
"list2 = ['sql', 'Java', 'Python']\n",
"list2.extend(['hackerrank', 'Challenge'])\n",
"print(\"after extending : \", list2)\n",
"print(\"-------------\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "IL40AVFqNpvQ",
"outputId": "d770a1ee-2265-494a-b8af-d4fd66481f3e"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Count of Java : 1\n"
]
}
],
"source": [
"#-- count() counts object in the list \n",
"list2 = ['Java Script', 'Java', 'Python']\n",
"print(\"Count of Java : \", list2.count('Java')) # Note \"Java\" is being
searched as an element , not part of the element.\n",
" # hence \"Java\" won't match
with \"Java Script\""
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "uz0GUmzfRmaY",
"outputId": "20b3b1e7-9b97-4479-8772-0e7e291c8ae8"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Final list : ['physics', 'Biology', 'chemistry', 'maths']\n"
]
}
],
"source": [
"#--insert() method inserts object obj into list at offset index.\n",
"list3 = ['physics', 'chemistry', 'maths']\n",
"list3.insert(1, 'Biology')\n",
"print ('Final list : ', list3)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 50
},
"colab_type": "code",
"id": "TvnmqFgGR4Df",
"outputId": "096cb071-c165-4f68-d38e-c65802ed03a2"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"pop once : ['physics', 'maths']\n",
"pop again : ['physics']\n"
]
}
],
"source": [
"#--pop() Removes and returns last object from list\n",
"list4 = ['physics', 'Biology', 'maths']\n",
"list4.pop(1)\n",
"print ('pop once : ', list4)\n",
"list4.pop()\n",
"print ('pop again : ', list4)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "W4wcLFaMSI1g",
"outputId": "46e59386-b093-42e6-92df-980cc5f0aaef"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"after reversing : ['maths', 'chemistry', 'physics']\n"
]
}
],
"source": [
"#--reverse() the list\n",
"list5 = ['physics', 'chemistry', 'maths']\n",
"list5.reverse()\n",
"print(\"after reversing : \", list5)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "hz3n6VIZSMv4",
"outputId": "6087ea6f-0acd-447d-f123-8a40d5225802"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"after sorting : ['chemistry', 'maths', 'physics']\n"
]
}
],
"source": [
"#--sort() the list\n",
"list6 = ['physics', 'chemistry', 'maths']\n",
"list6.sort()\n",
"print(\"after sorting : \", list6)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "0BhqGUy5STVQ",
"outputId": "448a645c-4bde-4c65-ffd4-372b871d28d0"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"after reverse sorting : ['physics', 'maths', 'chemistry']\n"
]
}
],
"source": [
"# -- for reverse sorting\n",
"# -- Setting reverse=True sorts the list in the descending order.\n",
"list7 = ['physics', 'chemistry', 'maths']\n",
"list7.sort(reverse=True)\n",
"print(\"after reverse sorting : \", list7) "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "fSicCVyYS0Bs",
"outputId": "f4cc8e06-310c-41d7-f394-6728a33c2d2f"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Sorted list: [(4, 1), (2, 2), (1, 3), (3, 4)]\n"
]
}
],
"source": [
"# -- How to sort using your own function with key parameter?\n",
"# -- If you want your own implementation for sorting, sort() also accepts a
key function as an optional parameter.\n",
"# -- Based on the results of the key function, you can sort the given list.\
n",
"\n",
"# take second element for sort\n",
"def takeSecond(elem):\n",
" return elem[1]\n",
"\n",
"# a list of tuples\n",
"random = [(2, 2), (3, 4), (4, 1), (1, 3)]\n",
"\n",
"# sort list with key\n",
"random.sort(key=takeSecond)\n",
"\n",
"# print list\n",
"print('Sorted list:', random)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "Pn-5ovq_YiQ8"
},
"source": [
"<h3><b><font color='green'> Wait a minute.... </font></b></h3>\n",
"\n",
"I am assuming you know Python is the most popular language for Data Science.
**Why ???** \n",
"\n",
"**See this Short Video** , every thing would be <u>very clear</u> !!\n",
"\n",
"<a href=\"https://drive.google.com/open?
id=1uqKrXftNnOcc4zmqygcZk0lV0cLnCpC7\">\n",
" <img src=\"https://drive.google.com/uc?
id=1o1bc4NNHizaKGGVodefB3UytMXmIuOgH\" alt=\"Why Python is popular for Data Science
?\" width=\"130\" height=\"70\">\n",
"</a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "aqBt02frTVTH"
},
"source": [
"<h3><font color ='red'><b>Coding <u>Exercises</u> on <u>list</u>:
</b></font></h3>\n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "ZY7r2-uOVaDo"
},
"outputs": [],
"source": [
"# Q1. Count the occurences of 'a' in the list.\n",
"\n",
"list = [\"d\", \"a\", \"t\", \"a\", \"c\", \"a\", \"m\", \"p\"]\n",
"\n",
"# hint : uncomment this below LOC\n",
"# print(list.count(\"a\"))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "MRSGXwQnVoV4"
},
"outputs": [],
"source": [
"# Q2. Try out & tell what does the code do ? (fully coded)\n",
"\n",
"# Import 'Counter' from the 'collections' library\n",
"from collections import Counter\n",
"\n",
"# This is your list\n",
"list = [\"a\",\"b\",\"b\",\"c\",\"d\",\"e\",\"e\"]\n",
"\n",
"# Pass 'list' to 'Counter()'\n",
"num = Counter(list)\n",
"print(num) "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "5VTfg3cnV1la"
},
"outputs": [],
"source": [
"# Q3. Loop over your list and print all elements that are of size 3 ?\n",
"\n",
"# This is your list\n",
"mylist = [[1,2,3],[4,5,6,7],[8,9,10]]\n",
"\n",
"# Loop over your list and print all elements that are of size 3\n",
"# code yourself\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "gaiFXn3OWFtv"
},
"outputs": [],
"source": [
"# Q4. Loop over \"myList\" and print tuples of all indices and values. (fully
coded)\n",
"\n",
"# This is your list\n",
"myList = [3,4,5,6]\n",
"\n",
"# Loop over \"myList\" and print tuples of all indices and values \n",
"for i, val in enumerate(myList):\n",
" print(i, val)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "in__YjdKWat_"
},
"outputs": [],
"source": [
"# Q5. Get the common unique elements i.e intersection (fully coded)\n",
"\n",
"list1 = [1,2,3,4,5,6,6]\n",
"list2 = [3,3,4,7,8]\n",
"\n",
"#-- first way to get intersection\n",
"# Make use of the list and set data structures\n",
"print(list(set(list1) & set(list2)))\n",
"\n",
"#-- second way to get intersection\n",
"# Use intersection()\n",
"print(list(set(list1).intersection(list2)))\n",
"\n",
"# Remember : intersection method is defn over set object, not list"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "5p9rOMqTWvBc"
},
"outputs": [],
"source": [
"# Q6. Print the unique elements from List. (fully coded)\n",
"\n",
"# Your list with duplicate values\n",
"duplicates = [1, 2, 3, 1, 2, 5, 6, 7, 8]\n",
"\n",
"#--Print the unique elements from \"duplicates\" list\n",
"print(list(set(duplicates)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "V5soGP7-fLME"
},
"source": [
"**`Important` : zip() : You should know how to zip and unzip values**\n",
"\n",
"<font color='blue'>\n",
"The purpose of zip() <b>is to map the similar index of multiple containers</b>
so that they can be used as a single entity.\n",
"</font>"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "lLsIR35GdiAV",
"outputId": "bd616cae-ff0b-4ce6-f3cf-780ba032ae50"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The zipped result is : {('Neha', 1, 50), ('Snehar', 3, 60), ('Akshay', 4,
40), ('Vipin', 2, 70)}\n"
]
}
],
"source": [
"# --use of zip()\n",
"#The purpose of zip() is to map the similar index of multiple containers so
that they can be used as a single entity.\n",
"\n",
"# initializing lists \n",
"name = [ \"Akshay\", \"Neha\", \"Snehar\", \"Vipin\" ] \n",
"roll_no = [ 4, 1, 3, 2 ] \n",
"marks = [ 40, 50, 60, 70 ] \n",
" \n",
"# using zip() to map values \n",
"mapped = zip(name, roll_no, marks) # if you don't convert to set, then it
prints the object id\n",
" \n",
"# converting values to print as set \n",
"mapped = set(mapped) # if you don't convert to set , then it prints the object
id\n",
" \n",
"# printing resultant values \n",
"print(\"The zipped result is : \",end=\"\") \n",
"print(mapped) \n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 84
},
"colab_type": "code",
"id": "TFgKaKuEgcu9",
"outputId": "9a7b1c58-db17-44f6-a96b-fb7173bd6892"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The unzipped result:\n",
"The name list is : ('Neha', 'Snehar', 'Akshay', 'Vipin')\n",
"The roll_no list is : (1, 3, 4, 2)\n",
"The marks list is : (50, 60, 40, 70)\n"
]
}
],
"source": [
"#unzipping values \n",
"namz, roll_noz, marksz = zip(*mapped) \n",
" \n",
"print(\"The unzipped result:\") \n",
" \n",
"#printing initial lists \n",
"print(\"The name list is : \",end=\"\") \n",
"print(namz) \n",
" \n",
"print(\"The roll_no list is : \",end=\"\") \n",
"print(roll_noz) \n",
" \n",
"print(\"The marks list is : \",end=\"\") \n",
"print(marksz)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "h6cipNDWZJx2"
},
"source": [
"<hr>\n",
"<h3><b>B. Python <u>Dictionary</u> data Structure</b></h3>\n",
"\n",
"Python's dictionaries are kind of **hash-table type**. They work like
associative arrays or hashes found in Perl and consist of **key-value pairs**. A
dictionary key can be almost any Python type, but are usually numbers or strings.
Values, on the other hand, can be any arbitrary Python object.\n",
"\n",
"Dictionaries are enclosed by curly braces ({ }) and values can be assigned and
accessed using square braces ([]). \n",
"\n",
"**code examples −**\n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 101
},
"colab_type": "code",
"id": "xkBzr8_gXC22",
"outputId": "445377e8-0999-4e36-f316-29c27553dc9a"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"IIT-B\n",
"Chembur\n",
"{'name': 'suven', 'code': 1234, 'dept': 'IT sales'}\n",
"dict_keys(['name', 'code', 'dept'])\n",
"dict_values(['suven', 1234, 'IT sales'])\n"
]
}
],
"source": [
"#--Python Dictionary : holds key:value pair --\n",
"#--enclosed by curly braces { } --\n",
"#--elements of dict are not ordered\n",
"dict = {}\n",
"dict['campus'] = \"IIT-B\"\n",
"dict[400071]= \"Chembur\"\n",
"\n",
"tinydict={'name':'suven','code':1234,'dept':'IT sales'}\n",
"\n",
"print(dict['campus']) \n",
"print(dict[400071])\n",
"print(tinydict) #elements of dict are not ordered\n",
"print(tinydict.keys())\n",
"print(tinydict.values()) "
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "mLY7nAa1aaOn"
},
"source": [
"**Next code example is very important :**"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "V7UNy8eWaIbf",
"outputId": "30af174f-2436-4541-e595-f5e81498dbf0"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{1: ('A', 'B'), 2: ('C',)}\n"
]
}
],
"source": [
"#--dict can also be used to hold multiple data values--\n",
"#--multiple data values are held in list--\n",
"# \n",
"#-- convert this list into a dict --\n",
"l=[ [1, 'A'], [1, 'B'], [2, 'C'] ]\n",
"d={}\n",
"\n",
"#--extracting from list, making tuple and adding to dict\n",
"for pair in l:\n",
" if pair[0] in d:\n",
" d[pair[0]]=d[pair[0]]+tuple(pair[1])\n",
" else:\n",
" d[pair[0]]=tuple(pair[1])\n",
"\n",
"print(d) # tuple and list look similar, but different "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "-q0yUzA7fpBi",
"outputId": "1fad20fe-74b2-414d-ffd5-c98d5c6101bf"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Name': 'Suven', 'Age': 15}\n"
]
}
],
"source": [
"#-- very important Property of Dictionary Keys \n",
"\n",
"# 1. More than one entry per key is not allowed.\n",
"dictt = {'Name':'SCTPL','Age':15,'Name':'Suven'}\n",
"print(dictt)\n",
"\n",
"#--Note : for duplicate keys, the last value wins"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "4gQYMJl-g46c",
"outputId": "e561fdda-94b6-444d-e0a3-1be147778d48"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{}\n"
]
}
],
"source": [
"#clear() Removes all elements of dictionary dictt\n",
"dictt = {'Name':'SCTPL','Age':15,'Name':'Suven'}\n",
"dictt.clear()\n",
"print(dictt)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "IWrS-m0fhJML",
"outputId": "288f8aa2-4e38-4f11-a070-b4f6a31c4ea5"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Name': 'Suven Consultants', 'headOffice': 'Mumbai'}\n"
]
}
],
"source": [
"#copy() : Returns a shallow copy of dictionary dictt\n",
"dictt = {'Name':'Suven Consultants','headOffice':'Mumbai' }\n",
"dictt2 = dictt.copy() #only gets reference\n",
"print(dictt2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "2a8Fuj8qhajF"
},
"source": [
"**`Difference between shallow and deep copy`**\n",
"\n",
"> A shallow copy means only copying the address of the object, not its
content.\n",
"\n",
"> A deep copy makes the copying process recursive. It means first constructing
a new collection object and then recursively populating it with copies of the child
objects found in the original. Copying an object this way walks the whole object
tree to create a fully independent clone of the original object and all of its
children. ***So, here you copy the contents , not just the address***\n",
"\n",
"<font color='green'><b>Next example explains all </b></font>"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 151
},
"colab_type": "code",
"id": "yWP6ohPIiI-I",
"outputId": "583e7405-8ca4-407c-b180-685039c2f233"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[1, 2, 3], [4, 5, 6], [7, 8, 9]]\n",
"--------------------\n",
"xs is : [[1, 2, 3], [4, 5, 6], [7, 8, 9], ['new sublist']]\n",
"ys is : [[1, 2, 3], [4, 5, 6], [7, 8, 9], ['new sublist']]\n",
"--------------------\n",
"updated xs is : [[1, 2, 3], ['X', 5, 6], [7, 8, 9], ['new sublist']]\n",
"updated ys is : [[1, 2, 3], ['X', 5, 6], [7, 8, 9], ['new sublist']]\n",
"--------------------\n"
]
}
],
"source": [
"#--shallow copy example --\n",
"xs = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]\n",
"ys = xs # Makes a shallow copy. ys refers to same address space as that of
xs\n",
"print(ys)\n",
"print(\"--------------------\")\n",
"xs.append(['new sublist'])\n",
"print(\"xs is : \", xs)\n",
"print(\"ys is : \", ys)\n",
"print(\"--------------------\")\n",
"xs[1][0] = 'X'\n",
"print(\"updated xs is : \", xs)\n",
"print(\"updated ys is : \", ys)\n",
"print(\"--------------------\")\n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 84
},
"colab_type": "code",
"id": "O7u-AW5ZjDpA",
"outputId": "1b4211dc-5cca-4ae5-9881-b25a0474a9cf"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"zs is : [[1, 2, 3], [4, 5, 6], [7, 8, 9]]\n",
"--------------------\n",
"updated xs is : [[1, 2, 3], ['X', 5, 6], [7, 8, 9]]\n",
"zs remains same : [[1, 2, 3], [4, 5, 6], [7, 8, 9]]\n"
]
}
],
"source": [
"#--deep copy example\n",
"import copy\n",
"xs = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]\n",
"zs = copy.deepcopy(xs) # all child objects are copied\n",
"\n",
"print(\"zs is : \", zs)\n",
"print(\"--------------------\")\n",
"\n",
"xs[1][0] = 'X'\n",
"\n",
"print(\"updated xs is : \", xs)\n",
"print(\"zs remains same : \", zs)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 134
},
"colab_type": "code",
"id": "1GNKmh1JjcLA",
"outputId": "a4a85ea6-0fd0-4ad0-c1a3-1aa06e5cb904"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dict_keys(['Name', 'Age'])\n",
"---------------------\n",
"values are dict_values(['Suven', 15])\n",
"---------------------\n",
"key:value pairs are : dict_items([('Name', 'Suven'), ('Age', 15)])\n",
"---------------------\n",
"get value of a key 'Name': Suven\n"
]
}
],
"source": [
"#--get list of all keys, key:value pairs\n",
"dictt = {'Name':'SCTPL','Age':15,'Name':'Suven'}\n",
"\n",
"# .keys() will fetch all keys for you.\n",
"print(dictt.keys());\n",
"print(\"---------------------\");\n",
"\n",
"# .values() will fetch all values for you.\n",
"print(\"values are \", dictt.values());\n",
"print(\"---------------------\");\n",
"\n",
"# .items() will fetch all key : value pairs for you.\n",
"print(\"key:value pairs are :\", dictt.items()); # returns list of tuples.
Each tuple is a key:value pair\n",
"print(\"---------------------\");\n",
"\n",
"# get value of a key\n",
"print(\"get value of a key 'Name':\",dictt.get('Name'))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 50
},
"colab_type": "code",
"id": "XCHXNqFXj5JR",
"outputId": "aa9e56df-7255-4fe2-e9fc-da7de87de0af"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n",
"False\n"
]
}
],
"source": [
"#--to check for a key ? -- use in operator.\n",
"# -- 'in' is a membership operator \n",
"\n",
"dictt = {'Name':'SCTPL','Age':15,'Name':'Suven'}\n",
"print('Name' in dictt)\n",
"print('Names' in dictt)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "p_3YC_2ckAMN",
"outputId": "2780e3f6-e564-400e-ce49-b6c7ce20f05b"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"updated dict : {'Name': 'Suven', 'course': 'Eco and Stats', 'Location':
'IITB'}\n"
]
}
],
"source": [
"#--update updates the dictionary \n",
"dict1 = {'Name':'Suven', 'course':'Eco and Stats'}\n",
"dict2 = {'Location':'IITB'}\n",
"\n",
"dict1.update(dict2)\n",
"print (\"updated dict : \", dict1)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 118
},
"colab_type": "code",
"id": "P5h6hNDqkGOv",
"outputId": "94d0a1c5-4e1c-44bb-9e1b-5eae9dbe6625"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"value of the popped key : Neha\n",
"dict after pop : {'course': 'Web Tech'}\n",
"null\n",
"first time ('course', 'Web Tech')\n",
"second time ('Name', 'Neha')\n",
"{}\n"
]
}
],
"source": [
"#--pop removes a key and returns its value\n",
"dict3 = {'Name':'Neha', 'course':'Web Tech'}\n",
"print(\"value of the popped key :\",dict3.pop('Name'))\n",
"print(\"dict after pop :\", dict3)\n",
"\n",
"#--The 2nd parameter in pop() is the default value. Prevents Exception when
key not found\n",
"print(dict3.pop('location', 'null'))\n",
"\n",
"#--to pop key:value pair , do :\n",
"dict3 = {'Name':'Neha', 'course':'Web Tech'}\n",
"print(\"first time \",dict3.popitem())\n",
"print(\"second time \",dict3.popitem())\n",
"print(dict3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "D55SKQwOkpJG"
},
"source": [
"<h3><font color ='red'><b>Coding <u>Test</u> on simple Python (Core) Concepts:
</b></font></h3>"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "8zKH4eSJmjSW"
},
"source": [
"**`First`** : I am assuming you remember your loop syntax\n",
"<hr>\n",
"\n",
"\n",
"\n",
"<font color='red'><b>Task_1 to be solved</b></font>\n",
"\n",
"Read an integer **`N`** . For all non-negative integers **`i<N`** , print
**`i^2`**. See the sample for details.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "adrrQhQYpO79"
},
"source": [
"**`Input Format`**\n",
"\n",
"The first and only line contains the integer, **`N`**.\n",
"\n",
"**`Output Format`**\n",
"\n",
"Print **`N`** lines, one corresponding to each **`i`**.\n",
"\n",
"**`Sample Input 0`** <br>\n",
"5\n",
"\n",
"**`Sample Output 0`**\n",
"\n",
"0 <br>\n",
"1 <br>\n",
"4 <br>\n",
"9 <br>\n",
"16"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "lzQqEKaikhWu"
},
"outputs": [],
"source": [
"# please code and solve here\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "qm_K1nCQqGE9"
},
"source": [
"<font color='red'><b>Task_2 to be solved</b></font>\n",
"\n",
"The **zip([iterable, ...])** function returns a list of tuples. The ***ith***
tuple contains the ***ith*** element from each of the argument sequences or
iterables.\n",
"\n",
"If the argument sequences are of unequal lengths, then the returned list is
truncated to the length of the shortest argument sequence.\n",
"\n",
"<font color='red'><b>Task</b></font> : The `National University` conducts an
examination of students in subjects. **Your task is to compute the average scores
of each student.**"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "L8GiV_fsr2R2"
},
"source": [
"<img src = 'https://drive.google.com/uc?id=1py-UNLDdH4sRW-
Z8hvirTYcEO0VrGzFa' />"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "7uJ6bEqpsplT"
},
"source": [
"**`Input Format`**\n",
"\n",
"> The first line contains **`N`** and **`X`** separated by a space. \n",
"\n",
"> The next **`X`** lines contains the space separated marks obtained by
students in a particular subject. And **`N`** indicates the no. of students\n",
"\n",
"**`Output Format`**\n",
"\n",
"> Print the averages of all students on separate lines.\n",
"> *The averages must be correct up to 1 decimal place.*\n",
"\n",
"<img src='https://drive.google.com/uc?id=1lIAaQvF8O94xBnx0LNqM62rff_0pw7ci'
/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "TWWb4Dh9tEtw"
},
"source": [
"<img src='https://drive.google.com/uc?id=1K1dWMfzk-FRRWP7P9JYLqDjOFzWMpcgd'
/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "S1T8VJ9xu-XR"
},
"source": [
"**Great Reading Resource** : https://realpython.com/python-zip-function/\n",
"\n",
"Would greatly help in solving any problem using zip function"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "cvwbG4zQrybA"
},
"outputs": [],
"source": [
"# please code and solve here\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "SvrHkoXdb2AL"
},
"source": [
"**Don't worry :** If you get stuck some where in solving , try harder, use
more brains, but still if things don't progress, then ping me on whats-app group.
<b><font color='green'>Invite Link to join the <u>Whats-app group</u> was send in
the enrollment mail.</font></b>"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "Tvb8IfS3udTG"
},
"source": [
"<h3><b>Recalling <u>Regex</u></b></h3>\n",
"\n",
"You may be familiar with searching for text by pressing ctrl-F and typing in
the words you're looking for. \n",
"\n",
"Regular expressions go one step further: They allow you to specify a pattern
of text to search for. You may not know a business's exact phone number, but if you
live in India, you know it will be two/three digits of state code , followed by a
hyphen or space, and then eight more digits. \n",
"\n",
"<font color='green'>This is how you, a human would know a phone number when he
see it: 022-25277413 is a phone number, but 2,225,277,413 is not. </font>\n",
"\n",
"A regular expression is a special sequence of characters that helps you match
or find other strings or sets of strings, using a specialized syntax held in a
pattern. \n",
"\n",
"The module **re** provides full support for regular expressions in Python.\n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "dnwzjXiqwfcJ",
"outputId": "a14ca976-b640-4721-cfe1-775d6d4a4b11"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Phone number found: 022-2527-7413\n"
]
}
],
"source": [
"import re\n",
"\n",
"phoneNumRegex = re.compile(r'\\d\\d\\d-\\d\\d\\d\\d-\\d\\d\\d\\d')\n",
"mo = phoneNumRegex.search('My number is 022-2527-7413.')\n",
"\n",
"#return value of search is match_object or NONE \n",
"if mo :\n",
" print('Phone number found: ' + mo.group())"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 67
},
"colab_type": "code",
"id": "PPThHpzQwy4Z",
"outputId": "424dc73f-8774-4803-a880-a0cbbc69e08d"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"State code is : 022\n",
"And the number is : 2527-7413\n",
"and the entire no. is : 022-2527-7413\n"
]
}
],
"source": [
"#--form groups by using ()\n",
"#--printing state code and the no. \n",
"import re\n",
"\n",
"phoneNumRegex = re.compile(r'(\\d\\d\\d)-(\\d\\d\\d\\d-\\d\\d\\d\\d)')\n",
"mo = phoneNumRegex.search('My number is 022-2527-7413.')\n",
"\n",
"if mo :\n",
" print('State code is : ' + mo.group(1))\n",
" print('And the number is : ' + mo.group(2))\n",
" print('and the entire no. is : ', mo.group())"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "Cf23BJLvxOSV",
"outputId": "962062fc-bab0-4680-c267-0c57ecfc004e"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"('022', '2527-7413')\n"
]
}
],
"source": [
"#--rewriting the above code and printing all groups\n",
"import re\n",
"\n",
"phoneNumRegex = re.compile(r'(\\d\\d\\d)-(\\d\\d\\d\\d-\\d\\d\\d\\d)')\n",
"mo = phoneNumRegex.search('My number is 022-2527-7413.')\n",
"\n",
"if mo :\n",
" print(mo.groups()) #prints all groups as tuples\n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 50
},
"colab_type": "code",
"id": "QEQfS_x3xUYd",
"outputId": "65149366-7b4d-4204-cca3-eb964cb34cc4"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"91-98700-14450\n",
"98700-14450\n"
]
}
],
"source": [
"#-- Suppose if state-code 022 is optional then \n",
"#-- use (pattern)? -> this means pattern is optional\n",
"import re\n",
"\n",
"phoneRegex = re.compile(r'(\\d\\d-)?\\d\\d\\d\\d\\d-\\d\\d\\d\\d\\d')\n",
"mo1 = phoneRegex.search('My number is 91-98700-14450')\n",
"if mo1 :\n",
" print(mo1.group())\n",
"\n",
"mo2 = phoneRegex.search('My number is 98700-14450')\n",
"if mo2 :\n",
" print(mo2.group())"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 67
},
"colab_type": "code",
"id": "dRt-UrVLxtyO",
"outputId": "95000cde-4242-406c-fcb3-19eee489e6b6"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Batman\n",
"Batwoman\n",
"Batwowowowoman\n"
]
}
],
"source": [
"#--Matching Zero or More with the Star\n",
"\n",
"import re\n",
"\n",
"batRegex = re.compile(r'Bat(wo)*man')\n",
"mo1 = batRegex.search('The Adventures of Batman')\n",
"if mo1 :\n",
" print(mo1.group())\n",
"\n",
"mo2 = batRegex.search('The Adventures of Batwoman')\n",
"if mo2 :\n",
" print(mo2.group())\n",
" \n",
"mo3 = batRegex.search('The Adventures of Batwowowowoman')\n",
"if mo3 :\n",
" print(mo3.group())"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 50
},
"colab_type": "code",
"id": "SdIAMnOax0QX",
"outputId": "f8874af0-279d-4d26-f7fa-8db7114819b6"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Batwoman\n",
"input string does not contain pattern\n"
]
}
],
"source": [
"#--Matching One or More with the Plus\n",
"import re \n",
"\n",
"batRegex = re.compile(r'Bat(wo)+man')\n",
"mo1 = batRegex.search('The Adventures of Batwoman')\n",
"if mo1 :\n",
" print(mo1.group())\n",
"else :\n",
" print(\"input string does not contain pattern\")\n",
"\n",
"\n",
"mo2 = batRegex.search('The Adventures of Batman')\n",
"if mo2 :\n",
" print(mo2.group())\n",
"else :\n",
" print(\"input string does not contain pattern\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 50
},
"colab_type": "code",
"id": "fEl9IhJEx8uv",
"outputId": "03be66d0-6f21-463a-ab99-ca4f38bdf209"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"98925-44177\n",
"['98925-44177', '98700-14450']\n"
]
}
],
"source": [
"#---findall() Method\n",
"#-- search() will return a Match object of the first matched text in the
searched string, \n",
"#-- the findall() method will return the strings of every match in the
searched string.\n",
"\n",
"#--------- ex : findall() \n",
"import re\n",
"\n",
"phoneNumRegex = re.compile(r'\\d\\d\\d\\d\\d-\\d\\d\\d\\d\\d')\n",
"mo1 = phoneNumRegex.search('Cell: 98925-44177 Work: 98700-14450')\n",
"if mo1 :\n",
" print(mo1.group())\n",
"\n",
"phoneNumRegex = re.compile(r'\\d\\d\\d\\d\\d-\\d\\d\\d\\d\\d')\n",
"mo2 = phoneNumRegex.findall('Cell: 98925-44177 Work: 98700-14450')\n",
"\n",
"#return type of findall() is a list. \n",
"if mo2 :\n",
" print(mo2)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "S-4YZr_0yRHe",
"outputId": "9863c609-38ab-46b5-ded9-2c2956895ef3"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['1 Java', '2 SQL', '33 Web', '34 Textile']\n"
]
}
],
"source": [
"#--Character Classes : \\d, \\D, \\w, \\W, \\s, \\S\n",
"#-----------------------------------------------\n",
"#--using Character Classes----\n",
"import re\n",
"\n",
"coursesRegex = re.compile(r'\\d+\\s\\w+')\n",
"mo = coursesRegex.findall('1 Java-1z0-808, 2 SQL Oracle-1z0-061, 33 Web-
Technologies, 34 Textile Designing')\n",
"\n",
"if mo :\n",
" print(mo)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "aOT5P6bdyhpE"
},
"source": [
"**`Explaination to the above code`**\n",
"\n",
"> The regular expression **\\d+\\s\\w+** will match text that has one or more
numeric digits **(\\d+)**, followed by a whitespace character **(\\s)**, followed
by one or more letter/digit/underscore characters **(\\w+)**. \n",
"\n",
"> The findall() method returns all matching strings of the regex pattern in a
list.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "ywCG5iLvy-mv"
},
"source": [
"**Making Your Own Character Classes**\n",
"\n",
"There are times when you want to match a set of characters but the shorthand
character classes (\\d, \\w, \\s, and so on) are too broad. \n",
"\n",
"You can **define your own character class using square brackets**. \n",
"\n",
"For example, the character class [aeiouAEIOU] will match any vowel, both
lowercase and uppercase. "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "YqQeXIMvzMsH",
"outputId": "7bf184b6-5e4b-4c4e-aad9-8fe9c34a2e62"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']\n"
]
}
],
"source": [
"#--- find all vowels in a sentence\n",
"import re\n",
"\n",
"vowelRegex = re.compile(r'[aeiouAEIOU]')\n",
"mo = vowelRegex.findall('RoboCop eats baby food. BABY FOOD.')\n",
"\n",
"if mo :\n",
" print(mo)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "tdc5mKRuzcaQ"
},
"source": [
"By placing a caret character **(^)** just after the character class’s opening
bracket, you can make a negative character class. \n",
"\n",
"> A **negative character class** will match all the characters that are not in
the character class.\n"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "bFAm-RnezuSf",
"outputId": "bca855c2-e7b9-4068-e844-601432902ce6"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['R', 'b', 'C', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd', '.',
' ', 'B', 'B', 'Y', ' ', 'F', 'D', '.']\n"
]
}
],
"source": [
"# using ^ makes a negative character class\n",
"import re\n",
"\n",
"vowelRegex = re.compile(r'[^aeiouAEIOU]')\n",
"mo = vowelRegex.findall('RoboCop eats baby food. BABY FOOD.')\n",
"\n",
"if mo :\n",
" print(mo)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "9ahkodzoz4MK"
},
"source": [
"**The Caret and Dollar Sign Characters**\n",
"\n",
"You can also use the caret symbol **(^)** at the start of a regex to indicate
that a match must occur at the beginning of the searched text. \n",
"\n",
"Likewise, you can put a **dollar sign ($)** at the end of the regex to
indicate the string must end with this regex pattern. \n",
"\n",
"> And you can use the **^ and $ together** to indicate that the entire string
must match the regex."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 67
},
"colab_type": "code",
"id": "16RAqfvD0JoO",
"outputId": "4a678ac7-9e05-4119-c482-b43e4c014579"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1234567890\n",
"None\n",
"None\n"
]
}
],
"source": [
"# using ^(starts with), +(1 or more), $(ends with)\n",
"import re\n",
"\n",
"wholeStringIsNum = re.compile(r'^\\d+$') # i want to start with a digit,
followed with 1 or more digits and ending with digit.\n",
"mo1 = wholeStringIsNum.search('1234567890')\n",
"if mo1 :\n",
" print(mo1.group())\n",
"\n",
"mo2 = wholeStringIsNum.search('12345xyz67890')\n",
"if mo2 == None:\n",
" print(mo2) # the i/p does not match with regex, as it has non-digits in
between.\n",
" \n",
"mo3 = wholeStringIsNum.search('12 34567890') \n",
"if mo3 == None:\n",
" print(mo3) # the i/p does not match with regex, as it has space in
between."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"colab_type": "code",
"id": "X9TTX5Xx02mw",
"outputId": "6a35a7f7-e5f0-46bd-b419-28cef1ac8268"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['cat', 'hat', 'sat', 'lat', 'mat']\n"
]
}
],
"source": [
"#- .(dot) the Wildcard Character\n",
"# .(dot) will match any character except newline.\n",
"import re\n",
"\n",
"atRegex = re.compile(r'.at')\n",
"mo1 = atRegex.findall('The cat in the hat sat on the flat mat.')\n",
"if mo1 :\n",
" print(mo1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "jj-sJA-P3XoN"
},
"source": [
"<h3><font color ='red'><b>Coding <u>Test</u> on Regex: </b></font></h3>"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "kTQ4Ijl13bK_"
},
"source": [
"<img src='https://drive.google.com/uc?id=1-_o3s1giusGVYmpy3yz3goH-bEFUvL_7'
/>"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "et0nXEEp379T"
},
"outputs": [],
"source": [
"regex_pattern = r\"\"\t # Do not delete 'r'.\n",
"\n",
"import re\n",
"print(\"\\n\".join(re.split(regex_pattern, input())))\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "r8i7yxdS6c75"
},
"source": [
"This Notebook is <u>one</u> of the **`10`** Notebooks which form **`Python for
Data Science` Course** as listed on
https://datascience.suvenconsultants.com/elearning/\n",
"\n",
"\n",
"<hr>\n",
"<h3><font color='green'>You have solved above problems .. that means, Your
python concepts are Awesome !!</font></h3>\n",
"<hr>\n",
"\n",
"**Now , from the next lesson we would start learning all the packages (in
python) used for Data Science applications.**\n",
"\n",
"Thank you for going through the Notebook. I am sure it was a fruitful learning
exprience. Even you can earn your **`\"Masters in Data Science\"`** certification
followed with Internships and Placement calls. Do look at
https://datascience.suven.net for classroom training programmes or the
https://datascience.suven.net/elearning for **`online learning with continuous
support`** from <u>Rocky Sir & his team of data scientist for doubt solving over
whats-group and video calls</u>.\n",
"\n",
""
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "1_Introduction - Recalling List, dictionary and Regex",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 1
}