0% found this document useful (0 votes)
64 views17 pages

PDS Exp 1 To 3

This document is a laboratory manual for the subject "Python for Data Science" for 5th semester BE students of Computer Engineering at a Government Engineering College in Gujarat, India. It contains instructions for 16 experiments involving Python programming and data science concepts like data structures, data analysis, visualization, machine learning etc. The preface explains the purpose of practical skills and how this manual aims to enhance industry-relevant competencies in students. Guidelines are provided for faculty on conducting the experiments and developing the targeted skills in students.

Uploaded by

X
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views17 pages

PDS Exp 1 To 3

This document is a laboratory manual for the subject "Python for Data Science" for 5th semester BE students of Computer Engineering at a Government Engineering College in Gujarat, India. It contains instructions for 16 experiments involving Python programming and data science concepts like data structures, data analysis, visualization, machine learning etc. The preface explains the purpose of practical skills and how this manual aims to enhance industry-relevant competencies in students. Guidelines are provided for faculty on conducting the experiments and developing the targeted skills in students.

Uploaded by

X
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

A Laboratory Manual for

Python for Data Science


(3150713)

B.E. Semester 5
(Computer Engineering)

Directorate of Technical Education, Gandhinagar,


Gujarat
Government Engineering College, Rajkot

Certificate

This is to certify that Mr./Ms. ___________________________________


________ Enrollment No. _______________ of B.E. Semester _____
Computer Engineering of this Institute (GTU Code: _____ ) has satisfactorily
completed the Practical / Tutorial work for the subject Python for Data
Science (3150713) for the academic year 2023-24.

Place: __________
Date: __________

Name and Sign of Faculty member

Head of the Department


Python for Data Science (3150713)

Preface

Main motto of any laboratory/practical/field work is for enhancing required skills as well as
creating ability amongst students to solve real time problem by developing relevant competencies
in psychomotor domain. By keeping in view, GTU has designed competency focused outcome-
based curriculum for engineering degree programs where sufficient weightage is given to
practical work. It shows importance of enhancement of skills amongst the students and it pays
attention to utilize every second of time allotted for practical amongst students, instructors and
faculty members to achieve relevant outcomes by performing the experiments rather than having
merely study type experiments. It is must for effective implementation of competency focused
outcome-based curriculum that every practical is keenly designed to serve as a tool to develop
and enhance relevant competency required by the various industry among every student. These
psychomotor skills are very difficult to develop through traditional chalk and board content
delivery method in the classroom. Accordingly, this lab manual is designed to focus on the
industry defined relevant outcomes, rather than old practice of conducting practical to prove
concept and theory.

By using this lab manual students can go through the relevant theory and procedure in advance
before the actual performance which creates an interest and students can have basic idea prior to
performance. This in turn enhances pre-determined outcomes amongst students. Each experiment
in this manual begins with competency, industry relevant skills, course outcomes as well as
practical outcomes (objectives). The students will also achieve safety and necessary precautions
to be taken while performing practical.

This manual also provides guidelines to faculty members to facilitate student centric lab activities
through each experiment by arranging and managing necessary resources in order that the
students follow the procedures with required safety and necessary precautions to achieve the
outcomes. It also gives an idea that how students will be assessed by providing rubrics.

Data Science is about data gathering, analysis and decision-making. Data Science is about finding
patterns in data, through analysis, and make future predictions. By using Data Science, companies
are able to make:

• Better decisions (should we choose A or B)


• Predictive analysis (what will happen next?)
• Pattern discoveries (find pattern, or maybe hidden information in the data)

Data Science is used in many industries in the world today, e.g. banking, consultancy, healthcare,
and manufacturing. Python is an open-source, interpreted, high-level language and provides a
great approach to data science, machine learning, and research purposes. It is one of the best
languages for data science to use for various applications & projects. When it comes to dealing
with mathematical, statistical, and scientific functions, Python has great utility.

Utmost care has been taken while preparing this lab manual however always there is chances of
improvement. Therefore, we welcome constructive suggestions for improvement and removal of
errors if any.
Python for Data Science (3150713)

Practical – Course Outcome matrix

Course Outcomes (COs):


1. Apply various Python data structures to effectively manage various types of data.
2. Explore various steps of data science pipeline with role of Python.
3. Design applications applying various operations for data cleansing and transformation.
4. Use various data visualization tools for effective interpretations and insights of data.
5. Perform data Wrangling with Scikit-learn applying exploratory data analysis.

Sr.
Objective(s) of Experiment CO1 CO2 CO3 CO4 CO5
No.
Develop a program to understand the control structures of
1. √
python.
Develop a program to learn different types of structures
2. (list, dictionary, tuples) in python. √

Develop a program that reads a .csv dataset file using


Pandas library and display the following content of the
dataset.
3. √ √
a) First five rows of the dataset
b) Complete data of the dataset
c) Summary or metadata of the dataset.
Develop a program that shows application of slicing and
4. dicing over the rows and columns of the dataset. √ √

Develop a program that shows usage of aggregate function


5. over the input dataset. a) describe b) max c) min d) mean √ √
e) median f) count g) std h) Corr
Develop a program that applies split and merge operations
6. on the datasets. √ √

Develop a program that shows the various data cleaning


tasks over the dataset. a) Identifying the null values. b)
7. √ √ √
Identifying the empty values.
c) Identifying the incorrect timestamp
Develop a program that shows usage of following NumPy
array operations: a) any() b) all() c) isnan() d) isinf() e)
8. isfinite() f) isinf() g) zeros() h) isreal() i) iscomplex() j) √
isscalar() k) less() l) greater() m) less_equal() n)
greater_equal()
Develop a program that shows usage of following NumPy
9. library vector functions. a) arrange() b) reshape() c) √
linspace() d) randint() e) dot()
Write a program to display below plot using matplotlib
10. libraryFor Values of X:[1,2,3,...,49], Values of Y (thrice √
ofX):[3,6,9,12,...,144,147]
Write a program to display below bar plot using matplotlib
library For value
11. Languages = ['Java', 'Python', 'PHP', 'JavaScript', 'C#', √
'C++']
popularity = [22.2, 17.6, 8.8, 8, 7.7, 6.7]
Write a program to display below bar plot using matplotlib
library For below data display pie plot
12. √
languages = ['Java', 'Python', 'PHP', 'JavaScript', 'C#',
'C++']
Python for Data Science (3150713)

popuratity = [22.2, 17.6, 8.8, 8, 7.7, 6.7]


colors = ["#1f77b4", "#ff7f0e", "#2ca02c", "#d62728",
"#9467bd", "#8c564b"]
Write a program to display below bar plot using matplotlib
library For 200 random points for both X and Y display
13. √
scatter plot

Develop a program that reads .csv file from the url:


(https://github.com/chris1610/pbpython/blob/master/data/
14. √ √
sample salesv3.xlsx?raw=true) and plot the data of the
dataset stored in the .csv file.
Write a text classification pipeline using a custom
preprocessor and CharNGramAnalyzer using data from
15. Wikipedia articles as a training set. √ √ √

• Evaluate the performance on some held out test sets.


Write a text classification pipeline to classify movie
reviews as either positive or negative.
16. √ √ √
• Find a good set of parameters using grid search.
• Evaluate the performance on a held out test set.
Python for Data Science (3150713)

Industry Relevant Skills

The following industry relevant competency are expected to be developed in the student by
undertaking the practical work of this laboratory.
1. Programming Languages
2. Mathematics, Statistical Analysis, and Probability
3. Data Mining
4. Machine Learning and AI
5. Data Visualization

Guidelines for Faculty members


1. Teacher should provide the guideline with demonstration of practical to the students
with all features.
2. Teacher shall explain basic concepts/theory related to the experiment to the students before
starting of each practical
3. Involve all the students in performance of each experiment.
4. Teacher is expected to share the skills and competencies to be developed in the
students and ensure that the respective skills and competencies are developed in the
students after the completion of the experimentation.
5. Teachers should give opportunity to students for hands-on experience after the
demonstration.
6. Teacher may provide additional knowledge and skills to the students even though not
covered in the manual but are expected from the students by concerned industry.
7. Give practical assignment and assess the performance of students based on task
assigned to check whether it is as per the instructions or not.
8. Teacher is expected to refer complete curriculum of the course and follow the
guidelines for implementation.

Instructions for Students


1. Students are expected to carefully listen to all the theory classes delivered by the faculty
members and understand the COs, content of the course, teaching and examination scheme,
skill set to be developed etc.
2. Students shall organize the work in the group and make record of all observations.
3. Students shall develop maintenance skill as expected by industries.
4. Student shall attempt to develop related hand-on skills and build confidence.
5. Students shall make a small project/application in Python.
6. Student shall develop the habits of evolving more ideas, innovations, skills etc. apart from
those included in scope of manual.
7. Student shall refer technical magazines and data books.
8. Student should develop a habit of submitting the experimentation work as per the schedule
and s/he should be well prepared for the same.
Python for Data Science (3150713)

Common Safety Instructions


Students are expected to
1. Switch on the PC carefully (not to use wet hands)
2. Shutdown the PC properly at the end of your Lab
3. Carefully Handle the peripherals (Mouse, Keyboard, Network cable etc)
4. Use Laptop in lab after getting permission from Teacher
Python for Data Science (3150713)

Index
(Progressive Assessment Sheet)

Sr. Objective(s) of Experiment Page Date of Date of Assessme Sign. of Remar


No. No. perform submiss nt Teacher ks
ance ion Marks with date

Total
Python for Data Science (3150713)

Experiment No: 1

Develop a program to understand the control structures of python.


a) Write a program to print grade of a student.
b) Write a program to find factorial of given number using for loop.
c) Write a program to access list elements using for loop.
d) Write a program to make sum of first n numbers using while loop.
e) Write a program to print 10 random numbers between 1 to 100 using for
loop.

Date:

Competency and Practical Skills:


Competency skills:

• Basic knowledge of computer systems, operating systems, and file systems.


• Familiarity with command-line interfaces (CLI) and graphical user interfaces (GUI).
• Understanding of programming languages, syntax, and logic.
Practical skills:

• Basic understanding of Python programming language


• Understanding of Python control structures
• Ability to use Python's built-in functions and libraries
• Familiarity with Python's syntax
• Problem-solving skills

Relevant CO: CO1

Objectives: (a) To learn and understand the different control structures in Python, such as loops,
conditional statements, and functions.

Equipment/Instruments: Personal Computer, Internet, Python

Theory:
Conditional statements: Conditional statements in Python allow you to execute certain blocks of
code based on whether a certain condition is true or false. The two main types of conditional
statements in Python are "if" statements and "if-else" statements.

Loops: Loops in Python allow you to repeat a block of code multiple times, either for a fixed number
of times or until a certain condition is met. The two main types of loops in Python are "for" loops
and "while" loops.

Functions: Functions in Python allow you to encapsulate blocks of code and reuse them throughout
your program. Functions can accept parameters and return values, making them a powerful tool for
organizing and structuring your code.

Scope: Scope in Python refers to the region of your program where a variable or function is visible
and accessible. Understanding scope is critical for avoiding errors and ensuring that your code is
organized and easy to maintain.

Error handling: Error handling in Python involves detecting and responding to errors that may occur
Python for Data Science (3150713)

during program execution. Proper error handling can help you avoid crashes and ensure that your
program continues to run smoothly.

Safety and necessary Precautions:

1. Data validation.
2. Check the data types.
3. Input sanitization.
4. Error Handling and Secure coding practices.
5. Use comments.
6. Test your code.

Procedure:

1. Plan the program structure and flow: Develop a plan for the program structure, including
the control structures that will be included, and the flow of the program logic.

2. Implement the control structures in Python: Write the code to implement the different
control structures in Python, including conditional statements, loops, and functions.

3. Test and debug the program: Conduct thorough testing of the program to ensure that it is
functioning correctly and identify and troubleshoot any errors or bugs.

4. Refine and optimize the program: Refine the program as needed to improve performance
and optimize its functionality, based on user feedback and testing results.

5. Document the program: Provide clear documentation of the program's purpose,


functionality, and limitations, as well as any potential security risks or necessary
precautions.

6. Deploy and maintain the program: Deploy the program for use by users, and maintain it by
addressing any issues or bugs that arise and providing updates and new features as needed.

Observations: Put Output of the program

Conclusion:
Python for Data Science (3150713)

Quiz:
1. What is a conditional statement in Python?
2. What is a loop in Python?
3. What is the difference between a "for" loop and a "while" loop in Python?
4. What is a function in Python?
5. What is scope in Python?

Suggested Reference:
1. https://docs.python.org/3/library/
2. https://www.tutorialspoint.com/python/
3. https://www.geeksforgeeks.org/
4. https://realpython.com/
5. https://www.w3schools.com/python/

References used by the students:

Rubric wise marks obtained:

Rubrics 1 2 3 4 5 Total
Marks
Knowledge of Programming Team work (2) Communication Skill Ethics(2)
subject (2) Skill (2)

Goo Averag Goo Averag Good Satisfactory Good Satisfactory Good Average
d (2) e (1) d (2) e (1) (2) (1) (2) (1) (2) (1)
Python for Data Science (3150713)

Experiment No: 2

Develop a program to learn different types of structures (list, dictionary,


tuples) in python.

Date:

Competency and Practical Skills:


Competency skills:

• Basic knowledge of computer systems, operating systems, and file systems.


• Familiarity with command-line interfaces (CLI) and graphical user interfaces (GUI).
• Understanding of programming languages, syntax, and logic.
Practical skills:

• Basic programming concepts: You should have a good grasp of basic programming concepts
such as variables, data types, conditional statements, loops, and functions.
• Python programming language: You should have a good understanding of Python syntax,
data structures, and standard library functions.
• Sequences: Sequences are ordered collections of elements that can be accessed by their
index or key. You should have a good understanding of the different types of sequences
such as string, tuple, list, dictionary, and set, and their respective properties.
• String manipulation: You should know how to manipulate them using methods such as
slicing, concatenation, and formatting.
• Collection manipulation: Collections such as lists, tuples, dictionaries, and sets can be
manipulated using methods such as append, insert, remove, pop, and sort.
• Iteration: You should know how to use for loops and list comprehensions to iterate over
sequences.
• Conditional statements: You should know how to use conditional statements to check for
specific conditions in sequences.
• Functions: You should know how to define functions that operate on sequences and return
values.

Relevant CO: CO1

Objectives: (a) To learn how to manipulate and access their elements, iterate over them, perform
conditional operations on them, and use them in functions.

(b) To learn how to select the appropriate sequence type for a given task based on its properties and
performance characteristics.

Equipment/Instruments: Personal Computer, Internet, Python

Theory:
1. In Python programming language, there are four built-in sequence types: strings, lists,
tuples, and ranges. Additionally, Python includes the set and dictionary data structures,
which are implemented as unordered collections of unique and key-value pairs, respectively.

2. The string data type in Python represents a sequence of characters and is immutable,
meaning its contents cannot be changed once it is created. Strings can be manipulated using
various methods such as slicing, concatenation, and formatting.

3. Lists and tuples are similar in many ways, but tuples are immutable, whereas lists are
mutable. Lists and tuples can hold elements of any data type and can be indexed and sliced
like strings. However, lists offer additional methods such as append, insert, remove, and pop
Python for Data Science (3150713)

that allow for manipulation of the list's contents.

4. Dictionaries are another important sequence type in Python and are implemented as
unordered collections of key-value pairs. Each element in a dictionary consists of a key and
a corresponding value. Dictionaries can be used to store and retrieve data quickly based on
the key.

5. Sets are collections of unique elements that are unordered and mutable. Sets are often used
to perform set operations such as union, intersection, and difference.

Safety and necessary Precautions:

1. Use of proper data validation.


2. Secure data storage.
3. Proper error handling.
4. Testing and debugging.
5. Keeping software up to date.
6. Proper code formatting and documentation.

Procedure:
1. Create a string variable using single or double quotes.
Use string methods like upper(), lower(), strip(), split(), join(), and replace() to manipulate the
string as needed.
Use indexing and slicing to access specific characters or substrings within the string.
2. Create a tuple variable using parentheses.
Use indexing and slicing to access specific elements or subsets within the tuple.
Tuples are immutable, so you cannot add, remove or modify elements once created.
3. Create a list variable using square brackets.
Use indexing and slicing to access specific elements or subsets within the list.
Use list methods like append(), insert(), remove(), pop(), extend(), and sort() to modify the list
as needed.
Lists are mutable, so you can add, remove or modify elements once created.
4. Create a dictionary variable using curly braces or the dict() constructor.
Use keys to access values within the dictionary.
Use dictionary methods like keys(), values(), and items() to access different parts of the
dictionary.
Use del or pop() to remove elements from the dictionary.
Use assignment to add or modify elements in the dictionary.
5. Create a set variable using curly braces or the set() constructor.
Use set methods like add(), remove(), pop(), union(), and intersection() to modify or perform
operations on the set.
Sets do not allow duplicate elements, so adding the same element multiple times will only add
it once.

Observations: Put Output of the program

Conclusion:
Python for Data Science (3150713)

Quiz:
1. What method can you use to convert a string to uppercase in Python?
2. What is the difference between a tuple and a list in Python?
3. How do you add an element to a list in Python?
4. How do you access a value in a dictionary using its key in Python?
5. What is a set in Python?

Suggested Reference:
1. https://docs.python.org/3/library/
2. https://www.tutorialspoint.com/python/
3. https://www.geeksforgeeks.org/
4. https://realpython.com/
5. https://www.w3schools.com/python/

References used by the students:

Rubric wise marks obtained:

Rubrics 1 2 3 4 5 Total
Marks
Knowledge of Programming Team work (2) Communication Skill Ethics(2)
subject (2) Skill (2)

Goo Averag Goo Averag Good Satisfactory Good Satisfactory Good Average
d (2) e (1) d (2) e (1) (2) (1) (2) (1) (2) (1)
Python for Data Science (3150713)

Experiment No: 3

Develop a program that reads a .csv dataset file using Pandas library and display
the following content of the dataset.
a) First five rows of the dataset
b) Complete data of the dataset
c) Summary or metadata of the dataset.
Date:

Competency and Practical Skills:


Competency skills:

• Knowledge of Python programming language and its libraries, particularly the Pandas
library.
• Understanding of the structure of .csv files and how to read and manipulate them using
Pandas.
• Familiarity with the different methods and functions available in Pandas, such as "head()",
"print()", "display()", "info()", and "describe()".
• Ability to write and debug code, and troubleshoot errors that may arise when working with
datasets.
• Experience in working with datasets, including data cleaning, data wrangling, and data
analysis.
• Ability to understand the content and structure of datasets, and use them to derive insights
and information.

Practical skills:

• Writing code to load a .csv dataset file into a Pandas DataFrame using the "read_csv()"
function.
• Using the "head()" method to display the first five rows of the dataset.
• Using the "print()" function or "display()" method to display the complete data of the dataset.
• Using the "info()" method or "describe()" method to display the summary or metadata of the
dataset.
• Handling errors and exceptions that may arise when working with datasets.
• Writing clean and efficient code that is easy to read and maintain.
• Testing the program with different datasets to ensure its accuracy and reliability.

Relevant CO: CO1, CO2

Objectives: (a) To read and load the .csv dataset file into a Pandas DataFrame.
(b) To display the first five rows of the dataset using the "head()" method.
(c) To display the complete data of the dataset using the "print()" function or "display()" method.
(d) To display the summary or metadata of the dataset using the "info()" method or "describe()"
method.

Equipment/Instruments: Personal Computer, Internet, Python

Theory:

Pandas is a popular data manipulation library for Python, widely used in data science and machine
learning. It provides a powerful and flexible toolset for working with structured data, including
loading, manipulating, and analyzing datasets in various formats, including .csv files
Python for Data Science (3150713)

Safety and necessary Precautions:

1. Data security, quality and privacy.


2. Memory and performance optimization.
3. Error handling and exception handling.
4. Use comments.
5. Test your code.

Procedure:
1. Import the Pandas library: To use the Pandas library in Python, it is essential to import it
into your program. You can do this by using the "import pandas as pd" statement.

2. Load the dataset: The next step is to load the dataset into a Pandas DataFrame using the
"read_csv()" function. This function takes the path to the .csv file as an argument and returns
a DataFrame object that contains the data from the file.

3. Display the first five rows: To display the first five rows of the dataset, you can use the
"head()" method. This method returns the first five rows of the DataFrame by default, but
you can specify the number of rows you want to display as an argument.

4. Display the complete data: To display the complete data of the dataset, you can use the
"print()" function or "display()" method. This will output the entire DataFrame to the
console or Jupyter Notebook.

5. Display summary or metadata: To display the summary or metadata of the dataset, you can
use the "info()" method or "describe()" method. The "info()" method provides information
about the DataFrame, including the number of rows and columns, data types, and memory
usage. The "describe()" method provides statistical summary of the dataset, including count,
mean, standard deviation, minimum, maximum, and quartiles for each column.

Observations: Put Output of the program

Conclusion:
Python for Data Science (3150713)

Quiz:
1. What library should be used to read a .csv dataset file in Python?
2. Which method is used to read a .csv file using Pandas library?
3. How can you display the first five rows of the dataset using Pandas?
4. How can you display the complete data of the dataset using Pandas?
5. How can you display the summary or metadata of the dataset using Pandas?

Suggested Reference:
1. Official Pandas documentation: https://pandas.pydata.org/docs/
2. "Python for Data Analysis" by Wes McKinney:
https://www.oreilly.com/library/view/python-for-data/9781491957653/
3. "Python Data Science Handbook" by Jake VanderPlas:
https://jakevdp.github.io/PythonDataScienceHandbook/
4. Pandas tutorial by DataCamp: https://www.datacamp.com/community/tutorials/pandas-
tutorial-dataframe-python

References used by the students:

Rubric wise marks obtained:

Rubrics 1 2 3 4 5 Total
Marks
Knowledge of Programming Team work (2) Communication Skill Ethics(2)
subject (2) Skill (2)

Goo Averag Goo Averag Good Satisfactory Good Satisfactory Good Average
d (2) e (1) d (2) e (1) (2) (1) (2) (1) (2) (1)

You might also like