0% found this document useful (0 votes)
12 views32 pages

Week 1

The document outlines a course on Python programming and data analysis, detailing course structure, grading, and attendance policies. It emphasizes the importance of data analysis, programming skills, and provides information on required textbooks and software tools like Anaconda and Spyder. Additionally, it covers fundamental concepts of Python, including data types, variables, and type casting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views32 pages

Week 1

The document outlines a course on Python programming and data analysis, detailing course structure, grading, and attendance policies. It emphasizes the importance of data analysis, programming skills, and provides information on required textbooks and software tools like Anaconda and Spyder. Additionally, it covers fundamental concepts of Python, including data types, variables, and type casting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Week 1

Introduction and overview.


Variables, Data Types and Output.
Course Information

• Course Management System:


• Moodle
• Course Hours:
• 2 hours lecture / 2 hours lab
• Assessments:
• 1 midterm 40%
• 1 final 40%
• 10 lab assignments 20%

2
Lab Assignments

• Students must attend their scheduled lab and complete weekly hands-on
exercises during the lab session.
• There are 10 lab assignments to be completed during the weekly lab sessions.
• You will be graded for your performance and submission of your lab solutions.
• You must attend the lab session and upload your solution by the end of the
session to receive a grade.
• There are no make-ups for lab assignments. Special permission and medical
reports will not be accepted.
• You may not attend a lab other than the one to which you are registered.

3
Lab Grading and Attendance
Lab Attendance:
• If you do not attend a lab, your grade for that lab will be zero.
• There are no make-up assignments for lab sessions, however in calculating your
lab average your lowest lab score will be discarded and your lab average will
be calculated using your top 9 lab scores.
Lab Grading:
• Students who do not attend the weekly lectures cannot receive full points for
the lab sessions.
• To receive full points for the lab assignments each week, you must attend the
lecture.
• Students who DO NOT attend the lecture but DO attend the lab session may
receive AT MOST 50% for the lab assignment. 4
Textbook

There are 2 required textbooks for this course:

1. Python Data Science Handbook, Jake VanderPlas, O’Reilly. [download]


2. Fundamentals of Python Programming, Richard L. Halterman [download]

Both books are open source and can be downloaded for free from the links given
above.

5
What is Data Analysis?

• Data analysis is the process of examining, cleaning, transforming, and


interpreting data to discover useful information, draw conclusions, and support
decision-making.
• It involves applying statistical and computational techniques to identify
patterns, trends, and relationships within data sets.
• Helps organizations or individuals make informed choices, solve problems, and
uncover insights based on the data at hand.
• Goal of data science: gain insights and knowledge from data.
• All fields use data science to analyze and interpret large sets of data.

6
Why Learn Programming and Data Analysis?

• Problem solving and critical thinking: helps you break down complex problems
into smaller, manageable parts.
• Data-driven decision making: understanding how to analyze data allows you to
interpret patterns, trends, and correlations that guide better decisions.
• Career Opportunities: skills open up numerous career paths and opportunities.
• Versatility: techniques can be applied to almost any field. Being able to program
and analyze data allows you to solve specific problems in your area of interest.

7
What is Programming?

• Computer programs are used to analyze data.


• A computer program is a set of instructions for a computer to execute.
• Programs are written in a variety of programming languages.
• A programming language is like any other language, it has a vocabulary and set of
grammatical rules for instructing a computer or computing device to perform
specific tasks.

8
What is Python?

• Python is a general-purpose programming language that can be used for a variety


of applications.
• Python is popular because there are many ready-made tools that are available
for data analysis.
• In this course we will use Python version 3.12, and third-party tools for extra
functionality such as numpy, pandas and matplotlib.

9
Software
For the course, the tools that we need are included in the Anaconda Python 3.12
Distribution.
Anaconda contains:
• Python: a programming language in which we write computer programs.

• Python packages: For scientific computing and computational modelling, we need


additional libraries (packages) that are not part of the Python standard library.
These allow us to create plots, operate on matrices, analyze data and use
specialized numerical methods. Eg. numpy, pyplot, PANDAS etc.

• SPYDER: default development environment used to write and execute Python


programs.

• Jupyter Notebook: another tool to develop Python programs. 10


How to Download Anaconda

• We recommend that you install the Anaconda Python 3.12 distribution using the
instructions found at the following link: Anaconda Python

• When you download Anaconda, you will have access to the tools used in the
course: Python, Spyder, Numpy, Pandas, Matplotlib.

11
Online Development Environment

• During the lab sessions you must use Spyder (installed on all lab
computers) to complete your work.
• However, for practice outside the labs, if you are away from your
computer and want to practice Python, we recommend the online
tool, replit.com. Replit requires you to create an account however it is
free to use.
• We do not recommend using Anaconda Cloud for practice, as you will
not have access to this tool during labs and exams.

12
First Python Commands

• A Python program is made up of one or more statements/commands, which are


instructions telling the computer to do something.
• Like in any language, statements must be meaningful and follow certain rules.
• When we type a command, the Python interpreter must decide how to carry out the
command.
• If our command is not written correctly, the interpreter cannot understand what we
are asking, and will display an error message.
• Errors in the commands must be corrected for our statements to run (execute)
correctly.

13
First Python Commands – shell vs. script
Interactive Shell:
• Python commands (instructions) can be executed interactively using the shell (bottom
right corner of the Spyder window).
• The shell is useful for simple statements or to debug (find errors in) a larger program.
• The results of commands executed in the shell/console are automatically displayed.
Script Window:
• Python commands can also be placed in a file called a script. A script contains one or
more Python commands that are executed when the program is run.
• The results of commands placed in the script are only displayed to the console if the
programmer chooses to display them (by printing).
14
First Commands: Examples
Step 1: Open the development environment (Spyder).
Step 2: Type the command(s) in the shell.
Step 3: Evaluate the results.
Try the following:

15
Things to Notice
• Whitespace characters:
• Include spaces, newlines, tabs and they make programs and commands easier to read.
• Notice in the examples, we put spaces before and after the arithmetic operators ( 3 + 5 )
• This is an example of using whitespace to improve readability.
• The python interpreter ignores these extra characters, as they are for the reader of the
program (us!) rather than the interpreter.
• With the spaces removed, the program would behave in the same way.

• Also notice that for multiplication, we use the ‘*’ symbol instead of 3 x 5.
• These are examples of the syntax rules of python, which we will discuss in
detail next week.
16
print() statement

Python provides functions to carry out common tasks, the print function
is an example of this.
The print() statement allows us to print messages to the console
window from our scripts.
print() can echo text strings, numbers or calculations.
• Text strings should be between single or double quotes.
• To output multiple values, we can separate them with a comma.

17
First Commands: print()
• Try the following in a script:

18
Data in Data Science
Data science is attempt to understand, model, and manipulate the world around
us through quantifiable observations (data).

Data is an abstraction—a symbol or number that represents something more


complex.

In Python, this data is stored in objects and every piece of data, regardless of its
type, is represented internally as an object.

19
Objects

All data in Python is stored as objects and programs manipulate objects.


Every object has a type that defines the kinds of things programs can do with it.
For example:
• The value 5 is an integer and it can be multiplied, added, etc.

• ‘Hello world’ is a string that can be searched, sliced.

There are two general types of objects:


• scalar (cannot be subdivided)

• non-scalar (have internal structure that can be accessed)

20
Numeric Data Types – int, float

There are two numeric data types in Python that store integer and floating-point values.
int is the data type of objects that store integer values. Examples: 4, 11, 73, 1243.
float is the data type that stores floating point values. Examples: 44.2, 79.435, 2.6.
Numeric objects (int, float) are used in arithmetic operations.
int and float objects are called scalar types, because they store a single value.

21
Text/String Data Type – str

Text values such as ‘CS125’, ‘Hello World!’, ‘4’ are called strings in python.
The data type determines what we can do with the object: strings can be joined,
but we cannot subtract one string from another.
Any values inside double or single quotes are interpreted as strings and not
numbers.
We cannot use ‘4’ in the same way we use 4.

22
Variables
Our programs (scripts) store data that will be used and manipulated.
In order to access the data, we should store it somewhere and give it a name.
Variables are named locations that store values used in a program.
Variables have a name, a value and a type.

name of the variable a

value of the variable 2

type of the variable: int

23
Naming Variables

Python has strict rules for naming variables (identifiers).


Rules are:
• Identifiers must contain at least one character.

• The first character of an identifier must be an alphabetic letter (upper or lowercase)


or an underscore
• The remaining characters (if any) may be alphabetic characters (upper or
lowercase), the underscore, or a digit.
• No other characters (including spaces) are permitted in identifiers.

• A reserved word cannot be used as an identifier. A reserved word is a word with a


special meaning, such as int, float, etc.
• If you accidentally use a reserved word as a variable name, python will give an
error. 24
Valid Variable Names
Which of the following names are valid variable names:
• first_name
• sub-total
• first entry
• Section1
• 4all
• *2
• classSize
• LOCATION
• int

25
Valid Variable Names
Which of the following names are valid variable names:
• first_name (VALID)
• sub-total (INVALID - dash is not a legal symbol in an identifier)
• first entry (INVALID - space is not a legal symbol in an identifier)
• Section1 (VALID)
• 4all (INVALID - begins with a digit)
• *2 (INVALID - the asterisk is not a legal symbol in an identifier)
• classSize (VALID)
• LOCATION (VALID)
• int (INVALID - int is a reserved word)

26
Naming Guidelines

To make your programs more understandable, it is recommended you use the


following guidelines:
• Give your variables meaningful names, which indicate the purpose of the data
stored.
• Name your variables using lowercase letters.
• If your name includes multiple words, separate the words with an underscore.
• Name should not be too long or too short (meaningful!)

27
Assignment – storing values in variables
Variables store data(objects).
To store an object with a given name, we assign the value to the variable using the
assignment operator (=).
An assignment statement assigns a value to a variable.
Note: The meaning of the assignment operator is different from equality in mathematics.
In Python, = sign assigns the value of the expression on the right to the variable on the left.

28
Data Types – Variable Explorer
The value stored in each variable has a type.
We can see the type of a value (how Python will interpret the value) in the Variable
Explorer window.
Example:

Note: Python recognizes the dot(.) as the decimal point. Don’t use a comma,
29
the interpreter does not interpret it as a decimal.
Assignment - Variables
Variable: an element that may vary or change.
The values stored in variables may change throughout the execution of a program.
If the new value is of a different type, the type of a variable will change.
When a variable is assigned a new value, the new value overwrites the existing
value.

30
Type Casting – Changing the Type of a Value
Sometimes we need to convert a value from one type to another to change how
they are used.
Each type has a special command we can use (called functions) to convert a
value to that type.
For example, if we want to convert the string value ‘5’ to an integer value, we
used the int() command.

31
Type Casting – Changing the Type of a Value
Use caution when type casting, each value can only be cast to a suitable type.
For example, the string value ‘5’ has the integer representation of 5, or the float
representation of 5.0.
If we try to convert the string value ‘abc’ to an int, it has no integer representation, so a type
error will occur.
IMPORTANT: If we convert float values to an integer, the float value will be truncated,
meaning the value is NOT rounded, any decimals are dropped.
Examples:
int(5.2) -> 5
int(5.8) -> 5
float(3) -> 3.0
str(5) -> ‘5’
str(3.7) ->’3.7’
int(‘abc’) -> ERROR 32

You might also like