01 - CM2015 - Introduction To Data Programming (2022-10)
01 - CM2015 - Introduction To Data Programming (2022-10)
Topic 1
Introduction to Data Programming
Learning Outcomes
After completing this topic and the recommended reading, you should be able
to:
• Set up and run Jupyter Notebook on a Windows, Mac or Linux operating
system.
• Use Jupyter Notebook to write and edit code.
• Write and explain simple Python programs using variables and
mathematical operators.
Data (definition)
• “Facts and statistics collected together for reference or analysis.”
[Oxford English Dictionary]
Information (definition)
• “Facts provided or learned about something or someone.”
[Oxford English Dictionary]
Data Science
• Programming
o The process of producing an executable computer program that
performs a specific task.
o The purpose is to find a sequence of instructions that automate the
implementation of the task for solving a given problem.
• Programming Language
o The source code of a program is written in one or more languages
that are intelligible to humans, rather than machine code, which is
directly executed by the CPU.
o Python
§ https://www.python.org/
Source-code Editors
• Source-code editor, or programming text editor, is a fundamental
programming tools designed specifically for editing source code of
computer programs.
• It highlights the syntax elements of your programs, and provides many
features that aid in your program development.
• Examples:
o Visual Studio Code [https://code.visualstudio.com/]
o Notepad++ (Windows only) [https://notepad-plus-plus.org/]
o Vim [https://www.vim.org/]
o Sublime Text (not open source) [https://www.sublimetext.com/]
o Atom [https://atom.io/]
o Emacs [https://www.gnu.org/software/emacs/]
o TextMate (Macs only) [https://macromates.com/]
o Jupyter
§ https://jupyter.org/
o RStudio [https://rstudio.com]
o Eclipse [https://www.eclipse.org/]
o Microsoft Visual Studio [https://visualstudio.microsoft.com/vs/]
o Wing Python IDE [https://wingware.com]
o Git
§ https://git-scm.com/
o GitHub
§ https://github.com/
Package/Environment Manager
• Package manager, or package management system, is a collection of
software tools that automates the process of installing, upgrading,
configuring, and removing computer programs for a computer in a
consistent manner. Also deals with packages, distributions of software
and data in archive files.
• Environment manager enables personalised, consistent desktop
environments without cumbersome roaming profiles or scripts.
• Example:
o Anaconda
§ https://www.anaconda.com/
Installing Anaconda
• Go to Anaconda, download Anaconda Individual Edition
o https://www.anaconda.com/products/distribution
• Packages include
o conda
§ package management system
3. Introduction to Python
Variables
• Variable is a named piece of memory whose value can change during the
running of the program; constant is a value which cannot change as the
program runs.
o Python doesn’t use constant
• We use variable names to represent objects (number, data structures,
functions, etc.) in our program, to make our program more readable.
o All variable names must be one word, spaces are never allowed.
o Can only contain alpha-numeric characters and underscores.
o Must start with a letter or the underscores character.
o Cannot begin with a number.
o Case-sensitive
o Standard way for most things named in Python is lower with under
§ Lower case with separate words joined by an underscore
Comments
Python Operations
• Assignment Operator
o “=”
o Example:
§ a = 67890/12345
# compute the ratio, store the result in ram, assign to a
# the value of a is 5.499392
§ b=a
# b pointing to value of a
• Output
o “print()”
o Example:
§ print(‘Hello World!’) # print the string literals
§ print(a) # print the value of a
• Float
o Stores real numbers
o a = 4.6
o print(type(a))
• Integer
o Stores integers
o b = 10
o print(type(b))
• Conversion
o int(a) # convert float to int => 4
o float(b) # convert int to float => 10.0
• String
o Stores strings
o phrase = ‘All models are wrong, but some are useful.’
o phrase[0:3] # slicing character 0 up to 2
=> All
o phrase.find(‘models’) # find the starting index of word
=> 4
o phrase.find(‘right’) # word not found
=> -1
o phrase.lower() # set to lower case
=> ‘all models are wrong, but
some are useful.’
o phrase.upper() # set to upper case
=> ‘ALL MODELS ARE
WRONG, BUT SOME ARE
USEFUL.’
o phrase.split(‘,’) # split strings into list, base on delimiter
=> [‘All models are wrong’,
‘ but some are useful.’]
• Boolean
o Stores logical or Boolean values of TRUE or FALSE
o k=1>3
o print(k)
o print(type(k))
• Logical operators
o Conjunction (AND): “and”
o Disjunction (OR): “or”
o Negation (NOT): “not”
a b a and b a or b not a
T T T T F
T F F T F
F T F T T
F F F F T
• Lists
• Sets
o Store unordered, unindexed, nonduplicates collection of objects
o Written with square brackets “{ }”
§ set1 = {“apple”, “banana”, “cherry”}
§ set2 = {“apple”, “samsung”}
o Set operations
§ set1.union(set2) # Union both sets
=> {‘apple’, ‘banana’, ‘cherry’,
‘samsung’}
§ set1.intersection(set2) # Intersect both sets
=> {‘apple’}
• Dictionaries
o Store unordered collection of objects
o Written with square brackets “{ }”, and “key:value” pair
§ thisdict = {“brand”: “Ford”, “model”: “Mustang”,
“year”: 1964}
o Accessing/modifying elements by key name
§ thisdict[“model”] => ‘Mustang’
§ thisdist[“year”] = 2018 => {‘brand’: ‘Ford’,
thisdist[“color”] = “red” ‘model’: ‘Mustang’,
‘year’: 2018,
‘color’: ‘red’}
5. Exercises
6. Practice Quiz
• Work on Practice Quiz 01 posted on Canvas.
Useful Resources
•
o http://