0% found this document useful (0 votes)
17 views30 pages

1 Intro

Uploaded by

tolgaairmakk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views30 pages

1 Intro

Uploaded by

tolgaairmakk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

BUS 601 Python for Analytics

Course Introduction

By Dr. Qi Li

BUS 601
Computer and Data Analytics
• Computers can perform calculations and make logical decisions
phenomenally faster than human beings can
• Today’s personal computers can perform billions of calculations in one
second
• Supercomputers are already performing thousands of trillions (quadrillions) of
instructions per second!
• Computers process data under the control of sequences of instructions called
computer programs
• Guide the computer through ordered actions specified by computer
programmers
• A computer consists of various physical devices referred to as **hardware
(keyboard, screen, mouse, solid-state disks, hard disks, memory, DVD drives
and processing units
• Moore’s Law
– For many decades, hardware costs have fallen rapidly
– Every year or two, the capacities of computers have approximately doubled inexpensively

BUS 601
Computer Architecture
• Von Neumann architecture

BUS 601
Input and Output
• Input Unit:
– obtains information (data and computer programs)
from input devices and places it at the disposal of the
other units for processing
• Output Unit:
– Takes information the computer has processed and
places it on various output devices to make it
available for use outside the computer

BUS 601
Memory Unit
• Rapid-access, relatively low-capacity “warehouse” section
retains information that has been entered through the input unit,
making it immediately available for processing when needed
• Also retains processed information until it can be placed on
output devices by the output unit
• Information in the memory unit is volatile—it’s typically lost
when the computer’s power is turned off
• Called memory, primary memory or RAM (Random Access
Memory)
• Main memories on desktop and notebook computers contain as
much as 128 GB of RAM, though 8 to 16 GB is most common
– A gigabyte is approximately one billion bytes
– A byte is eight bits
– A bit is either a 0 or a 1
BUS 601
Arithmetic and Logic Unit (ALU)
• “Manufacturing” section
• Performs calculations, such as addition,
subtraction, multiplication and division
• Also contains the decision mechanisms that
allow the computer, for example, to compare
two items from the memory unit to determine
whether they’re equal
• In today’s systems, the ALU is part of the next
logical unit, the CPU

BUS 601
Central Processing Unit (CPU)
• “Administrative” section
• Coordinates and supervises the operation of the other sections
• Tells the input unit when information should be read into the
memory unit
• Tells the ALU when information from the memory unit should be
used in calculations
• Tells the output unit when to send information from the memory
unit to specific output devices
• Most computers have multicore processors that implement
multiple processors on a single integrated-circuit chip
• A dual-core processor has two CPUs, a quad-core processor has
four and an octa-core processor has eight
– Intel has some processors with up to 72 cores
BUS 601
Secondary Storage Unit
• Long-term, high-capacity “warehousing” section
• Programs or data not actively being used by the other units
normally are placed on secondary storage devices until they’re
needed
• Information on secondary storage devices is persistent—it’s
preserved even when the computer’s power is turned off
• Secondary storage information takes much longer to access than
information in primary memory, but its cost per unit is much less
• Many current drives hold terabytes (TB) of data
– A terabyte is approximately one trillion bytes
– Typical hard drives on desktop and notebook computers hold up to 4 TB,
and some recent desktop-computer hard drives hold up to 15 TB

BUS 601
Data Hierarchy

BUS 601
Bits
• A bit (short for “binary digit”—a digit that can
assume one of two values) is the smallest data
item in a computer
• Can have the value 0 or 1
• Bits are the basis of the binary number system

BUS 601
Characters
• Decimal digits (0–9), letters (A–Z and a–z) and special symbols
such as $ @ % & * ( ) – + " : ; , ? /
• Computer's character set contains the characters used to write
programs and represent data items
• Computers process only 1s and 0s, so a character set represents
every character as a pattern of 1s and 0s
• Python uses Unicode® characters composed of one, two, three or
four bytes (8, 16, 24 or 32 bits, respectively)—known as UTF-8
encoding
• The ASCII (American Standard Code for Information
Interchange) character set is a subset of Unicode that represents
letters (a–z and A–Z), digits and some common special
characters
• ASCII subset of Unicode
• Unicode charts for all languages,
BUSsymbols,
601 emojis and more
Fields
• Fields are composed of characters or bytes
• A field is a group of characters or bytes that conveys
meaning
– a person’s name
– a person’s age
– etc.

BUS 601
Records
• A record is a group of related fields
• A record for an employee might consist of
– Employee identification number (a whole number)
– Name (a string of characters)
– Address (a string of characters)
– Hourly pay rate (a number with a decimal point)
– Year-to-date earnings (a number with a decimal point)
– Amount of taxes withheld (a number with a decimal point)

BUS 601
Files
• A file is a group of related records
• More generally, a file contains arbitrary data in arbitrary
formats
• Any organization of the bytes in a file, such as
organizing the data into records, is a view created by
the application programmer
• Not unusual for an organization to have many files,
some containing billions, or even trillions, of characters
of information

BUS 601
Databases
• A database is a collection of data organized for
easy access and manipulation
• Most popular model is the relational database, in
which data is stored in simple tables
• A table includes records and fields
• You can search, sort and otherwise manipulate
the data, based on its relationship to multiple
tables or databases

BUS 601
Programming Language
• Programmers write instructions in various
programming languages
– Some directly understandable by computers
– Others require intermediate translation steps
• Three general types
– Machine languages
– Assembly languages
– High-level languages

BUS 601
Machine Language
• Any computer understands only its own machine
language, defined by its hardware design
• Generally consist of strings of numbers (ultimately 1s
and 0s) that instruct computers to perform their most
elementary operations
• Cumbersome for humans
• Section of an early machine-language payroll program
that adds overtime pay to base pay and stores the result
in gross pay
– +1300042774
– +1400593419
– +1200274027
BUS 601
Assembly Languages and Assemblers
• English-like abbreviations to represent elementary
operations
• Formed the basis of assembly languages
• Assemblers were developed to convert assembly-
language programs to machine language at computer
speeds
• Section of an assembly-language payroll program that
adds overtime pay to base pay and stores the result in
gross pay
– load basepay
– add overpay
– store grosspay
BUS 601
High-Level Languages and Compilers
• With the advent of assembly languages, computer usage increased
• Programmers still needed numerous instructions to accomplish even
simple tasks. High-level languages enable single statements to
accomplish substantial tasks.
• A typical high-level-language program contains many statements,
known as the program’s source code
• Compilers convert high level language into machine language.
• High-level languages look almost like everyday English and contain
commonly used mathematical notations.
• Payroll program written in a high-level language might contain a
statement such as grossPay = basePay + overTimePay
• Python is among the world’s most widely used high-level
programming languages
BUS 601
Interpreters
• Interpreter programs execute high-level language
programs directly and avoid the delay of compilation
• Interpreted programs run slower than compiled
programs
• Most widely used Python implementation—CPython—
uses a clever mixture of compilation and interpretation
to run programs

BUS 601
Operating Systems
• Make using computers more convenient for users, application
developers and system administrators
• Provide services that allow each application to execute safely,
efficiently and concurrently with other applications
• Core components of the operating system are implemented in the
kernel
• Linux, Windows and macOS are popular desktop computer
operating systems
• Google’s Android and Apple’s iOS are the most popular mobile
operating systems

BUS 601
Why Python
• Open source, free and widely available with a massive open-source
community
• Massive numbers of free open-source Python applications
• Easier to learn than many other languages, enabling novices and professional
developers to get up to speed quickly
• Easier to read than many other popular programming languages
• Widely used in education, web development (e.g., Django, Flask), financial
community, Artificial Intelligence
• Enhances developer productivity with extensive standard libraries and third-
party open-source libraries
– Programmers can write code faster and perform complex tasks with minimal code
• Supports popular procedural, functional-style and object-oriented
programming
• Build anything from simple scripts to complex apps with massive numbers of
users, such as Dropbox, YouTube, Reddit, Instagram and Quora
• Widely used in the Extensive job market for Python programmers across
many disciplines, especially in data-science-oriented positions, and Python
jobs are among the highest paid of all BUS 601
programming jobs
Anaconda Python Distribution
• Easy to install on Windows, macOS and Linux and
supports the latest versions of Python, the IPython
interpreter and Jupyter Notebooks
• Also includes other software packages and libraries
commonly used in Python programming and data
science
• IPython interpreter

BUS 601
Other Popular Programming
•Languages
Basic
– Developed in the 1960s at Dartmouth College to familiarize novices with programming techniques
– Many of its latest versions are object-oriented
• C
– Developed in the early 1970s by Dennis Ritchie at Bell Laboratories
– Initially known as the UNIX operating system’s development language
– General-purpose operating systems and other performance-critical systems often are written in C or C++
• C++
– Based on C
– Developed by Bjarne Stroustrup in the early 1980s at Bell Laboratories
– Enhances C and adds capabilities for object-oriented programming
• Java
– Sun Microsystems in 1991 funded an internal corporate research project led by James Gosling, which
resulted in the C++-based object-oriented programming language called Java
– “write once, run anywhere” —Enable developers to write programs that will run on a great variety of
computer systems
– Used in enterprise applications, to enhance the functionality of web servers, to provide applications for
consumer devices (e.g., smartphones, tablets, television set-top boxes, appliances, automobiles and more)
and for many other purposes
– Originally the key language for developing Android smartphone and tablet apps, though several other
languages are now supported
BUS 601
• C#
– Based on C++ and Java
– One of Microsoft’s three primary object-oriented programming languages—others are Visual C++ and
Visual Basic
– Developed to integrate the web into computer applications and is now widely used to develop many types
of applications
– Microsoft now offers open-source versions of C# and Visual Basic
• JavaScript
– Most widely used scripting language
– Primarily used to add programmability to web pages
– All major web browsers support it
– Many Python visualization libraries output JavaScript as part of visualizations that you can interact with
in your web browser
– Tools like NodeJS also enable JavaScript to run outside of web browsers
• Swift,
– Introduced in 2014
– Apple’s programming language for developing iOS and macOS apps
– A contemporary language that includes popular features from languages such as Objective-C, Java, C#,
Ruby, Python and others
– Open source, so it can be used on non-Apple platforms as well
• R
– A popular open-source programming language for statistical applications and visualization
– Python and R are the two most widely used data-science languages

BUS 601
Test-Drives: Using IPython and Jupyter Notebooks

• Test-drive the IPython interpreter in two modes:


– interactive mode—enter small bits of code called
snippets and immediately see their results
– script mode—execute code loaded from a file that
has the .py extension (short for Python)
• Called scripts or programs
• Use browser-based Jupyter Notebook for writing
and executing Python code

BUS 601
IPython Interactive Mode
• Entering IPython in Interactive Mode
– Open a command-line window on your system
• On macOS, open a Terminal from the Applications folder’s Utilities
subfolder
• On Windows, open the Anaconda Command Prompt from the start
menu
• On Linux, open your system’s Terminal or shell (this varies by Linux
distribution)
– Type ipython, then press Enter (or Return)
• Exiting Interactive Mode
– Type exit and press Enter to exit immediately
– Type Ctrl + d (or control + d) then confirm
– Type Ctrl + d (or control + d) twice
BUS 601
BUS 601
Executing a Python Program Using the
IPython Interpreter
• Execute a script named RollDieDynamic.py that you’ll write in
Chapter 6
• .py extension indicates the file contains Python source code
• RollDieDynamic.py simulates rolling a six-sided die, presenting
a colorful animated visualization that dynamically graphs the
frequencies of each die face

BUS 601
Summary
• Basic computer architecture
• Why Python
• Python Running Environment
• Python Lab

BUS 601

You might also like