BSM 461 Introduction To Big Data: Kevser Ovaz Akpınar, PHD
BSM 461 Introduction To Big Data: Kevser Ovaz Akpınar, PHD
kovaz.sakarya.edu.tr
[email protected]
Agenda
• Numeric Types
• Strings
• Boolean Types
• Special Types
• Some keywords are reserved such as ‘and’, ‘assert’, ‘break’, ‘lambda’. A list of keywords are located at
https://docs.python.org/2.5/ref/keywords.html
#lambda function 1
• Lambda functions fnc = lambda x : x + 1
lambda parameters : words print(fnc(1))
#Output: 2
print(fnc(fnc(1)))
#Output: 3
#lambda function 2
fnc2 = lambda x, y : x + y
print(fnc2(4,7))
#Output: 11
print(fnc2(4,fnc(1)))
#Output: 6
• The Python Pip Toolkit: Programmers contribute to its open source repository,
the Python Package Index (PIP). Sample pip packages read and write to JSON
and requests to work with web services.
• Pandas: Open-source library! Transform data from one format to another and
run these algorithms at scale, meaning across a cluster. For example, older
algorithms that existed before distributed computing (i.e., big data) like scikit-
learn would not work with distributed data frames and other objects run across
a cluster. They are designed to work with one file on one computer. So that is an
issue to keep in mind as you figure out which framework to use. With Pandas,
for very large data sets you might have a hybrid of tools
No support of parallel processing!!
• MacOS X, High Sierra has a preloaded version of Python 2.7 out-of-the-box. If you
have macOS X, you will not have to install or configure anything else in order to use
Python 2. If you want to use Python3, then installation is required
• Python doesn’t come prepackaged with Windows. Download the installer and follow
the wizard.
Matplotlib Tutorials,
https://matplotlib.org/tutorials/introductory/pyplot.html
kovaz.sakarya.edu.tr
© Kevser Ovaz Akpınar
[email protected]