SlideShare a Scribd company logo
Extending Python in C 
Cluj.py meetup, Nov 19th 
Steffen Wenz, CTO TrustYou
Goals of today’s talk 
● Look behind the scenes of the CPython interpreter - 
gain insights into “how Python works” 
● Explore the CPython C API 
● Build a Python extension in C 
● Introduction to Cython
Who are we? 
● For each hotel on the 
planet, provide a 
summary of all reviews 
● Expertise: 
○ NLP 
○ Machine Learning 
○ Big Data 
● Clients: …
Cluj.py Meetup: Extending Python in C
TrustYou Tech Stack 
Batch Layer 
● Hadoop (HDP 2.1) 
● Python 
● Pig 
● Luigi 
Service Layer 
● PostgreSQL 
● MongoDB 
● Redis 
● Cassandra 
Data Data Queries 
Hadoop cluster (100 nodes) Application machines
Let’s dive in! Assigning an integer 
a = 4 PyObject* a = 
PyInt_FromLong(4); 
// what's the 
difference to 
int a = 4? 
Documentation: PyInt_FromLong
List item access 
x = xs[i] PyObject* x = 
PyList_GetItem(xs, 
i); 
Documentation: PyList_GetItem
Returning None … 
return None Py_INCREF(Py_None); 
return Py_None; 
Documentation: Py_INCREF
Calling a function 
foo(1337, "bar") // argument list 
PyObject *args = Py_BuildValue 
("is", 1337, "bar"); 
// make call 
PyObject_CallObject(foo, 
args); 
// release arguments 
Py_DECREF(args); 
Documentation: Py_BuildValue, 
PyObject_CallObject
What’s the CPython C API? 
● API to manipulate Python objects, and interact with 
Python code, from C/C++ 
● Purpose: Extend Python with new modules/types 
● Why?
CPython internals 
def slangify(s): 
return s + ", yo!" 
C API 
Compiler Interpreter 
>>> slangify("hey") 
'hey, yo!' 
|x00x00dx01x00x17S 
Not true for Jython, IronPython, PyPy, Stackless …
Why is Python slow? 
a = 1 
a = a + 1 
int a = 1; 
a++;
Why is Python slow? 
class Point: 
def __init__(self, x, y): 
self.x = x; self.y = y 
p = Point(1, 2) 
print p.x 
typedef struct { int x, y; } 
point; 
int main() { 
point p = {1, 2}; 
printf("%i", p.x); 
}
Why is Python slow? 
The GIL
Writing in C 
● No OOP : 
typedef struct { /* ... */ } complexType; 
void fun(complexType* obj, int x, char* y) { 
// ... 
} 
● Macros for code generation: 
#define SWAP(x,y) {int tmp = x; x = y; y = tmp;} 
SWAP(a, b);
Writing in C 
● Manual memory management: 
○ C: static, stack, malloc/free 
○ Python C API: Reference counting 
● No exceptions 
○ Error handling via returning values 
○ CPython: return null; signals an error
Reference Counting 
void Py_INCREF(PyObject *o) 
Increment the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL, use 
Py_XINCREF(). 
= I want to hold on to this object and use it again after a while* 
*) any interaction with Python interpreter that may invalidate my reference 
void Py_DECREF(PyObject *o) 
Decrement the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL, use 
Py_XDECREF(). If the reference count reaches zero, the object’s type’s deallocation function (which must not 
beNULL) is invoked. 
= I’m done, and don’t care if the object is discarded at this call 
See documentation
Anatomy of a refcount bug 
void buggy(PyObject *list) 
{ 
PyObject *item = PyList_GetItem(list, 0); // borrowed ref. 
PyList_SetItem(list, 1, PyInt_FromLong(0L)); // calls 
destructor of previous element 
PyObject_Print(item, stdout, 0); // BUG! 
}
Our First Extension Module
Adding integers in C 
>>> import arithmetic 
>>> arithmetic.add(1, 1337) 
1338
#include <Python.h> 
static PyObject* 
arithmetic_add(PyObject* self, PyObject* args) 
{ 
int i, j; 
PyArg_ParseTuple(args, "ii", &i, &j); 
PyObject* sum = PyInt_FromLong(i + j); 
return sum; 
}
static PyObject* 
arithmetic_add(PyObject* self, PyObject* args) 
{ 
int i, j; 
PyObject* sum = NULL; 
if (!PyArg_ParseTuple(args, "ii", &i, &j)) 
goto error; 
sum = PyInt_FromLong(i + j); 
if (sum == NULL) 
goto error; 
return sum; 
error: 
Py_XDECREF(sum); 
return NULL; 
}
Boilerplate² 
static PyMethodDef ArithmeticMethods[] = { 
{"add", arithmetic_add, METH_VARARGS, "Add two integers."}, 
{NULL, NULL, 0, NULL} // sentinel 
}; 
PyMODINIT_FUNC 
initarithmetic(void) 
{ 
(void) Py_InitModule("arithmetic", ArithmeticMethods); 
}
… and build your module 
from distutils.core import setup, Extension 
module = Extension("arithmetic", sources=["arithmeticmodule.c"]) 
setup( 
name="Arithmetic", 
version="1.0", 
ext_modules=[module] 
)
$ sudo python setup.py install 
# build with gcc, any compiler errors & warnings are shown here 
$ python 
>>> import arithmetic 
>>> arithmetic 
<module 'arithmetic' from '/usr/local/lib/python2.7/dist-packages/ 
arithmetic.so'> 
>>> arithmetic.add 
<built-in function add> 
>>> arithmetic.add(1, "1337") 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
TypeError: an integer is required
Why on earth 
would I do that?
Why go through all this trouble? 
● Performance 
○ C extensions & Cython optimize CPU-bound code 
(vs. memory-bound, IO-bound) 
○ Pareto principle: 20% of the code responsible for 
80% of the runtime 
● Also: Interfacing with existing C/C++ code
Is my Python code performance-critical? 
import cProfile, pstats, sys 
pr = cProfile.Profile() 
pr.enable() 
setup() 
# run code you want to profile 
pr.disable() 
stats = pstats.Stats(pr, stream=sys.stdout).sort_stats("time") 
stats.print_stats()
55705589 function calls (55688041 primitive calls) in 69.216 seconds 
Ordered by: internal time 
ncalls tottime percall cumtime percall filename:lineno(function) 
45413 21.856 0.000 21.856 0.000 {method 'get' of 'pytc.HDB' objects} 
32275 9.490 0.000 9.656 0.000 /usr/local/lib/python2.7/dist-packages/simplejson/decoder.py:376(raw_decode) 
18760 6.403 0.000 12.797 0.001 /home/steffen/apps/group/lib/util/timeseries.py:29(reindex_pad) 
56992 2.586 0.000 2.624 0.000 {sorted} 
1383832 2.244 0.000 2.244 0.000 /home/steffen/apps/group/lib/hotel/index.py:231(<lambda>) 
2708692 1.845 0.000 5.657 0.000 /home/steffen/apps/group/lib/hotel/index.py:21(<genexpr>) 
497989 1.718 0.000 2.456 0.000 {_heapq.heapreplace} 
4734466 1.624 0.000 2.491 0.000 /home/steffen/apps/group/lib/util/timeseries.py:43(<genexpr>) 
346738 1.475 0.000 1.475 0.000 /usr/lib/python2.7/json/decoder.py:371(raw_decode) 
510726 1.354 0.000 10.432 0.000 /usr/lib/python2.7/heapq.py:357(merge) 
2691966 1.310 0.000 1.310 0.000 /home/steffen/apps/group/lib/util/timeseries.py:21(float_parse) 
357260 1.160 0.000 5.122 0.000 /home/steffen/apps/group/lib/hotel/index.py:471(<genexpr>) 
5348564 0.912 0.000 0.912 0.000 /home/steffen/apps/group/lib/util/timeseries.py:90(<genexpr>) 
758026 0.882 0.000 0.882 0.000 {method 'match' of '_sre.SRE_Pattern' objects} 
9470443 0.868 0.000 0.868 0.000 {method 'append' of 'list' objects} 
4715746 0.867 0.000 0.867 0.000 /home/steffen/apps/group/lib/util/timeseries.py:31(bound) 
1 0.857 0.857 69.220 69.220 /home/steffen/apps/group/lib/pages/table_page.py:37(calculate) 
644766 0.839 0.000 1.752 0.000 {sum}
You can’t observe without changing … 
import timeit 
def setup(): 
pass 
def stmt(): 
pass 
print timeit.timeit(stmt=stmt, setup=setup, number=100)
Example: QuickSort
Pythonic QuickSort 
def quicksort(xs): 
if len(xs) <= 1: 
return xs 
middle = len(xs) / 2 
pivot = xs[middle] 
del xs[middle] 
left, right = [], [] 
for x in xs: 
append_to = left if x < pivot else right 
append_to.append(x) 
return quicksort(left) + [pivot] + quicksort(right)
Results: Python vs. C extension 
Pythonic QuickSort: 2.0s 
C extension module: 0.092s
Cython
Adding integers in Cython 
# add.pyx 
def add(i, j): 
return i + j 
# main.py 
import pyximport; pyximport.install() 
import add 
if __name__ == "__main__": 
print add.add(1, 1337)
What is Cython? 
● Compiles Python to C code 
● “Superset” of Python: Accepts type annotations to 
compile more efficient code (optional!) 
cdef int i = 2 
● No reference counting, error handling, boilerplate … 
plus nicer compiling workflows
Results: 
Pythonic QuickSort: 2.0s 
C extension module: 0.092s 
Cython QuickSort (unchanged): 0.82s
cdef partition(xs, int left, int right, int pivot_index): 
cdef int pivot = xs[pivot_index] 
cdef int el 
xs[pivot_index], xs[right] = xs[right], xs[pivot_index] 
pivot_index = left 
for i in xrange(left, right): 
el = xs[i] 
if el <= pivot: 
xs[i], xs[pivot_index] = xs[pivot_index], xs[i] 
pivot_index += 1 
xs[pivot_index], xs[right] = xs[right], xs[pivot_index] 
return pivot_index 
def quicksort(xs, left=0, right=None): 
if right is None: 
right = len(xs) - 1 
if left < right: 
middle = (left + right) / 2 
pivot_index = partition(xs, left, right, middle) 
quicksort(xs, left, pivot_index - 1) 
quicksort(xs, pivot_index + 1, right)
Results: 
Pythonic QuickSort: 2.0s 
C extension module: 0.092s 
Cython QuickSort (unchanged): 0.82s 
Cython QuickSort (C-like): 0.37s 
● Unscientific result. Cython can be faster than hand-written 
C extensions!
Further Reading on Cython 
See code samples on ● O’Reilly Book 
TrustYou GitHub account: 
https://github. 
com/trustyou/meetups/tre 
e/master/python-c
TrustYou wants you! 
We offer positions 
in Cluj & Munich: 
● Data engineer 
● Application developer 
● Crawling engineer 
Write me at swenz@trustyou.net, check out our website, 
or see you at the next meetup!
Thank you!
Python Bytecode 
>>> def slangify(s): 
... return s + ", yo!" 
... 
>>> slangify.func_code.co_code 
'|x00x00dx01x00x17S' 
>>> import dis 
>>> dis.dis(slangify) 
2 0 LOAD_FAST 0 (s) 
3 LOAD_CONST 1 (', yo!') 
6 BINARY_ADD 
7 RETURN_VALUE
Anatomy of a memory leak 
void buggier() 
{ 
PyObject *lst = PyList_New(10); 
return Py_BuildValue("Oi", lst, 10); // increments refcount 
} 
// read the doc carefully before using *any* C API function

More Related Content

PDF
Powered by Python - PyCon Germany 2016
Steffen Wenz
 
PDF
Cluj Big Data Meetup - Big Data in Practice
Steffen Wenz
 
PDF
PyData Berlin Meetup
Steffen Wenz
 
PDF
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Sages
 
PDF
Wprowadzenie do technologi Big Data i Apache Hadoop
Sages
 
PDF
Start Wrap Episode 11: A New Rope
Yung-Yu Chen
 
PDF
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Sages
 
PDF
Virtual machine and javascript engine
Duoyi Wu
 
Powered by Python - PyCon Germany 2016
Steffen Wenz
 
Cluj Big Data Meetup - Big Data in Practice
Steffen Wenz
 
PyData Berlin Meetup
Steffen Wenz
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Sages
 
Wprowadzenie do technologi Big Data i Apache Hadoop
Sages
 
Start Wrap Episode 11: A New Rope
Yung-Yu Chen
 
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Sages
 
Virtual machine and javascript engine
Duoyi Wu
 

What's hot (20)

PDF
All I know about rsc.io/c2go
Moriyoshi Koizumi
 
PPTX
Hacking Go Compiler Internals / GoCon 2014 Autumn
Moriyoshi Koizumi
 
PDF
Letswift19-clean-architecture
Jung Kim
 
PDF
DevTalks Cluj - Open-Source Technologies for Analyzing Text
Steffen Wenz
 
PPTX
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
PROIDEA
 
PDF
Concurrent applications with free monads and stm
Alexander Granin
 
PDF
PyCon KR 2019 sprint - RustPython by example
YunWon Jeong
 
PPTX
Python GC
delimitry
 
PPT
Python Objects
Quintagroup
 
PDF
Compose Async with RxJS
Kyung Yeol Kim
 
PDF
서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015
NAVER / MusicPlatform
 
PDF
ClojureScript loves React, DomCode May 26 2015
Michiel Borkent
 
PDF
C++ How I learned to stop worrying and love metaprogramming
cppfrug
 
PDF
RxJS Evolved
trxcllnt
 
PPT
C++totural file
halaisumit
 
PPTX
Basic C++ 11/14 for Python Programmers
Appier
 
PDF
Продвинутая отладка JavaScript с помощью Chrome Dev Tools
FDConf
 
PPT
C++ tutorial
sikkim manipal university
 
PDF
C c++-meetup-1nov2017-autofdo
Kim Phillips
 
PDF
ClojureScript for the web
Michiel Borkent
 
All I know about rsc.io/c2go
Moriyoshi Koizumi
 
Hacking Go Compiler Internals / GoCon 2014 Autumn
Moriyoshi Koizumi
 
Letswift19-clean-architecture
Jung Kim
 
DevTalks Cluj - Open-Source Technologies for Analyzing Text
Steffen Wenz
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
PROIDEA
 
Concurrent applications with free monads and stm
Alexander Granin
 
PyCon KR 2019 sprint - RustPython by example
YunWon Jeong
 
Python GC
delimitry
 
Python Objects
Quintagroup
 
Compose Async with RxJS
Kyung Yeol Kim
 
서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015
NAVER / MusicPlatform
 
ClojureScript loves React, DomCode May 26 2015
Michiel Borkent
 
C++ How I learned to stop worrying and love metaprogramming
cppfrug
 
RxJS Evolved
trxcllnt
 
C++totural file
halaisumit
 
Basic C++ 11/14 for Python Programmers
Appier
 
Продвинутая отладка JavaScript с помощью Chrome Dev Tools
FDConf
 
C c++-meetup-1nov2017-autofdo
Kim Phillips
 
ClojureScript for the web
Michiel Borkent
 
Ad

Similar to Cluj.py Meetup: Extending Python in C (20)

PDF
Notes about moving from python to c++ py contw 2020
Yung-Yu Chen
 
PDF
掀起 Swift 的面紗
Pofat Tseng
 
PDF
Python For Scientists
aeberspaecher
 
PDF
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Edureka!
 
PPT
An Overview Of Python With Functional Programming
Adam Getchell
 
PDF
Python-GTK
Yuren Ju
 
PPTX
C# 6.0 Preview
Fujio Kojima
 
PDF
Python高级编程(二)
Qiangning Hong
 
PDF
Threads and Callbacks for Embedded Python
Yi-Lung Tsai
 
PDF
Data Structure and Algorithms (DSA) with Python
epsilonice
 
PDF
2018 cosup-delete unused python code safely - english
Jen Yee Hong
 
PDF
Pemrograman Python untuk Pemula
Oon Arfiandwi
 
PDF
Python bootcamp - C4Dlab, University of Nairobi
krmboya
 
PDF
Profiling in Python
Fabian Pedregosa
 
PDF
Python GTK (Hacking Camp)
Yuren Ju
 
PPT
Euro python2011 High Performance Python
Ian Ozsvald
 
PPT
Object Oriented Technologies
Umesh Nikam
 
PPT
Lo Mejor Del Pdc2008 El Futrode C#
Juan Pablo
 
PDF
PyHEP 2018: Tools to bind to Python
Henry Schreiner
 
Notes about moving from python to c++ py contw 2020
Yung-Yu Chen
 
掀起 Swift 的面紗
Pofat Tseng
 
Python For Scientists
aeberspaecher
 
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Edureka!
 
An Overview Of Python With Functional Programming
Adam Getchell
 
Python-GTK
Yuren Ju
 
C# 6.0 Preview
Fujio Kojima
 
Python高级编程(二)
Qiangning Hong
 
Threads and Callbacks for Embedded Python
Yi-Lung Tsai
 
Data Structure and Algorithms (DSA) with Python
epsilonice
 
2018 cosup-delete unused python code safely - english
Jen Yee Hong
 
Pemrograman Python untuk Pemula
Oon Arfiandwi
 
Python bootcamp - C4Dlab, University of Nairobi
krmboya
 
Profiling in Python
Fabian Pedregosa
 
Python GTK (Hacking Camp)
Yuren Ju
 
Euro python2011 High Performance Python
Ian Ozsvald
 
Object Oriented Technologies
Umesh Nikam
 
Lo Mejor Del Pdc2008 El Futrode C#
Juan Pablo
 
PyHEP 2018: Tools to bind to Python
Henry Schreiner
 
Ad

Recently uploaded (20)

PDF
Building High-Performance Oracle Teams: Strategic Staffing for Database Manag...
SMACT Works
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
AbdullahSani29
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PPTX
The Power of IoT Sensor Integration in Smart Infrastructure and Automation.pptx
Rejig Digital
 
PDF
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
DOCX
Top AI API Alternatives to OpenAI: A Side-by-Side Breakdown
vilush
 
PDF
Google’s NotebookLM Unveils Video Overviews
SOFTTECHHUB
 
PDF
Shreyas_Phanse_Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
SHREYAS PHANSE
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Revolutionize Operations with Intelligent IoT Monitoring and Control
Rejig Digital
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Building High-Performance Oracle Teams: Strategic Staffing for Database Manag...
SMACT Works
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
AbdullahSani29
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
The Power of IoT Sensor Integration in Smart Infrastructure and Automation.pptx
Rejig Digital
 
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
Top AI API Alternatives to OpenAI: A Side-by-Side Breakdown
vilush
 
Google’s NotebookLM Unveils Video Overviews
SOFTTECHHUB
 
Shreyas_Phanse_Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
SHREYAS PHANSE
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Software Development Methodologies in 2025
KodekX
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Revolutionize Operations with Intelligent IoT Monitoring and Control
Rejig Digital
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 

Cluj.py Meetup: Extending Python in C

  • 1. Extending Python in C Cluj.py meetup, Nov 19th Steffen Wenz, CTO TrustYou
  • 2. Goals of today’s talk ● Look behind the scenes of the CPython interpreter - gain insights into “how Python works” ● Explore the CPython C API ● Build a Python extension in C ● Introduction to Cython
  • 3. Who are we? ● For each hotel on the planet, provide a summary of all reviews ● Expertise: ○ NLP ○ Machine Learning ○ Big Data ● Clients: …
  • 5. TrustYou Tech Stack Batch Layer ● Hadoop (HDP 2.1) ● Python ● Pig ● Luigi Service Layer ● PostgreSQL ● MongoDB ● Redis ● Cassandra Data Data Queries Hadoop cluster (100 nodes) Application machines
  • 6. Let’s dive in! Assigning an integer a = 4 PyObject* a = PyInt_FromLong(4); // what's the difference to int a = 4? Documentation: PyInt_FromLong
  • 7. List item access x = xs[i] PyObject* x = PyList_GetItem(xs, i); Documentation: PyList_GetItem
  • 8. Returning None … return None Py_INCREF(Py_None); return Py_None; Documentation: Py_INCREF
  • 9. Calling a function foo(1337, "bar") // argument list PyObject *args = Py_BuildValue ("is", 1337, "bar"); // make call PyObject_CallObject(foo, args); // release arguments Py_DECREF(args); Documentation: Py_BuildValue, PyObject_CallObject
  • 10. What’s the CPython C API? ● API to manipulate Python objects, and interact with Python code, from C/C++ ● Purpose: Extend Python with new modules/types ● Why?
  • 11. CPython internals def slangify(s): return s + ", yo!" C API Compiler Interpreter >>> slangify("hey") 'hey, yo!' |x00x00dx01x00x17S Not true for Jython, IronPython, PyPy, Stackless …
  • 12. Why is Python slow? a = 1 a = a + 1 int a = 1; a++;
  • 13. Why is Python slow? class Point: def __init__(self, x, y): self.x = x; self.y = y p = Point(1, 2) print p.x typedef struct { int x, y; } point; int main() { point p = {1, 2}; printf("%i", p.x); }
  • 14. Why is Python slow? The GIL
  • 15. Writing in C ● No OOP : typedef struct { /* ... */ } complexType; void fun(complexType* obj, int x, char* y) { // ... } ● Macros for code generation: #define SWAP(x,y) {int tmp = x; x = y; y = tmp;} SWAP(a, b);
  • 16. Writing in C ● Manual memory management: ○ C: static, stack, malloc/free ○ Python C API: Reference counting ● No exceptions ○ Error handling via returning values ○ CPython: return null; signals an error
  • 17. Reference Counting void Py_INCREF(PyObject *o) Increment the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL, use Py_XINCREF(). = I want to hold on to this object and use it again after a while* *) any interaction with Python interpreter that may invalidate my reference void Py_DECREF(PyObject *o) Decrement the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL, use Py_XDECREF(). If the reference count reaches zero, the object’s type’s deallocation function (which must not beNULL) is invoked. = I’m done, and don’t care if the object is discarded at this call See documentation
  • 18. Anatomy of a refcount bug void buggy(PyObject *list) { PyObject *item = PyList_GetItem(list, 0); // borrowed ref. PyList_SetItem(list, 1, PyInt_FromLong(0L)); // calls destructor of previous element PyObject_Print(item, stdout, 0); // BUG! }
  • 20. Adding integers in C >>> import arithmetic >>> arithmetic.add(1, 1337) 1338
  • 21. #include <Python.h> static PyObject* arithmetic_add(PyObject* self, PyObject* args) { int i, j; PyArg_ParseTuple(args, "ii", &i, &j); PyObject* sum = PyInt_FromLong(i + j); return sum; }
  • 22. static PyObject* arithmetic_add(PyObject* self, PyObject* args) { int i, j; PyObject* sum = NULL; if (!PyArg_ParseTuple(args, "ii", &i, &j)) goto error; sum = PyInt_FromLong(i + j); if (sum == NULL) goto error; return sum; error: Py_XDECREF(sum); return NULL; }
  • 23. Boilerplate² static PyMethodDef ArithmeticMethods[] = { {"add", arithmetic_add, METH_VARARGS, "Add two integers."}, {NULL, NULL, 0, NULL} // sentinel }; PyMODINIT_FUNC initarithmetic(void) { (void) Py_InitModule("arithmetic", ArithmeticMethods); }
  • 24. … and build your module from distutils.core import setup, Extension module = Extension("arithmetic", sources=["arithmeticmodule.c"]) setup( name="Arithmetic", version="1.0", ext_modules=[module] )
  • 25. $ sudo python setup.py install # build with gcc, any compiler errors & warnings are shown here $ python >>> import arithmetic >>> arithmetic <module 'arithmetic' from '/usr/local/lib/python2.7/dist-packages/ arithmetic.so'> >>> arithmetic.add <built-in function add> >>> arithmetic.add(1, "1337") Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: an integer is required
  • 26. Why on earth would I do that?
  • 27. Why go through all this trouble? ● Performance ○ C extensions & Cython optimize CPU-bound code (vs. memory-bound, IO-bound) ○ Pareto principle: 20% of the code responsible for 80% of the runtime ● Also: Interfacing with existing C/C++ code
  • 28. Is my Python code performance-critical? import cProfile, pstats, sys pr = cProfile.Profile() pr.enable() setup() # run code you want to profile pr.disable() stats = pstats.Stats(pr, stream=sys.stdout).sort_stats("time") stats.print_stats()
  • 29. 55705589 function calls (55688041 primitive calls) in 69.216 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 45413 21.856 0.000 21.856 0.000 {method 'get' of 'pytc.HDB' objects} 32275 9.490 0.000 9.656 0.000 /usr/local/lib/python2.7/dist-packages/simplejson/decoder.py:376(raw_decode) 18760 6.403 0.000 12.797 0.001 /home/steffen/apps/group/lib/util/timeseries.py:29(reindex_pad) 56992 2.586 0.000 2.624 0.000 {sorted} 1383832 2.244 0.000 2.244 0.000 /home/steffen/apps/group/lib/hotel/index.py:231(<lambda>) 2708692 1.845 0.000 5.657 0.000 /home/steffen/apps/group/lib/hotel/index.py:21(<genexpr>) 497989 1.718 0.000 2.456 0.000 {_heapq.heapreplace} 4734466 1.624 0.000 2.491 0.000 /home/steffen/apps/group/lib/util/timeseries.py:43(<genexpr>) 346738 1.475 0.000 1.475 0.000 /usr/lib/python2.7/json/decoder.py:371(raw_decode) 510726 1.354 0.000 10.432 0.000 /usr/lib/python2.7/heapq.py:357(merge) 2691966 1.310 0.000 1.310 0.000 /home/steffen/apps/group/lib/util/timeseries.py:21(float_parse) 357260 1.160 0.000 5.122 0.000 /home/steffen/apps/group/lib/hotel/index.py:471(<genexpr>) 5348564 0.912 0.000 0.912 0.000 /home/steffen/apps/group/lib/util/timeseries.py:90(<genexpr>) 758026 0.882 0.000 0.882 0.000 {method 'match' of '_sre.SRE_Pattern' objects} 9470443 0.868 0.000 0.868 0.000 {method 'append' of 'list' objects} 4715746 0.867 0.000 0.867 0.000 /home/steffen/apps/group/lib/util/timeseries.py:31(bound) 1 0.857 0.857 69.220 69.220 /home/steffen/apps/group/lib/pages/table_page.py:37(calculate) 644766 0.839 0.000 1.752 0.000 {sum}
  • 30. You can’t observe without changing … import timeit def setup(): pass def stmt(): pass print timeit.timeit(stmt=stmt, setup=setup, number=100)
  • 32. Pythonic QuickSort def quicksort(xs): if len(xs) <= 1: return xs middle = len(xs) / 2 pivot = xs[middle] del xs[middle] left, right = [], [] for x in xs: append_to = left if x < pivot else right append_to.append(x) return quicksort(left) + [pivot] + quicksort(right)
  • 33. Results: Python vs. C extension Pythonic QuickSort: 2.0s C extension module: 0.092s
  • 35. Adding integers in Cython # add.pyx def add(i, j): return i + j # main.py import pyximport; pyximport.install() import add if __name__ == "__main__": print add.add(1, 1337)
  • 36. What is Cython? ● Compiles Python to C code ● “Superset” of Python: Accepts type annotations to compile more efficient code (optional!) cdef int i = 2 ● No reference counting, error handling, boilerplate … plus nicer compiling workflows
  • 37. Results: Pythonic QuickSort: 2.0s C extension module: 0.092s Cython QuickSort (unchanged): 0.82s
  • 38. cdef partition(xs, int left, int right, int pivot_index): cdef int pivot = xs[pivot_index] cdef int el xs[pivot_index], xs[right] = xs[right], xs[pivot_index] pivot_index = left for i in xrange(left, right): el = xs[i] if el <= pivot: xs[i], xs[pivot_index] = xs[pivot_index], xs[i] pivot_index += 1 xs[pivot_index], xs[right] = xs[right], xs[pivot_index] return pivot_index def quicksort(xs, left=0, right=None): if right is None: right = len(xs) - 1 if left < right: middle = (left + right) / 2 pivot_index = partition(xs, left, right, middle) quicksort(xs, left, pivot_index - 1) quicksort(xs, pivot_index + 1, right)
  • 39. Results: Pythonic QuickSort: 2.0s C extension module: 0.092s Cython QuickSort (unchanged): 0.82s Cython QuickSort (C-like): 0.37s ● Unscientific result. Cython can be faster than hand-written C extensions!
  • 40. Further Reading on Cython See code samples on ● O’Reilly Book TrustYou GitHub account: https://github. com/trustyou/meetups/tre e/master/python-c
  • 41. TrustYou wants you! We offer positions in Cluj & Munich: ● Data engineer ● Application developer ● Crawling engineer Write me at [email protected], check out our website, or see you at the next meetup!
  • 43. Python Bytecode >>> def slangify(s): ... return s + ", yo!" ... >>> slangify.func_code.co_code '|x00x00dx01x00x17S' >>> import dis >>> dis.dis(slangify) 2 0 LOAD_FAST 0 (s) 3 LOAD_CONST 1 (', yo!') 6 BINARY_ADD 7 RETURN_VALUE
  • 44. Anatomy of a memory leak void buggier() { PyObject *lst = PyList_New(10); return Py_BuildValue("Oi", lst, 10); // increments refcount } // read the doc carefully before using *any* C API function