0% found this document useful (0 votes)
23 views44 pages

Pygrunn 2014

Gevent is a Python library that uses greenlets to provide a synchronous API for asynchronous applications. It allows blocking code to run without blocking the entire application by periodically switching between greenlets. The document provides an overview of gevent, its status and compatibility with Python versions and standard libraries.

Uploaded by

Douglas Sousa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views44 pages

Pygrunn 2014

Gevent is a Python library that uses greenlets to provide a synchronous API for asynchronous applications. It allows blocking code to run without blocking the entire application by periodically switching between greenlets. The document provides an overview of gevent, its status and compatibility with Python versions and standard libraries.

Uploaded by

Douglas Sousa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

gevent,

 threads  &  async  frameworks  

Denis  Bilenko  
What  is  gevent?  
•  green  threads  for  Python  
–  Coopera>ve,  only  switch  at  I/O  
•  greenlet  for  context  switching  
–  No  syscalls  
•  libev  for  event  loop  
–  PreEy  fast  
 
status  
•  current  stable:  1.0.1  
–  Runs  on  Python  2.5  –  2.7  
•  master  branch:  1.1  
–  Support  for  PyPy:  85%  tests  pass  
–  Support  for  Python3:  53%  tests  pass  
•  Basics  and  sockets  work  
•  Subprocesses  and  fileobjects  todo  
import  thread,  socket  
 
 
def  send(host):  
       sock  =  socket.create_connec>on((host,  80))  
       sock.sendall("GET  /  HTTP/1.0\r\n\r\n”)  
 
#  do  two  requests  in  parallel  
create_new_thread(send,  ('python.org',  ))  
create_new_thread(send,  ('gevent.org',  ))  
 
#import  thread,  socket  
from  gevent  import  thread,  socket  
 
def  send(host):  
       sock  =  socket.create_connec>on((host,  80))  
       sock.sendall("GET  /  HTTP/1.0\r\n\r\n”)  
 
#  do  two  requests  in  parallel  
create_new_thread(send,  ('python.org',  ))  
create_new_thread(send,  ('gevent.org',  ))  
 
drop-­‐in  modules  
from  gevent  import  
 socket,  
 ssl,  
 subprocess,  
 thread,  
 local,  
 queue  
stdlib  compa>bility  
 
•  less  to  learn  
•  API  is  more  stable  across  gevent  versions  
•  we  can  use  stdlib's  tests  to  verify  seman>cs  
•  trivial  to  port  libraries  and  apps  to  gevent  
from  gevent  import  monkey;  monkey.patch_all()  
   
import  requests  
from  flask  import  Flask  
app  =  Flask(__name__)  
   
@app.route('/')  
def  hello_world():  
       return  requests.get('hEp://python.org').content  
 
app.run()  
 
 
A  TCP  server  
 
def  echo(socket,  address):  
       for  line  in  socket.makefile():  
               socket.sendall(line)  
 
gevent.server.StreamServer(':5000’,  handle).start()  
gevent.wait([…  objects  …])  
 
•  wait  for  any  gevent  object  
•  extendable  through  rawlink(callback)  
gevent.wait([…  objects  …],  >meout=5)  
 
•  limit  wai>ng  to  certain  >me  
gevent.wait([…  objects  …],  count=N)  
 
•  only  wait  for  N  objects  
•  return  value  is  a  list  of  ready  objects  
gevent.wait()  
 
•  wait  for  everything  
•  “background”  watchers  ref=False  
•  graceful  shutdown:  
–  stop  accep>ng  new  requests  
–  gevent.wait()  
geventserver  
 
•  pre-­‐fork,  <999LOC  
•  supports  any  gevent  server,  not  just  HTTP  
•  graceful  shutdown  
•  can  be  embedded:  import  geventserver  
•  reports  long  loop  itera>ons  
 
github.com/surfly/geventserver  
 
 
Why?  

 
•  Avoid  complexi>es  of  event  loops  
•  Avoid  costs  of  real  threads  
Complexi>es  of  event  loops  
A  simple  
 
sock  =  socket.create_connec>on((host,  80))  
sock.sendall("GET  /  HTTP/1.0\r\n\r\n”)  
data  =  sock.recv(1024)  
 
becomes  
Complexi>es  of  event  loops  
 
def  connec>onMade(self):  
       …  
def  dataReceived(self,  data):  
       …  
def  connec>onError(self,  error):  
       …  
async  vs.  sync  
excep>ons  for  I/O  errors  

try:   def  connec>onMade(self):  


       …  
       connect()    
except  IOError:   def  connec>onError(self,  error):  
       …          …  
 
async  vs.  sync  
context  managers  

with  open(‘log’,  ‘w’)  as  log:   explicit  state  machine  


       io_opera>on()    
       log(“connected”)  
       io_opera>on()  
       log(“sent”)  
 
#  log  object  is  closed  by  “with”  
         
async  vs.  sync  
synchronous  programming  model  

handle_result(func(params))   d  =  Deferred()  
func(d)  
d.add_callback(handle_result)  
d.add_errback(handle_error)  
 
Giving  up  
•  excep>on  handling  
•  context  managers  
•  synchronous  programming  style  
•  3rdparty  libraries  
 
 
A  subset  of  Python  without  baEeries.  
Generators  /  PEP-­‐3156?  
•  Help  somewhat  
•  But  not  enough:  
   
 with  conn.cursor()  as  curs:  
   curs.execute(SQL)  
 #  cursor()  needs  to  do  I/O  in  __exit__  
 
•  No  compa>bility  with  threading  /  3rdparty  
libraries  
Corou>nes  
•  generators:  non-­‐stackful  corou>ne  
–  Only  yield  to  parent  
–  Need  special  syntax  
–  Only  saves  the  top  frame  

•  greenlet:  stackful  corou>ne  


–  Switching  is  just  a  func>on  call  
–  Switching  to  any  target  
–  Saves  the  whole  stack,  like  a  thread  
Why?  

 
•  Avoid  complexi>es  of  event  loops  
•  Avoid  costs  of  real  threads  
Threads  vs  green  threads  
•  crea>on  
–  thread.start_new_thread:  28usec  
–  gevent.spawn:  5usec  
–  gevent.spawn_raw:  1usec  
 
•  does  not  maEer  if  used  via  pool  
Threads  vs  green  threads  
memory  
•  threads:  8MB  of  stack  by  default  
•  greenlet:  only  allocate  what’s  actually  used  
–  350  bytes  per  Python  frame  
–  10-­‐15KB  
 
•  does  not  maEer  since  the  memory  is  virtual  
–  limits  number  of  threads  on  32bit  arch  
Gevent  server  
def  handle(socket,  addr):  
       #  read  out  the  request  
       for  line  in  socket.makefile():  
               if  not  line.strip():  
                       break  
       #  send  the  response  
       socket.sendall(HTTP_RESPONSE)  
       socket.close()  
 
from  gevent.server  import  StreamServer  
StreamServer(':5000',  handle).serve_forever()  
Threaded  server  
queue  =  Queue()  
 
def  worker():  
       while  True:  
               socket  =  queue.get()  
               handle(socket)  
 
for  _  in  xrange(1000):  
       thread.start_new_thread(handle,  ())  
 
while  True:  
       socket,  addr  =  listener.accept()  
       queue.put(socket)  
 Benchmark  
ab  -­‐n  1000  -­‐c  100  hEp://localhost:5000/  
•  threaded:  7.1K  RPS,  latency  14ms  
•  gevent:  9.3k  RPS,  latency  11ms  
 
•  The  threaded  server  can  probably  be  
improved  
Mutex  /  context  switch  benchmark  
def  func(source,  dest,  finished):  
       source_id  =  id(source)  
       for  _  in  xrange(COUNT):  
               source.acquire()  
               dest.release()  
       finished.release()  
 
thread.start_new_thread(func,  (sem1,  sem2,  a_finished))  
thread.start_new_thread(func,  (sem2,  sem1,  b_finished))  
Threads  vs  green  threads  
•  context  switch  
–  2  threads  switch  to  each  other  using  2  mutexes  
•  gevent  threads:  15ns  
•  real  threads:  60ns  
–  2  CPUs  –  GIL  conten>on  
 
to  avoid  “taskset  1  python  …”  
Threads  vs  green  threads  
•  gevent  threads:  15ns  
•  real  threads:  12ns  
 
PyPy:  
•  gevent  threads:  11ns  
•  real  threads:  7ns  
•  requires  warmup,  with  only  100  itera>ons:  
115ns  /  35ns  
hEp://www.mailinator.com/tymaPaulMul>threaded.pdf  
Threads  on  Linux  
•  Threads  used  to  have  awful  performance  
•  But  since  linux  2.6  /  NPTL  they’ve  improved  a  
lot  
–  idle  threads  cost  is  almost  zero  
–  context  switching  is  much  faster  
–  you  can  create  lots  of  them  
•  (in  Java)  sync  I/O  is  30%  faster  than  async  I/O  
coopera>ve  OS  threads?  
•  Already  exclusive  due  to  GIL  
•  sys.setswitchinteval(2  **  31)    #  Python  3  
•  sys.setcheckinterval(2  **  30)    #  Python  2  
 
would  not  work  if  you  have  CPU-­‐hungry  
threads,  but  neither  will  async  frameworks  
Can  gevent  be  speed  up?  
 
•  Switching  and  primi>ves  are  in  Python  
–  Let’s  try  C  
•  Switching  is  done  through  Hub  
–  Let’s  try  directly  
libgevent  

•  stacklet:  a  C  module  that  PyPy  uses  to  


implement  greenlet  
•  libuv:  Node.JS  event  loop  
libgevent  
 
gevent_cothread  t1;  
gevent_cothread_init(hub,  &t1,  sleep0);  
gevent_cothread_spawn(&t1);  
gevent_wait(hub)  
libgevent  

•  It’s  only  a  prototype  


•  spawn(),  sleep(),  wait()  
•  channels,  semaphores  
•  getaddrinfo()  
•  naive  Python  wrapper  

hEps://github.com/denik/libgevent  
Threads  vs  green  threads  
•  context  switch  
–  2  threads  switch  to  each  other  using  2  
semaphores  
•  gevent  threads:  15ns  
•  real  threads:  12ns  
•  libgevent/gevent2:  1.8ns  
Conclusions  
•  Thread  pool  is  a  deployment  op>on  with  considering  

•  Gevent’s  performance  can  be  pushed  further  

•  Avoid  framework  lock-­‐in  


–  Migra>ng  between  gevent  and  threads  is  easy  
–  Migra>ng  between  async  and  sync  models  is  not  

•  The  beEer  threads  are,  the  more  irrelevant  async  


frameworks  are  (gevent  included)  
–  And  threads  are  already  preEy  fast  
References  

gevent:    
hEp://gevent.org  
 
faster  gevent  experiment:  
hEps://github.com/denik/libgevent  
 
pre-­‐fork  server  for  gevent:  
hEps://github.com/surfly/geventserver  
 
“Thousands  of  threads  and  blocking  I/O”  by  Paul  Tyma:  
hEp://www.mailinator.com/tymaPaulMul>threaded.pdf  

You might also like