Learning IPython for Interactive Computing and Data Visualization, Second Edition: Get started with Python for data analysis and numerical computing in the Jupyter notebook
2/5
()
About this ebook
Cyrille Rossant
Cyrille Rossant, PhD, is a neuroscience researcher and software engineer at University College London. He is a graduate of École Normale Supérieure, Paris, where he studied mathematics and computer science. He has also worked at Princeton University and Collège de France. While working on data science and software engineering projects, he gained experience in numerical computing, parallel computing, and high-performance data visualization. He is the author of Learning IPython for Interactive Computing and Data Visualization, Second Edition, Packt Publishing.
Related to Learning IPython for Interactive Computing and Data Visualization, Second Edition
Related ebooks
Interactive Applications Using Matplotlib Rating: 0 out of 5 stars0 ratingsFlask Blueprints: Dive into the world of the Flask microframework to develop an array of web applications Rating: 0 out of 5 stars0 ratingsNumPy Beginner's Guide Rating: 5 out of 5 stars5/5Mastering matplotlib Rating: 0 out of 5 stars0 ratingsMatplotlib for Python Developers Rating: 3 out of 5 stars3/5matplotlib Plotting Cookbook Rating: 5 out of 5 stars5/5Machine Learning with Spark and Python: Essential Techniques for Predictive Analytics Rating: 0 out of 5 stars0 ratingsNumPy Cookbook Rating: 5 out of 5 stars5/5Pyqt6 101: A Beginner’s Guide to PyQt6 Rating: 0 out of 5 stars0 ratingsLearning Data Mining with Python - Second Edition Rating: 0 out of 5 stars0 ratingsPython Multimedia Beginner's Guide Rating: 0 out of 5 stars0 ratingsLearning SciPy for Numerical and Scientific Computing - Second Edition Rating: 0 out of 5 stars0 ratingsRaspberry Pi By Example Rating: 0 out of 5 stars0 ratingsPython Data Visualization Cookbook Rating: 4 out of 5 stars4/5Getting Started with Python Data Analysis Rating: 0 out of 5 stars0 ratingsLearning NumPy Array Rating: 0 out of 5 stars0 ratingsIan Talks Python A-Z Rating: 0 out of 5 stars0 ratingsUseful Python Rating: 0 out of 5 stars0 ratingsNumPy Essentials Rating: 0 out of 5 stars0 ratingsPython Data Visualization Cookbook - Second Edition Rating: 0 out of 5 stars0 ratingsData Visualization with Python: Exploring Matplotlib, Seaborn, and Bokeh for Interactive Visualizations (English Edition) Rating: 0 out of 5 stars0 ratingsNumPy Recipes Rating: 0 out of 5 stars0 ratingsLearning Jupyter Rating: 3 out of 5 stars3/5Mastering IPython 4.0 Rating: 0 out of 5 stars0 ratingsIPython Interactive Computing and Visualization Cookbook Rating: 5 out of 5 stars5/5Data Structure in Python: From Basics to Expert Proficiency Rating: 0 out of 5 stars0 ratings
Trending on #Booktok
A Court of Mist and Fury Rating: 5 out of 5 stars5/5Icebreaker: A Novel Rating: 4 out of 5 stars4/5It Ends with Us: A Novel Rating: 4 out of 5 stars4/5The Assassin and the Pirate Lord: A Throne of Glass Novella Rating: 4 out of 5 stars4/5The Secret History: A Read with Jenna Pick: A Novel Rating: 4 out of 5 stars4/5Powerless Rating: 4 out of 5 stars4/5A Little Life: A Novel Rating: 4 out of 5 stars4/5Pride and Prejudice Rating: 4 out of 5 stars4/5Normal People: A Novel Rating: 4 out of 5 stars4/5The Love Hypothesis Rating: 4 out of 5 stars4/5If We Were Villains: A Novel Rating: 4 out of 5 stars4/5The Summer I Turned Pretty Rating: 4 out of 5 stars4/5Funny Story Rating: 4 out of 5 stars4/5Happy Place Rating: 4 out of 5 stars4/5Once Upon a Broken Heart Rating: 4 out of 5 stars4/5Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones Rating: 4 out of 5 stars4/5Seven Stones to Stand or Fall: A Collection of Outlander Fiction Rating: 4 out of 5 stars4/5Better Than the Movies Rating: 4 out of 5 stars4/5Fire & Blood: 300 Years Before A Game of Thrones Rating: 4 out of 5 stars4/5The 48 Laws of Power Rating: 4 out of 5 stars4/5Crime and Punishment Rating: 4 out of 5 stars4/5Beauty and the Beast Rating: 4 out of 5 stars4/5Dune Rating: 4 out of 5 stars4/5Divine Rivals: A Novel Rating: 4 out of 5 stars4/5Rich Dad Poor Dad Rating: 4 out of 5 stars4/5The Lord Of The Rings: One Volume Rating: 5 out of 5 stars5/5The Little Prince: New Translation Version Rating: 5 out of 5 stars5/5Finnegans Wake Rating: 4 out of 5 stars4/5Beach Read Rating: 4 out of 5 stars4/5Milk and Honey: 10th Anniversary Collector's Edition Rating: 4 out of 5 stars4/5
Reviews for Learning IPython for Interactive Computing and Data Visualization, Second Edition
3 ratings0 reviews
Book preview
Learning IPython for Interactive Computing and Data Visualization, Second Edition - Cyrille Rossant
Table of Contents
Learning IPython for Interactive Computing and Data Visualization Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Getting Started with IPython
What are Python, IPython, and Jupyter?
Jupyter and IPython
What this book covers
References
Installing Python with Anaconda
Downloading Anaconda
Installing Anaconda
Before you get started...
Opening a terminal
Finding your home directory
Manipulating your system path
Testing your installation
Managing environments
Common conda commands
References
Downloading the notebooks
Introducing the Notebook
Launching the IPython console
Launching the Jupyter Notebook
The Notebook dashboard
The Notebook user interface
Structure of a notebook cell
Markdown cells
Code cells
The Notebook modal interface
Keyboard shortcuts available in both modes
Keyboard shortcuts available in the edit mode
Keyboard shortcuts available in the command mode
References
A crash course on Python
Hello world
Variables
String escaping
Lists
Loops
Indentation
Conditional branches
Functions
Positional and keyword arguments
Passage by assignment
Errors
Object-oriented programming
Functional programming
Python 2 and 3
Going beyond the basics
Ten Jupyter/IPython essentials
Using IPython as an extended shell
Learning magic commands
Mastering tab completion
Writing interactive documents in the Notebook with Markdown
Creating interactive widgets in the Notebook
Running Python scripts from IPython
Introspecting Python objects
Debugging Python code
Benchmarking Python code
Profiling Python code
Summary
2. Interactive Data Analysis with pandas
Exploring a dataset in the Notebook
Provenance of the data
Downloading and loading a dataset
Making plots with matplotlib
Descriptive statistics with pandas and seaborn
Manipulating data
Selecting data
Selecting columns
Selecting rows
Filtering with boolean indexing
Computing with numbers
Working with text
Working with dates and times
Handling missing data
Complex operations
Group-by
Joins
Summary
3. Numerical Computing with NumPy
A primer to vector computing
Multidimensional arrays
The ndarray
Vector operations on ndarrays
How fast are vector computations in NumPy?
How an ndarray is stored in memory
Why operations on ndarrays are fast
Creating and loading arrays
Creating arrays
Loading arrays from files
Basic array manipulations
Computing with NumPy arrays
Selection and indexing
Boolean operations on arrays
Mathematical operations on arrays
A density map with NumPy
Other topics
Summary
4. Interactive Plotting and Graphical Interfaces
Choosing a plotting backend
Inline plots
Exported figures
GUI toolkits
Dynamic inline plots
Web-based visualization
matplotlib and seaborn essentials
Common plots with matplotlib
Customizing matplotlib figures
Interacting with matplotlib figures in the Notebook
High-level plotting with seaborn
Image processing
Further plotting and visualization libraries
High-level plotting
Bokeh
Vincent and Vega
Plotly
Maps and geometry
The matplotlib Basemap toolkit
GeoPandas
Leaflet wrappers: folium and mplleaflet
3D visualization
Mayavi
VisPy
Summary
5. High-Performance and Parallel Computing
Accelerating Python code with Numba
Random walk
Universal functions
Writing C in Python with Cython
Installing Cython and a C compiler for Python
Implementing the Eratosthenes Sieve in Python and Cython
Distributing tasks on several cores with IPython.parallel
Direct interface
Load-balanced interface
Further high-performance computing techniques
MPI
Distributed computing
C/C++ with Python
GPU computing
PyPy
Julia
Summary
6. Customizing IPython
Creating a custom magic command in an IPython extension
Writing a new Jupyter kernel
Displaying rich HTML elements in the Notebook
Displaying SVG in the Notebook
JavaScript and D3 in the Notebook
Customizing the Notebook interface with JavaScript
Summary
Index
Learning IPython for Interactive Computing and Data Visualization Second Edition
Learning IPython for Interactive Computing and Data Visualization Second Edition
Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: April 2013
Second edition: October 2015
Production reference: 1151015
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78398-698-9
www.packtpub.com
Credits
Author
Cyrille Rossant
Reviewers
Damián Avila
Nicola Rainiero
G Scott Stukey
Commissioning Editor
Kartikey Pandey
Acquisition Editors
Kartikey Pandey
Richard Brookes-Bland
Content Development Editor
Arun Nadar
Technical Editor
Pranil Pathare
Copy Editor
Stephen Copestake
Project Coordinator
Shweta H Birwatkar
Proofreader
Safis Editing
Indexer
Monica Ajmera Mehta
Production Coordinator
Conidon Miranda
Cover Work
Conidon Miranda
About the Author
Cyrille Rossant is a researcher in neuroinformatics, and is a graduate of Ecole Normale Superieure, Paris, where he studied mathematics and computer science. He has worked at Princeton University, University College London, and College de France. As part of his data science and software engineering projects, he gained experience in machine learning, high-performance computing, parallel computing, and big data visualization.
He is one of the main developers of VisPy, a high-performance visualization package in Python. He is the author of the IPython Interactive Computing and Visualization Cookbook, Packt Publishing, an advanced-level guide to data science and numerical computing with Python, and the sequel of this book.
I am grateful to Nick Fiorentini for his help during the revision of the book. I would also like to thank my family and notably my wife Claire for their support.
About the Reviewers
Damián Avila is a software developer and data scientist (formerly a biochemist) from Córdoba, Argentina.
His main focus of interest is data science, visualization, finance, and IPython/Jupyter-related projects.
In the open source area, he is a core developer for several interesting and popular projects, such as IPython/Jupyter, Bokeh, and Nikola. He has also started his own projects, being RISE, an extension to enable amazing live slides in the Jupyter notebook, the most popular one. He has also written several tutorials about the Scientific Python tools (available at Github) and presented several talks at international conferences.
Currently, he is working at Continuum Analytics.
Nicola Rainiero is a civil geotechnical engineer with a background in the construction industry as a self-employed designer engineer. He is also specialized in the renewable energy field and has collaborated with the Sant'Anna University of Pisa for two European projects, REGEOCITIES and PRISCA, using qualitative and quantitative data analysis techniques.
He has an ambition to simplify his work with open software and use and develop new ones; sometimes obtaining good results, at other times, negative. You can reach Nicola on his website at http://rainnic.altervista.org.
A special thanks to Packt Publishing for this opportunity to participate in the reviewing of this book. I thank my family, especially my parents, for their physical and moral support.
www.PacktPub.com
Support files, eBooks, discount offers, and more
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
Support files, eBooks, discount offers, and morehttps://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Free access for Packt account holders
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
Preface
Data analysis skills are now essential in scientific research, engineering, finance, economics, journalism, and many other domains. With its high accessibility and vibrant ecosystem, Python is one of the most appreciated open source languages for data science.
This book is a beginner-friendly introduction to the Python data analysis platform, focusing on IPython (Interactive Python) and its Notebook. While IPython is an enhanced interactive Python terminal specifically designed for scientific computing and data analysis, the Notebook is a graphical interface that combines code, text, equations, and plots in a unified interactive environment.
The first edition of Learning IPython for Interactive Computing and Data Visualization was published in April 2013, several months before the release of IPython 1.0. This new edition targets IPython 4.0, released in August 2015. In addition to reflecting the novelties of this new version of IPython, the present book is also more accessible to non-programmer beginners. The first chapter contains a brand new crash course on Python programming, as well as detailed installation instructions.
Since the first edition of this book, IPython's popularity has grown significantly, with an estimated user base of several millions of people and ongoing collaborations with large companies like Microsoft, Google, IBM, and others. The project itself has been subject to important changes, with a refactoring into a language-independent interface called the Jupyter Notebook, and a set of backend kernels in various languages. The Notebook is no longer reserved to Python; it can now also be used with R, Julia, Ruby, Haskell, and many more languages (50 at the time of this writing!).
The Jupyter project has received significant funding in 2015 from the Leona M. and Harry B. Helmsley Charitable Trust, the Gordon and Betty Moore Foundation, and the Alfred P. Sloan Foundation, which will allow the developers to focus on the growth and maturity of the project in the years to come.
Here are a few references:
Home page for the Jupyter project at http://jupyter.org/
Announcement of the funding for Jupyter at https://blog.jupyter.org/2015/07/07/jupyter-funding-2015/
Detail of the project's grant at https://blog.jupyter.org/2015/07/07/project-jupyter-computational-narratives-as-the-engine-of-collaborative-data-science/
What this book covers
Chapter 1, Getting Started with IPython, is a thorough and beginner-friendly introduction to Anaconda (a popular Python distribution), the Python language, the Jupyter Notebook, and IPython.
Chapter 2, Interactive Data Analysis with pandas, is a hands-on introduction to interactive data analysis and visualization in the Notebook with pandas, matplotlib, and seaborn.
Chapter 3, Numerical Computing with NumPy, details how to use NumPy for efficient computing on multidimensional numerical arrays.
Chapter 4, Interactive Plotting and Graphical Interfaces, explores many capabilities of Python for interactive plotting, graphics, image processing, and interactive graphical interfaces in the Jupyter Notebook.
Chapter 5, High-Performance and Parallel Computing, introduces the various techniques you can employ to accelerate your numerical computing code, namely parallel computing and compilation of Python code.
Chapter 6, Customizing IPython, shows how IPython and the Jupyter Notebook can be extended for customized use-cases.
What you need for this book
The following software is required for the book:
Anaconda with Python 3
Windows, Linux, or OS X can be used as a platform
Who this book is for
This book targets anyone who wants to analyze data or perform numerical simulations of mathematical models.
Since our world is becoming more and more data-driven, knowing how to analyze data effectively is an essential skill to learn. If you're used to spreadsheet programs like Microsoft Excel, you will appreciate Python for its much larger range of analysis and visualization possibilities. Knowing this general-purpose language will also let you share your data and analysis with other programs and libraries.
In conclusion, this book will be useful to students, scientists, engineers, analysts, journalists, statisticians, economists, hobbyists, and all data enthusiasts.
Conventions
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: Run it with a command like bash Anaconda3-2.3.0-Linux-x86_64.sh (if necessary, replace the filename by the one you downloaded).
A block of code is set as follows:
def load_ipython_extension(ipython):
"This function is called when the extension is loaded.
It accepts an IPython InteractiveShell instance.
We can register the magic with the `register_magic_function`
method of the shell instance."
ipython.register_magic_function(cpp, 'cell')
Any command-line input or output is written as follows:
$ python Python 3.4.3