Software Environment
Software Environment
4.2.1 Overview
Anaconda distribution comes with more than 1,500 packages as well as the conda package
and virtual environment manager. It also includes a GUI, Anaconda Navigator, as a graphical
alternative to the command line interface (CLI).
The big difference between conda and the pip package manager is in how package
dependencies are managed, which is a significant challenge for Python data science and the
reason conda exists.
When pip installs a package, it automatically installs any dependent Python packages without
checking if these conflict with previously installed packages. It will install a package and any
of its dependencies regardless of the state of the existing installation. Because of this, a user
with a working installation of, for example, Google Tensorflow, can find that it stops
working having used pip to install a different package that requires a different version of the
dependent numpy library than the one used by Tensorflow. In some cases, the package may
appear to work but produce different results in detail.
In contrast, conda analyses the current environment including everything currently installed,
and, together with any version limitations specified (e.g. the user may wish to have
Tensorflow version 2,0 or higher), works out how to install a compatible set of dependencies,
and shows a warning if this cannot be done.
Open source packages can be individually installed from the Anaconda repository[8],
Anaconda Cloud (anaconda.org), or your own private repository or mirror, using the conda
install command. Anaconda Inc compiles and builds all the packages in the Anaconda
repository itself, and provides binaries for Windows 32/64 bit, Linux 64 bit and MacOS 64-
bit. Anything available on PyPI may be installed into a conda environment using pip, and
conda will keep track of what it has installed itself and what pip has installed.
Custom packages can be made using the conda build command, and can be shared with
others by uploading them to Anaconda Cloud,[9] PyPI or other repositories.
1
The default installation of Anaconda2 includes Python 2.7 and Anaconda3 includes Python
3.7. However, it is possible to create new environments that include any version of Python
packaged with conda.
JupyterLab
Jupyter Notebook
QtConsole
Spyder
Glue
Orange
RStudio
Conda
2
4.2.3 Anaconda Cloud
Anaconda Cloud is a package management service by Anaconda where you can find, access,
store and share public and private notebooks, environments, and conda and PyPI packages.
[20] Cloud hosts useful Python packages, notebooks and environments for a wide variety of
applications. You do not need to log in or to have a Cloud account, to search for public
packages, download and install them.
You can build new packages using the Anaconda Client command line interface (CLI), then
manually or automatically upload the packages to Cloud.
This tutorial explains how to install, run, and use Jupyter Notebooks for data science,
including tips, best practices, and examples.
As a web application in which you can create and share documents that contain live code,
equations, visualizations as well as text, the Jupyter Notebook is one of the ideal tools to help
you to gain the data science skills you need.
3
(To practice pandas dataframes in Python, try this course on Pandas foundations.)
In this case, "notebook" or "notebook documents" denote documents that contain both code
and rich text elements, such as figures, links, equations, ... Because of the mix of code and
text elements, these documents are the ideal place to bring together an analysis description,
and its results, as well as, they can be executed perform the data analysis in real time.
For now, you should know that "Jupyter" is a loose acronym meaning Julia, Python, and R.
These programming languages were the first target languages of the Jupyter application, but
nowadays, the notebook technology also supports many other languages.
As you just saw, the main components of the whole environment are, on the one hand, the
notebooks themselves and the application. On the other hand, you also have a notebook
kernel and a notebook dashboard.
As a server-client application, the Jupyter Notebook App allows you to edit and run your
notebooks via a web browser. The application can be executed on a PC without Internet
access, or it can be installed on a remote server, where you can access it through the Internet.
A kernel is a program that runs and introspects the user’s code. The Jupyter Notebook App
has a kernel for Python code, but there are also kernels available for other programming
languages.
4
The dashboard of the application not only shows you the notebook documents that you have
made and can reopen but can also be used to manage the kernels: you can which ones are
running and shut them down if necessary.
To fully understand what the Jupyter Notebook is and what functionality it has to offer you
need to know how it originated.
Let's back up briefly to the late 1980s. Guido Van Rossum begins to work on Python at the
National Research Institute for Mathematics and Computer Science in the Netherlands.
Fast forward two years: the IPython team had kept on working, and in 2007, they formulated
another attempt at implementing a notebook-type system. By October 2010, there was a
prototype of a web notebook, and in the summer of 2011, this prototype was incorporated,
and it was released with 0.12 on December 21, 2011. In subsequent years, the team got
awards, such as the Advancement of Free Software for Fernando Pérez on 23 of March 2013
and the Jolt Productivity Award, and funding from the Alfred P. Sloan Foundations, among
others.
Lastly, in 2014, Project Jupyter started as a spin-off project from IPython. IPython is now the
name of the Python backend, which is also known as the kernel. Recently, the next
generation of Jupyter Notebooks has been introduced to the community. It's called
JupyterLab.
After all this, you might wonder where this idea of notebooks originated or how it came
about to the creators.
A brief research into the history of these notebooks learns that Fernando Pérez and Robert
Kern were working on a notebook just at the same time as the Sage notebook was a work in
progress. Since the layout of the Sage notebook was based on the layout of Google
notebooks, you can also conclude that also Google used to have a notebook feature around
that time.
5
For what concerns the idea of the notebook, it seems that Fernando Pérez, as well as William
Stein, one of the creators of the Sage notebook, have confirmed that they were avid users of
the Mathematica notebooks and Maple worksheets. The Mathematica notebooks were
created as a front end or GUI in 1988 by Theodore Gray.
The concept of a notebook, which contains ordinary text and calculation and/or graphics, was
definitely not new.
Also, the developers had close contact with one another and this, together with other failed
attempts at GUIs for IPython and the use of "AJAX" = web applications, which didn't require
users to refresh the whole page every time you do something, were two other motivations for
the team of William Stein to start developing the Sage notebooks.