datascience

A library for introductory data science.

written by Professor John DeNero, Professor David Culler, Sam Lau, and Alvin Wan

Installation

Use pip:

pip install datascience

Developing

The recommended environment for installation and tests is the Anaconda Python3 distribution

If you encounter an Image not found error on Mac OSX, you may need an XQuartz upgrade.

Start by cloning this repository:

git clone https://github.com/dsten/datascience

Install it locally with:

make install

Then, run the tests:

make test

After that, go ahead and start hacking!

Documentation is generated from the docstrings in the methods and is pushed online at http://data8.org/datascience/ automatically. If you want to preview the docs locally, use these commands:

make docs       # Generates docs inside doc/ folder
make serve_docs # Starts a local server to view docs

Using Zenhub

We use Zenhub to organize development on this library. To get started, go ahead and install the Zenhub Chrome Extension.

Then navigate to the issue board or press b. You'll see a screen that looks something like this:

New Issues are issues that are just created and haven't been prioritized.
Backlogged issues are issues that are not high priority, like nice-to-have features.
To Do issues are high priority and should get done ASAP, such as breaking bugs or functionality that we need to lecture on soon.
Once someone has been assigned to an issue, that issue should be moved into the In Progress column.
When the task is complete, we close the related issue.

Example Workflow

John creates an issue called "Everything is breaking". It goes into the New Issues pipeline at first.
This issue is important, so John immediately moves it into the To Do pipeline. Since he has to go lecture for 61A, he doesn't assign it to himself right away.
Sam sees the issue, assigns himself to it, and moves it into the In Progress pipeline.
After everything is fixed, Sam closes the issue.

Here's another example.

Ani creates an issue asking for beautiful histograms. Like before, it goes into the New Issues pipeline.
John decides that the issue is not as high priority right now because other things are breaking, so he moves it into the Backlog pipeline.
When he has some more time, John assigns himself the issue and moves it into the In Progress pipeline.
Once the issue is finished, he closes the issue.

Publishing

python setup.py sdist upload -r pypi

Name		Name	Last commit message	Last commit date
Latest commit History 268 Commits
datascience		datascience
docs		docs
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
deploy_docs.sh		deploy_docs.sh
setup.py		setup.py
version.py		version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

datascience

Installation

Developing

Using Zenhub

Example Workflow

Publishing

About

Uh oh!

Releases

Packages

Languages

License

stefanv/datascience

Folders and files

Latest commit

History

Repository files navigation

datascience

Installation

Developing

Using Zenhub

Example Workflow

Publishing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages