Join our community | Newsletter | Contact us | Blog | Website | YouTube
Convert monolithic Jupyter notebooks into Ploomber pipelines.
soorgeon.mp4
Try the interactive demo:
Note: Soorgeon is in alpha, help us make it better.
Compatible with Python 3.7 and higher.
pip install soorgeon
Before refactoring, you can optionally test if the original notebook or script runs without exceptions:
# works with ipynb files
soorgeon test path/to/notebook.ipynb
# and notebooks in percent format
soorgeon test path/to/notebook.py
Optionally, set the path to the output notebook:
soorgeon test path/to/notebook.ipynb path/to/output.ipynb
soorgeon test path/to/notebook.py path/to/output.ipynb
To refactor your notebook:
# refactor notebook
soorgeon refactor nb.ipynb
# all variables with the df prefix are stored in csv files
soorgeon refactor nb.ipynb --df-format csv
# all variables with the df prefix are stored in parquet files
soorgeon refactor nb.ipynb --df-format parquet
# store task output in 'some-directory' (if missing, this defaults to 'output')
soorgeon refactor nb.ipynb --product-prefix some-directory
# generate tasks in .py format
soorgeon refactor nb.ipynb --file-format py
# use alternative serializer (cloudpickle or dill) if notebook
# contains variables that cannot be serialized using pickle
soorgeon refactor nb.ipynb --serializer cloudpickle
soorgeon refactor nb.ipynb --serializer dill
To learn more, check out our guide.
Soorgeon has a clean
command that applies
black for .ipynb
and .py
files:
soorgeon clean path/to/notebook.ipynb
or
soorgeon clean path/to/script.py
Soorgeon has a lint
command that can apply [flake8]:
soorgeon lint path/to/notebook.ipynb
or
soorgeon lint path/to/script.py
git clone https://github.com/ploomber/soorgeon
Exploratory data analysis notebook:
cd soorgeon/examples/exploratory
soorgeon refactor nb.ipynb
# to run the pipeline
pip install -r requirements.txt
ploomber build
Machine learning notebook:
cd soorgeon/examples/machine-learning
soorgeon refactor nb.ipynb
# to run the pipeline
pip install -r requirements.txt
ploomber build
To learn more, check out our guide.
Ploomber is a big community of data enthusiasts pushing the boundaries of Data Science and Machine Learning tooling.
Whatever your skillset is, you can contribute to our mission. So whether you're a beginner or an experienced professional, you're welcome to join us on this journey!
Click here to know how you can contribute to Ploomber.
We collect anonymous statistics to understand and improve usage. For details, see here