-
Notifications
You must be signed in to change notification settings - Fork 81
/
Copy pathrelated_projects.txt
138 lines (94 loc) · 5.44 KB
/
related_projects.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
.. _related_projects:
=====================================
Related Projects
=====================================
Below is a list of sister-projects, extensions and domain specific packages.
Interoperability and framework enhancements
-------------------------------------------
These tools adapt scikit-learn for use with other technologies or otherwise
enhance the functionality of scikit-learn's estimators.
- `sklearn_pandas <https://github.com/paulgb/sklearn-pandas/>`_ bridge for
scikit-learn pipelines and pandas data frame with dedicated transformers.
- `Scikit-Learn Laboratory
<https://skll.readthedocs.org/en/latest/index.html>`_ A command-line
wrapper around scikit-learn that makes it easy to run machine learning
experiments with multiple learners and large feature sets.
- `auto-sklearn <https://github.com/automl/auto-sklearn/blob/master/source/index.rst>`_
An automated machine learning toolkit and a drop-in replacement for a
scikit-learn estimator
- `sklearn-pmml <https://github.com/alex-pirozhenko/sklearn-pmml>`_
Serialization of (some) scikit-learn estimators into PMML.
Other estimators and tasks
--------------------------
Not everything belongs or is mature enough for the central scikit-learn
project. The following are projects providing interfaces similar to
scikit-learn for additional learning algorithms, infrastructures
and tasks.
- `pylearn2 <http://deeplearning.net/software/pylearn2/>`_ A deep learning and
neural network library build on theano with scikit-learn like interface.
- `sklearn_theano <http://sklearn-theano.github.io/>`_ scikit-learn compatible
estimators, transformers, and datasets which use Theano internally
- `lightning <http://www.mblondel.org/lightning/>`_ Fast state-of-the-art
linear model solvers (SDCA, AdaGrad, SVRG, SAG, etc...).
- `Seqlearn <https://github.com/larsmans/seqlearn>`_ Sequence classification
using HMMs or structured perceptron.
- `HMMLearn <https://github.com/hmmlearn/hmmlearn>`_ Implementation of hidden
markov models that was previously part of scikit-learn.
- `PyStruct <https://pystruct.github.io>`_ General conditional random fields
and structured prediction.
- `py-earth <https://github.com/jcrudy/py-earth>`_ Multivariate adaptive
regression splines
- `sklearn-compiledtrees <https://github.com/ajtulloch/sklearn-compiledtrees/>`_
Generate a C++ implementation of the predict function for decision trees (and
ensembles) trained by sklearn. Useful for latency-sensitive production
environments.
- `lda <https://github.com/ariddell/lda/>`_: Fast implementation of Latent
Dirichlet Allocation in Cython.
- `Sparse Filtering <https://github.com/jmetzen/sparse-filtering>`_
Unsupervised feature learning based on sparse-filtering
- `Kernel Regression <https://github.com/jmetzen/kernel_regression>`_
Implementation of Nadaraya-Watson kernel regression with automatic bandwidth
selection
- `gplearn <https://github.com/trevorstephens/gplearn>`_ Genetic Programming
for symbolic regression tasks.
- `nolearn <https://github.com/dnouri/nolearn>`_ A number of wrappers and
abstractions around existing neural network libraries
- `sparkit-learn <https://github.com/lensacom/sparkit-learn>`_ Scikit-learn functionality and API on PySpark.
- `keras <https://github.com/fchollet/keras>`_ Theano-based Deep Learning library.
- `mlxtend <https://github.com/rasbt/mlxtend>`_ Includes a number of additional
estimators as well as model visualization utilities.
Statistical learning with Python
--------------------------------
Other packages useful for data analysis and machine learning.
- `Pandas <http://pandas.pydata.org>`_ Tools for working with heterogeneous and
columnar data, relational queries, time series and basic statistics.
- `theano <http://deeplearning.net/software/theano/>`_ A CPU/GPU array
processing framework geared towards deep learning research.
- `Statsmodel <http://statsmodels.sourceforge.net/>`_ Estimating and analysing
statistical models. More focused on statistical tests and less on prediction
than scikit-learn.
- `PyMC <http://pymc-devs.github.io/pymc/>`_ Bayesian statistical models and
fitting algorithms.
- `REP <https://github.com/yandex/REP>`_ Environment for conducting data-driven
research in a consistent and reproducible way
- `Sacred <https://github.com/IDSIA/Sacred>`_ Tool to help you configure,
organize, log and reproduce experiments
- `gensim <https://radimrehurek.com/gensim/>`_ A library for topic modelling,
document indexing and similarity retrieval
- `Seaborn <http://stanford.edu/~mwaskom/software/seaborn/>`_ Visualization library based on
matplotlib. It provides a high-level interface for drawing attractive statistical graphics.
- `Deep Learning <http://deeplearning.net/software_links/>`_ A curated list of deep learning
software libraries.
Domain specific packages
~~~~~~~~~~~~~~~~~~~~~~~~
- `scikit-image <http://scikit-image.org/>`_ Image processing and computer
vision in python.
- `Natural language toolkit (nltk) <http://www.nltk.org/>`_ Natural language
processing and some machine learning.
- `NiLearn <https://nilearn.github.io/>`_ Machine learning for neuro-imaging.
- `AstroML <http://www.astroml.org/>`_ Machine learning for astronomy.
- `MSMBuilder <http://www.msmbuilder.org/>`_ Machine learning for protein
conformational dynamics time series.
Snippets and tidbits
---------------------
The `wiki <https://github.com/scikit-learn/scikit-learn/wiki/Third-party-projects-and-code-snippets>`_ has more!