Jarvis Auto ML
Jarvis Auto ML
Jarvis Auto ML
net/publication/353823706
CITATIONS READS
5 1,382
1 author:
Corentin Macqueron
ORANO (ex-AREVA)
29 PUBLICATIONS 26 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Corentin Macqueron on 02 November 2022.
- state-of-the-Art
- easy to build, with automated preprocessing and hyperparameter
optimization
- easy to use
- fast to train, capable of using multi-CPUs and GPUs
- not a black box
- able to deal with regression, classification, steady state and transient state
and vision
- able to deal with numerical and textual data
3
WHAT’S JARVIS?
JARVIS is a Python script based on Scikit-Learn [1] and Keras [2] that only requires a few
inputs from the user and that will automately build a machine learning model
JARVIS requires the data to be in the TSV format, the type of model to look for, the
validation strategy, the ‘intensity’ of the hyperparameters optimization and whether the
model should be explained or not (and some few other inputs)
The JARVIS Graphical User Interface is either a text file that just needs a copy-paste to be
run in a Python console, or a Jupyter Notebook navigator
4
WHAT’S JARVIS?
5
DATA LAB
JARVIS offers a DATA LAB providing insights on the data prior to the machine learning
phase
The DATA LAB analyzes the data and shows the statistical distributions, the correlations
and the relative importance of the inputs (according to linear correlation coefficients and
to a quick gradient boosting feature importance) for the outputs prediction and allows for
selecting only a fraction of relevant inputs
6
PREPROCESSING
JARVIS automatically :
7
PREPROCESSING – TIME-SERIES
For transient states, JARVIS uses the ‘sliding window’
approach and recursive integration to make
predictions on any arbitrarily long sequence
8
TIME-SERIES
For transient states, JARVIS can use
both machine learning and deep
learning techniques
9
PREPROCESSING
– NATURAL LANGUAGE PROCESSING
For text data, JARVIS automatically :
- training-validation split
- training-validation-test split
- 3 folds cross validation split
- 3 folds cross validation + test split
11
ALGORITHMS
JARVIS offers several regression algorithms [1][2]:
- Dummy (Scikit-Learn)
- Linear (Scikit-Learn)
- Polynomial (Scikit-Learn)
- kNN (Scikit-Learn)
- Spline (EARTH)
- Decision Tree (Scikit-Learn)
- Random Forest (Scikit-Learn)
- AdaBoost (Scikit-Learn)
- Gradient Boosting (Scikit-Learn)
- XGBoost (DMLC)
- LightGBM (Microsoft)
- Gaussian processes (Scikit-Learn)
- Neural Network (Keras-TensorFlow) 12
ALGORITHMS
JARVIS offers several classification algorithms [1][2]:
- Dummy (Scikit-Learn)
- Logistic Regression (Scikit-Learn)
- kNN (Scikit-Learn)
- Naive Bayes (Scikit-Learn)
- Linear Discriminant Analysis (Scikit-Learn)
- Decision Tree (Scikit-Learn)
- Random Forest (Scikit-Learn)
- AdaBoost (Scikit-Learn)
- Gradient Boosting (Scikit-Learn)
- XGBoost (DMLC)
- LightGBM (Microsoft)
- Gaussian processes (Scikit-Learn)
- Neural Network (Keras-TensorFlow) 13
HYPERPARAMETER OPTIMIZATION
JARVIS offers 4 ‘intensity’ levels for
hyperparameter optimization
15
MODEL EXPLANATION
JARVIS offers 3 ways to explain the model :
These 3 approaches show how the model behave, both quantitatively and
qualitatively, can ‘explain’ the value of the outputs as a function of each input and
can even show how the model behaves in extrapolation
16
COMPUTING
JARVIS can run the training on a single or multiple CPU core(s) or on a single or multiple
GPU(s)
The training is parallelized at the grid search level and not at the model level
(tiny models would not benefit from parallel computing)
17
CONVERGENCE
JARVIS automatically performs a convergence study to show the effect of the size of
the dataset on the machine learning score or error, by performing 10 subsamplings
from 10% to 100% of the training set to plot the validation score or error for each
subsampling, so the user can see if more data would be better or not :
18
GRAPHS AND RESULTS
JARVIS will automatically saves the database with the original and predicted
results so the user can check the error case by case
19
COMPUTER VISION
JARVIS also offers image processing capabilities for vision (classification learning)
20
COMPUTER VISION
JARVIS builds and trains convolutional neural layers followed by fully connected layers, following the user’s
requests for the following parameters :
It is possible to load a pre-trained convolutional neural network such as VGG [7], ResNet [8] or INCEPTION [9]
with a single keyword to reduce training time and increase performance
- Clustering (the number of clusters to be looked for is automated with the Elbow method)
(Gaussian Mixture and DBSCAN) [1]
- Anomaly detection (DBSCAN, One-Class SVM, Z-Score, Isolation Forest and Auto-Encoding) [1][2]
23
OPTIMIZATION
Models built with JARVIS can be coupled
to an optimizer program built with SciPy
24
USE CASES
JARVIS is being used to build surrogate models in fluid dynamics, structural mechanics,
fire safety, to understand phenomena (such as weather impact on industrial cooling
systems), to make decision helping tools in various engineering domains such as piping or
IC, to perform topology optimization for complex systems such as thermoelectric
generators and to build digital twins for process engineering in transient states
25
LIMITATIONS
What JARVIS doesn’t do (yet):
26
REFERENCES
[1] Pedregosa, F., et al. (2011), Scikit-Learn: Machine learning in Python, https://scikit-learn.org/stable
[2] Chollet, F. (2015), Keras: Deep Learning Library for Theano and TensorFlow, https://keras.io/
[4] Pak, M. et al. (2017), Scikit-Optimize : Sequential model-based optimization in Python, https://scikit-optimize.github.io/stable/
[5] Iwanaga, T., et al. (2022), SALib - Sensitivity Analysis Library in Python, https://salib.readthedocs.io/en/latest/
[7] Simonyan, K., Zisserman, A. (2014), Very Deep Convolutional Networks for Large-Scale Image Recognition,
https://arxiv.org/abs/1409.1556
[8] He, K., et al. (2015), Deep Residual Learning for Image Recognition, https://arxiv.org/abs/1512.03385
[9] Szegedy, C., et al. (2014), Going deeper with convolutions, https://arxiv.org/pdf/1409.4842v1.pdf
JARVIS was mainly developed by Corentin Macqueron, with the much appreciated help of Kenza Hammou for Computer Vision
27
View publication stats