0% found this document useful (0 votes)
196 views7 pages

(Paper) Time Series Feature Extraction On Basis of Scalable Hypothesis Tests (Tsfresh-A Python Package)

Uploaded by

irradiance0121
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
196 views7 pages

(Paper) Time Series Feature Extraction On Basis of Scalable Hypothesis Tests (Tsfresh-A Python Package)

Uploaded by

irradiance0121
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/324948288

Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh


– A Python package)

Article  in  Neurocomputing · May 2018


DOI: 10.1016/j.neucom.2018.03.067

CITATIONS READS

225 3,670

4 authors, including:

Maximilian Christ Nils Braun


SMS group GmbH Karlsruhe Institute of Technology
11 PUBLICATIONS   397 CITATIONS    71 PUBLICATIONS   435 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

FRESH (FeatuRe Extraction by Scalable Hypothesis testing) View project

Belle 2 Detector View project

All content following this page was uploaded by Maximilian Christ on 29 May 2018.

The user has requested enhancement of the downloaded file.


JID: NEUCOM
ARTICLE IN PRESS [m5G;May 16, 2018;13:36]
Neurocomputing 0 0 0 (2018) 1–6

Contents lists available at ScienceDirect

Neurocomputing
journal homepage: www.elsevier.com/locate/neucom

Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests


(tsfresh – A Python package)
Maximilian Christ a, Nils Braun b, Julius Neuffer a, Andreas W. Kempa-Liehr c,d,∗
a
Blue Yonder GmbH, Karlsruhe, Germany
b
Institute of Experimental Particle Physics, Karlsruhe Institute of Technology, Germany
c
Department of Engineering Science, University of Auckland, New Zealand
d
Freiburg Materials Research Center, University of Freiburg, Germany

a r t i c l e i n f o a b s t r a c t

Article history: Time series feature engineering is a time-consuming process because scientists and engineers have to
Received 31 May 2017 consider the multifarious algorithms of signal processing and time series analysis for identifying and
Revised 22 March 2018
extracting meaningful features from time series. The Python package tsfresh (Time Series FeatuRe
Accepted 23 March 2018
Extraction on basis of Scalable Hypothesis tests) accelerates this process by combining 63 time series
Available online xxx
characterization methods, which by default compute a total of 794 time series features, with feature
Communicated by Dr. Francesco Dinuzzo selection on basis automatically configured hypothesis tests. By identifying statistically significant time
series characteristics in an early stage of the data science process, tsfresh closes feedback loops with
Keywords:
Feature engineering domain experts and fosters the development of domain specific features early on. The package imple-
Time series ments standard APIs of time series and machine learning libraries (e.g. pandas and scikit-learn)
Feature extraction and is designed for both exploratory analyses as well as straightforward integration into operational data
Feature selection science applications.
Machine learning
© 2018 Published by Elsevier B.V.

1. Introduction data science projects in order to rapidly extract and explore differ-
ent time series features and evaluate their statistical significance
Trends such as the Internet of Things (IoT) [1], Industry 4.0 [2], for predicting the target. The Python package tsfresh supports
and precision medicine [3] are driven by the availability of cheap this process by providing automated time series feature extraction
sensors and advancing connectivity, which among others increases and selection on basis of the FRESH algorithm [12].
the availability of temporally annotated data. The resulting time se-
ries are the basis for machine learning applications like the classi-
2. Problems and background
fication of hard drives into risk classes concerning a specific de-
fect [4], the analysis of the human heart beat [5], the optimiza-
A time series is a sequence of observations taken sequentially
tion of production lines [6], the log analysis of server farms for
in time [13]. In order to use a set of time series D = {χi }N i=1
as
detecting intruders [7], or the identification of patients with high
input for supervised or unsupervised machine learning algorithms,
infection risk [8]. Examples for regression tasks are the prediction
each time series χ i needs to be mapped into a well-defined feature
of the remaining useful life of machinery [9] or the estimation of
space with problem specific dimensionality M and feature vec-
conditional event occurrence in complex event processing appli-
tor xi = (xi,1 , xi,2 , . . . , xi,M ). In principle, one might decide to map
cations [10]. Other frequently occurring temporal data are event
the time series of set D into a design matrix of N rows and M
series from processes, which could be transformed to uniformly
columns by choosing M data points from each time series χ i as
sampled time series via process evolution functions [11]. Time se-
elements of feature vector xi . However, from the perspective of
ries feature extraction plays a major role during the early phases of
pattern recognition [14], it is much more efficient and effective
to characterize the time series with respect to the distribution of
data points, correlation properties, stationarity, entropy, and non-

Corresponding author at: Department of Engineering Science, University of
linear time series analysis [15]. Therefore the feature vector xi can
Auckland, New Zealand.
E-mail addresses: [email protected] (M. Christ), [email protected] (N. Braun),
be constructed by applying time series characterization methods fj :
[email protected] (J. Neuffer), [email protected] (A.W. χ i → xi, j to the respective time series χ i , which results into feature
Kempa-Liehr). vector xi = ( f1 (χi ), f2 (χi ), . . . , fM (χi )). This feature vector can be

https://doi.org/10.1016/j.neucom.2018.03.067
0925-2312/© 2018 Published by Elsevier B.V.

Please cite this article as: M. Christ et al., Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python
package), Neurocomputing (2018), https://doi.org/10.1016/j.neucom.2018.03.067
JID: NEUCOM
ARTICLE IN PRESS [m5G;May 16, 2018;13:36]

2 M. Christ et al. / Neurocomputing 000 (2018) 1–6

Fig. 1. The three steps of the tsfresh algorithm are feature extraction (1.), calculation of p-values (2.) and a multiple testing procedure (3.) [12]: Both steps 1. and 2. are
highly parallelized in tsfresh, further 3. has a negligible runtime For 1, the public function extract_features is provided. 2. and 3. can be executed by the public function
select_features. The function extract_relevant_features combines all three steps. By default the hypothesis tests of step 2 are configured automatically depending on the type
of supervised machine learning problem (classification/regression) and the feature type (categorical/continuous) [12].

extended by additional univariate attributes {ai,1 , ai,2 , . . . , ai,U }N


i=1
of a fitted autoregressive model (AR), the AR model is only fitted
and feature vectors from other types of time series. If a machine once and shared. Local parallelization is realized on basis of the
learning problem had a total of K different time series and U uni- Python module multiprocessing, which is used both for fea-
variate variables per sample i, the resulting design matrix would ture selection and feature extraction. Distributed computing on a
have i rows and (K · M + U ) columns. Here, M denotes the num- cluster is supported on basis of Dask [22].
ber of features from time series characterization methods. The pro- The parallelization in the extraction and selection process of the
cessed time series do not need to have the same number of data features enables significant runtime improvement (Fig. 2). How-
points. ever, the memory temporarily used by the feature calculators
For classification and regression tasks the significance of ex- quickly adds up in a parallel regime. Hence, the memory consump-
tracted features is of high importance, because too many irrele- tion of the parallelized calculations can be high, which can make
vant features will impair the ability of the algorithm to general- the usage of a high number of processes on machines with low
ize beyond the train set and result in overfitting [12]. Therefore, memory infeasible.
tsfresh provides a highly parallel feature selection algorithm on
basis of statistical hypothesis tests, which by default are configured
automatically depending on the type of supervised machine learn- 3.2. Software functionalities
ing problem (classification/regression) and the feature type (cate-
gorical/continuous) [12]. tsfresh provides 63 time series characterization methods,
which compute a total of 794 time series features. A design ma-
3. Software framework
trix of univariate attributes can be extended by time series features
from one or more associated time series. Alternatively, a design
3.1. Software architecture
matrix can be generated from a set of time series, which might
have different number of data points and could comprise different
By widely deploying pandas.DataFrames, e.g. as input and out-
types of time series. The resulting design matrix can be either used
put objects, and providing scikit-learn compatible trans-
for unsupervised machine learning, or supervised machine learn-
former classes, tsfresh implements the application program-
ing, in which case statistically significant features can be selected
ming interfaces of the most popular Python machine learning and
with respect to the classification or regression problem at hand.
data analysis frameworks such as scikit-learn [16], numpy
For this purpose hypothesis tests are automatically configured de-
[17], pandas [18], scipy [19], keras [20] or tensorflow [21].
pending on the type of supervised machine learning problem (clas-
This enables users to seamlessly integrate tsfresh into complex
sification/regression) and the feature type (categorical/continuous)
machine learning systems that rely on state-of-the-art Python data
[12].
analysis packages. Rows of the design matrix correspond to samples identified by
The feature_extraction submodule (Fig. 1) contains both the col- their id, while its columns correspond to the extracted features.
lection of feature calculators and the logic to apply them effi- Column names uniquely identify the corresponding features with
ciently to the time series data. The main public function of this respect to the following three aspects. (1) the kind of time se-
submodule is extract_features. The number and parameters of all ries the feature is based on, (2) the name of the feature calculator,
extracted features are controlled by a settings dictionary. It can be which has been used to extract the feature, and (3) key-value pairs
filled manually, instantiated using one of the predefined objects, of parameters configuring the respective feature calculator:
or reconstructed from the column names of an existing feature
matrix. The feature_selection submodule provides the function se- [kind]_[calculator]_[parameterA]_[valueA]_[parameterB]_[valueB]
lect_features, which implements the highly parallel feature selec-
tion algorithm [12]. Additionally, tsfresh contains several minor For supervised machine learning tasks, for which an instance of
submodules: utilities provides helper functions used all over the sklearn compatible transformer FeatureSelector had been fitted,
package. convenience contains the extract_relevant_features func- the feature importance can be inspected, which is reported as (1
tion, which combines the extraction and selection with an addi- −p-value) with respect to the result of the automatically config-
tional imputing step in between. transformers enables the usage of ured hypothesis tests.
tsfresh as part of scikit-learn [16] pipelines. The test suite
covers 97% of the code base.
The feature selection and the calculation of features in 4. Illustrative example and empirical results
tsfresh are parallelized and unnecessary calculations are pre-
vented by calculating groups of similar features and sharing auxil- For this example, we will inspect 15 force and torque sen-
iary results. For example, if multiple features return the coefficients sors on a robot that allow the detection of failures in the robots

Please cite this article as: M. Christ et al., Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python
package), Neurocomputing (2018), https://doi.org/10.1016/j.neucom.2018.03.067
JID: NEUCOM
ARTICLE IN PRESS [m5G;May 16, 2018;13:36]

M. Christ et al. / Neurocomputing 000 (2018) 1–6 3

Fig. 2. Memory consumption of extraction and selecting time series features from 30 time series on MacBook Pro, 2.7 GHz Intel Core i5 and tsfresh v0.11.0 (Table 1).
Each time series has a length of 10 0 0 data points. (a) one core, (b) four cores (b).

based on the sensor time series. This example is extended in the


Jupyter notebook robot_failure_example.ipynb.1
The runtime analysis given in the appendix (Fig. A.1) is sum-
marized in Fig. 3. It shows the histogram and boxplot of the
logarithm of the average runtimes for the feature calculators of
tsfresh for a time series of 10 0 0 data points. The average
has been obtained from a sample of 30 different time series
for which all features had been computed three times. The time
series were simulated beforehand from  the following sequence
xt+1 = xt + 0.0045(y − 1/0.3 )xt − 325xt3 + 6.75 · 10−5 ηt with ηt ∼
N (0, 1 )[25, p. 164] and y being the target. The figure shows that
all but 9 of the time series feature extraction methods have an av-
erage runtime below 25 ms.
On a selection of 31 datasets from the UCR time series classi-
fication archive as well as an industrial dataset from the produc-
tion of steel billets by a German steel manufacturer the FRESH
algorithm was able to automatically extract relevant features for
Fig. 3. Average runtime of feature calculation algorithms without parallelization time series classification tasks [12]. The fresh algorithm is able to
for time series with 10 0 0 data points on MacBook Pro, 2.7 GHz Intel Core i5 and scale linearly with the number of used features, the number of
tsfreshv0.11.0 (Table 1). The runtime varies between 1.7 ms and 4.8 s with a devices/samples and the number of different time series [12]. Its
median of 3.6 ms.
scaling with respect to the length of the time series or number of
features depends on the individual feature calculators. Some fea-
tures such as the maximal value scale linearly with the length of
the time series while other, more complex features have higher
execution [23,24]:

In line 5 we load the dataset. Then, in line 6, the features are


extracted from the pandas.DataFrame df containing the time series. costs such as the calculation of the sample entropy (Fig. A.1). So,
The resulting pandas.DataFrame X, the design matrix, comprises adjusting the calculated features can change the total runtime of
3270 time series features. We have to replace non-finite values tsfresh drastically.
(e.g. NaN or inf) by imputing in line 7. Finally, the feature selec-
tion of tsfresh is used to filter out irrelevant features. The final
design matrix X_filtered contains 623 time series features, which
can now be used for training a classifier (e.g. a RandomForest from 1
https://github.com/blue-yonder/tsfresh/blob/v0.11.0/notebooks/robot_failure_
the scikit-learn package) to predict a robot execution failure example.ipynb.

Please cite this article as: M. Christ et al., Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python
package), Neurocomputing (2018), https://doi.org/10.1016/j.neucom.2018.03.067
JID: NEUCOM
ARTICLE IN PRESS [m5G;May 16, 2018;13:36]

4 M. Christ et al. / Neurocomputing 000 (2018) 1–6

5. Conclusions Acknowledgments

The Python based machine learning library tsfresh is a fast The authors would like to thank M. Feindt, who inspired the
and standardized machine learning library for automatic time development of the FRESH algorithm, U. Kerzel and F. Kienle for
series feature extraction and selection. It is the only Python their valuable feedback, Blue Yonder GmbH and its CTO J. Karstens,
based machine learning library for this purpose. The only alter- who approved the open sourcing of the tsfresh package, and the
native is the Matlab based package hctsa [26], which extracts contributors to tsfresh: M. Frey, earthgecko, N. Haas, St. Müller,
more than 7700 time series features. Because tsfresh imple- M. Gelb, B. Sang, V. Tang, D. Spathis, Ch. Holdgraf, H. Swaffield, D.
ments the application programming interface of scikit-learn, Gomez, A. Loosley, F. Malkowski, Ch. Chow, E. Kruglick, T. Klerx,
it can be easily integrated into complex machine learning G. Koehler, M. Tomlein, F. Aspart, S. Shepelev, J. White, jkleint.
pipelines. The development of tsfresh was funded in part by the German
The widespread adoption of the tsfresh package shows Federal Ministry of Education and Research under Grant number
that there is a pressing need to automatically extract features, 01IS14004 (project iPRODICT).
originating from e.g. financial, biological or industrial applica-
tions. We expect that, due to the increasing availability of anno- Appendix A. Detailed runtime of time series feature extraction
tated temporally data, the interest in such tools will continue to
grow. The average runtime has been obtained from a sample of 30
different time series for which all features had been computed
three times. The time series were simulated beforehand from the
following sequence:
 
Current code version xt+1 = xt + 0.0045(y − 1/0.3 )xt − 325xt3 + 6.75 · 10−5 ηt (A.1)
with ηt ∼ N (0, 1 ) [25, p. 164] and y being the target.

Table 1
Code metadata of tsfresh.
Nr. Code metadata description Please fill in this column

C1 Current code version v0.11.0


C2 Permanent link to code/repository used of this code version https://github.com/blue-yonder/tsfresh/releases/tag/v0.11.0
C3 Legal Code License MIT
C4 Code versioning system used git
C5 Software code languages, tools, and services used Python
C6 Compilation requirements, operating environments & dependencies OS-X, Unix-Like (e.g. Linux), Microsoft Windows
A Python interpreter (2.7 or 3.6.3) and the following Python packages:
requests (2.9.1), numpy (1.10.4),
pandas (0.20.3), scipy (0.17.0), statsmodels (0.6.1), patsy (0.4.1),
scikit-learn (0.17.1), future (0.16.0),
six (1.10.0), tqdm (4.10.0), ipadress (1.0), dask (0.15.2), distributed
(1.18.3)
C7 If available Link to developer documentation/manual http://tsfresh.readthedocs.io
C8 Support email for questions [email protected]

Please cite this article as: M. Christ et al., Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python
package), Neurocomputing (2018), https://doi.org/10.1016/j.neucom.2018.03.067
JID: NEUCOM
ARTICLE IN PRESS [m5G;May 16, 2018;13:36]

M. Christ et al. / Neurocomputing 000 (2018) 1–6 5

Fig. A.1. Average runtime of time series feature extraction methods documented in http://tsfresh.readthedocs.io/en/latest/text/list_of_features.html.

Please cite this article as: M. Christ et al., Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python
package), Neurocomputing (2018), https://doi.org/10.1016/j.neucom.2018.03.067
JID: NEUCOM
ARTICLE IN PRESS [m5G;May 16, 2018;13:36]

6 M. Christ et al. / Neurocomputing 000 (2018) 1–6

References [23] L.S. Lopes, Robot learning at the task level: a study in the assembly domain,
Ph.D. thesis, Universidade Nova de Lisboa, Portugal, 1997.
[1] J. Gubbi, R. Buyya, S. Marusic, M. Palaniswami, Internet of Things (IoT): a vi- [24] M. Lichman, UCI machine learning repository, 2013. http://archive.ics.uci.edu/
sion, architectural elements, and future directions, Future Gener. Comput. Syst. ml.
29 (7) (2013) 1645–1660. [25] A.W. Liehr, Dissipative solitons in reaction diffusion systems, in: Mechanisms,
[2] M. Hermann, T. Pentek, B. Otto, Design principles for Industrie 4.0 scenarios, Dynamics, Interaction, 70, Springer Series in Synergetics, Berlin, 2013. 10.1007/
in: Proceedings of 2016 49th Hawaii International Conference on System Sci- 978- 3- 642- 31251-9.
ences (HICSS), 2016, p. 3928. [26] B.D. Fulcher, N.S. Jones, hctsa: a computational framework for automated time-
[3] F.S. Collins, H. Varmus, A new initiative on precision medicine, N. Engl. J. Med. series phenotyping using massive feature extraction, Cell Syst. 5 (5) (2017)
372 (9) (2015) 793–795, doi:10.1056/NEJMp1500523. 527–531. E3. doi: 10.1016/j.cels.2017.10.001.
[4] R.K. Mobley, An introduction to predictive maintenance, second, Elsevier Inc.,
Woburn, MA, 2002. Maximilian Christ received his M.S. degree in Mathemat-
[5] B.D. Fulcher, M.A. Little, N.S. Jones, Highly comparative time-series analysis: ics and Statistics from Heinrich-Heine-Universität Düssel-
the empirical structure of time series and their methods, J. R. Soc. Interface 10 dorf, Germany, in 2014. In his daily work as a Data Sci-
(83) (2013) 20130048. ence Consultant at Blue Yonder GmbH he optimizes busi-
[6] M. Christ, F. Kienle, A.W. Kempa-Liehr, Time series analysis in industrial appli- ness processes through data driven decisions. Beside his
cations, in: Proceedings of Workshop on Extreme Value and Time Series Anal- business work he is persuing a Ph.D. in collaboration with
ysis, KIT Karlsruhe, 2016, doi:10.13140/RG.2.1.3130.7922. University of Kaiserslautern, Germany. His research inter-
[7] A.L. Buczak, E. Guven, A survey of data mining and machine learning methods est relate on how to deliver business value through Ma-
for cyber security intrusion detection, IEEE Commun. Surv. Tutor. 18 (2) (2016) chine Learning based optimizations.
1153–1176.
[8] J. Wiens, E. Horvitz, J.V. Guttag, Patient risk stratification for hospital-asso-
ciated c. diff as a time-series classification task, in: F. Pereira, C.J.C. Burges,
L. Bottou, K.Q. Weinberger (Eds.), Advances in Neural Information Processing
Systems 25, Curran Associates, Inc., 2012, pp. 467–475.
Nils Braun is a Ph.D. student in High Energy Particle
[9] J. Yan, Machinery Prognostics and Prognosis Oriented Maintenance Manage-
Physics at Karlsruher Institut of Technology (KIT), Ger-
ment, John Wiley & Sons, Singapore, 2015.
many. His research is focussed on developing and op-
[10] M. Christ, J. Krumeich, A.W. Kempa-Liehr, Integrating predictive analytics into
timizing scientific software for analysing and processing
complex event processing by using conditional density estimations, in: Pro-
large amounts of recorded data efficiently. He received his
ceedings of IEEE 20th International Enterprise Distributed Object Comput-
M.S. degree in Physics at KIT in 2016. He has worked at
ing Workshop (EDOCW), IEEE Computer Society, Los Alamitos, CA, USA, 2016,
Blue Yonder GmbH as a Data Science Engineer, where he
pp. 1–8, doi:10.1109/EDOCW.2016.7584363.
developed platforms and utilities for various data science
[11] A. Kempa-Liehr, Performance analysis of concurrent workflows, J. Big Data 2
tasks.
(10) (2015) 1–14, doi:10.1186/s40537-015-0017-0.
[12] M. Christ, A.W. Kempa-Liehr, M. Feindt, Distributed and parallel time series
feature extraction for industrial big data applications. Asian Machine Learning
Conference (ACML) 2016, Workshop on Learning on Big Data (WLBD), Hamil-
ton (New Zealand), ArXiv preprint arXiv: 1610.07717v1.
[13] G.E.P. Box, G.M. Jenkins, G.C. Reinsel, G.M. Ljung, Time Series Analysis: Fore- Julius Neuffer is a Software Developer at Blue Yonder
casting and Control, fifth ed., John Wiley & Sons, Hoboken, New Jersey, 2016. GmbH. He has a background in computer science and phi-
[14] C.M. Bishop, Pattern Recognition and Machine Learning, Information Science losophy. Aside from software development, he takes inter-
and Statistics, Springer, New York, 2006. est in applying machine learning to real-worldproblems.
[15] B.D. Fulcher, Feature-Based Time-Series Analysis, Cornell University Library,
2017. arXiv: 1709.08055v2.
[16] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,
M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine
learning in Python, J. Mach. Learn. Res. 12 (2011) 2825–2830.
[17] S.V.D. Walt, S.C. Colbert, G. Varoquaux, The numpy array: a structure for effi-
cient numerical computation, Comput. Sci. Eng. 13 (2) (2011) 22–30.
[18] W. McKinney, Data structures for statistical computing in Python, in: S. van der
Walt, J. Millman (Eds.), Proceedings of the 9th Python in Science Conference,
2010, pp. 51–56. Andreas W. Kempa-Liehr is a Senior Lecturer at the
[19] E. Jones, T. Oliphant, P. Peterson, et al., SciPy: open source scientific tools for Department of Engineering Science of the University of
Python, 2001. http://www.scipy.org/. Auckland, New Zealand, and an Associate Member of the
[20] F. Chollet, Keras, 2015. https://github.com/fchollet/keras. Freiburg Materials Research Center (FMF) at the Univer-
[21] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, sity of Freiburg, Germany. Andreas received his doctorate
A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, from the University of Münster in 2004 and continued his
M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, research as head of service group Scientific Information
R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, Processing at FMF. From 2009 to 2016 he was working
I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, in different data science roles at EnBW Energie Baden-
O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, in: Tensor- Württemberg AG and Blue Yonder GmbH.
flow: Large-scale machine learning on heterogeneous systems, 2015. Software
available from tensorflow.org
[22] M. Rocklin, Dask: Parallel computation with blocked algorithms and task
scheduling, in: K. Huff, J. Bergstra (Eds.), Proceedings of the 14th Python in
Science Conference, 2015, pp. 130–136.

Please cite this article as: M. Christ et al., Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python
package), Neurocomputing (2018), https://doi.org/10.1016/j.neucom.2018.03.067
View publication stats

You might also like