0% found this document useful (0 votes)
57 views14 pages

Time Series Classification: Lab Based Project

This document describes a time series classification project using a convolutional neural network (CNN). Time series data from a astronomy dataset was encoded into 2D images with 6 channels, one for each passband. The CNN model takes these encoded images as input and outputs classifications. Preprocessing was needed to handle varying time series lengths, which included resizing smaller images and extracting only images above a certain size. The overall approach uses a CNN to learn patterns from the encoded time series data in order to perform classification.

Uploaded by

Aniket Sujay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views14 pages

Time Series Classification: Lab Based Project

This document describes a time series classification project using a convolutional neural network (CNN). Time series data from a astronomy dataset was encoded into 2D images with 6 channels, one for each passband. The CNN model takes these encoded images as input and outputs classifications. Preprocessing was needed to handle varying time series lengths, which included resizing smaller images and extracting only images above a certain size. The overall approach uses a CNN to learn patterns from the encoded time series data in order to perform classification.

Uploaded by

Aniket Sujay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

 

LAB BASED PROJECT 

TIME SERIES CLASSIFICATION 


ANIKET SUJAY - 17122006 
SANAT BHARGAVA - 17122023 

Introduction 

Time series classification is one of the most common problems in the modern data science paradigm. Time 
series is a representation of lots of important events. With the advent of technology, the amount of data that 
has become available for exploitation has exploded in the past decade. The classic statistical machine learning 
algorithms are no longer enough for proper analysis for the coming data wave. Neural Networks although 
invented near the mid-1900 has seen a boom in interest in the data science community. This is due to their 

 
 
 

performance on the big data-sets. Their applicability in multiple paradigms such as computer vision and 
sequential data analysis. One of the variations of the classical neural network which is used for the analysis 
of sequential data is the RNN(Recurrent Neural Network). These kinds of networks are extremely good at 
learning the underlying long-range dependencies in the sequential data. Which makes them ideal for time 
series analysis. 

But one of the newer techniques that are being used to analyze such data is encoding it in other forms and 
using different models to get better performance. One of such architectures is CNN. CNN’s have been proven 
to be one of the most capable models in the field of computer vision. In this project, we are going to explore 
one such technique where the 1-D time series is encoded into a 2-D image and then a CNN is used to learn 
and classify. 

About the Data 

In this project, we decided to use the PLAsTiCC Astronomy dataset available on the KAGGLE website. 
PLAsTiCC is intended to simulate sources that vary with time in the night sky as seen during the first three 
years of operation of the ​Large Synoptic Survey Telescope (LSST)​. 
The LSST (illustrated below) is a telescope with an 8.4-meter primary mirror diameter being built high up in 

the Atacama desert of Chile, on a mountain called Cerro Pachon. 

The dataset provided has 2 parts, the training set, and the metadata information. 

Training set: 

We have 6 columns in this file. They are object_id, mjd, passband, flux, flux_err, detected. 

The first column in this file is the object_id information. Each object observed has its own unique id. The 

second column is the mjd or the time information. The mjd is an astronomical unit and its form is of no 

 

 
 

consequence on the original dataset. The third column is the passband column, this indicates the filter used 

to gather the data. There are 6 passbands for each object, so we have 6 different time-series data for each 

unique object. The fourth column i.e the flux information is the light wave information of the object being 

observed. The fourth column is the flux_err which denotes the 68% confidence interval for the flux column 

values. 

Training_set_metadata: 

This file contains additional information about the training_set file data. It has information about the 

orientation of the telescope, the redshift information, coordinates of the object under observation in different 

coordinate systems. 

EDA 

First, we will explore the data and analyze its structure. Plotting the length of the time series for different 

objects shows us that the time series length is highly ununiform. This is due to the fact that the sampling 

period of the telescope is not the same all the time. For example, say the telescope is looking at the moon for 

about 2 hours on Monday, it will return to observe on Tuesday bu the sampling time will be different say 2:30 

hours. This nonuniformity creates a problem. Because when we encode the time series into images later on 

we will get different size images. So resizing them will result in some loss of information. 

 

 
 

Even the number of observation made for an object vary for different passband. This will 
have to be fixed, further resulting in loss of information due to resizing. For example,  

If we 57 observations for some passband of an object it will create an image of 57x57 


according to the encoding schema below. But the same object could have a 50x50 image 
for another passband, this will result in a 50x50 image. So we have to either pad the 
smaller to make it large or crop the larger to fit the smaller size. 

 

 
 

Approach 

As stated in the introduction we have used an encoding technique, which changed the 1-D time series data 
into 2-D matrix data. This is done to exploit the promising performance of the CNN architecture for image 
data. We have 6 passbands for each object in the dataset. This acts as channels analogous to the 3 channel 
image data.  

So far we have an image like data with 6 channels, each representing a passband for each object. This is fed 
into a CNN with many convolutions and pooling layers which then is fed into a fully connected Neural 
Network for classification tasks. 

TIme-Series to Image Encoding 

Time-series can be characterized by a distinct recurrent behavior such as periodicities and irregular 
cyclicities. The main idea is to reveal in which points some trajectories return to a previous state and it can 
be formulated as:  

R(i,j) = θ(ǫ − k~|si ~sj|), ~s(.) ∈ Rm, i, j = 1, ..., K (1) 

where K is the number of considered states ~s, ǫ is a threshold distance, k.k a norm, and θ(.) the 
Heaviside function. 

Θ - This could be any step function. Ex. relu, leaky relu, tanh, sigmoid, etc. Following is the image data 
obtained for the object with id=730, for one passband. We have 6 for each object. 

In our implementation, we are going to use a modified form of the above encoding function. As we can see 
that the thresholding number should be fundamentally different for each image. So to keep things simple we 
will use the sigmoid function. 

 

 
 

As we can see this is a regular image data. The good thing about this type of encoding is that all the 
sequential information is preserved in the above representation. So the need for networks such as the RNN is 
nullified. CNN can learn the underlying sequential structure of the object’s light flux information by learning 
the image representation of it. 

After encoding and analyzing the lengths of the images we see that we have image data with different sizes. 
The dataset has more images with smaller lengths than the images with larger sizes. We know that CNN’s 
perform worse when the image size is small as the pooling layers already discard some of the pixels in the 
image. So CNN’s are able to fully learn the structure of the data from these images.  

After some preprocessing and encoding the data we can see that the number of images with smaller 
dimensions is much larger than the ones with the larger dimensions. This is a problem since CNNs don’t 
perform very well with low-resolution data. 

 

 
 

We have 2 options here, first, we can use padding with the smaller images and scale them up to dimension 
(57, 57, 5), or we can just use the larger images. 

We are going to try out the simpler second approach first. We will now scale the images with dimensions 
larger than 50 up to 57. So after scaling out images we get the following distribution. 

 

 
 

Now we have about 1644 instances of training data. We will now perform the train test split.  

ARCHITECTURE 

A multi-stage deep CNN model is applied here with a 6-channel input of size 57 × 57 and the output layer 
with c neurons. Each feature learning stage is representing a different feature-level and consists of 
convolution (filter), activation, and pooling operators, respectively. The input and output of each layer are 
called feature maps. A filter layer convolves its input with a set of trainable kernels. The convolutional layer is 
the core building block of a CNN and exploits spatially local correlation by enforcing a local connectivity 
pattern between neurons of adjacent layers. The connections are local but always extend along with the entire 
depth of the input volume in order to produce the strongest response to a spatially local input pattern. The 
activation function (such as sigmoid and tanh) introduces non-linearity into the networks and allows them to 
learn complex models. Here we applied ReLU (Rectified Linear Units) because it trains the neural networks 

 

 
 

several times faster28 without a significant penalty to generalization accuracy. Pooling (a.k.a. subsampling) 
reduces the resolution of input and makes it robust to small variations for previously learned features. It 
combines the outputs of the i-1th layer into a single input in the ith layer over a range of local 
neighborhoods. At the end of the 2-stage feature extraction, the feature maps are flattened and fed into a 
fully connected (FC) layer for classification. FC layers connect every neuron in one layer to every neuron in 
another layer, which in principle are the same as the traditional multi-layer perceptron (MLP). 

 

 
 

We get 72% training set accuracy and around 61% test set accuracy with the above model. We can change a 
few things to increase accuracy.  

Right now the encoding has been done only using the flux values. Next, we will 
encode using the ratio of the flux and flux_err values. This will increase the 
information inside the encoding. 

Before Change in encoding After Change in Encoding 

 
10 
 
 

In the previous neural net, our training accuracy was 72% and validation accuracy was around 61% which is 
not a bad performance, but we could do better. We used the encoding inputs for the new net and used relu as 
the activation function of one layer in the neural network. This increased the training set accuracy drastically. 
We can see that it over-fitted the data quite badly​. This is a high variance situation, we can do 
regularization on this network. The regularization of neural networks essentially reduces the complexity of 
the network. So we have two ways to do regularization: we make the neural network a bit simpler or we can 
add regularization penalty in the hidden layers. Or we can simplify the network itself.  

 
11 
 
 

Attempt-3 Sparse Network 

As we can see that the training set accuracy went down but this model generalizes over that test set data 
quite well. The model is doing okay on the training set due to the fact that we have skewed nature of the size 
of the time series lengths for different objects. 

Changing the Metrics 

We will now change the classical metric to some sophisticated ones. We are going to use the 
PrecisionAtRecall metric. This is specially designed to work with classification models. 

 
12 
 
 

Training Accuracy: 84.60% 

Validation Accuracy: 77.17% 

Reference: 

Academic Paper​:​ Classification of Time-Series Images Using Deep Convolutional Neural Networks 

By - Nima Hatami, Yann Gavet and Johan Debayle  

 
13 
 
 

Ecole Nationale Supérieure des Mines de Saint-Etienne, SPIN/LGF CNRS UMR 5307, 158 cours Fauriel, 
42023 Saint-Etienne, France 

Link: ​https://arxiv.org/pdf/1710.00886.pdf 

Data: PLAsTiCC dataset by Kaggle 

Link: ​https://www.kaggle.com/c/PLAsTiCC-2018/overview 

 
14 

You might also like