Plot - Sleep: 1 Sleep Stage Classification From Polysomnography (PSG) Data
Plot - Sleep: 1 Sleep Stage Classification From Polysomnography (PSG) Data
Plot - Sleep: 1 Sleep Stage Classification From Polysomnography (PSG) Data
December 7, 2021
import numpy as np
import matplotlib.pyplot as plt
import mne
from mne.datasets.sleep_physionet.age import fetch_data
from mne.time_frequency import psd_welch
1
1.1 Load the data
Here we download the data from two subjects and the end goal is to obtain :term:epochs and its
associated ground truth.
MNE-Python provides us with :func:mne.datasets.sleep_physionet.age.fetch_data to
conveniently download data from the Sleep Physionet dataset [1]_ [2]_. Given a list of subjects
and records, the fetcher downloads the data and provides us for each subject, a pair of files:
• -PSG.edf containing the polysomnography. The :term:raw data from the EEG helmet,
• -Hypnogram.edf containing the :term:annotations recorded by an expert.
Combining these two in a :class:mne.io.Raw object then we can extract :term:events based on
the descriptions of the annotations to obtain the :term:epochs.
1.1.1 Read the PSG data and Hypnograms to create a raw object
raw_train = mne.io.read_raw_edf(alice_files[0])
annot_train = mne.read_annotations(alice_files[1])
raw_train.set_annotations(annot_train, emit_warning=False)
raw_train.set_channel_types(mapping)
2
[3]:
3
1.1.2 Extract 30s events from annotations
The Sleep Physionet dataset is annotated using 8 labels <physionet_labels_>_: Wake (W),
Stage 1, Stage 2, Stage 3, Stage 4 corresponding to the range from light sleep to deep sleep, REM
sleep (R) where REM is the abbreviation for Rapid Eye Movement sleep, movement (M), and Stage
(?) for any none scored segment.
We will work only with 5 stages: Wake (W), Stage 1, Stage 2, Stage 3/4, and REM sleep (R).
To do so, we use the event_id parameter in :func:mne.events_from_annotations to select which
events are we interested in and we associate an event identifier to each of them.
Moreover, the recordings contain long awake (W) regions before and after each night. To limit
the impact of class imbalance, we trim each recording by only keeping 30 minutes of wake time
before the first occurrence and 30 minutes after the last occurrence of sleep stages.
[4]: annotation_desc_2_event_id = {'Sleep stage W': 1,
'Sleep stage 1': 2,
'Sleep stage 2': 3,
'Sleep stage 3': 4,
'Sleep stage 4': 4,
'Sleep stage R': 5}
# keep last 30-min wake events before sleep and first 30-min wake events after
# sleep and redefine annotations on raw data
annot_train.crop(annot_train[1]['onset'] - 30 * 60,
annot_train[-2]['onset'] + 30 * 60)
raw_train.set_annotations(annot_train, emit_warning=False)
events_train, _ = mne.events_from_annotations(
raw_train, event_id=annotation_desc_2_event_id, chunk_duration=30.)
# plot events
fig = mne.viz.plot_events(events_train, event_id=event_id,
sfreq=raw_train.info['sfreq'],
first_samp=events_train[0, 0])
Used Annotations descriptions: ['Sleep stage 1', 'Sleep stage 2', 'Sleep stage
3', 'Sleep stage 4', 'Sleep stage R', 'Sleep stage W']
4
1.1.3 Create Epochs from the data based on the events found in the annotations
print(epochs_train)
5
1.1.4 Applying the same steps to the test data from Bob
print(epochs_test)
6
for ax, title, epochs in zip([ax1, ax2],
['Alice', 'Bob'],
[epochs_train, epochs_test]):
Loading data for 58 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 250 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 220 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 125 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 188 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 109 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 562 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 105 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 170 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 157 events and 3000 original time points ...
0 bad epochs dropped
Using multitaper spectrum estimation with 7 DPSS windows
7
1.2.1 Design a scikit-learn transformer from a Python function
We will now create a function to extract EEG features based on relative power in specific frequency
bands to be able to predict sleep stages from EEG signals.
[8]: def eeg_power_band(epochs):
"""EEG relative power band feature extraction.
This function takes an ``mne.Epochs`` object and creates EEG features based
on relative power in specific frequency bands that are compatible with
scikit-learn.
Parameters
----------
epochs : Epochs
The data.
Returns
-------
X : numpy array of shape [n_samples, 5]
Transformed data.
"""
# specific frequency bands
FREQ_BANDS = {"delta": [0.5, 4.5],
8
"theta": [4.5, 8.5],
"alpha": [8.5, 11.5],
"sigma": [11.5, 15.5],
"beta": [15.5, 30]}
X = []
for fmin, fmax in FREQ_BANDS.values():
psds_band = psds[:, :, (freqs >= fmin) & (freqs < fmax)].mean(axis=-1)
X.append(psds_band.reshape(len(psds), -1))
# Train
y_train = epochs_train.events[:, 2]
pipe.fit(epochs_train, y_train)
# Test
y_pred = pipe.predict(epochs_test)
Loading data for 841 events and 3000 original time points ...
0 bad epochs dropped
Effective window size : 2.560 (s)
Loading data for 1103 events and 3000 original time points ...
0 bad epochs dropped
9
Effective window size : 2.560 (s)
Accuracy score: 0.6699909338168631
In short, yes. We can predict Bob’s sleeping stages based on Alice’s data.
[[156 0 1 0 0]
[ 67 27 5 3 7]
[ 53 51 401 33 24]
[ 0 0 5 100 0]
[ 52 45 18 0 55]]
1.4 Exercise
Fetch 50 subjects from the Physionet database and run a 5-fold cross-validation leaving each time
10 subjects out in the test set.
1.5 References
.. [1] B Kemp, AH Zwinderman, B Tuk, HAC Kamphuisen, JJL Oberyé. Analysis of a sleep-
dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE-BME
47(9):1185-1194 (2000).
.. [2] Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus
JE, Moody GB, Peng C-K, Stanley HE. (2000) PhysioBank, PhysioToolkit, and PhysioNet: Compo-
nents of a New Research Resource for Complex Physiologic Signals. Circulation 101(23):e215-e220
.. [3] Chambon, S., Galtier, M., Arnal, P., Wainrib, G. and Gramfort, A. (2018)A Deep Learning
Architecture for Temporal Sleep Stage Classification Using Multivariate and Multimodal Time
Series. IEEE Trans. on Neural Systems and Rehabilitation Engineering 26: (758-769).
10