0% found this document useful (0 votes)
82 views18 pages

How To Get Data From The MIT-BIH Arrhythmia Database - by Proto Bioengineering - Medium

The document provides a tutorial on how to access and analyze data from the MIT-BIH Arrhythmia Database using Python. It outlines the steps to download the WFDB library, obtain the ECG data, visualize it, and extract it into a CSV format for further analysis. The tutorial is aimed at helping users navigate the complexities of the database and utilize the data effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views18 pages

How To Get Data From The MIT-BIH Arrhythmia Database - by Proto Bioengineering - Medium

The document provides a tutorial on how to access and analyze data from the MIT-BIH Arrhythmia Database using Python. It outlines the steps to download the WFDB library, obtain the ECG data, visualize it, and extract it into a CSV format for further analysis. The tutorial is aimed at helping users navigate the complexities of the database and utilize the data effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

29.05.

2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

Open in app

Search

Member-only story

How to Get Data from the MIT-BIH Arrhythmia


Database
Proto Bioengineering · Follow
5 min read · Jul 11, 2023

Listen Share More

Use Python to read the most famous heart rhythm database in the world.

Photo by Alexander Sinn on Unsplash

The MIT-BIH Arrhythmia Database is a set of 30-minute heart rhythm recordings (AKA
“electrocardiograms” or “ECGs”) from 47 patients from 1975 to 1979. The data is from a range of
healthy to heart-diseased patients and is useful for practicing analyses of the heart using code.

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 1/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

However, downloading and using the data is not straightforward. If you go to the download
page, you’ll see a ZIP file and a bunch of .atr , .hea , and other obscure file types.

And if you try to open these in a text editor, you’ll get weird characters or hex data, because your
computer doesn’t automatically know how to read the files.

Attempting to open `100.dat` and `100.atr` in Sublime Text editor.

We will fix this by reading the ECG data with WFDB (Waveform Database), a waveform reading
library available for multiple languages, like C, Python, MATLAB, and more.

This tutorial will cover the steps in Python.

Overall Steps
https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 2/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

1. Download the WFDB library ( wfdb )

2. Download the ZIP file from Physionet

3. Open one of the ECG recordings using wfdb

4. (Optional) Visualize the data with Matplotlib

5. Extract the raw data to a CSV

1. Download the WFDB library


WFDB is available via Pip. Download it with one of the following commands:

pip install wfdb

python -m pip install wfdb

python3 -m pip install wfdb

See this link for troubleshooting info.

2. Download the ZIP file of heart data from Physionet


The full database is in a ZIP file on Physionet.org. Scroll to the bottom of that page and click
“Download the ZIP file.”

Then unzip the file. You’ll see 4 files per patient ( .atr , .dat , .hea , and .xws ). The patients are
numbered 100–234.

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 3/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

How to unzip files on Windows?

3. Open one of the ECG recordings with WFDB


Use the following code to open one of the ECG recordings for a single patient.

Below, we open the recording for patient 100 .

# Python 3
import wfdb

patient_record = wfdb.rdrecord("100")

Note that we leave off any filetypes in our code. It’s just "100" , rather than “100.dat” or
“100.hea” . This is because WFDB automatically tacks on .hea to our argument, then uses the
100.hea as an entry to point to then read the 100.dat file. It’s kind of odd, but that’s how it
works.

4. Visualize the data (optional)


WFDB has some built-in functions to make graphs of the ECG data (which use Matplotlib under
the hood).

To plot, we only need one more line at the bottom:

import wfdb

patient_record = wfdb.rdrecord("100")
wfdb.plot_wfdb(patient_record) # plots the ECG

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 4/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

Running this will give us a Matplotlib graph.

To zoom in, click the magnifying glass, then click and drag on the graph to zoom into a small
section.

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 5/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

Above are a bunch of little spikes. Each spike is a heart beat. The official name for these spikes
in medicine is a “QRS complex.”

5. Extract the data to a CSV


What about saving the data to a human-friendly CSV?

This can be done by grabbing the data from the patient_record variable.

This is the same code as above, minus the graphing part, plus the data printing part:

import wfdb

patient_record = wfdb.rdrecord("100")
print(patient_record.__dict__)

This prints a truncated version of the dictionary that holds all of the ECG data.

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 6/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

The top chunk of data tells us things like:

the patient number ( record_name )

the number of leads recorded ( n_sig )

some info about the patient ( comments such as 69 male and his prescribed medications, like
Inderal )

etc.

The middle chunk is where the actual ECG data is (the voltages of the leads — in this case, leads
MLII and V5).

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 7/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

And the last chunk of data tells us more about the voltage, units, data format, etc.

To get a basic CSV with two columns (one for each lead), we’ll grab the following from
patient_record :

record_name (patient #)

sig_name (which leads were recorded)

p_signal (the ECG data)

This data can also be accessed via dot notation ( patient_record.p_signal , etc.).

Below, the data is extracted from the patient_record variable:

import wfdb
import csv

# Open ECG file


patient_record = wfdb.rdrecord("100")

# Extract patient info, lead names, and ECG data


patient_number = patient_record.record_name
leads = patient_record.sig_name
ecg_data = patient_record.p_signal

Then it is written to a new CSV with the following:

# Create CSV
filename = f"{patient_number}.csv"
outfile = open(filename, "w")
out_csv = csv.writer(outfile)

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 8/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium
# Write CSV header with lead names
out_csv.writerow(leads)

# Write ECG data to CSV


for row in ecg_data:
out_csv.writerow(row)

The output will be a CSV with two columns, one for each ECG lead, and 65,000 rows:

This CSV can then be plugged back in to Matplotlib, SciPy, Pandas, and more for further analysis.

The Full Code


This code can be run with the number of any patient in the MIT-BIH Arrhythmia database ( 100 ,
207 , 231 , etc.).

import wfdb
import csv

# Open ECG file


patient_record = wfdb.rdrecord("100") # PUT PATIENT NUMBER HERE

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 9/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium
# Extract patient info, lead names, and ECG data
patient_number = patient_record.record_name
leads = patient_record.sig_name
ecg_data = patient_record.p_signal

# Create CSV
filename = f"{patient_number}.csv"
outfile = open(filename, "w")
out_csv = csv.writer(outfile)

# Write CSV header with lead names


out_csv.writerow(leads)

# Write ECG data to CSV


for row in ecg_data:
out_csv.writerow(row)

An alternative method that uses mostly WFDB in Python is also available here.

Easy Download from Kaggle


We also made the dataset available on Kaggle for free.

Questions and Feedback


If you have questions or feedback, email us at [email protected] or message us
on Instagram (@protobioengineering).

If you liked this article, consider supporting us by donating a coffee.

More Cool Health Data


Physionet.org Databases

QRS Complex detector example in C code using WFDB

Sources
Moody GB, Mark RG. The impact of the MIT-BIH Arrhythmia Database. IEEE Eng in Med and
Biol 20(3):45–50 (May-June 2001). (PMID: 11446209)

Physionet.org

WFDB (Waveform Database)

Medicine Bioengineering Bioinformatics Python Medical Devices

More from the list: "Heart and Electrocardiogram Analysis"


https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 10/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium
Curated by Proto Bioengineering

Proto Bioengineering Proto Bioengineering Proto Bioengineering

Heart Analysis with Heart Analysis with Heart Analysis with


Python (Part 4:… Python (Part 3: How to… Python (Part 2: Labeling…
· Jul 24, 2023 · Jul 23, 2023 · Jul 19, 2023

View list

Follow

Written by Proto Bioengineering


283 Followers

Learn to code for science. “Everything simple is false. Everything complex is unusable.” — Paul Valery

More from Proto Bioengineering

Proto Bioengineering

How to Get Every FDA-Approved Drug with One Line of Code


Get a list of every pharmaceutical drug in the United States with code (or without).

4 min read · Mar 22, 2023

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 11/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

Proto Bioengineering

How to Make a Standalone Python Script


How to code outside of a “learn to code” site.

7 min read · Feb 27, 2023

76 1

Proto Bioengineering

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 12/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

How to Make a Live Map of the ISS’s Location with Python and Plotly Dash
See the latitude/longitude of the International Space Station every second.

10 min read · Apr 9, 2023

Proto Bioengineering

How to Stream Data from a Movella DOT Wearable Sensor with a Mac and Python
Use Python on your MacBook to get human movement data from Movella DOT.

14 min read · Apr 2, 2023

See all from Proto Bioengineering

Recommended from Medium

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 13/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

Gigi Dattaradon

Automating Weather Prediction with Python: A Data Science Approach Using


Logistic Regression
Forecasting the weather accurately is essential for planning various activities and making informed
decisions. Leveraging Python…

4 min read · Feb 24, 2024

Abigail A Antenor

Graph Analytics on Financial Crime Detection for Different Levels of Transaction

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 14/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

By Abigail Antenor, Sook-Yee Chong Data Scientists Artificial Intelligence and Innovation Center of
Excellence Aboitiz Data Innovation and…

13 min read · Dec 29, 2023

55 1

Lists

Coding & Development


11 stories · 628 saves

Predictive Modeling w/ Python


20 stories · 1229 saves

Practical Guides to Machine Learning


10 stories · 1484 saves

ChatGPT
21 stories · 651 saves

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 15/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

Alon Fliess

Article 5 of 5 — Completing the Journey — From ESP32 to the Cloud and Back
In this final article of the series, I present how to implement the device side of our MQTT system using an
ESP32 microcontroller. The…

15 min read · Mar 2, 2024

18

Proto Bioengineering

How to Analyze Coronavirus RNA with Python (Part 2: Installing Biopython)


Biopython is a set of tools for doing all sorts of genomics tasks: reading DNA and RNA, aligning sequences,
analyzing similarities between…

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 16/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

· 4 min read · Feb 9, 2024

Theo Wolf in Towards Data Science

Kolmogorov-Arnold Networks: the latest advance in Neural Networks, simply


explained
The new type of network that is making waves in the ML world.

· 9 min read · May 12, 2024

1.6K 20

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 17/18
29.05.2024, 17:36 How to Get Data from the MIT-BIH Arrhythmia Database | by Proto Bioengineering | Medium

Varun Tyagi

A Deep Dive into Building a Weather Prediction Model using Neural Networks
In the ever-evolving field of machine learning, predicting weather patterns has become an intriguing
application. In this blog post, we’ll…

7 min read · Feb 13, 2024

See more recommendations

https://medium.com/@protobioengineering/how-to-get-heart-data-from-the-mit-bih-arrhythmia-database-e452d4bf7215 18/18

You might also like