diff --git a/README.md b/README.md index ee91459a..82256093 100644 --- a/README.md +++ b/README.md @@ -20,7 +20,6 @@ or navigate to any of the documents listed below and download it individually. 3. [Tutorial: Determining Moore's Law with real data in NumPy](content/mooreslaw-tutorial.md) 4. [Tutorial: Saving and sharing your NumPy arrays](content/save-load-arrays.md) 5. [Tutorial: NumPy deep learning on MNIST from scratch](content/tutorial-deep-learning-on-mnist.md) -6. [Tutorial: X-ray image processing](content/tutorial-x-ray-image-processing.md) ## Contributing diff --git a/content/tutorial-x-ray-image-processing.ipynb b/content/tutorial-x-ray-image-processing.ipynb new file mode 100644 index 00000000..97f12b6b --- /dev/null +++ b/content/tutorial-x-ray-image-processing.ipynb @@ -0,0 +1,726 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Tutorial: X-ray image processing" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This tutorial demonstrates how to read and process X-ray images with NumPy, imageio, Matplotlib and SciPy. You will learn how to load medical images, focus on certain parts, and visually compare them using mask filters and the [Sobel](https://en.wikipedia.org/wiki/Sobel_operator) and [Canny](https://en.wikipedia.org/wiki/Canny_edge_detector) edge detectors. X-ray image analysis can be part of your data analysis and [machine learning workflow](https://www.sciencedirect.com/science/article/pii/S235291481930214X) when, for example, you're building an algorithm that helps [detect pneumonia](https://www.kaggle.com/c/rsna-pneumonia-detection-challenge) as part of a [Kaggle](https://www.kaggle.com) [competition](https://www.kaggle.com/eduardomineo/u-net-lung-segmentation-montgomery-shenzhen). In the healthcare industry, medical image processing and analysis is particularly important when images are estimated to account for [at least 90%](https://www-03.ibm.com/press/us/en/pressrelease/51146.wss) of all medical data.\n", + "\n", + "You'll be working with radiology images from the [ChestX-ray8](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community\n", + ") dataset provided by the [National Institutes of Health (NIH)](http://nih.gov). ChestX-ray8 contains over 100,000 de-identified X-ray images in the PNG format from more than 30,000 patients. You can find ChestX-ray8's files on NIH's public Box [repository](https://nihcc.app.box.com/v/ChestXray-NIHCC) in the `/images` folder. (For more details, refer to the research [paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Wang_ChestX-ray8_Hospital-Scale_Chest_CVPR_2017_paper.pdf) published at CVPR (a computer vision conference) in 2017.)\n", + "\n", + "For your convenience, a small number of PNG images have been saved to this tutorial's repository under `/tutorial-x-ray-image-processing`, since ChestX-ray8 contains gigabytes of data and you may find it challenging to download it in batches.\n", + "\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Prerequisites" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The reader should have some knowledge of Python, NumPy arrays, and Matplotlib. To refresh the memory, you can take the [Python](https://docs.python.org/dev/tutorial/index.html) and Matplotlib [PyPlot](https://matplotlib.org/tutorials/introductory/pyplot.html) tutorials, and the NumPy [quickstart](https://numpy.org/devdocs/user/quickstart.html).\n", + "\n", + "The following packages are used in this tutorial:\n", + "\n", + "- [imageio](https://imageio.github.io) for reading and writing image data. The healthcare industry usually works with the [DICOM](https://en.wikipedia.org/wiki/DICOM) format for medical imaging and [imageio](https://imageio.readthedocs.io/en/stable/format_dicom.html) should be well-suited for reading that format. For simplicity, in this tutorial you'll be working with PNG files.\n", + "- [Matplotlib](https://matplotlib.org/) for data visualization.\n", + "- [SciPy](https://www.scipy.org) for multi-dimensional image processing via [`ndimage`](https://docs.scipy.org/doc/scipy/reference/ndimage.html).\n", + "\n", + "This tutorial can be run locally in an isolated environment, such as [Virtualenv](https://virtualenv.pypa.io/en/stable/) or [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). You can use [Jupyter Notebook or JupyterLab](https://jupyter.org/install) to run each notebook cell." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Table of contents" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Examine an X-ray with `imageio`\n", + "2. Combine images with `np.stack()` to demonstrate progression\n", + "3. Edge detection using the Laplacian-Gaussian, Gaussian gradient, Sobel, and Canny filters\n", + "4. Apply masks to X-rays with `np.where()`\n", + "5. Compare the results\n", + "\n", + "---" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Examine an X-ray with `imageio`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's begin with a simple example using just one X-ray image from the ChestX-ray8 dataset. \n", + "\n", + "The file — `00000011_001.png` — has been downloaded for you and saved in the `/tutorial-x-ray-image-processing` folder.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Load the image with `imageio`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import imageio\n", + "\n", + "DIR = 'tutorial-x-ray-image-processing/'\n", + "\n", + "xray_image = imageio.imread(os.path.join(DIR, '00000011_001.png'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Check that its shape is 1024x1024 pixels and that the array is made up of 8-bit integers:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(xray_image.shape)\n", + "print(xray_image.dtype)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "3. Import `matplotlib` and display the image in a grayscale colormap:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "plt.imshow(xray_image, cmap='gray')\n", + "plt.axis('off')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Combine images with `np.stack()` to demonstrate progression" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With NumPy's `np.stack()` you can combine multiple X-rays to make an n-dimensional array and then show the \"health progress\" in a sequential manner.\n", + "\n", + "In the next example, instead of 1 image you'll use 8 X-ray 1024x1024-pixel images from the ChestX-ray8 dataset that have been downloaded and extracted from one of the dataset files. They are numbered from `...000.png` to `...008.png` and let's assume they belong to the same patient." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Import NumPy, read in each of the X-rays, and stack them together with `np.stack()`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "file1 = imageio.imread(os.path.join(DIR + '00000011_000.png'))\n", + "file2 = imageio.imread(os.path.join(DIR + '00000011_001.png'))\n", + "file3 = imageio.imread(os.path.join(DIR + '00000011_003.png'))\n", + "file4 = imageio.imread(os.path.join(DIR + '00000011_004.png'))\n", + "file5 = imageio.imread(os.path.join(DIR + '00000011_005.png'))\n", + "file6 = imageio.imread(os.path.join(DIR + '00000011_006.png'))\n", + "file7 = imageio.imread(os.path.join(DIR + '00000011_007.png'))\n", + "file8 = imageio.imread(os.path.join(DIR + '00000011_008.png'))\n", + "\n", + "combined_xray_images_1 = np.stack([file1, file2, file3, file4, file5, file6, file7, file8])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Alternatively, you can `append` the image arrays as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "combined_xray_images_2 = []\n", + "\n", + "for i in range(8):\n", + " single_xray_image = imageio.imread(os.path.join(DIR + '00000011_00'+str(i)+'.png'))\n", + " combined_xray_images_2.append(single_xray_image)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "_Note on performance:_\n", + "\n", + "- `append`ing the images may no be faster. If you care about performance, you should probably use `np.stack()`, as evidenced when you try to time the code with Python's `timeit`:\n", + "\n", + " ```python\n", + " %timeit combined_xray_images_1 = np.stack([file1, file2, file3, file4, file5, file6, file7, file8])\n", + " ```\n", + "\n", + " Example output:\n", + "\n", + " ```\n", + " 1.52 ms ± 49.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n", + " ```\n", + "\n", + " ```python\n", + " %timeit C = [combined_xray_images_2.append(imageio.imread(os.path.join(DIR + '00000011_00'+str(i)+'.png'))) for i in range(8)]\n", + " ```\n", + "\n", + " Example output:\n", + "\n", + " ```\n", + " 159 ms ± 2.69 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n", + " ```\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Check the shape of the new X-ray image array containing 8 stacked images:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "combined_xray_images_1.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "3. You can now display the \"health progress\" by plotting each of frames next to each other using Matplotlib:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(nrows=1, ncols=8, figsize=(30, 30))\n", + "\n", + "for i in range(8):\n", + " x = combined_xray_images_1[i]\n", + " axes[i].imshow(x, cmap='gray')\n", + " axes[i].axis('off')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "4. In addition, it can be helpful to show the progress as an animation. Let's create a GIF file with `imageio.mimwrite()` and display the result in the notebook:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from IPython.display import Image\n", + "\n", + "GIF_PATH = os.path.join(DIR, 'xray_image.gif')\n", + "imageio.mimwrite(GIF_PATH, combined_xray_images_1, format= '.gif', fps = 1) \n", + "\n", + "Image(filename=GIF_PATH, width=400, height=400)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Edge detection using the Laplacian-Gaussian, Gaussian gradient, Sobel, and Canny filters" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When processing biomedical data, it can be useful to emphasize the 2D [\"edges\"](https://en.wikipedia.org/wiki/Edge_detection) to focus on particular features in an image. To do that, using [image gradients](https://en.wikipedia.org/wiki/Image_gradient) can be particularly helpful when detecting the change of color pixel intensity." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### The Laplace filter with Gaussian second derivatives\n", + "\n", + "Let's start with an n-dimensional [Laplace](https://en.wikipedia.org/wiki/Laplace_distribution) filter (\"Laplacian-Gaussian\") that uses [Gaussian](https://en.wikipedia.org/wiki/Normal_distribution) second derivatives. This Laplacian method focuses on pixels with rapid intensity change in values and is combined with Gaussian smoothing to [remove noise](https://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm). Let's examine how it can be useful in analyzing 2D X-ray images.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- The implementation of the Laplacian-Gaussian filter is relatively straightforward: 1) import the `ndimage` module from SciPy; and 2) call [`scipy.ndimage.gaussian_laplace()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_laplace.html) with a sigma (scalar) parameter, which affects the standard deviations of the Gaussian filter (you'll use `1` in the example below):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from scipy import ndimage\n", + "\n", + "xray_image_laplace_gaussian = ndimage.gaussian_laplace(xray_image, sigma=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Display the original X-ray and the one with the Laplacian-Gaussian filter:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 10))\n", + "\n", + "axes[0].set_title('Original')\n", + "axes[0].imshow(xray_image, cmap='gray')\n", + "axes[1].set_title('Laplacian-Gaussian (edges)')\n", + "axes[1].imshow(xray_image_laplace_gaussian, cmap='gray')\n", + "for i in axes:\n", + " i.axis('off')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### The Gaussian gradient magnitude method\n", + "\n", + "Another method for edge detection that can be useful is the [Gaussian](https://en.wikipedia.org/wiki/Normal_distribution) (gradient) filter. It computes the multidimensional gradient magnitude with Gaussian derivatives and helps by remove [high-frequency](https://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf) image components.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Call [`scipy.ndimage.gaussian_gradient_magnitude()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_gradient_magnitude.html) with a sigma (scalar) parameter (for standard deviations; you'll use `2` in the example below):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "x_ray_image_gaussian_gradient = ndimage.gaussian_gradient_magnitude(xray_image, sigma=2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Display the original X-ray and the one with the Gaussian gradient filter:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 10))\n", + "\n", + "axes[0].set_title('Original')\n", + "axes[0].imshow(xray_image, cmap='gray')\n", + "axes[1].set_title('Gaussian gradient (edges)')\n", + "axes[1].imshow(x_ray_image_gaussian_gradient, cmap='gray')\n", + "for i in axes:\n", + " i.axis('off')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### The Sobel-Feldman operator (the Sobel filter)\n", + "\n", + "To find regions of high spatial frequency (the edges or the edge maps) along the horizontal and vertical axes of a 2D X-ray image, you can use the [Sobel-Feldman operator (Sobel filter)](https://en.wikipedia.org/wiki/Sobel_operator) technique. The Sobel filter applies two 3x3 kernel matrices — one for each axis — onto the X-ray through a [convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution). Then, these two points (gradients) are combined using the [Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) to produce a gradient magnitude." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Use the Sobel filters — ([`scipy.ndimage.sobel()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.sobel.html)) — on x- and y-axes of the X-ray. Then, calculate the distance between `x` and `y` (with the Sobel filters applied to them) using the [Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and NumPy's [`np.hypot()`](https://numpy.org/doc/stable/reference/generated/numpy.hypot.html) to obtain the magnitude. Finally, normalize the rescaled image for the pixel values to be between 0 and 255. \n", + "\n", + " [Image normalization](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) follows the `output_channel = 255.0 * (input_channel - min_value) / (max_value - min_value)` [formula](http://dev.ipol.im/~nmonzon/Normalization.pdf). Because you're using a grayscale image, you need to normalize just one channel." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "x_sobel = ndimage.sobel(xray_image, axis=0)\n", + "y_sobel = ndimage.sobel(xray_image, axis=1)\n", + "\n", + "xray_image_sobel = np.hypot(x_sobel, y_sobel)\n", + "\n", + "xray_image_sobel *= 255.0 / np.max(xray_image_sobel)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Change the new image array data type to the 32-bit floating-point format from `float16` to [make it compatible](https://github.com/matplotlib/matplotlib/issues/15432) with Matplotlib:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print('The data type - before: ', xray_image_sobel.dtype)\n", + "\n", + "xray_image_sobel = xray_image_sobel.astype('float32')\n", + "\n", + "print('The data type - after: ', xray_image_sobel.dtype)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "3. Display the original X-ray and the one with the Sobel \"edge\" filter applied. Note that both the grayscale and `CMRmap` colormaps are used to help emphasize the edges:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15, 15))\n", + "\n", + "axes[0].set_title('Original')\n", + "axes[0].imshow(xray_image, cmap='gray')\n", + "axes[1].set_title('Sobel (edges) - grayscale')\n", + "axes[1].imshow(xray_image_sobel, cmap='gray')\n", + "axes[2].set_title('Sobel (edges) - CMRmap')\n", + "axes[2].imshow(xray_image_sobel, cmap='CMRmap')\n", + "for i in axes:\n", + " i.axis('off')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### The Canny filter\n", + "\n", + "You can also consider using another well-known filter for edge detection called the [Canny filter](https://en.wikipedia.org/wiki/Canny_edge_detector).\n", + "\n", + "First, you apply a [Gaussian](https://en.wikipedia.org/wiki/Gaussian_filter) filter to remove the noise in an image. In this example, you're using using the [Fourier](https://en.wikipedia.org/wiki/Fourier_transform) filter which smoothens the X-ray through a [convolution](https://en.wikipedia.org/wiki/Convolution) process. Next, you apply the [Prewitt filter](https://en.wikipedia.org/wiki/Prewitt_operator) on each of the 2 axes of the image to help detect some of the edges — this will result in 2 gradient values. Similar to the Sobel filter, the Prewitt operator also applies two 3x3 kernel matrices — one for each axis — onto the X-ray through a [convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution). In the end, you compute the magnitude between the two gradients using the [Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and [normalize](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) the images, as before." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Use SciPy's Fourier filters — [`scipy.ndimage.fourier_gaussian()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.fourier_gaussian.html) — with a small `sigma` value to remove some of the noise from the X-ray. Then, calculate two gradients using [`scipy.ndimage.prewitt()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.prewitt.html). Next, measure the distance between the gradients using NumPy's `np.hypot()`. Finally, [normalize](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) the rescaled image, as before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fourier_gaussian = ndimage.fourier_gaussian(xray_image, sigma=0.05)\n", + "\n", + "x_prewitt = ndimage.prewitt(fourier_gaussian, axis=0)\n", + "y_prewitt = ndimage.prewitt(fourier_gaussian, axis=1)\n", + "\n", + "xray_image_canny = np.hypot(x_prewitt, y_prewitt)\n", + "\n", + "xray_image_canny *= 255.0 / np.max(xray_image_canny)\n", + "\n", + "print('The data type - ', xray_image_canny.dtype)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Plot the original X-ray image and the ones with the edges detected with the help of the Canny filter technique. The edges can be emphasized using the `prism`, `nipy_spectral`, and `terrain` Matplotlib colormaps." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(nrows=1, ncols=4, figsize=(20, 15))\n", + "\n", + "axes[0].set_title('Original')\n", + "axes[0].imshow(xray_image, cmap='gray')\n", + "axes[1].set_title('Canny (edges) - prism')\n", + "axes[1].imshow(xray_image_canny, cmap='prism')\n", + "axes[2].set_title('Canny (edges) - nipy_spectral')\n", + "axes[2].imshow(xray_image_canny, cmap='nipy_spectral')\n", + "axes[3].set_title('Canny (edges) - terrain')\n", + "axes[3].imshow(xray_image_canny, cmap='terrain')\n", + "for i in axes:\n", + " i.axis('off')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Apply masks to X-rays with `np.where()`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To screen out only certain pixels in X-ray images to help detect particular features, you can apply masks with NumPy's [`np.where(condition: array_like (bool), x: array_like, y: ndarray)`](https://numpy.org/doc/stable/reference/generated/numpy.where.html) that returns `x` when `True` and `y` when `False`.\n", + "\n", + "Identifying regions of interest — certain sets of pixels in an image — can be useful and masks serve as boolean arrays of the same shape as the original image." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Retrieve some basics statistics about the pixel values in the original X-ray image you've been working with:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print('The data type of the X-ray image is: ', xray_image.dtype)\n", + "print('The minimum pixel value is: ', np.min(xray_image))\n", + "print('The maximum pixel value is: ', np.max(xray_image))\n", + "print('The average pixel value is: ', np.mean(xray_image))\n", + "print('The median pixel value is: ', np.median(xray_image))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. The array data type is `uint8` and the minimum/maximum value results suggest that all 256 colors (from `0` to `255`) are used in the X-ray. Let's visualize the _pixel intensity distribution_ of the original raw X-ray image with `ndimage.histogram()` and Matplotlib:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pixel_intensity_distribution = ndimage.histogram(xray_image, min=np.min(xray_image), max=np.max(xray_image), bins=256)\n", + "\n", + "plt.plot(pixel_intensity_distribution)\n", + "plt.title('Pixel intensity distribution')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As the pixel intensity distribution suggests, there are many low (between around 0 and 20) and very high (between around 200 and 240) pixel values. \n", + "\n", + "3. You can create different conditional masks with NumPy's `np.where()` — for example, let's have only those values of the image with the pixels exceeding a certain threshold:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# The threshold is \"greater than 150\"\n", + "# Return the original image if true, `0` otherwise\n", + "xray_image_mask_noisy = np.where(xray_image > 150, xray_image, 0)\n", + "\n", + "plt.imshow(xray_image_mask_noisy, cmap='gray')\n", + "plt.axis('off')\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# The threshold is \"greater than 150\"\n", + "# Return `1` if true, `0` otherwise\n", + "xray_image_mask_less_noisy = np.where(xray_image > 150, 1, 0)\n", + "\n", + "plt.imshow(xray_image_mask_less_noisy, cmap='gray')\n", + "plt.axis('off')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Compare the results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's display some of the results of processed X-ray images you've worked with so far:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(nrows=1, ncols=9, figsize=(30, 30))\n", + "\n", + "axes[0].set_title('Original')\n", + "axes[0].imshow(xray_image, cmap='gray')\n", + "axes[1].set_title('Laplace-Gaussian (edges)')\n", + "axes[1].imshow(xray_image_laplace_gaussian, cmap='gray')\n", + "axes[2].set_title('Gaussian gradient (edges)')\n", + "axes[2].imshow(x_ray_image_gaussian_gradient, cmap='gray')\n", + "axes[3].set_title('Sobel (edges) - grayscale')\n", + "axes[3].imshow(xray_image_sobel, cmap='gray')\n", + "axes[4].set_title('Sobel (edges) - hot')\n", + "axes[4].imshow(xray_image_sobel, cmap='hot')\n", + "axes[5].set_title('Canny (edges) - prism)')\n", + "axes[5].imshow(xray_image_canny, cmap='prism')\n", + "axes[6].set_title('Canny (edges) - nipy_spectral)')\n", + "axes[6].imshow(xray_image_canny, cmap='nipy_spectral')\n", + "axes[7].set_title('Mask (> 150, noisy)')\n", + "axes[7].imshow(xray_image_mask_noisy, cmap='gray')\n", + "axes[8].set_title('Mask (> 150, less noisy)')\n", + "axes[8].imshow(xray_image_mask_less_noisy, cmap='gray')\n", + "for i in axes:\n", + " i.axis('off')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Next steps" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you want to use your own samples, you can use [this image](https://openi.nlm.nih.gov/detailedresult?img=CXR3666_IM-1824-1001&query=chest%20infection&it=xg&req=4&npos=32) or search for various other ones on the [_Openi_](https://openi.nlm.nih.gov) database. Openi contains many biomedical images and it can be especially helpful if you have low bandwidth and/or are restricted by the amount of data you can download.\n", + "\n", + "To learn more about image processing in the context of biomedical image data or simply edge detection, you may find the following material useful:\n", + "\n", + "- [DICOM processing and segmentation in Python](https://www.raddq.com/dicom-processing-segmentation-visualization-in-python/) with Scikit-Image and pydicom (Radiology Data Quest)\n", + "- [Image manipulation and processing using Numpy and Scipy](https://scipy-lectures.org/advanced/image_processing/index.html) (Scipy Lecture Notes)\n", + "- [Intensity values](https://s3.amazonaws.com/assets.datacamp.com/production/course_7032/slides/chapter2.pdf) (presentation, DataCamp)\n", + "- [Object detection with Raspberry Pi and Python](https://makersportal.com/blog/2019/4/23/image-processing-with-raspberry-pi-and-python-part-ii-spatial-statistics-and-correlations) (Maker Portal)\n", + "- [X-ray data preparation and segmentation](https://www.kaggle.com/eduardomineo/u-net-lung-segmentation-montgomery-shenzhen) with deep learning (a Kaggle-hosted Jupyter notebook)\n", + "- [Image filtering](https://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf) (lecture slides, CS6670: Computer Vision, Cornell University)\n", + "- [Edge detection in Python](https://towardsdatascience.com/edge-detection-in-python-a3c263a13e03) and NumPy (Towards Data Science)\n", + "- [Edge detection](https://datacarpentry.org/image-processing/08-edge-detection/) with Scikit-Image (Data Carpentry)\n", + "- [Image gradients and gradient filtering](www.cs.cmu.edu/~16385/s17/Slides/4.0_Image_Gradients_and_Gradient_Filtering.pdf) (lecture slides, 16-385 Computer Vision, Carnegie Mellon University)\n" + ] + } + ], + "metadata": { + "colab": { + "name": "tutorial-x-ray-image-processing.ipynb", + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/content/tutorial-x-ray-image-processing.md b/content/tutorial-x-ray-image-processing.md deleted file mode 100644 index 71a4669d..00000000 --- a/content/tutorial-x-ray-image-processing.md +++ /dev/null @@ -1,572 +0,0 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst - format_version: 0.12 - jupytext_version: 1.7.1 -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Tutorial: X-ray image processing - -+++ - -This tutorial demonstrates how to read and process X-ray images with NumPy, -imageio, Matplotlib and SciPy. You will learn how to load medical images, focus -on certain parts, and visually compare them using the -[Gaussian](https://en.wikipedia.org/wiki/Gaussian_filter), -[Laplacian-Gaussian](https://en.wikipedia.org/wiki/Laplace_distribution), -[Sobel](https://en.wikipedia.org/wiki/Sobel_operator), and -[Canny](https://en.wikipedia.org/wiki/Canny_edge_detector) filters for edge -detection. - -X-ray image analysis can be part of your data analysis and -[machine learning workflow](https://www.sciencedirect.com/science/article/pii/S235291481930214X) -when, for example, you're building an algorithm that helps -[detect pneumonia](https://www.kaggle.com/c/rsna-pneumonia-detection-challenge) -as part of a [Kaggle](https://www.kaggle.com) -[competition](https://www.kaggle.com/eduardomineo/u-net-lung-segmentation-montgomery-shenzhen). -In the healthcare industry, medical image processing and analysis is -particularly important when images are estimated to account for -[at least 90%](https://www-03.ibm.com/press/us/en/pressrelease/51146.wss) of all -medical data. - -You'll be working with radiology images from the -[ChestX-ray8](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community) -dataset provided by the [National Institutes of Health (NIH)](http://nih.gov). -ChestX-ray8 contains over 100,000 de-identified X-ray images in the PNG format -from more than 30,000 patients. You can find ChestX-ray8's files on NIH's public -Box [repository](https://nihcc.app.box.com/v/ChestXray-NIHCC) in the `/images` -folder. (For more details, refer to the research -[paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Wang_ChestX-ray8_Hospital-Scale_Chest_CVPR_2017_paper.pdf) -published at CVPR (a computer vision conference) in 2017.) - -For your convenience, a small number of PNG images have been saved to this -tutorial's repository under `tutorial-x-ray-image-processing/`, since -ChestX-ray8 contains gigabytes of data and you may find it challenging to -download it in batches. - -![image.png](tutorial-x-ray-image-processing.png) - -+++ - -## Prerequisites - -+++ - -The reader should have some knowledge of Python, NumPy arrays, and Matplotlib. -To refresh the memory, you can take the -[Python](https://docs.python.org/dev/tutorial/index.html) and Matplotlib -[PyPlot](https://matplotlib.org/tutorials/introductory/pyplot.html) tutorials, -and the NumPy [quickstart](https://numpy.org/devdocs/user/quickstart.html). - -The following packages are used in this tutorial: - -- [imageio](https://imageio.github.io) for reading and writing image data. The -healthcare industry usually works with the -[DICOM](https://en.wikipedia.org/wiki/DICOM) format for medical imaging and -[imageio](https://imageio.readthedocs.io/en/stable/format_dicom.html) should be -well-suited for reading that format. For simplicity, in this tutorial you'll be -working with PNG files. -- [Matplotlib](https://matplotlib.org/) for data visualization. -- [SciPy](https://www.scipy.org) for multi-dimensional image processing via -[`ndimage`](https://docs.scipy.org/doc/scipy/reference/ndimage.html). - -This tutorial can be run locally in an isolated environment, such as -[Virtualenv](https://virtualenv.pypa.io/en/stable/) or -[conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). -You can use [Jupyter Notebook or JupyterLab](https://jupyter.org/install) to run -each notebook cell. - -+++ - -## Table of contents - -+++ - -1. Examine an X-ray with `imageio` -2. Combine images with `np.stack()` to demonstrate progression -3. Edge detection using the Laplacian-Gaussian, Gaussian gradient, Sobel, and - Canny filters -4. Apply masks to X-rays with `np.where()` -5. Compare the results - ---- - -+++ - -## Examine an X-ray with `imageio` - -+++ - -Let's begin with a simple example using just one X-ray image from the -ChestX-ray8 dataset. - -The file — `00000011_001.png` — has been downloaded for you and saved in the -`/tutorial-x-ray-image-processing` folder. - -+++ - -**1.** Load the image with `imageio`: - -```{code-cell} ipython3 -import os -import imageio - -DIR = 'tutorial-x-ray-image-processing' - -xray_image = imageio.imread(os.path.join(DIR, '00000011_001.png')) -``` - -**2.** Check that its shape is 1024x1024 pixels and that the array is made up of -8-bit integers: - -```{code-cell} ipython3 -print(xray_image.shape) -print(xray_image.dtype) -``` - -**3.** Import `matplotlib` and display the image in a grayscale colormap: - -```{code-cell} ipython3 -import matplotlib.pyplot as plt - -plt.imshow(xray_image, cmap='gray') -plt.axis('off') -plt.show() -``` - -## Combine images with `np.stack()` to demonstrate progression - -+++ - -With NumPy's `np.stack()` you can combine multiple X-rays to make an -n-dimensional array and then show the "health progress" in a sequential manner. - -In the next example, instead of 1 image you'll use 8 X-ray 1024x1024-pixel -images from the ChestX-ray8 dataset that have been downloaded and extracted -from one of the dataset files. They are numbered from `...000.png` to -`...008.png` and let's assume they belong to the same patient. - -+++ - -**1.** Import NumPy, read in each of the X-rays, and stack them together with -`np.stack()`: - -```{code-cell} ipython3 -import numpy as np - -file1 = imageio.imread(os.path.join(DIR, '00000011_000.png')) -file2 = imageio.imread(os.path.join(DIR, '00000011_001.png')) -file3 = imageio.imread(os.path.join(DIR, '00000011_003.png')) -file4 = imageio.imread(os.path.join(DIR, '00000011_004.png')) -file5 = imageio.imread(os.path.join(DIR, '00000011_005.png')) -file6 = imageio.imread(os.path.join(DIR, '00000011_006.png')) -file7 = imageio.imread(os.path.join(DIR, '00000011_007.png')) -file8 = imageio.imread(os.path.join(DIR, '00000011_008.png')) - -combined_xray_images_1 = np.stack([file1, file2, file3, file4, file5, file6, file7, file8]) -``` - -Alternatively, you can `append` the image arrays as follows: - -```{code-cell} ipython3 -combined_xray_images_2 = [] - -for i in range(8): - single_xray_image = imageio.imread(os.path.join(DIR, '00000011_00'+str(i)+'.png')) - combined_xray_images_2.append(single_xray_image) -``` - -_Note on performance:_ - -- `append`ing the images may no be faster. If you care about performance, you - should probably use `np.stack()`, as evidenced when you try to time the code - with Python's `timeit`: - - ```python - %timeit combined_xray_images_1 = np.stack([file1, file2, file3, file4, file5, file6, file7, file8]) - ``` - - Example output: - - ``` - 1.52 ms ± 49.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) - ``` - - ```python - %timeit C = [combined_xray_images_2.append(imageio.imread(os.path.join(DIR, '00000011_00'+str(i)+'.png'))) for i in range(8)] - ``` - - Example output: - - ``` - 159 ms ± 2.69 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) - ``` - -+++ - -**2.** Check the shape of the new X-ray image array containing 8 stacked images: - -```{code-cell} ipython3 -combined_xray_images_1.shape -``` - -**3.** You can now display the "health progress" by plotting each of frames next -to each other using Matplotlib: - -```{code-cell} ipython3 -fig, axes = plt.subplots(nrows=1, ncols=8, figsize=(30, 30)) - -for i in range(8): - x = combined_xray_images_1[i] - axes[i].imshow(x, cmap='gray') - axes[i].axis('off') -``` - -**4.** In addition, it can be helpful to show the progress as an animation. -Let's create a GIF file with `imageio.mimwrite()` and display the result in the -notebook: - -```{code-cell} ipython3 -GIF_PATH = os.path.join(DIR, 'xray_image.gif') -imageio.mimwrite(GIF_PATH, combined_xray_images_1, format= '.gif', fps = 1) -``` -Which gives us: -![health_progress](tutorial-x-ray-image-processing/xray_image.gif) -## Edge detection using the Laplacian-Gaussian, Gaussian gradient, Sobel, and Canny filters - -+++ - -When processing biomedical data, it can be useful to emphasize the 2D -["edges"](https://en.wikipedia.org/wiki/Edge_detection) to focus on particular -features in an image. To do that, using -[image gradients](https://en.wikipedia.org/wiki/Image_gradient) can be -particularly helpful when detecting the change of color pixel intensity. - -+++ - -### The Laplace filter with Gaussian second derivatives - -Let's start with an n-dimensional -[Laplace](https://en.wikipedia.org/wiki/Laplace_distribution) filter -("Laplacian-Gaussian") that uses -[Gaussian](https://en.wikipedia.org/wiki/Normal_distribution) second -derivatives. This Laplacian method focuses on pixels with rapid intensity change -in values and is combined with Gaussian smoothing to -[remove noise](https://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm). Let's examine -how it can be useful in analyzing 2D X-ray images. - -+++ - -- The implementation of the Laplacian-Gaussian filter is relatively -straightforward: 1) import the `ndimage` module from SciPy; and 2) call -[`scipy.ndimage.gaussian_laplace()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_laplace.html) -with a sigma (scalar) parameter, which affects the standard deviations of the -Gaussian filter (you'll use `1` in the example below): - -```{code-cell} ipython3 -from scipy import ndimage - -xray_image_laplace_gaussian = ndimage.gaussian_laplace(xray_image, sigma=1) -``` - -Display the original X-ray and the one with the Laplacian-Gaussian filter: - -```{code-cell} ipython3 -fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 10)) - -axes[0].set_title('Original') -axes[0].imshow(xray_image, cmap='gray') -axes[1].set_title('Laplacian-Gaussian (edges)') -axes[1].imshow(xray_image_laplace_gaussian, cmap='gray') -for i in axes: - i.axis('off') -plt.show() -``` - -### The Gaussian gradient magnitude method - -Another method for edge detection that can be useful is the -[Gaussian](https://en.wikipedia.org/wiki/Normal_distribution) (gradient) filter. -It computes the multidimensional gradient magnitude with Gaussian derivatives -and helps by remove -[high-frequency](https://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf) -image components. - -+++ - -**1.** Call [`scipy.ndimage.gaussian_gradient_magnitude()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_gradient_magnitude.html) -with a sigma (scalar) parameter (for standard deviations; you'll use `2` in the -example below): - -```{code-cell} ipython3 -x_ray_image_gaussian_gradient = ndimage.gaussian_gradient_magnitude(xray_image, sigma=2) -``` - -**2.** Display the original X-ray and the one with the Gaussian gradient filter: - -```{code-cell} ipython3 -fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 10)) - -axes[0].set_title('Original') -axes[0].imshow(xray_image, cmap='gray') -axes[1].set_title('Gaussian gradient (edges)') -axes[1].imshow(x_ray_image_gaussian_gradient, cmap='gray') -for i in axes: - i.axis('off') -plt.show() -``` - -### The Sobel-Feldman operator (the Sobel filter) - -To find regions of high spatial frequency (the edges or the edge maps) along the -horizontal and vertical axes of a 2D X-ray image, you can use the -[Sobel-Feldman operator (Sobel filter)](https://en.wikipedia.org/wiki/Sobel_operator) -technique. The Sobel filter applies two 3x3 kernel matrices — one for each axis -— onto the X-ray through a [convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution). -Then, these two points (gradients) are combined using the -[Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) to -produce a gradient magnitude. - -+++ - -**1.** Use the Sobel filters — ([`scipy.ndimage.sobel()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.sobel.html)) -— on x- and y-axes of the X-ray. Then, calculate the distance between `x` and -`y` (with the Sobel filters applied to them) using the -[Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and -NumPy's [`np.hypot()`](https://numpy.org/doc/stable/reference/generated/numpy.hypot.html) -to obtain the magnitude. Finally, normalize the rescaled image for the pixel -values to be between 0 and 255. - - [Image normalization](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) - follows the `output_channel = 255.0 * (input_channel - min_value) / (max_value - min_value)` - [formula](http://dev.ipol.im/~nmonzon/Normalization.pdf). Because you're - using a grayscale image, you need to normalize just one channel. - -```{code-cell} ipython3 -x_sobel = ndimage.sobel(xray_image, axis=0) -y_sobel = ndimage.sobel(xray_image, axis=1) - -xray_image_sobel = np.hypot(x_sobel, y_sobel) - -xray_image_sobel *= 255.0 / np.max(xray_image_sobel) -``` - -**2.** Change the new image array data type to the 32-bit floating-point format -from `float16` to [make it compatible](https://github.com/matplotlib/matplotlib/issues/15432) -with Matplotlib: - -```{code-cell} ipython3 -print('The data type - before: ', xray_image_sobel.dtype) - -xray_image_sobel = xray_image_sobel.astype('float32') - -print('The data type - after: ', xray_image_sobel.dtype) -``` - -**3.** Display the original X-ray and the one with the Sobel "edge" filter -applied. Note that both the grayscale and `CMRmap` colormaps are used to help -emphasize the edges: - -```{code-cell} ipython3 -fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15, 15)) - -axes[0].set_title('Original') -axes[0].imshow(xray_image, cmap='gray') -axes[1].set_title('Sobel (edges) - grayscale') -axes[1].imshow(xray_image_sobel, cmap='gray') -axes[2].set_title('Sobel (edges) - CMRmap') -axes[2].imshow(xray_image_sobel, cmap='CMRmap') -for i in axes: - i.axis('off') -plt.show() -``` - -### The Canny filter - -You can also consider using another well-known filter for edge detection called -the [Canny filter](https://en.wikipedia.org/wiki/Canny_edge_detector). - -First, you apply a [Gaussian](https://en.wikipedia.org/wiki/Gaussian_filter) -filter to remove the noise in an image. In this example, you're using using the -[Fourier](https://en.wikipedia.org/wiki/Fourier_transform) filter which -smoothens the X-ray through a [convolution](https://en.wikipedia.org/wiki/Convolution) -process. Next, you apply the [Prewitt filter](https://en.wikipedia.org/wiki/Prewitt_operator) -on each of the 2 axes of the image to help detect some of the edges — this will -result in 2 gradient values. Similar to the Sobel filter, the Prewitt operator -also applies two 3x3 kernel matrices — one for each axis — onto the X-ray -through a [convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution). -In the end, you compute the magnitude between the two gradients using the -[Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and -[normalize](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) -the images, as before. - -+++ - -**1.** Use SciPy's Fourier filters — [`scipy.ndimage.fourier_gaussian()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.fourier_gaussian.html) -— with a small `sigma` value to remove some of the noise from the X-ray. Then, -calculate two gradients using [`scipy.ndimage.prewitt()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.prewitt.html). -Next, measure the distance between the gradients using NumPy's `np.hypot()`. -Finally, [normalize](https://en.wikipedia.org/wiki/Normalization_%28image_processing%29) -the rescaled image, as before. - -```{code-cell} ipython3 -fourier_gaussian = ndimage.fourier_gaussian(xray_image, sigma=0.05) - -x_prewitt = ndimage.prewitt(fourier_gaussian, axis=0) -y_prewitt = ndimage.prewitt(fourier_gaussian, axis=1) - -xray_image_canny = np.hypot(x_prewitt, y_prewitt) - -xray_image_canny *= 255.0 / np.max(xray_image_canny) - -print('The data type - ', xray_image_canny.dtype) -``` - -**2.** Plot the original X-ray image and the ones with the edges detected with -the help of the Canny filter technique. The edges can be emphasized using the -`prism`, `nipy_spectral`, and `terrain` Matplotlib colormaps. - -```{code-cell} ipython3 -fig, axes = plt.subplots(nrows=1, ncols=4, figsize=(20, 15)) - -axes[0].set_title('Original') -axes[0].imshow(xray_image, cmap='gray') -axes[1].set_title('Canny (edges) - prism') -axes[1].imshow(xray_image_canny, cmap='prism') -axes[2].set_title('Canny (edges) - nipy_spectral') -axes[2].imshow(xray_image_canny, cmap='nipy_spectral') -axes[3].set_title('Canny (edges) - terrain') -axes[3].imshow(xray_image_canny, cmap='terrain') -for i in axes: - i.axis('off') -plt.show() -``` - -## Apply masks to X-rays with `np.where()` - -+++ - -To screen out only certain pixels in X-ray images to help detect particular -features, you can apply masks with NumPy's -[`np.where(condition: array_like (bool), x: array_like, y: ndarray)`](https://numpy.org/doc/stable/reference/generated/numpy.where.html) -that returns `x` when `True` and `y` when `False`. - -Identifying regions of interest — certain sets of pixels in an image — can be -useful and masks serve as boolean arrays of the same shape as the original -image. - -+++ - -**1.** Retrieve some basics statistics about the pixel values in the original -X-ray image you've been working with: - -```{code-cell} ipython3 -print('The data type of the X-ray image is: ', xray_image.dtype) -print('The minimum pixel value is: ', np.min(xray_image)) -print('The maximum pixel value is: ', np.max(xray_image)) -print('The average pixel value is: ', np.mean(xray_image)) -print('The median pixel value is: ', np.median(xray_image)) -``` - -**2.** The array data type is `uint8` and the minimum/maximum value results -suggest that all 256 colors (from `0` to `255`) are used in the X-ray. Let's -visualize the _pixel intensity distribution_ of the original raw X-ray image -with `ndimage.histogram()` and Matplotlib: - -```{code-cell} ipython3 -pixel_intensity_distribution = ndimage.histogram(xray_image, min=np.min(xray_image), max=np.max(xray_image), bins=256) - -plt.plot(pixel_intensity_distribution) -plt.title('Pixel intensity distribution') -plt.show() -``` - -As the pixel intensity distribution suggests, there are many low (between around -0 and 20) and very high (between around 200 and 240) pixel values. - -**3.** You can create different conditional masks with NumPy's `np.where()` — -for example, let's have only those values of the image with the pixels exceeding -a certain threshold: - -```{code-cell} ipython3 -# The threshold is "greater than 150" -# Return the original image if true, `0` otherwise -xray_image_mask_noisy = np.where(xray_image > 150, xray_image, 0) - -plt.imshow(xray_image_mask_noisy, cmap='gray') -plt.axis('off') -plt.show() -``` - -```{code-cell} ipython3 -# The threshold is "greater than 150" -# Return `1` if true, `0` otherwise -xray_image_mask_less_noisy = np.where(xray_image > 150, 1, 0) - -plt.imshow(xray_image_mask_less_noisy, cmap='gray') -plt.axis('off') -plt.show() -``` - -## Compare the results - -+++ - -Let's display some of the results of processed X-ray images you've worked with -so far: - -```{code-cell} ipython3 -fig, axes = plt.subplots(nrows=1, ncols=9, figsize=(30, 30)) - -axes[0].set_title('Original') -axes[0].imshow(xray_image, cmap='gray') -axes[1].set_title('Laplace-Gaussian (edges)') -axes[1].imshow(xray_image_laplace_gaussian, cmap='gray') -axes[2].set_title('Gaussian gradient (edges)') -axes[2].imshow(x_ray_image_gaussian_gradient, cmap='gray') -axes[3].set_title('Sobel (edges) - grayscale') -axes[3].imshow(xray_image_sobel, cmap='gray') -axes[4].set_title('Sobel (edges) - hot') -axes[4].imshow(xray_image_sobel, cmap='hot') -axes[5].set_title('Canny (edges) - prism)') -axes[5].imshow(xray_image_canny, cmap='prism') -axes[6].set_title('Canny (edges) - nipy_spectral)') -axes[6].imshow(xray_image_canny, cmap='nipy_spectral') -axes[7].set_title('Mask (> 150, noisy)') -axes[7].imshow(xray_image_mask_noisy, cmap='gray') -axes[8].set_title('Mask (> 150, less noisy)') -axes[8].imshow(xray_image_mask_less_noisy, cmap='gray') -for i in axes: - i.axis('off') -plt.show() -``` - -## Next steps - -+++ - -If you want to use your own samples, you can use -[this image](https://openi.nlm.nih.gov/detailedresult?img=CXR3666_IM-1824-1001&query=chest%20infection&it=xg&req=4&npos=32) -or search for various other ones on the [_Openi_](https://openi.nlm.nih.gov) -database. Openi contains many biomedical images and it can be especially helpful -if you have low bandwidth and/or are restricted by the amount of data you can -download. - -To learn more about image processing in the context of biomedical image data or -simply edge detection, you may find the following material useful: - -- [DICOM processing and segmentation in Python](https://www.raddq.com/dicom-processing-segmentation-visualization-in-python/) with Scikit-Image and pydicom (Radiology Data Quest) -- [Image manipulation and processing using Numpy and Scipy](https://scipy-lectures.org/advanced/image_processing/index.html) (Scipy Lecture Notes) -- [Intensity values](https://s3.amazonaws.com/assets.datacamp.com/production/course_7032/slides/chapter2.pdf) (presentation, DataCamp) -- [Object detection with Raspberry Pi and Python](https://makersportal.com/blog/2019/4/23/image-processing-with-raspberry-pi-and-python-part-ii-spatial-statistics-and-correlations) (Maker Portal) -- [X-ray data preparation and segmentation](https://www.kaggle.com/eduardomineo/u-net-lung-segmentation-montgomery-shenzhen) with deep learning (a Kaggle-hosted Jupyter notebook) -- [Image filtering](https://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf) (lecture slides, CS6670: Computer Vision, Cornell University) -- [Edge detection in Python](https://towardsdatascience.com/edge-detection-in-python-a3c263a13e03) and NumPy (Towards Data Science) -- [Edge detection](https://datacarpentry.org/image-processing/08-edge-detection/) with Scikit-Image (Data Carpentry) -- [Image gradients and gradient filtering](https://www.cs.cmu.edu/~16385/s17/Slides/4.0_Image_Gradients_and_Gradient_Filtering.pdf) (lecture slides, 16-385 Computer Vision, Carnegie Mellon University) diff --git a/environment.yml b/environment.yml index 75479662..a10f3e83 100644 --- a/environment.yml +++ b/environment.yml @@ -9,6 +9,5 @@ dependencies: - nbval - statsmodels - pip - - imageio - pip: - jupyter-book diff --git a/requirements.txt b/requirements.txt index 8b9bde6a..744d4e70 100644 --- a/requirements.txt +++ b/requirements.txt @@ -6,4 +6,3 @@ matplotlib pytest nbval statsmodels -imageio diff --git a/site/index.md b/site/index.md index edd35fa8..14e940b0 100644 --- a/site/index.md +++ b/site/index.md @@ -31,7 +31,6 @@ content/tutorial-svd content/mooreslaw-tutorial content/save-load-arrays content/tutorial-deep-learning-on-mnist -content/tutorial-x-ray-image-processing ``` ### Attribution