Part5 - Image Processing
Part5 - Image Processing
## upload images
img = ski.io.imread(img_name)
## display image
plt.imshow(img,cmap='gray')
sns.heatmap(kernel,annot=True,cmap='gray',ax=axes[])
## define ROI
roi_start = (500, 100) # point on the bottom left
roi_height, roi_width = 300, 250
roi = retina[roi_start[0]:roi_start[0]+roi_height,roi_start[1]:roi_start[1]+roi_w
plt.subplot(1,2,2)
plt.imshow(roi)
plt.title('Zoomed in ROI')
plt.show()
Quantization
🏷️ Why are bits interesting?
the human visual system cannot detect more than 256 different gray levels in an image
available
often this quantization results in a representation of 1 Byte (8 bits), since 1 byte
corresponds to the way memory is organized inside a computer (0 and a high charge
quantized to 255)
# Step 4: Rescale the rounded values to bring them back to the original intensity
# [0,3] -> [0,1] -> [0,255]
image_quantized = rounded_values * 255 / (n_colors-1)
2 Colors
Red, green, blue are called Primary Colors
R, G, B were chosen due to the structure of the human eye
R, G, B are used in cameras as they got three sensors
Typically each color value is represented by an 8-bit value meaning that 256 different shades of each
color can be measured (totally 2563 colors can be represent in an pixel)
the actual representation might be three images - one for each color, but it can also be a three-
dimensional vector for each pixel, hence an image of vectors
### separate the red, green and blue channels of an RBG image
green_img = np.zeros_like(img)
green_img[:,:,1] = img[:,:,1]
blue_img = np.zeros_like(img)
blue_img[:,:,2] = img[:,:,2]
1
as default, the three colors are equally important, hence WR = WG = WB =
3
a method for colorizing grayscale images: Slicing the image into levels
slices the image into N levels using an equal step threshold process
# rescale to [0,255]
img1 = np.uint8(img_norm*255)
plt.imshow(img,cmap=cm.get_cmap(mycmap))
point processing is now defined as an operation which calculates the new value of a pixel in g(x,y) based
on the value of the pixel in the same position in f(x,y)
Simplest type of operation because the pixel value itself is independent on the surrounding pixels
3.1.2 Contrast
contrast describes the level of detail we can see, or the difference between pixel values
img_adapted = ski.exposure.equalize_adapthist(img)*255
plt.figure(figsize=(8,5))
plt.hist(img.ravel(), bins=256,range=(0,255),density=True,alpha=0.7)
plt.ylim([0,45])
plt.legend()
plt.show()
plt.figure(figsize=(12,5))
plt.subplot(1,2,1)
plt.imshow(img, cmap='grey', vmin=0, vmax=255)
plt.title(img_title)
plt.colorbar()
plt.subplot(1,2,2)
plt.hist(img.flatten(), bins=100, range=(0,255), color='blue')
plt.title(hist_title)
plt.xlabel('Pixel Value')
plt.ylabel('Frequency')
plt.show()
humans cannot tell the difference between graylevel values too close to each other
so, spread out graylevel values
return new_img
def img_cdf_stretch(img):
# calculate the counts number of values from 0 to 255
counts, values = np.histogram(einstein.flatten(), bins=256, range=(0,255))
# convert intensity value into cumulative probability [0,1]
cdf_normalize = np.cumsum(counts)/sum(counts)
# the values in map is the probability of [0,1] corresponding to [0,255]
map = np.uint8(cdf_normalize * 255)
# use the intensity value of img as index
return map[img]
3.2.2 Thresholding
segmentation is about image analysis, not image manipulation
the task: information vs. noise; foreground (object) vs. background
use graylevel mapping and the histogram
when two peaks/nodes of a histogram correspond to object and noise, find a threshold value T that
separates the two peaks
# Threshold
img_threshold = img <= Threshold
#OR: img_threshold = img >= Threshold
img_invert = ski.util.invert(img)
3. compute the average gray level value m1 and m2 por the pixels in regions G1 and G2
5. repeat step 2-4 until the difference in T is smaller than a predefined parameter T0
Threshold_otsu = ski.filters.threshold_otsu(img)
overflow / underflow
the result of calculation may be smaller than 0 or larger than 255
solution: use an intermediate image: pixel values are float values (32 bits / 4 bytes), can store almost
any number
1. write computation result into intermediate image
2. rescale intermediate image to values [0,255] and write results into 8 bit image
image−min
N ormalizedimage =
max−min
# use grayscale image |you can convert RGB image to grayscale by "ski.color.rgb2gray
img_filter = sp.signal.convolve2d(img,kernel)
copy the input image after processing: copy the outer border in the input
padding: add a fixed value around the edge: 0, 255 | this process will change histogram |
truncate kernel when at the edge | complex and not well-defined |
mirror padding: copy the outer border in the output
if we represent an image by height as opposed to intensity, then edges correspond to places where
we have steep hills
for each point in this image landscape, we have two gradients: x-direction and y-direction.
Edge detection steps
1. noise reduction | because edge detect filters are high-resolution sensitive |
2. edge enhancement: calculate candidates for the edges
img_fill = sp.ndimage.binary_fill_holes(img)
5 Morphological operations
### Generate a structural element (SE)
## generate a cross SE
Radius = 1 # the size of SE will be 2*Radius+1
SE = ski.morphology.disk(Radius)
## display it
sns.heatmap(SE,cmap='gray',cbar=False, annot=True)
fill holes but keep original size and shape (better than dilation)
idempotent
6.1.2 rotation
usually rotation is defined to rotate around the center point
img_rotated = ski.transform.rotate(img,angle)
# angle: rotation angle in degrees in counter-clockwise direction.
6.3 Interpolation
when doing transformations, it is often not possible to map pixels 1 to 1
hence, spatial transformations usually require some form of interpolation to add possible anti-aliasing
interpolation methods: since computation increases with the number of pixels that are considered,
there is a trade off between quality and computational time
relies on an optimization technique to maximize the correlation, or other measure of similarity between the
images
some reduction in sharpness is seen in the realigned image as a result of information lost in the distortion process
uses human pattern recognition skills to aid the alignment process, usually by selecting corresponding
reference points in the images
the number of reference pairs required is the same as the number of variables needed to define a
transformation
an affine transformation will require a minimum of 3 reference points
a projective transformation requires 4 variables
more reference points generally improve the alignment
%matplotlib tk
# Deactivate pop up => %matplotlib inline
import matplotlib.animation as animation
fig = plt.figure()
images = []
# We can just plot each picture directly without creating the axes, since we need to
for i in range(36):
images.append([plt.imshow(image_stack[:,:,i],cmap='gray', animated=True)])
# Create the animation object, where we input the figure and the images
ani = animation.ArtistAnimation(fig, images, interval=50, repeat=False)
plt.axis('off')
plt.show()