0% found this document useful (0 votes)
42 views

Micro Arrays II - Image Analysis and Data Pre-Processing

The document provides information on analyzing microarray images using SpotFinder software. It discusses reading dual-color microarray images, creating and adjusting a grid to define spot locations, processing the images to extract foreground intensities and background measurements for each spot and dye, and exporting the data to Excel for further analysis. Quality control plots like the MA plot can be generated to inspect normalization and identify differentially expressed genes between samples.

Uploaded by

Beto Cavazos
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Micro Arrays II - Image Analysis and Data Pre-Processing

The document provides information on analyzing microarray images using SpotFinder software. It discusses reading dual-color microarray images, creating and adjusting a grid to define spot locations, processing the images to extract foreground intensities and background measurements for each spot and dye, and exporting the data to Excel for further analysis. Quality control plots like the MA plot can be generated to inspect normalization and identify differentially expressed genes between samples.

Uploaded by

Beto Cavazos
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

GENOMICA FUNCIONAL DR. VCTOR TREVIO VTREVINO@ITESM.

MX
A7-421

Microarrays Image Analysis

[email protected]

Microarray - Pre-Processing Purpose

[email protected]

Microarray Image Analysis


TECHNOLOGIES y
sectors (~=3)

m
probsets (~100)

x sectors (~=3) DNA Usually 3 Sectors (print-tip) i x j spots (18x20) Empty spots landing lights

Target
(cDNA, PCR products, etc.)

n probsets

(~100)

Probes Copies per gene Organization Sectors Controls

Probeset

Oligos
~20 40nt

Usually 1 n x m probsets

perfect match probes (pm) mismatch probes (mm)

[email protected]

Microarray - Image Analysis


TECHNOLOGIES RAW DATA

10,000 genes * 2 dyes * 3 copies/gene * ~40 pixels/gene = 2,400,00 values Image Analysis Pre-processing only 10,000 values

10,000 genes * 20 oligos * 2 (pm,mm) * ~ 36 pixels/gene = 14,400,00 values

only 10,000 values

[email protected]

Image Analysis
Addressing
Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities Done by GeneChip Affymetrix software quality measures.

[email protected]

Image Analysis
Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures.

Addressing (by grid, GenePix)

[email protected]

Image Analysis
Addressing: Estimate location of spot centers. Segmentation: Classify pixels as foreground or background. Extraction: For each spot on the array and each dye foreground intensities background intensities quality measures. Segmentation

Circular feature

Irregular feature shape

Finally compute Average

Background Reduction

Extraction: Determining Background

Image Analysis
Segmentation (Spot detection) Background Estimation Value
Value = Spot Intensity Spot Background

[email protected]

Sample1 Sample1

Gene1 Gene2 Gene3 . . Genek . . GeneN

100 209 7 . . 9882 . . 2298

98 4209 2 . . 9711 . . 28

[email protected]

Data Transformation two dyes


Sample1 Sample1

Gene1 100 Gene2 209 Gene3 7 . . . . Genek 9882 . . . . GeneN 2298

98 4209 2 . . 9711 . . 28

G=Sample1 Log2(G=Sample1)

Log2

Microarray Bioinformatics - D. Stekel (Cambridge, 2003)

Log2(R=Sample1)

R=Sample1

[email protected]

Data Transformation two dyes


(log2 scale)
Sample1 Sample1

Gene1 100 Gene2 209 Gene3 7 . . . . Genek 9882 . . . . GeneN 2298

98 4209 2 . . 9711 . . 28

R=Sample1

Desv Intensity

MA-Plot
G=Sample1

1 value?

R M = Log 2 G Log 2(R G ) A= 2

R G

M A

Normalization 2 dyes
"With-in"
(2 color technologies)
(assumption: Majority No change)
1

M
log2(R)-log2(G) -1 -4 -3 -2

10

12

14

16

(log2(G)+log2(R)) / 2

Normalization 2 dyes
"With-in"
(2 color technologies)
(assumption: Majority No change)

Before

After

Normalization 2 dyes
"With-in" Spatial
(2 color technologies) Before Normalization Aftter loess Global Normalization Aftter loess by Sector (print-tip) Normalization

[email protected]

Data Transformation one dye


Sample1

Gene1 100 Gene2 209 Gene3 7 . . . . Genek 9882 . . . . GeneN 2298

Log2

1.5

Normalization 1 or 2 dyes
density(x = log2(t[, 15] + 200), adjust = 0.475)

Between-slides
Density 0.0
7

0.5

1.0

9 N = 3840

10 Bandwidth = 0.1051

11

12

Before normalization
1.0

After normalization quantile scale qspline invariantset loess


0.8

0.8

MAD (median absolute deviation)


density

density

0.6

0.4

0.0

0.2

10

11

12

13

14

15

16

0.0

0.2

0.4

0.6

10

11

12 x

13

14

15

log intensity

Summarization Affymetrix
Oligonucleotide dependent technologies

PM MM

Sumarization = "Average"(Intensities) Usual Methods: tukey-biweight av-diff median-polish

The "summarization" equivalent in two-dyes technologies is the average of gene replicates within the slide.

[email protected]

Microarrays Filtering / Treating Undefined Values


Some spots may be defective in the printing process Some spots could not be detected Some spots may be damaged during the assay Artefacts may be presents (bubbles, etc)

Use replicated spots as averages Remove unrecoverable genes Remove problematic spots in all arrays Infer values using computational methods (warning)

[email protected]

Microarray Data Filtering


More than 10,000 genes Too many data increases Computation Time and analysis complexity Remove

Genes that do not change significantly Undefined Genes Low expression Large signal to noise ratio Large statistical significance Large variability Large expression

Keeping

Microarray Pre-Processing Summary


ImageAnalysisandBackgroundSubtraction b) a) DataProcessing
Aymetrix

[email protected]

Transformation c)
Microarray Twodyes

Image Scanning

Spot Detection

Background Detection& Subtraction

Intensity Value

M=log2(R/G)

Normalization d)
Within Between

A=log2(R*G)/2

[email protected]

Image Analysis Exercise

Data processing of Placental Microarrays


Dr. Hugo A. Barrera Saldaa Paper in Mol. Med. 2007 : DNA Microarrays - A Powerful Genomic Tool for Biomedical Research Trevino - Barrera - Mol Med 2007

Search PubMed for Trevino V

Experimental Design Goal : Differential Expression


mRNAExtraction Labelling
Placenta1 ReferencePool Placenta2
Green Red Green Red

(controls)

Microarray Hybridization (byduplicates) Scanning& DataProcessing Detectionof Dierentially ExpressedGenes Validationand Analysis
Image Analysis Within Normalization (perarray) Between Normalization (allarrays)

(Dr. Hugo Barrera)

ttestH0:=0 pvaluescorrection:FalseDiscoveryRate ComparisonWithKnownTissueSpecicGenes

[email protected]

Experimental Design - Slides


SLIDES' SCANNINGS GROUP 1a 1b 2a 2b 3a 3b 4a 4b 5a 5b 6a SLIDE 52 A 52 B 51 A 51 B 56 A 56 B A 54 B 54 A 55 B 55 A 53 B 53 V V V V V V V V V V V V CY3 (GREEN) Sample Sample Sample Sample Control Control Control Control Control Control Control Control CY5(RED) Control Control Control Control Muestra Muestra Muestra Muestra Control Control Control Control COMMENTS

RIGHT TOP GROUP RIGHT BOTTOM GROUP

LEFT TOP GROUP LEFT BOTTOM GROUP

Download Images from

6b

http://bioinformatica.mty.itesm.mx/?q=node/68

[email protected]

Read Images
Read BOTH Images together using SpotFinder
Mark file 1 as "Cy3" = Green Mark file 2 as "Cy5" = Red

Adjust Image Brightness and Contrast

[email protected]

Create Grid
Create Grid
Metarows = 12, Metacolumns = 4 Rows = 24, Columns = 24 Pixels = 450 (of the 24 x 24 spots) Spacing = 18 (between metacolumns and metarows)

[email protected]

Adjust Grid
Created Grids are not aligned to the image.
Use Visible All (right click in a blank area) Use Move All To adjust overall position. Use visible all to restore grid.

Adjust each of the 12*4 Grids to correct positions


Right mouse button in a grid to move that grid Arrow keys also work Right mouse button in a blank section to move all grids

[email protected]

Save Grid

Save the grid frequently to avoid loosing your work

Image Analysis

[email protected]

Use Gridding and Processing


Copy images

Adjust (save grid first, in mac adjust doesnt work well) Process

Export to .mev file Open .mev file in excel Remove comment lines Compute signal:

1 From the grid adjust 1 From the RI plot 1 From the data (figure) 2 From the QC view (A and B) What does they represent?

Signal A = Cy3 Green = MNA - MedBkgA = Media del spot A - Mediana del fondo B Signal B = Cy5 Red = MNB - MedBkgB = Media del spot B - mediana del fondo B Copy image in a word file

Plot Signal A vs Signal B

DO NOT SAVE THE modified .MEV FILE

[email protected]

Execute Process
- Select Gridding Tab - Use Histogram Segmentation - Spot Size = 10 - Process All !

[email protected]

Inspect DATA PROCESSED

Select Data Tab

Select a row / spot See results and interpret output

[email protected]

Inspect MA-PLOT
Select RI-PLOT Tab Observe the MA-PLOT You can switch on/off specific grids A tendency can be observed (which has to be corrected to 0 see MIDAS exercise)

[email protected]

Quality Control View

Quality view tab

View 2 gives if each had M > 1 (yellow, or 0.5 in this image) or M < -1 View 1 gives the count of all M values per color (yellow, gray, blue, and green)

[email protected]

Export DATA and VIEW in Excel

Save data to a .mev file


Open .mev file in excel Remove comment lines (important !) Compute signal:

Signal A = Cy3 Green = MNA MedBkgA = Media del spot A Mediana del fondo B Signal B = Cy5 Red = MNB - MedBkgB = Media del spot B - mediana del fondo B Copy image in a word file

Plot Signal A vs Signal B

The Plot in Excel should be similar to the MA plot (RI-Plot)

DO NOT SAVE THE modified .MEV FILE

[email protected]

Resumen del Uso de SpotFinder

Imagen

Lemos 2 imgenes, Verde=Cy3, Roja=Cy5 para generar un valor de intensidad con ruido de fondo reducido para cada color:
Generamos

Datos

un grid con la cantidad de spots y diseo espacial especificado para el microarreglo Ajustamos las posiciones visualmente moviendo los grids Calculamos el valor de la seal y el ruido de fondo para cada color Obtuvimos un archivo con datos

You might also like