Microarray segmentation for DNA arrays


What: The DNA-microarray technology allows simultaneous analysis of thousands of genes (or of the entire genome) to determine their (changes in) gene expression level, but can also be used to detect mutations such as single nucleotide polymorphisms (SNPs) type.

Why: There are two main types of technology platforms for DNA microarrays implementation and analysis: “Two-Colour Spotted Microarrays” or “Spotted Microarrays” technology platform and “Probe-Set Arrays” or “Affymetrix” platform.

Pre-processing microarray data
The first step in processing stage is the image analysis. The microarray is scanned for to obtain a digital image in which every sample is described by some tens of pixels. It follows the image segmentation used to locate the pixels that correspond to each sample. Each sample is quantified (the light intensities of the corresponding pixels are summarized). The background quantification (for to separate the specific effects of those unspecific) is also necessary.
The next step consists in eliminating the background effect. The objective is to estimate the genetic material abundance measuring the samples signals intensity. Other important aspects are: data normalization to ensure the compatibility with other microarray experiments and the evaluation of the data quality, for identifying the discordant data.

Statistical analysis of gene expression
One microarrays experiment is testing the gene expression in the case: one gene-one sample. The question is: if a specific gene has a significant expression for a given sample.
The hybridization intensities values obtained by the target samples and control samples are dependent pairs. The statistical problem of interest is to verify the null hypothesis of equal means for two dependent random variables or, in other words, to verify if there are significant differences between the means of the two groups of dependent values. Thus, the case “one gene-one sample” becomes “one gene-two samples”.
The most natural problem in microarrays analysis is to compare levels of gene expression for two target samples, one with diseased cells and one with healthy cells. This case also becomes the case “one gene-two samples” from above, but in this case, the variables can be dependent or independent.
For those problems there are well known statistical tests that can be used, already implemented in software packages, like SPSS, SAS, Statistica, R: two-sample t-test, one sample t-test: testing for a mean, the Wilcoxon two sample test, Wilcoxon signed-ranks test, two-sample permutation test and the ANOVA test.