Abstract
A tiling array yields a series of abundance measurements across the genome using evenly spaced probes. These data can be used for detecting sequences that exhibit a particular behavior. Scanning window statistics are often employed for testing each probe while accounting for local correlation and smoothing noisy measurements. However, window testing may yield false probe discoveries around the sequences and false non-discoveries within the sequences, resulting in biased predicted intervals. We propose to avoid this problem by stipulating that a sequence of interest can appear at most once within a defined region, such as a gene; thus, only one window statistic is considered per region. This substantially reduces the number of tests and hence, is potentially more powerful. We compare this approach to a genome-wise scan that does not require pre-defined search regions, but considers clumps of adjacent probe discoveries. Simulations show that the gene-wise search maintains the nominal FDR level, while the genome-wise scan yields FDR that exceeds the nominal level for low interval effects, and achieves slightly less power. Using arrays to map introns in yeast, we identified 71% of the previously published introns, detected nine previously undiscovered introns, and observed no false intron discoveries by either method.
Original language | English |
---|---|
Pages (from-to) | 173-190 |
Number of pages | 18 |
Journal | Statistical Applications in Genetics and Molecular Biology |
Volume | 13 |
Issue number | 2 |
DOIs | |
State | Published - 1 Jan 2014 |
Externally published | Yes |
Keywords
- Gene-wise search
- Introns
- Meiosis
- Saccharomyces cerevisiae
- Scan statistic
- Tiling arrays
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Genetics
- Computational Mathematics