On the total number of genes and their length distribution in complete microbial genomes
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
On the total number of genes and their length distribution in complete microbial genomes. / Skovgaard, M; Jensen, L J; Brunak, S; Ussery, David; Krogh, A.
In: Trends in Genetics, Vol. 17, No. 8, 2001, p. 425-8.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - On the total number of genes and their length distribution in complete microbial genomes
AU - Skovgaard, M
AU - Jensen, L J
AU - Brunak, S
AU - Ussery, David
AU - Krogh, A
PY - 2001
Y1 - 2001
N2 - In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300 genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes.
AB - In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300 genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes.
KW - Databases, Factual
KW - Escherichia coli
KW - Genome
KW - Genome, Bacterial
KW - Models, Statistical
KW - Open Reading Frames
KW - Saccharomyces cerevisiae
M3 - Journal article
C2 - 11485798
VL - 17
SP - 425
EP - 428
JO - Trends in Genetics
JF - Trends in Genetics
SN - 0168-9525
IS - 8
ER -
ID: 40749806