Current Rice Genome Pseudomolecules Release
We are pleased to announce release 7 of the Rice Pseudomolecules and Genome Annotation. The official release date for this version was October 31, 2011.
Release 7 is a major update from release 6.1. The rice pseudomolecules have been reconstructed using an optimal BAC tiling path that involved use of a BAC-optical map and error correction of the underlying BAC sequence using next generation sequencing reads from Nipponbare rice. This effort, in cooperation with researchers at the Agrogenomics Research Center at the National Institute of Agrobiological Sciences, Tsukuba, Japan and the Rice Annotation Project Database (RAP-DB), represents a final and unified set of pseudomolecules (Os-Nipponbare-Reference-IRGSP-1.0). There are the 12 chromosomes, one pseudomolecule representing the unanchored BAC clones, one pseudomolecule representing unmapped Syngenta sequences plus the two organellar genomes. Note that while the Rice Genome Annotation Project (RGAP) and the International Rice Annotation Project Database (RAP-DB) have different annotation efforts, these parallel annotation efforts utilize the same underlying pseudomolecule sequence.
In release 7, there were 373,245,519 bp of non-overlapping rice genome sequence from the 12 rice chromosomes. The genes that had been identified from release 6.1 were remapped and transfered to release 7. This process resulted in 55,986 genes (loci) had been identified, of which 6,457 had 10,352 additional alternative splicing isoforms resulting in a total of 66,338 transcripts (or gene models) in the rice genome. Note that small gene models (<50 amino acids) have been excluded from our annotated gene set.
Transposable element-related (TE-related) gene models were identified using two approaches: BLASTN searches against the MSU Oryza Repeat Database and by identifying gene models containing TE-related Pfam domains. These loci (16,941) and their models (17,272) were annotated based on the Pfam domain or the nomenclature in the MSU Oryza Repeat Database. Pack-MULEs were identified on all 12 chromosomes. They were annotated as described in Hanada et al. 2009. Transduplicate MULEs identified by Juretic et al. 2005 were aligned to the current pseudomolecules. Note that the Jiang Pack-MULEs and the transduplicate MULEs had only been identified on the Genome Browser and not in our functional annotation. Also note that although loci and gene models on ChrUn and ChrSy are now included in our official gene set but are not assigned LOC_OsXXgXXXXX identifiers. These two pseudomolecules contain 185 loci and gene models.
Please note that these pseudomolecules are constructed from finished and unfinished sequence and a majority of the gene models have not been manually curated.Table of Rice Pseudomolecule, Loci, and Gene Models in Release 7
Chr | BAC/ PAC No. | Sequence Length in Pseudomolecule (bp) | Gaps | Genes/Locia | Gene Modelsa | Download Sequences | ||||
---|---|---|---|---|---|---|---|---|---|---|
TEb | Non-TEc | Totald | TEb | Non-TEc | Totald | |||||
1 | 392 | 43,270,923 | 8 | 1,464 | 5,078 | 6,542 | 1,518 | 6,518 | 8,036 | Download |
2 | 359 | 35,937,250 | 5 | 1,244 | 4,143 | 5,387 | 1,274 | 5,392 | 6,666 | Download |
3 | 331 | 36,413,819 | 8 | 1,185 | 4,388 | 5,573 | 1,224 | 5,803 | 7,027 | Download |
4 | 296 | 35,502,694 | 9 | 1,903 | 3,419 | 5,322 | 1,919 | 4,265 | 6,184 | Download |
5 | 286 | 29,958,434 | 5 | 1,461 | 3,118 | 4,579 | 1,483 | 4,009 | 5,492 | Download |
6 | 281 | 31,248,787 | 4 | 1,488 | 3,236 | 4,724 | 1,517 | 3,965 | 5,482 | Download |
7 | 289 | 29,697,621 | 3 | 1,397 | 3,065 | 4,462 | 1,430 | 3,767 | 5,197 | Download |
8 | 278 | 28,443,022 | 3 | 1,432 | 2,762 | 4,194 | 1,446 | 3,426 | 4,872 | Download |
9 | 223 | 23,012,720 | 7 | 1,148 | 2,260 | 3,408 | 1,161 | 2,768 | 3,929 | Download |
10 | 208 | 23,207,287 | 10 | 1,219 | 2,298 | 3,517 | 1,244 | 2,830 | 4,074 | Download |
11 | 261 | 29,021,106 | 6 | 1,459 | 2,707 | 4,166 | 1,493 | 3,208 | 4,701 | Download |
12 | 269 | 27,531,856 | 5 | 1,579 | 2,443 | 4,022 | 1,605 | 2,983 | 4,588 | Download |
Totale | 3,184 | 373,245,519 | 73 | 16,979 | 39,102 | 56,081 | 17,314 | 49,119 | 66,433 | Download |
a Excluding small gene models (< 50 amino acids).
b TE: Transposable elements related genes and gene
models. The rice proteome was searched against
the MSU Oryza Repeat Database with TBLASTN and
against the TE-related Pfam domains with hmmpfam. Genes and gene models with matches above cut-offs
were annotated as TE-related gene models.
c Non-TE: Non-TE related gene models.
d There are 89 loci and 89 models on ChrSy. There are 96 loci and 96 models on ChrUn. These loci and models are not included in the totals for the main pseudomolecules.
e Note that these pseudomolecules are now identical to the IRGSP/RAP pseudomolecules.
This work is supported by grants (DBI-0321538/DBI-0834043) from the National Science Foundation and funds from the Georgia Research Alliance, Georgia Seed Development, and University of Georgia.