Structural analysis of Arabidopsis thaliana chromosome 5. VIII. Sequence features of the regions of 1,081,958 bp covered by seventeen physically assigned P1 and TAC clones.
A total of 17 Pl and TAC clones each representing an assigned region of chromosome 5 were isolated from P1 and TAC genomic libraries of Arabidopsis thaliana Columbia, and their nucleotide sequences were determined. The length of the clones sequenced in this study summed up to 1,081,958 bp. As we have previously reported the sequence of 9,072,622 bp by analysis of 125 P1 and TAC clones, the total length of the sequences of chromosome 5 determined so far is now 10,154,580 bp. The sequences were subjected to similarity search against protein and EST databases and analysis with computer programs for gene modeling. As a consequence, a total of 253 potential protein-coding genes with known or predicted functions were identified. The positions of exons which do not show apparent similarity to known genes were also assigned using computer programs for exon prediction. The average density of the genes identified in this study was 1 gene per 4277 bp. Introns were observed in 74% of the potential protein genes, and the average number per gene and the average length of the introns were 4.3 and 168 bp, respectively. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.