Structural analysis of Arabidopsis thaliana chromosome 5. II. Sequence features of the regions of 1,044,062 bp covered by thirteen physically assigned P1 clones.
A total of 13 P1 clones, each containing a marker(s) specifically mapped on chromosome 5, were isolated from a P1 library of the Arabidopsis thaliana Columbia genome, and their nucleotide sequences were determined according to the shot gun based strategy and precisely located on the physical map of chromosome 5. The total length of the sequenced regions was 1,044,062 bp. Since we have previously reported the sequence of 1,621,245 bp by analysis of 20 non-redundant P1 clones, the total length of the sequences of chromosome 5 determined so far reached 2,665,307 bp. The regions sequenced in this study were analysed by comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling; a total of 225 potential protein-coding genes and/or gene segments with known or predicted functions were identified. The positions of exons which do not exhibit similarity to known genes were also predicted by computer-aided analysis. An average density of the genes and/or gene was 1 gene/4,640 bp. Introns were identified in approximately 84% of the potential genes, and the average number and length of the introns per gene were 5.3 and 184 bp, respectively. These sequence features are essentially identical to those for the previously sequenced regions. The transcription level of the predicted genes has been roughly monitored by counting the numbers of matched Arabidopsis ESTs. The sequence data and gene information are available through the World Wide Web at http:@www.kazusa.or.jp/arabi/.