Sequencing the Maize Genome



There are two projects underway to test sequencing approaches for large plant genomes using maize as an example, which has about 2 Gb in clonable DNA. One approach uses traditional methods to obtain maize genome sequence data. The second approach attempts to focus in on the gene rich portion of the maize genome by avoiding the highly repetitive fraction. Both projects are supported by the National Science Foundation.

The first approach, titled Sequencing the Maize Genome, examines variations of methods proven during other whole-genome shotgun-sequencing projects. This project is being conducted by a team consisting of Jo Messing at Rutgers University, and Rod Wing and Cari Soderlund from the University of Arizona AGI/AGCoL.

The key features of this project are:

High-Resolution Fingerprinting: A map driven BAC-by-BAC approach to sequencing the maize genome could be improved if it would not need so many extension points thereby reducing the number of sequencing cycles and sequence redundancy.  However, reduction of nucleation points would require the determination of a minimal tiling path before sequencing begins. The limited resolution of FPCs is due to the six-base-pair cutters at a level where overlaps would become rather large and thereby cost and time prohibitive. Therefore, we will use a new fingerprinting approach that allows the use of four-base-pair cutters. The problem with such an approach is that many more fragments per BAC clone will be generated, which would be hard to analyze by any agarose gel-based technique. Therefore, we will use capillary sequencers to resolve the fragment sizes for each BAC clone. Elimination of an agarose gel-based technique also accelerates the fingerprinting because no manual band calling will be required.  We will re-fingerprint the clones from the MboI, HindIII and EcoRI BAC libraries that were originally agarose fingerprinted using to assemble version 1 and 2 of the maize physical map.

Sequencing ~20Mb of the Maize genome represented in genomic BAC clones: We will sequence 140 maize BAC clones to phase-2. Sequencing is subcontracted to the Whitehead Institute/MIT Center for Genome Research. Selection of the clones will be as follows: All clones will be selected from the MboI BAC library.  Twenty-five clones picked from fingerprinted contigs will to shotgun sequenced to high redundancy (25X) to determine the success of current sequence methodologies and assembly protocols.  This process will allow shortcomings of the current technologies with respect to sequencing the maize genome to be identified and addressed.

1.) An additional 25 MboI clones will be randomly picked, but selected from clones that have been end sequenced and do not fall into contigs.  The end sequences will be compared to the rice genome to insure that the clones selected are from random locations.
2.) Twenty-five additional clones will be randomly picked from the HindIII library with similar criteria as in #1.
3.) Twenty-five additional clones will be randomly picked from the EcoRI library with similar criteria as in #1.
4.) 40 additional clones will be used to determine the efficacy of using the HICF method in determining a minimum tiling path.

This collection of sequence will also provide a valuable comparison for the data generated by the gene-filtration approaches.

Annotation of the sequenced BACs will be undertaken at the Munich Information center for Protein Sequences (MIPS).

Sequencing ~450K BACend sequences: The BACend sequences, also known as Sequence Tag Connectors (STC), provide a framework by which the reduced number of FPCs identified by either manual curation of the agarose gel based, maize fingerprinted BAC map or high resolution fingerprinting can be further edited in two respects.

First, co-linearity of STCs with the rice genomic sequence can be used to order FPCs on the genetic map. As already stated above, the mouse-human homolog map used a total of 51,486 STCs or one match per 54 kb to align FPCs to the genetic map. We are planning to produce 450,000 STCs, which is in the same ballpark as the mouse genome project with 453,962 STCs. Moreover, the maize fingerprinted BAC clones are derived from twice as deep coverage BAC libraries, which have been created with three different restriction enzymes including a four-base-pair cutter.

Second, EST matches with STCs can be used to anchor genes to the genetic map in silico. It also creates a gene inventory and sequence repeat library.


GO BACK HOME