About AE   About NHM   Contact Us   Terms of Use   Copyright Info   Privacy Policy   Advertising Policies   Site Map
Ads on AE Biotech Applied
Custom Search of AE Site
spacer spacer

Sequence, sequence, sequence...

In simpler days, before people made up words like genomics and bioinformatics, a single gene was enough to keep a graduate student occupied for all five years of a doctorate. Sequence data trickled onto Genbank one gene at a time, even as the methods for obtaining that sequence became simpler and simpler. Then a renegade called J. Craig Venter had an idea: stop thinking about what DNA you should sequence and start sequencing anything you can get your hands on. Reasoned thought was out; brute force was in.

But there was some logic to Venter's approach. Genes are mere islands in a cell's DNA, stranded among seas of nonsensical filler DNA. The Human Genome Project promised to (eventually) sequence everything. (A genome is the collection of all the DNA in a given cell.) Venter wanted to fish out the informative bits first just enough of each gene to take a guess at its place in the running of the cell. It is proteins that do the work of a cell, but proteins are made only after genes are converted to mRNA, which is then converted to protein. Venter took the mRNA and transformed it back into DNA that was ready to sequence and devoid of non-gene junk. After sequencing at most a few hundred nucleotides of each piece of DNA, he had his expressed sequence tag (EST).

Venter founded The Institute for Genomic Research (TIGR; Rockville, Md.) in July 1992, with $85 million of funding promised over ten years by Human Genome Sciences, Inc. (HGS; Rockville, Md.). Within a year, TIGR claimed it had identified ESTs for over half the estimated 70,000 human genes. TIGR and HGS parted ways in 1997: TIGR is now a not-for-profit institute with government funding, and HGS has focused on patenting genes (several hundred applications so far, with over fifty patents allowed) and developing the corresponding proteins as drugs.

Incyte began as a traditional pharmaceutical company. But when the failure of its premier drug in clinical trials coincided with Venter's EST splash, Incyte decided to re-invent itself. "We became basically a factory for sequencing DNA," says Klingler.

Four years, six months, and three million human ESTs later, the sequencing machines are still running. The sequencing room is a far cry from the deserted computer room: people scurry everywhere to tend to the rows of sequencing machines. This room generates all the data that makes the company run, but the work is repetitive and the workers many of them college students are expendable. "There is a whole new temporary biologist market," says Klingler. "I don't know how long the average technician stays, but it's not too long."

Incyte’s human database is called LifeSeq
Incyte’s Database

. The ESTs come from the mRNA of 669 different tissue samples, some of them diseased and some of them not, and represent perhaps 9095% of all human genes. Genes that are often made into mRNA have been sequenced thousands of times, but some genes that are rarely converted into mRNA remain to be sequenced once. Incyte is also using the short ESTs to find the entire length of every human gene, and is working out where each gene lies in the 24 human chromosomes.

Newer databases include PathoSeq, which has most of the genes from 32 bacterial species, and ZooSeq, which includes genes from mice, rats, monkeys, and soon dogs. The sequencing operation that feeds these databases generates ~200,000 pieces of sequence, or over 40 million DNA nucleotides, every single month.


Biotech Applied Index

About Biotech Index

Custom Search on the AE Site