In recent years, we have seen an explosion in the amount of biological information that is available. Various databases are doubling in size every 15 months and we now have the complete genome sequences of more than 100 organisms. It appears that the ability to generate vast quantities of data has surpassed the ability to use this data meaningfully. The pharmaceutical industry has embraced genomics as a source of drug targets. It also recognises that the field of bioinformatics is crucial for validating these potential drug targets and for determining which ones are the most suitable for entering the drug development pipeline.
Recently, there has been a change in the way that medicines are being developed due to our increased understanding of molecular biology. In the past, new synthetic organic molecules were tested in animals or in whole organ preparations. This has been replaced with a molecular target approach in which in-vitro screening of compounds against purified, recombinant proteins or genetically modified cell lines is carried out with a high throughput. This change has come about as a consequence of better and ever improving knowledge of the molecular basis of disease.
All marketed drugs today target only about 500 gene products. The elucidation of the human genome which has an estimated 30,000 to 40,000 genes, presents immense new opportunities for drug discovery and simultaneously creates a potential bottleneck regarding the choice of targets to support the drug discovery pipeline. The major advances in genomics and sequencing means that finding an attractive target is no longer a problem but finding the targets that are most likely to succeed has become the challenge. The focus of bioinformatics in the drug discovery process has therefore shifted from target identification to target validation.A lot of factors need to be taken into account concerning a candidate target from a multitude of heterogeneous resources.
The types of information that one needs to gather about potential targets include nucleotide and protein sequencing information, homologues, mapping information, function prediction, pathway information, disease associations, variants, structural information, gene and protein expression data and species/taxonomic distribution among others. Different bioinformatics tools can be used to gather this information. The accumulation of this information into databases about potential targets means that the pharmaceutical companies can save themselves much time, effort and expense exerting bench efforts on targets that will ultimately fail. The information that is gathered helps to characterise the different targets into families and subfamilies.
It also classifies the behaviour of the different molecules in a biochemical and cellular context. Decisions about which families provide the best potential targets is guided by a number of criteria. It is important that the potential target has a suitable structure for interacting with drug molecules. Structural genomics helps to prioritise the families in terms of their 3D structures.