advertisement | sitemap | help | contact us 

search for: 
What is Info-Click?




Content Management by InterRed
Home  > Articles  > Archiv  > Article
ND-Issue-6-2003

Managing the Explosion
of Biological Data Using
Applied Bioinformatics

The production of huge masses of data in the field of genomics and proteomics in terms of the development of new therapeutics becomes more and more a problem. Thus, the main question for many researchers and pharmaceutical establishments is how to successfully manage this data explosion.
Recent reports from the global pharmaceutical industry have noted there is a significantly steady increase in the number of biopharmaceutical products entering into clinical trials and coming to market. According to the FDA and Biotechnology Industry Organization, the number of approved biological therapeutics and vaccines has grown from an average of 13 per year from 1990 to 1999, to 30 in the years 2000 to 2003 [1]. In addition, the Pharmaceutical Research and Manufacturers of America (PhRMA) reports that the number of biopharmaceutical products in clinical trials has increased from about 150 in 1995 to almost 500 in 2002 [2].
All of this, combined with constantly changing technology and the nature of the now ubiquitous fields of genomics and proteomics, is producing an explosion of data and information.

Bioinformatics vs data management
In practice, the general term of bioinformatics covers two distinct areas; the science-based aspect and the data management part. The science-based aspect (bioinformatics) can be defined as a three-step process, where steps one and two include data collection, the processing of scientific calculations and results analysis. Step three addresses the visualisation of experimental results. The data management part supports the workflow and storage of all the information. Under these definitions, bioinformatics refers to the access, handling and analysis of banks of scientific data that are available for scientists to feed into their own research. Data management is rather more an extension of the traditional laboratory notebook and allows researchers to seamlessly track, search and archive large amounts of experimental data between themselves and the other scientists working on the project. The distinction between bioinformatics and data management, however, is becoming less defined and the two functions have now effectively merged under the generic banner of bioinformatics.
The importance of managing data in the drug discovery environment may not at first be clearly evident, as the work moves extremely quickly. However, it is essential to be able to trace back through and validate data for quality assurance purposes. Once a target has been discovered, the data must be validated and the potential drug can progress to the next level and onto clinical trials. This is where having good data management system in place from the start becomes very significant because it makes life so much easier. The fact that the FDA has proposed new regulations to allow the use of genomic testing in clinical trials means that this is becoming more of an issue for all laboratories.
The main aim of bioinformatics is to provide systems to manage not only structured data, such as documents and tables, but also unstructured data (rich data) like mass spectrometric and gene expression data (figure 1). Every researcher and every laboratory is unique and the situation is far more complicated than it may at first seem. Not only are we dealing with enormous amounts of data, but also a variety of types of data from many different sources and different types and ages of technology.

Data management at its best
Bioinformatics is a new science. The fast development of new complex bioanalytical technologies is responsible for a lack of data format standardisation. As a result, it is often hard to merge existing information in different formats, so software cannot easily be re-used and a degree of customisation is the only answer to create valuable solutions or services. At present, there is a lack of standardisation of data types in the bioinformatics industry and, as such, it can be compared with the computer industry 10 to 15 years ago. For example, although today everyone uses the same desktop office packages, in the early days there were many alternative packages and formats. The opportunity to harmonise different data types for the whole industry in the early days was unfortunately missed.
As biology and related disciplines begin to complement traditional laboratory research with information-based science, new opportunities exist for technology providers to improve, speed up and lower the cost of life science discovery and development programmes. For example, the ability to combine genomic, proteomic, metabolic and other biological information with validated assays that address the full continuum of the drug-discovery and development processes will help speed the development of safer, more effective and better-targeted treatments for disease.
Bioinformatics is very successful when it comes to combining all the different elements relating to laboratory research. It begins when a scientist embarking on research, for example on a certain disease, can consult the Celera Discovery System online platform (CDS) and find out information already linked to this disease. The CDS is an excellent resource*, pulling together information regularly updated from proprietary as well as more than thirty public and third-party databases into one convenient location. The CDS will deliver workflows to help scientists to increase the value of their experiments. Functional proteomics and gene expression experiments may include biological functions using the on-line Panther protein classification system.
Finally, having found a potential target, the researcher may use the direct link to the Applied Biosystems on-line store to order validated assays using Assays-on-Demand products. The advantage of ready-made assays is that they save laboratories so much time. Rather than having to create them, the assays come in a ready-to-use format, fully validated and checked. There are also a number of direct links to partner companies for consumables, reagents and anything else a researcher might need, all in one central place. It is at this stage that data management comes into play, when the experiments are done and all the data created needs to be installed and properly managed. Results create more knowledge about this particular disease and a researcher can then go back to query CDS again to validate what has been found or start again with the next logical target.
When you want to integrate data, the first step is to make it clear what you want to integrate and why. Perhaps integration isn’t always the answer. An example of how integration can work

effectively is demonstrated in figure 2 using the analogy of a car dashboard. Sitting behind the dashboard of a car you have control of lots of different technologies in front of you. If you push the pedal, the combustion engine goes faster. If you press a button, you turn on the windscreen wiper. Another pedal activates the brakes and another button turns on the radio. You might want to integrate your windscreen wiper with a rain sensor and your combustion engine with cruise control, but this integration is very directed and it would be totally impractical to integrate every component with others. An effective data management solution will allow you to literally drive the experiment according to the conditions prevailing at the time. Information and controls are pre-selected and a central controlling influence is present throughout the time you are occupied in the specific research required.
Figure 3 shows how a Laboratory Information Management System (LIMS) can make a difference because at every stage the data is being entered into the LIMS database. Direct links to the CDS may be available and integrated in the overall workflow. Due to the integration, there is no need to manually document all stages, allowing researchers to do more laboratory work and less administrative paperwork. All the data is stored in a centralised repository and is available to anyone with the authorisation to access it, at any time and from any place connected to the network. For projects that are shared between different establishments and even different countries, this kind of system is essential. Consortiums like this are now quite common, where each arm is responsible for a different part of the project but all the information is entered into and is accessible from the same database.

Conclusion
Applied Biosystems has a legacy rooted in developing innovative technologies that become industry standard platforms for life science research. It is uniquely positioned to provide large-scale, integrated solutions to customers interested in performing Integrated Science.

*The CDS is unique in the industry as it includes proteomic information as well as genomic.

References
[1] G. Crocker et al., „Endurance: The European Biotechnology Report 2003,“ Ernst & Young LLP, London, May 2003.
[2] Pharmaceutical Research and Manufacturers of America Annual Report 2002-2003


recommend this article print version write a mail to the author

more information
    more details on the technology
 
    download the key facts and figures of “Endurance” report
 
   
 

Content Current Issue

Read more
 
   
  PharmaTEC 4/5-2005


Content Current Issue

Read more

 
   
  Further Publications
PROCESS worldwide
PROCESS China
PROCESS German Edition
LaborPraxis

 
 
Home  | News | Articles | Products | Events | Books & Catalogues | Links | Imprint