Over the past decade, the science of clinical bioinformatics has become one of the fastest growing areas of research and development within the healthcare environment. Indeed, the job of a bioinformatician has become an integral part of research laboratories. In particular, clinical bioinformatics aims to address the challenges in diagnosis, prognosis, and therapies of patients with diseases such as cancer, neurodegenerative (e.g. ALS, Alzheimer’s and Parkinson’s disease), allergic (e.g. asthma), and psychiatric disorders (e.g. depression), amongst others.
In 1970, Ben Hesper and Paulien Hogeweg coined the term bioinformatics to refer to “the study of information processes in biotic systems”. In consequence, bioinformatics was placed as a field parallel to biochemistry and biophysics.1 Since then, the digital world expanded and the definition of bioinformatics took on a whole new meaning. It now combines the fields of biology, computer science, engineering, mathematics, and statistics to decipher biological data and make sense of it in translational research.
Over the past decade, the advent of high throughput or next generation sequencing (NGS) has accelerated the rate at which genes and co-regulated gene networks are discovered. Indeed, a vast amount of data is now available, in particular from the completion of the human genome project in 2003. Together, this data is being used to modulate disease outcome, predisposition, and progression.2 For this reason, the science of clinical bioinformatics has become one of the fastest growing areas of development within the healthcare environment. It is an important component in laboratories that generate and interpret data from molecular genetics testing. Overall, the aim of clinical bioinformatics is to address the challenges in initial diagnosis, prognosis, and therapies of patients3 with diseases such as cancer, neurodegenerative and psychiatric disorders, amongst others.
In clinical medicine, it has become apparent that there is a need to develop and introduce advanced and new bioinformatics methodologies to answer the specific question of cancer.4 In order for cancer bioinformatics to be effective, the tools must thus concentrate on the communication, metabolism, proliferation, and signalling of the disease. In particular, cancer bioinformatics is expected to have a significant role in the identification and validation of biomarkers. For example, one of the strategies is to evaluate and monitor biomarkers at different stages and time points during cancer development. Identified as dynamic network biomarkers, these markers should compare with clinical informatics, such as patient complaints, history, symptoms, and therapies. In addition, these biomarkers should also correlate to biochemical analyses, imaging profiles, pathologies, physician’s examinations, and other measurements.5
For instance, through a genetic screen of hepatic cellular carcinoma, Sawey et al.6 discovered that a common alteration in liver cancer (11q13.3 amplification) causes the activation of the fibroblast growth factor 19 (FGF19), a hormone that regulates bile production with effects on glucose and lipid metabolism. In turn, through subsequent bioinformatics analysis with mouse models and RNAi, it was found that activation of FGF19 results in selective responsiveness to FGF19 inhibition. Therefore, Sawey et al. propose for the 11q13.3 amplification to be used as a biomarker for patients who, in all likelihood, will respond to anti-FGF19 therapies. In a somewhat similar approach, Baert-Desurmont et al.7 revealed that a combination of single nucleotide polymorphisms (8q23, 15q13 and 18q21 SNPs) could explain an increased risk for colorectal cancer.
Using genome-wide screening methods, aberrant expression profiles of microRNAs (miRNAs) have also been identified in human cancers, thus revealing their potential as diagnostic and prognostic biomarkers of cancer.8 Now, in order to infer the regulatory processes of miRNAs, bioinformatics approaches are fundamental. For example, Laczny et al.9 developed a comprehensive and integrative tool, called miRTrail, to generate reliable and robust data on deregulated pathogenic processes which could offer insights into the interactions between genes and miRNAs. In fact, the use of miRTrail on melanoma samples demonstrated how this platform opened new avenues for investigating a wide range of diseases, including cancer.
In clinical practice and medical research, medical image processing facilitates the accurate, initial detection and diagnosis of cancer. Indeed, medical imaging - imaging in clinical pathology, nuclear magnetic resonance imaging, positron emission tomography, and ultrasonic computed tomography - is one of the most important factors in the application of cancer bioinformatics. Kimori et al.10 for instance, used a mathematical morphology-based approach to enhance fine features of a lesion with high suppression of surrounding tissues. Here, the effectiveness of the method was evaluated in terms of the contrast improvement ratio as applied to three kinds of medical images: a chest radiographic image, a mammographic image, and a retinal image.
Overall, the aim of cancer bioinformatics is to continue developing tools so that the right treatment is provided to the right patient at the right time, based on the characteristics of each patient’s tumour; in other words, tailored bioinformatics.
It is known that the economical and societal costs of neurodegenerative diseases are accelerating. Therefore, there is a demand to find new solutions to resolve the situation.11 However, having said that, progress in this area has proved to be challenging. In part, this is because the cause of diseases such as Alzheimer’s (AD) or Parkinson’s disease (PD) is not known, making them difficult to understand.12 In addition, while understanding these diseases on a molecular level could lead to the development of better biomarkers and treatments, the enormous amount of data involved renders it an arduous task. For this reason, bioinformatics approaches are used to manage data from high-throughput technologies, pushing forward the frontiers of this field.
In regard to late onset AD and PD, both have an obvious genetic component, however, their genetic architecture is complex, with just a few, constant, associated risk factors. It is therefore possible that undiscovered AD and PD-related genes exist. Kim et al.13 using biomedical text mining, were able to pinpoint genes that have a direct relationship with both neurodegenerative diseases. In another approach, Hofmann-Apitius et al.12 developed a bioinformatics and modelling method based on patient data available to the public. Here, the work presented was driven through AETIONOMY, a public- private partnership between the European Union and the pharmaceutical industry association EFPIA.
ALS, short for amyotrophic lateral sclerosis, is another neurodegenerative disease, but one that affects nerve cells in the brain and the spinal cord. To date, there is a vast volume of data capturing this motor neurone disease. In consequence, there is a corresponding need for storage and interpretation. In keeping with this, Abel et al.14 presented an ALS online bioinformatics database (ALSoD) combining genotype, phenotype, and geographical information with associated analysis tools. Likewise, PRO-MINE (PROtein Mutations In NEurodegeneration)15 is a database describing all TDP-43 disease mutations identified up to now; TDP-43 is a multifunctional RNA-binding protein found in AD, ALS, and also frontotemporal lobar degeneration.
Allergic and Psychiatric Disorders
In 2008, TIME magazine named 23andMe the invention of the year. 23andMe provides a home-based saliva collection kit that decodes the genomic DNA of adults and interprets their genetic health risks, with results accessible online. In particular, it tests for ten diseases, including AD, PD, and some rare blood diseases. It is important to note that the 23andMe kit describes if an individual has a higher risk of developing a disease but it is not intended to diagnose disease. It is meant to provide information that can be used to inform life decisions.
Using the 23andMe gene pool, Hyde et al. 16 discovered 15 genetic loci associated with a risk of major depression in people of European descent. In a similar approach, genome-wide analyses for personality traits identified 6 loci with correlations to psychiatric disorders.17 In addition, through a multi-trait analysis of a genome-wide association study, Turley et al.18 identified loci for depressive symptoms, neuroticism, and subjective well-being. Using this 23andMe gene pool, scientists have also discovered that asthma, eczema and hay fever share a genetic origin, in part due to shared genetic risk variants that dysregulate the expression of immune-related genes.19
Overall, clinical bioinformatics is the critical step to discovering and developing new diagnostics and therapies for diseases. Here, we described cancer, neurodegenerative and psychiatric disorders, however, bioinformatics has been used in other disorders as well, such as acute rejection after renal transplantation20 and lung diseases.21 In addition, bioinformatics has been used in studies of model organisms such as Saccharomyces cerevisiae (yeast), Drosophila melanogaster (flies), and Mus musculus (mice), which in turn shed light onto non-model organisms such as humans.
It is evident that bioinformatics will continue to push the boundaries of medicine and shape clinical testing for the future. Just like microscopes, computers have become a requirement, and the job of a bioinformatician is now an integral part of research laboratories and also in the clinical setting. In the future, success will depend on improved analytics, annotations, software to deliver this information, and systems to capture the realised knowledge.22
1. Hogeweg P. The roots of bioinformatics in theoretical biology. PLoS Comput Biol 2011; 7(3):e1002021.
- Guffanti A, Simchovitz A, Soreq H. Emerging Bioinformatics Approaches for Analysis of NGS-Derived Coding and Non-Coding RNAs in Neurodegenerative Diseases. Front Cell Neurosci 2014; 8:89.
- Wang X, Liotta L. Bioinformatics: A New Emerging Science. Journal of Clinical Bioinformatics 2011; 1(1):1.
- Wu D, Rice CM, Wang X. Cancer Bioinformatics: A New Approach to Systems Clinical Medicine. BMC Bioinformatics 2012; 13:71.
- Wang X. Role of Clinical Bioinformatics in the Development of Network-Based Biomarkers. Journal of Clinical Bioinformatics 2011; 1:28.
- Sawey ET, Chanrion M, Cai C, et al. Identification of a Therapeutic Strategy Targeting Amplified FGF19 in Liver Cancer by Oncogenomic Screening. Cancer Cell 2011; 19(3):347-58.
- Baert-Desurmont S, Charbonnier F, Houivet E, et al. Clinical Relevance of 8q23, 15q13 and 18q21 SNP Genotyping to Evaluate Colorectal Cancer Risk. Eur J Hum Genet 2016; 24:99-105.
- Lan H, Lu H, Wang X, Jin H. MicroRNAs as Potential Biomarkers in Cancer: Opportunities and Challenges. Biomed Res Int 2015; 2015:125094.
- Laczny C, Leidinger P, Haas J, et al. miRTrail - a Comprehensive Webserver for Analyzing Gene and miRNA Patterns to Enhance the Understanding of Regulatory Mechanisms in Diseases. BMC Bioinformatics 2012; 13:36.
- Kimori Y. Mathematical Morphology-Based Approach to the Enhancement of Morphological Features in Medical Images. J Clin Bioinforma 2011; 1:33.
- Paananen J. Bioinformatics in the Identification of Novel Targets and Pathways in Neurodegenerative Diseases. Current Genetic Medicine Reports 2017; 5:15-21.
- Hofmann-Apitius M, Ball G, Gebel S, et al. Bioinformatics Mining and Modeling Methods for the Identification of Disease Mechanisms in Neurodegenerative Disorders. Int J Mol Sci 2015; 16(12):29179-206.
13. Kim YH, Beak SH, Charidimou A, Song M. Discovering New Genes in the Pathways of Common Sporadic Neurodegenerative Diseases: A Bioinformatics Approach. J Alzheimers Dis 2016; 51(1):293-312. 14. Abel O, Powell JF, Andersen PM, Al-Chalabi A. ALSoD: A User-Friendly Online Bioinformatics Tool for Amyotrophic Lateral Sclerosis Genetics. Hum Mutat 2012; 33(9):1345-51.
- Pinto S, Vlahovicek K, Buratti E. PRO-MINE: A Bioinformatics Repository and Analytical Tool for TARDBP Mutations. Hum Mutat 2011; 32(1):E1948-58.
16. Hyde CL, Nagle MW, Tian C, et al. Identification of 15 Genetic Loci Associated with Risk of Major Depression in Individuals of European Descent. Nat Genet 2016; 48(9):1031-6.
- Lo M-T, Hinds DA, Tung JY et al. Genome-Wide Analyses for Personality Traits Identify Six Genomic Loci and Show Correlations with Psychiatric Disorders. Nat Genet 2017; 49(1):152–156.
18. Turley P, Walters RK, Maghzian O, et al. Multi-Trait Analysis of Genome-Wide Association Summary Statistics Using MTAG. Nat Genet 2018; 50(2):229-237. 19. Ferreira MA, Vonk JM, Baurecht H, et al. Shared Genetic Origin of Asthma, Hay Fever and Eczema Elucidates Allergic Disease Biology. Nat Genet 2017; 49(12):1752-1757.
- Wu D, Zhu D, Xu M, et al. Analysis of Transcriptional Factors and Regulation Networks in Patients with Acute Renal Allograft Rejection. J Proteome Res 2011; 7;10(1):175-81.
21. Chen H, Song Z, Qian M, Bai C, Wang X. Selection of Disease-Specific Biomarkers by Integrating Inflammatory Mediators with Clinical Informatics in AECOPD Patients: A Preliminary Study. J Cell Mol Med 2012; 16(6):1286-97.
- Oliver GR, Hart SN, Klee EW. Bioinformatics for Clinical Next Generation Sequencing. Clin Chem 2015; 61(1):124-35.