Transcriptomics in predictive toxicology


Written by tdarde

March 14, 2022

One of the challenges in toxicology is to be able to extrapolate the results of the different phases of risk analysis from experimental systems to human populations. Animal models in particular, although widely used, often present differences in terms of substance clearance or enzymatic activity. For these practical reasons, but also for ethical, political, and economic reasons, laboratories are being asked to make major efforts to replace these models with other alternatives, to reduce their use to a minimum, and to refine experimental strategies to minimize the stress and pain of the animals (the “3Rs” principle). It is in this context that chemical risk prevention and management organizations have turned to computational toxicology and more specifically to predictive toxicology. These methods consist in the extrapolation of known information associated with a molecule to predict the effect of this one or a similar molecule on Man and his environment via the determination of its toxicological “signature”. This signature can be of various kinds (physiological, molecular, genomic…) on an individual or its descendants after exposure to one or more factors (biological, physical, or chemical).

Transcriptomics data are relevant to address several challenges in toxicogenomics. After careful planning of exposure conditions and data preprocessing, the toxicogenomics data can be used in predictive toxicology, where more advanced modeling techniques are applied. The large volume of molecular profiles produced by omics-based technologies is constantly increasing together with a plethora of different methods that are made available to facilitate their analysis, interpretation, and the generation of accurate and stable predictive models.

Benchmark Dose Modelling

One of the main goals of toxicity assessment is the study of exposure-response relationships that describe the strength of the response of an organism as a function of exposure to a stimulus, such as chemical exposure, after a certain time. These relationships can be described as dose-response curves where the doses are represented on the x-axis and the response is represented on the y-axis. From these curves, a BMD value is calculated as the dose (or concentration) that produces a given amount of change in the response rate (called BMR) of an adverse effect. In the last years, dose-response studies have been integrated with microarray technologies, thus introducing gene expression as an additional important outcome related to the dose. Indeed, the genes whose expression changes over the dose are of particular interest, since they provide insights into efficacy, toxicity, and many other phenotypes. A specific challenge is to identify genes with expression level changing according to dose level in a non-random manner, identifiable as potential biomarkers. A classic BMD modeling pipeline involves fitting the experimental data to a selection of mathematical models, such as linear, second-, or third-degree polynomial, exponential, hill, asymptotic regression, Michaelis–Menten models, etc. Among all, the best model is selected by using a goodness of fit criteria, such as the Akaike information (AIC) or the goodness-of-fit p-value. A predefined response level of interest, called BMR, is identified and the optimal model is used to predict the corresponding dose (BMD).

Gene Co-Expression Network Analysis

Gene co-expression network analysis is a systems biology method used to describe the correlation patterns among genes across different experimental samples. It allows representing, investigating, and understanding the complex molecular interactions within the exposed system. The genes and their interactions are represented as a network (or graph) where the genes are the nodes of the network and their strength of similarity is represented as weighted edges between the nodes. To understand the nature of cellular processes, it is necessary to study the behavior of genes by means of a holistic assessment. Thus, the inference of gene co-expression networks is a powerful tool for better understanding gene functions, biological processes, and complex disease mechanisms. Indeed, co-expression network analysis has been widely used to understand which genes are highly co-expressed within certain biological processes or differentially expressed in various conditions. They are also used for candidate disease-related gene prioritization, functional gene annotation, and identification of regulatory genes.


The main assumption of read-across studies is that structurally similar compounds are likely to share a similar toxicological profile. These approaches are used to fill toxicological data gaps by relating to similar chemicals for which test data are available. Traditional read-across studies rely only on the similarities between the chemical structure of the compounds. Different measures have been proposed to compute the chemical structure similarity and also multiple tools for read-across, mainly based on the nearest neighbor algorithm, have been developed. However, these approaches are limited to the fact that chemistry cannot explain the complex biological processes that are activated by substance exposure. Toxicogenomics datasets, such as DrugMatrix, Connectivity Map (CMAP), and LINCS 1000, can be used to profile the biological fingerprint of multiple chemicals and allow to compare the measured compound with a huge number of tested chemicals at the transcriptomic levels. Thus, the assumption underlying related read-across studies could be that if two chemicals have similar biological profiles they have a similar adverse outcome. Biological-based read-across could be complemented to the structure-based read-across.

Adverse Outcome Pathways

Adverse outcome pathway (AOP) is a conceptual framework that couples existing knowledge on the links between a molecular initiating event (MIE), such as contact of nanomaterial with Toll-like receptors on the cell surface, with the activation of a chain of causally relevant biological processes or key events (KE), e.g., the production of inflammatory cytokines, with the resulting adverse outcomes (AO) at the level of the organ or the organism (e.g., lung fibrosis). Coupling of gene expression profiling with the bioinformatics-driven placement of the results into AOP descriptions has the potential for quantitative analysis of adverse effects that combines in vitro-derived mechanistic analyses with causally relevant modes of action and related key events. As AOPs can span different cell types, numerous in vitro assays may need to be associated with a single one. The details of the coupling are still being worked out by the community but mapping the results of pathway analyses to KEs is a simple alternative. For example, if the bioinformatics results cover all of the KEs in the chain leading up to an AO, then the AOP could be considered active. The point-of-departure concentration might be defined as the lowest concentration where all of the KEs are activated. Naturally, this depends on the type of model systems used and its limitations.

SciLicium offers flexible, targeted integrated toxicogenomics solutions to meet your needs at various phases of your study.

Adapted from: Transcriptomics in Toxicogenomics, Part III: Data Modelling for Risk Assessment

You May Also Like…

2 years at SciLicium

2 years at SciLicium

SciLicium is 2 years old! I wanted to take this opportunity to make a brief retrospective of this adventure and...