COT/COM/COC Annual Report 2024

Horizon scanning

Last updated: 11 December 2025

2.15          At the June 2024 meeting, Alexander Kalian from King’s College London gave a presentation that reported findings from his PhD, supported by the UK Food Standards Agency, which aims to develop AI-driven models to improve the assessment of toxicity related to food. Of interest to COM is the use of such technology to predict mutagenicity. At present, food safety hazard assessments are carried out using experimental, analytical, and computational approaches but all of these have potential limitations including scientific validity, ethical considerations, and cost effectiveness. Of the currently available computational approaches, QSAR models are widely used to predict activities of molecules without data as they are very broad and versatile, however the models are very data intensive.

2.16          An AI-driven QSAR model utilising SMILES and deep learning (neural networks) was developed by the speaker which determined mutagenicity in a binary classification (YES/NO) with 78% accuracy (checked against Ames data). The model was further developed to use a convolutional neural network approach, which looks at aspects of images. As molecules are graph structured data, and may not fit into image analysis easily, graph convolutional neural networks (GCN) were developed to achieve this. In addition, the speaker evaluated the use of Explainable AI (XAI) with the model to determine the reasoning behind the mutagenic predictions made, and to mine structural alerts. The model (incorporating node enrichment) was used to predict the mutagenicity (YES/NO) of 5625 molecules, for which Ames data is available, and the output compared to that obtained using a language-based transformer model. An accuracy of between 74% and 78% was achieved (depending on node features used). This represents 85% AUC (area under the curve) which is comparable to other available models, with the transformer model also having 84% AUC (now retrained to give 90% AUC).

2.17          When XAI was used to mine structural alerts from the GCN model, an accuracy of 85% was achieved (using a threshold of 0.7). Very similar identification of fragments (mutagenic and non-mutagenic) was obtained using the language-based transformer model, but not using the QSARpy model and this requires further investigation. In addition, some identified structural alerts did not make complete sense and this also needs investigation. Prior to releasing the model for public use, the OECD guidelines require formalisation of the identity of its applicability domain, and it will also need to be applied to different toxicological endpoints.

2.18          During discussions, the model’s ability to assess the possibility of positional (stearic) hinderance was queried, which may be the reason why some fragments that are initially identified as DNA reactive are not so. The speaker replied that many of the fragments identified are very similar and while it is theoretically possible to look at positional hinderance, the false positives may also be due to other fragments being present, so the reasons are likely to be multifaceted. A member also asked whether the 3D structure of the molecule was important in determining whether it is DNA reactive. The speaker replied that the model developed here utilised fragments rather than 3D structure, however, there are examples where stearic chemistry is important to DNA reactivity and that the influence of stearic chemistry is often neglected as it is difficult to study and would need a more advanced model. Suggestions were made to the speaker by a member of COM to address some of the potential issues with the model.

2.19          The Chair thanked the speaker on behalf of the Committee and concluded that these approaches are not used at a regulatory level at the moment. However, these tools show how current approaches may be replaced in the near future and it is important that COM is prepared and understands them.

Presentation from Paul Rees, Swansea University, on Artificial Intelligence and mutagenicity data

2.20          At the June 2024 meeting, Paul Rees from Swansea University provided a presentation on Artificial Intelligence and mutagenicity data. The speaker outlined a case study to show how traditional machine learning is used to evaluate the cell cycle using a set of label free flow cytometry images. CellProfiler is used to extract the cell features following training of the model (supervised machine learning) with features from annotated images obtained using biomarkers for different parts of the cell cycle. AI models can provide a classification for the cell without adding cell stains (label free) with an accuracy of around 90%; it is important for some applications that cell biomarkers are not used. In addition, regression analysis has been used to predict DNA content from label free cell images.

2.21          Paul Rees noted that deep learning (neural networks) is a key concept in AI, but this does not have the same knowledge base as traditional learning. AI and deep learning are built around an artificial neuron which forms a neural network, and artificial weightings are given to determine how well they are connected. Although these have been around for 60 years, it is only now that computers are fast enough to develop deep learning. A commonly used network is the convolution neural network and the speaker outlined how this is used to synthesise an array (matrix) of numbers from the input image to allow matching with matrices from training images. Deep neural networks have been used to score micronucleus images for nine different phenotypes (from mononucleate to tetranucleate) with an accuracy of 96% (compared to human scoring). Label free detection has also been applied to leukaemia cells which reduces analysis (diagnostic) time considerably, to look at the change in morphology of red cells on storage, and to classify pollen grain in Arctic ice.

2.22          Another type of neural network is object detection, and this has been used to identify binucleated cells with micronuclei with 100% accuracy, following minimal training (175 binucleated cells with micronuclei images). Without retraining, the system detected tetranucleated and trinucleated cells, with and without micronuclei, with an accuracy of 90%. Other developments include the evaluation of cell painting to detect genotoxic events in cells, which is an unbiased cell profiling method. The greatest use of the technique has been for drug discovery, but it has now been applied to look for genotoxic changes. Detection of micronuclei, gH2AX foci, fragmented nuclei etc., was achieved using CellProfiler (previously trained) in the same CellPainting pipeline. This has important advantages as very large, freely available datasets for chemical structure, imaging and gene expression have been developed using cell painting and these will be able to now be an available resource to support future work.

2.23          During discussions, a COM member asked how independent the variables are in the model and can additional ones be added easily. The speaker replied that you do not have to start from scratch as the variables are independent and so you just introduce the new ones. A member also asked how to ensure that the available classifiers have been validated. The speaker replied that expert scientists need to produce annotated data sets, so we have known valid sets to use. It is also possible that, in the future, the datasets will need to be regulated to help regulatory submissions where this data is used. A comment was made that it is likely there will be an OECD guideline for using AI for genotoxicity assessment in the future. A point of clarification was also given that, at present, Cell Painting data is only being used at the early stage of drug discovery and is not being seen by regulators. A member also asked what level of accuracy has been obtained with the deep learning approach and the speaker replied that it has not been taken past the 90%, obtained with machine learning, as the availability of images to develop a classification set is limited at the moment.

OECD Test Guideline Programme

2.24       At the COM February 2024 meeting, members were informed that United Kingdom, along with several other countries, would submit a Standard Project Submission Form (SPSF) to the OECD to propose adaptations to the test guideline in vitro micronucleus assay (TG487). These adaptations would aim to include updates to the considerations necessary for evaluating nanomaterials. An independent interlaboratory trial is required to achieve this, which is the primary focus of the SPSF project. The trial would aim to provide the necessary data and evidence to facilitate the adaptation of TG487 for nanomaterials. The proposal is scheduled to be presented at the April OECD meeting, primarily for sign-off. The project is anticipated to commence in early summer.

2.25       In June 2024, the members were informed that the process of generating data for this project is currently in progress and the committee is expected to hear more information about this in the upcoming meetings.