Drug Discovery AI & Automation |

Virtual Screening, Experimental Accuracy

Vladimir Chorošajev, Machine Learning Researcher

| Technology Showcase

17 September, 2025

Watch time: 12 minutes

Summary Cortex Discovery is a startup based in Munich focused on providing extremely accurate machine learning models for hit discovery and lead optimisation phases of drug discovery. Vladimir Chrorošajev, Machine Learning Researcher at Cortex Discovery, posed the question Why bother focusing on this task? Many structure activity related models are typically trained to do a single task, such as predicting activities for one specific assay or one specific modality, which limits accuracy. Meanwhile, at Cortex Discovery, the aim is to break accuracy barriers by capitalising on deep learning methods to train models on all the publicly available data and adding proprietary data to this. Chorošajev explained that this allows them to break accuracy thresholds on traditional quantitative structure-activity relation models. The platform is an extensive neural network that ingests millions of separate molecules over thousands of separate assays to derive the most efficient compressed representation, a multi-dimensional vector that would encode all the information related to activity. These models are trained on all available assays, which comprise over 5,000 assays, 3.5 million molecules, and 750 million data points. Chorošajev explained that these deep learning models generate context-aware, multi-modal representations of molecules that can generalise across assay types, including binary, regression, and dose-response. Although one could argue that this context-aware representation applies to standard models, Chorošajev suggests that it is more applicable in Cortex Discovery’s models because of their multi-modal nature and ability to learn binary and continuous regression data and dose response curves. Furthermore, the models match experimental reproducibility accuracy, achieving AUC scores around 94-95%. The key value proposition of this service is the ability to extract results from small-scale HTS assays of around 25,000 – 50,000 molecules, which can save time and costs. Additionally, the models provide predictions for efficacy but also for off-target effects, ADME tox, and counter screens. Cortex Discovery’s in-house pipeline mostly focuses on age-related disorders and longevity, but the company has also had success in the oncology field. Chorošajev and his team have discovered novel oncology compounds and GP41 inhibitors. Alongside this, they have identified NRF2 activators, 4 mTOR inhibitors, and DNA enhancers for aging-related research. Now, Cortex Discovery is developing methods for binding affinity prediction without prior HTS data. Chorošajev uses reinforcement learning and force-field optimisation to speed up free energy calculations

Hello. I'm Vladimir from Cortex Discovery, and we are a startup based in Munich with a very narrow area of focus, which is providing extremely accurate machine learning models for hit discovery and lead optimisation phases of drug discovery.

So why would we even focus on this task? The issue is that for any kind of structure, activity related models, traditionally, they were always trained to do a single task, which means predict activities for one specific assay, one specific modality, etc.

We think that using modern deep learning, we can break a lot of accuracy barriers by training models on all of the publicly available data, adding proprietary data to that, and this is what allows us to break the accuracy thresholds on traditional quantitative structure activity relation models and reach accuracies which are exactly the same. Experimental reproducibility helps.

So a foundation model is just a deep learning model that's trained to predict activities on all the available assays at the same time. This includes binary activity assays, regression assays, dose response assays, and that also includes both molecular, cellular assays and even in some cases, whole organism measurements.

So rather than relying on a set of features defined by hand, which means by humans, by chemists, we think that best representation of an arbitrary molecule can be derived by deep learning systems to encode all the information that makes the molecule active or not active across a wide range of tests.

So, the platform is very big neural network, which ingests millions of separate molecules over thousands of separate assays to derive the most efficient compressed representation, a multi-dimensional vector that would encode all the information related to its activity, which automatically and implicitly involves the [unclear] of any given molecule that structure, that dissociation parameters, physical chemical parameters, and all that.

The learning system is based on a large database assembled from both public and proprietary data, which currently has more than 5000 assays. Uh, more than three and a half million molecules and 750 million data points.

Why would it even be echoed? So, first of all, the model is trained on very large amounts of data. It learns patterns to relate activities on certain passes versus activities on completely different assays. It goes beyond static fingerprints, which are traditionally used in making quantitative structure activity relationship as models and for every molecule, its interpretation. The interpretation of its latent feature space is dependent on the assay in question.

So the benefits of this approach is first, the context aware representation, where the nature of the assay investigated determines how the molecular features will be interpreted. It's arguable that this applies to standard models as well, but we think that it's more applicable in that case and the multi-modality of the models in the sense that it learns to distinguish how to best learn from both binary continuous regression data, dose response curves, etc,.

And it learns robustness to noise, because all the physically done HTS assays are done under different conditions, the model might learn how to best reconcile the differences between similar assays or assays done on the same target, etc.

So the model doesn't just learn what binds to what in a specific assay context. It learns how biological systems respond to certain chemicals.

So to give an example of what they mean by experimental reproducibility, accuracies. There is a measure called AUC very roughly. It's if you took two molecules in the library, what is the probability that the model would render them correctly in terms of activity, one above the other. So typical numbers that we see with the models are in the 95% range, which is pretty much what the experimental reproducibility is.

If you repeat the same experiment on the same assay due to some experimental noise, you will get results different by approximately this much. So, for example, for cancer cell line toxicity assays the average AUC scores are in the vicinity of 94 to 95% which is exactly the same of experimental reproducibility, one might naturally ask, okay, this is just some numbers in machine. How does that generalize to real life?

Well, the answer is that it does, and we have completed multiple tests with independent partners and multiple projects that demonstrate that prediction capability of the technology is validated in the real life by doing predictions on libraries, ranking the hits and then validating the hits in a wetlab.

So some examples are out of six predicted hits from a library for Covid 19 antiviral assay, we got five hits with very good potency. For set of molecules, in an oncological context, we have achieved predictions AUC which very similar to reproducibility assay based on the cell viability threshold, a number of new GP41 inhibitors were discovered in the same way.

How does that translate into service? So the value proposition is that take it one small scale HTS assay, which is around 25 to 50,000 molecules.

We have the ability to retain the same accuracy, to extrapolate the assay to any purchasable library, at any virtual library and still have the same quality of results that you will have by actually doing the wet lab high throughput screening assay on the libraries of millions. So another benefit of this approach is that all the assay predictions come at the same time, this means that you don't need to do separate assay measurements for off target activity, for ADME tox, for counter screens such as fluorescence, this is already included, and you can run rank the molecules and after the virtual measurement in any way defined on your personal criteria for success.

So another thing with this approach is that we don't only have the model predict that specific values for a certain assay, we also have the model predict its own measure of how certain it is about it's about the outcome.

So measurements, in which we the model is incredibly confident are shown as such, and they have a lot certainty in the output. And this is well aligned to the actual results that you would get from an experimental measurement. So if the model doesn't know what the compound will do on a specific assay, it will [unclear].

Now for all the services work, we have libraries from best known publishers available for predictions organized by the type of compound and also organized by potential approval speed, which means that we can very quickly do predictions, For example, for repurposing types of molecules and nutraceuticals, and this goes up to virtual libraries, where generative model can guide it by scoring, can explore the chemical space as it sees fit to produce the best synthesizable molecules for a specific project.

Our in house pipeline is mostly focused on age related disorders and longevity. So using this approach, we have already 3 novel NRF2 activators, 4 novel mTOR inhibitors, sorry, and a number of DNA repair enhancers. And we can also now claim that a compound discovered by this pipeline, it's experimentally measured to in vivo, increase the lifespan of C elegans by 33%. This is quite a recent result done in collaboration with Ora Biomedical in the US, and we do have a number of collaborations and partners which already, which is already using the technology to save quite a lot of time and expenses on doing HTS.

There is also something new in the pipeline, which is a little bit beyond the scope of the talk, but I'm very willing to talk about this later, which is, what do we do if we don't have available data as an unavailable HTS measurement, and we have to approach that is aimed to achieve free energy per probation accuracy for scoring binding affinity, and very small fraction of the time it's based on horse training of energy function and a reinforcement learning system that would propose docking poses and score them, while another neural network simultaneously learns how to best modify the force fields used to save on computational time and to give a better representation of the energy light scan. So this is if anybody would like to talk about this after the presentation.

But just to reiterate, we would like to present the system that enables extrapolating high throughput screening with additional benefits to massive libraries of millions of molecules, potentially billions, if we include virtual libraries at very low cost and at the accuracy which is exactly the same as the experiment.

Thank you.

Virtual Screening, Experimental Accuracy

Cortex Discovery

More from this sponsor

What to watch next

You're just a click away