0:00
First of all, we commonly encounter rapid metabolism leading to high clearance and low oral bioavailability that we need to improve.


0:10
Of course, we can also observe the formation of reactive or potentially toxic metabolites.


0:16
We have to take into account the effect of genetic polymorphisms on exposure in certain patient populations and of course the potential for drug interaction co-administering compounds that can lead to variability in exposure.


0:31
So these lead to a number of common questions relating to drug metabolism.


0:36
So first of all, in a project we want to understand WhichEnzymes are responsible for the metabolism of my compound.


0:44
We want to reduce the risk of unexpected metabolism issues downstream in our discovery and development projects.


0:51
We want to think about reducing that risk of drug-drug interactions as early as possible in our projects.


0:56
So identifying compounds ideally with multiple routes of clearance such as multiple enzyme families or isoforms, so that inhibition of one of those will not necessarily dramatically affect the exposure of our compound.


1:11
Of course, we also need to identify what metabolites we formed from a compound.


1:15
We want to help with that interpretation of those metabolite ID experiments to make them more efficient and more effective.


1:22
And of course, having identified the metabolites, we want to identify if any of them are likely to be active, reactive, or heaven forbid, toxic.


1:30
Again, to eliminate that risk as early as possible.


1:34
When we're moving downstream and beginning to look at preclinical animal studies, we want to confirm which of those species are most likely to give you coverage of the metabolites that will be observed in humans so that we're ensuring appropriate coverage.


1:50
And of course, if a compound that we're interested in or a series that we're interested in is metabolically unstable, we want to guide that design of compounds to improve that metabolic stability, reduce the clearance and improve oral bioavailability.


2:05
So what I thought I would do is, it's not time to go through all of these in detail today, but just pick a few of these challenges and illustrate how in silico models can really help us to address those efficiently and early on in our project.


2:18
So first of all, the first question there, what enzymes are responsible for metabolism of my compounds of interest.


2:27
So I think one of the challenges here is rightly we focus early on in a project on microsomal stability.


2:34
But if we focus purely on that, it can lead to unexpected metabolism by other enzymes when we move forward to late stage studies.


2:43
One example of that, or a couple of examples of that are falnidamol and carbazeran.


2:48
Both of these actually failed in the clinic due to poor oral bioavailability and exposure, despite the fact that they had past preclinical PK studies in multiple species.


2:59
Ultimately it turned out that this was due to unexpected metabolism by aldehyde oxidase and resulted in wasting millions of dollars in clinical trials.


3:10
One way we can address that is our model we call WhichEnzyme, which identifies the enzyme families most likely to metabolise a compound.


3:18
So if we look at both falnidamol and carbazeran, this sort of pie chart shows us the relative likelihood of metabolism by each of the enzyme families we model.


3:27
And clearly you can see from this sort of maroon section that these are very likely to be extensively metabolised by aldehyde oxidase.


3:37
And of course, if we had identified that early on, we could have prioritised the appropriate experiments, whether we're looking in hepatocytes, but also PK studies and appropriate species such as Guinea pigs or rhesus monkeys.


3:49
Because in practise these compounds have been studied in rats and dogs, both of which have little or no aldehyde oxidase activity.


3:57
So prioritising the right experiment early on in our projects can save millions of dollars in late stage failures.


4:07
If we look at the risk of drug interactions, one potential source of that risk is if a compound is predominantly metabolised by a single isoform of a single enzyme family, then of course inhibition or induction of that isoform by a co-administered drug can radically change the exposure of that compound.


4:27
Classic example of that is simvastatin which is contraindicated for co-administration with antifungal such as fluconazole and ketoconazole.


4:36
That's because simvastatin is primarily metabolised by CYP3A4 and so gluconazole and ketoconazole both are very potent inhibitors of 3A4 and inhibition of that enzyme again dramatically it changes this, the exposure of simvastatin.


4:51
And again, if we look at the predictions in our models, you can see first of all it's predicted to be predominantly metabolised by P450.


4:58
And then we have a second model for the P450 isoforms that identify those most likely, those isoforms most likely to be responsible for metabolism.


5:06
So here again you can see not only is it extent predominantly metabolised by P450s, but also specifically by CYP3A4.


5:14
If we see a signal like this early on in our project, what this would suggest is we should prioritise our metabolism phenotyping experiments to validate that and derisk the project early on and maybe consider alternative routes of for optimization of those compounds.


5:34
And then as we move forward, of course, we want to identify the metabolites of our specific compound of interest.


5:39
And these experiments are of course very time consuming and expensive.


5:43
So in silico methods can also help to address this.


5:47
So if we consider a molecule, here I'm illustrating dextromethorphan and think of all of the possible metabolites that might form in two generations.


5:55
You can see the sort of spider's web where dextromethorphan is in the middle.


5:58
We've got the first generation and second generation of metabolites there.


6:02
There is an enormous number of those and in practise only a small number are observed in practise.


6:08
You see these 4 green little cards here.


6:12
And so the question is if I'm doing my metabolite ID experiment, many of those will have, you know, the same exact masses.


6:19
How do I actually go fishing in this great big pool of potential metabolites to identify those who are actually observing in practise in our met ID experiments?


6:28
And hence which metabolites will a patient ultimately be exposed to in practise?


6:34
Just in terms of the statistics, looking at all the potential metabolites of course identifies all those that are observed in practise.


6:40
So the sensitivity is 100%, but the precision is 2%, i.e., it’s a 50 fold over prediction in potential metabolites relative to those observed in practise.


6:52
But if we can accurately predict which metabolites are likely to form and our mechanistic approach gives us this greater precision, we can see here a much simpler network.


7:02
There's dextromethorphan, here are the predicted potential metabolites that are most likely.


7:07
And of course of those we see all four of the experimentally observed metabolites.


7:12
So this really helps us to reduce that pool and prioritise our experimental efforts on identifying the correct metabolites in this case.


7:21
So we can save a huge amount of time and effort.


7:23
But also having identified these most likely metabolites, we can in turn make predictions about their properties and their activities in order to identify potential liabilities.


7:37
And then of course, inevitably in our medchem projects, we find this tension very commonly between activity and metabolic stability.


7:44
So how can we actually guide the design of new compounds with better metabolic stability?


7:53
And what you see is very frequently, you know, high turnover or intrinsic clearance or a short half-life in human living microsomes that typically indicates rapid metabolism by P450s of course.


8:04
And what we can do is identify these hotspots for metabolism.


8:08
And here you can see for the isoforms that are predicted to metabolise at this compound here, what are the predominant sites of metabolism for those different isoforms.


8:16
And there are slightly different relationships and ratios there.


8:20
But having identified that, we can think about what sites we need to block.


8:25
But the mechanistic approach that we use that I'll talk about in more detail gives us actually more information than that.


8:31
It allows us to look at the metabolic lability or vulnerability of those sites in absolute terms.


8:36
So how reactive are they to the different pathways catalysed by cytochrome P50s?


8:42
And this is shown by this sort of metabolic landscape.


8:45
So you can see here, although there are two sites that are predominantly observed in these compounds, there are actually three highly labile sites.


8:54
What this suggests is that actually we've got more work to do.


8:57
If we block just the first two sites here, what we're likely to see is metabolic switching and extensive metabolism at this third site.


9:04
So only a small effect on the overall intrinsic clearance, for example.


9:09
So we'd actually have to block all three of those to get down to much more stable sites and have a significant effect on the metabolic stability.


9:17
So how does this actually look in practise?


9:19
Well, this is a published case study we published looking at Buspirone analogues.


9:24
So Buspirone is an anti-anxiolytic.


9:27
It's a 5-HT 1A ligand.


9:29
It's got excellent receptor affinity but poor oral bioavailability due to metabolism by CYP3A4.


9:35
And if we look in vitro, the in vitro half-life in this particular experiment was just under 5 minutes.


9:42
Now it's metabolised actually at 3 positions.


9:44
These are the positions that are observed experimentally.


9:47
And so this project, the goal was to maintain the receptor affinity at least within an order of magnitude but substantially improve the half-life.


9:56
In this case, we wanted a greater than threefold improvement in the in vitro half-life with respect to CYP3A4. And cutting a lot of work very quickly down to just a couple examples.


10:08
Here you can see that the models correctly predict those sites of metabolism, and the metabolic landscape indicates 2 sites particularly that are highly labile.


10:18
We can combine this into a single value, the composite site lability, which is a number between zero and one where a high value indicates a likelihood of rapid metabolism.


10:28
So 0.95 is very high.


10:31
What we were able to do is using this detailed information about the metabolic lability, we could actually think about how strategies for modifying and blocking that metabolism.


10:41
Here are a couple of examples, Analogue A and B, where you can see a dramatic improvement in this metabolic landscape eliminating those highly labile sites, significantly improving this composite site lability.


10:53
And indeed what you can see experimentally is that analogue A retained that receptor affinity but achieved at a half-life of 78 minutes.


11:01
So it's really overachieving our goal of a threefold improvement where analogue B achieved a less than 100 nanomolar activity and again very high metabolic stability.


11:13
So we were able to find strategies to balance activity with metabolic stability.


11:22
So those are just a few examples of the applications and the value that it can add in addressing some of these big questions and challenges that we face.


11:29
What I thought I would also do is say, well, how does it actually work in practise?


11:32
It's all very well to talk about the applications, but what makes these models different and special?


11:39
Well, fundamentally the approach that we take is highly mechanistic.


11:44
We spend a lot of time understanding at a quantum mechanical level the reaction mechanisms leading to metabolism by these different enzyme families and isoforms.


11:53
So actually for each site of metabolism, we perform a quantum mechanical simulation that estimates the reactivity to the appropriate metabolic reaction.


12:01
Here you can see this little video is running.


12:03
It's showing the mechanism of FMO oxidation.


12:10
The advantage of taking this approach is it's very transferable.


12:13
It's based on the fundamental physical and chemical principles.


12:16
It's not just a machine learning model fitting to patterns of data.


12:20
This also makes it much more quantitative.


12:23
And of course, these reaction mechanisms are independent of the specific isoform of the enzyme.


12:29
The mechanisms, the chemical mechanisms are all the same within a family.


12:34
However, we do know that those different isoforms exhibit different patterns of metabolism, and this is down to the different accessibility of sites of metabolism due to orientation within the different binding sites and steric hindrance caused by the structure of the molecule itself.


12:50
And so we also combine this quantum mechanical approach with descriptors that capture relationships between sites and metabolism and key features within the molecule.


13:01
So here, for example, if we're looking at metabolism at this para position on the phenol ring, there's a descriptor that captures this distance to this basic nitrogen, which many of you will know is very important, particularly for CYP2D6.


13:12
Because this basic nitrogen will bind to aspartic acid residue 301, it will orient this molecule side of the molecule away from the active oxyheme, making this much more accessible and increasing the likelihood of metabolism at that position.


13:26
So we combine these quantum mechanical simulations and the reactivity with these descriptors capturing steric and orientation effects using machine learning models.


13:37
These are trained on very high quality data.


13:39
We spend a lot of time curating and very carefully comparing the different experiments.


13:45
We actually discard a huge amount of data because the experimental conditions are just not physiologically relevant, for example very high concentrations in the assay.


13:54
And we rigorously test all of our predictions with independent test sets.


13:58
And so on the next few slides, I'll just give you a whistle stop tour of those statistics just to see the performance of these models.


14:04
But you'll see at the end, we have many publications.


14:06
We actually publish all the training and validation data and the data set.


14:10
So you can actually dive into more detail.


14:13
But if you look at P450 metabolism, we've been doing this for over 20 years.


14:18
In fact, this goes all the way back to my PhD in postdoc many years ago.


14:22
But we identify the sites and metabolism and metabolites for seven major human P450 isoforms.


14:28
This offers greater than 90% top three and greater than 80% top two accuracy on the independent test sets.


14:34
So predicting the correct sites of metabolism, the differences between the different isoforms as well.


14:40
And I mentioned this sort of metabolic landscape as well.


14:43
And of course, having predicted the site of metabolism, we can easily work out what the resulting metabolite will be and present that for investigation.


14:52
The big development over the past few years has been taking this mechanistic approach that we pioneered in P450 metabolism and generalising that across a broad range of phase one and phase two enzymes.


15:03
So for phase one, in addition to P450, we also now model aldehyde oxidase and flavin containing monooxygenases.


15:09
And we also model conjugation by UDP gluconal transferase and sulphur transferase, again offering greater than 80% accuracy across all enzymes and isoforms on the independent test sets.


15:24
When we look at the WhichEnzyme and WhichP450 models, these are more conventional QSAR models.


15:30
They're trained again on the same high quality data sets we use to train the site of metabolism models and rigorously test those with independent test sets.


15:37
Again, you can see here AUC is about 0.9 and accuracy is in top 2 in excess of 90% identifying the correct enzyme and the correct isoform of P450.


15:52
And then when we combine all of these, we can look at the accuracy with which we're predicting these metabolite profiles as well.


16:00
So we use heuristics that we've trained on a training set that combine the WhichEnzyme predictions with the WhichP450 predictions and the site of metabolism models.


16:09
We automate this over 2 generations.


16:10
This is the picture I showed you before.


16:12
We'd start with dextromethorphan, and we can look at the two generations of metabolism and again we see excellent sensitivity of about 80% and much greater precision than other knowledge based approaches.


16:24
So if you look at something like Biotransformer, which generates all the possible metabolites, the precision is only 5%.


16:31
So in other words, it has the same sensitivity but generates or predicts twice as many metabolites on average to achieve that same level of sensitivity.


16:40
So this mechanistic approach gives us this much greater precision in our predictions.


16:47
So just a quick word, if you're interested in using these, we actually offer this in two different platforms, Semeta and StarDrop.


16:54
Semeta is really tailored for DMPK scientists where StarDrop, excuse me, StarDrop combines these metabolism models with a very broad range of chemistry capabilities for small molecule design optimization and data analysis.


17:10
But both provide the same models and the same predictions that I've just illustrated to you.


17:16
And so if we go through those questions we can answer like WhichEnzymes are responsible for my compound’s metabolism using the WhichEnzyme and WhichP450 models.


17:24
We can also look at the sites of metabolism for each of those isoforms and enzymes.


17:29
We can reduce the risk of drug interactions with WhichEnzyme in WhichP450 by identifying compounds that are predominantly metabolised by 1 enzyme and one isoform.


17:39
We can look at the combine those models to predict all of the metabolites by those enzymes and help to improve those met ID experiments and then analyse the structures of those.


17:52
One thing I didn't mention is we have extended our P450 models to also cover mouse, rat and dog.


17:58
So we can look at predictions in those species and see if they are similar in terms of the coverage of metabolites to the human metabolism.


18:07
And then also looking at the sites of metabolism for all of these different isoforms and enzymes, we can actually help to guide the design of compounds with improved metabolic stability.


18:19
So if you'd like to find out more, you can visit us on stand 41.


18:22
I'll be happy to give you a live demo.


18:25
I'll show you our list of peer reviewed publications and you're very welcome to contact us.


18:29
We also have a webinar that goes into more detail on the topics that I've just covered today as well.


18:34
You're very welcome to download and or watch that online.


18:38
And I mentioned there are many publications.


18:40
Everything we do is published.


18:41
It's rigorously peer reviewed.


18:43
All the data sets, all the validation sets and so on are there for you to explore if you want to really get into the gory details of what underlies these models.


18:51
So thank you very much.


18:53
I'm happy to answer questions if there's time.