0:25 

I'm Susana Tomasio. 

 
0:26 
I'm a senior application scientist at Collaborative Drug Discovery. 

 
0:30 
I'm actually a computational chemist by training. 

 
0:33 
I've worked in the field of drug discovery for a few years. 

 
0:37 
And then for the past six years I've been working in the technical team of CDD of collective drug Discovery. 

 
0:43 
So mainly providing Technical Support for users and providing training. 

 
0:50 
So today in this presentation, I will talk about how CDD Vault is crucial in implementing Fair Data principles within research environments, and I will explore the main features within CDD Vault that support FAIR Data practises. 

 
1:08 
Specially, I'll be talking about the assay registration, the common assay template and the common assay template, which help to standardise your assays in a more general way. 

 
1:23 
OK, So this slide is an attempt to illustrate the main data challenges within research environments. 

 
1:32 
So challenges ranging from data siloes, right, caused by the departments working in compartments, developing their own unique systems and data and also the limited collaboration between those departments. 

 
1:53 
Then also the incompatibility between software and instruments. 

 
1:58 
So scientists working with different instruments, different software, different proprietary data formats and then kind of disrupting the smooth data analysis and integration efforts. 

 
2:14 
Of course, an additional challenge is working with multiple entities. 

 
2:19 
So you may be working with small molecules or biological entities and then also collaborate within with your research group or groups within your organisation. 

 
2:34 
And then as a consequence of these limitations is that we also need to keep the data FAIR. 

 
2:43 
So the data needs to be findable, accessible, interoperable and reusable. 

 
2:50 
And of course that the fair data doesn't necessarily need to be open, but the actual metadata needs to be open so that it can be reproduced outside of your organisation. 

 
3:04 
So it needs to be reproduced. 

 
3:07 
So the question is how can we change these landscape? 

 
3:10 
How can we overcome these data challenges? 

 
3:17 
So CDD Vault obeys data principles, the fair data principles of data management and it is a highly adoptable user configurable cloud based solution which is easy to set up and makes it accessible and an efficient solution within organisations of all sizes. 

 
3:43 
OK, so the central to CDD vault is the activity and registration system which allows users to register their assets. 

 
3:53 
So when I say assets, I mean your chemical structures, your molecules, your protein sequences, your cell lines, antibodies, you name it. 

 
4:04 
So your actual entities that can be small molecules or biological entities and but also as important as your asset registration is the assay registration, right? 

 
4:21 
So you want to have a database which is well populated with well defined assays, right? 

 
4:32 
And within CDD Vault, you can use what we call protocol fields and run fields in combination with the ontologies from the Pistoia Alliance, common Pistoia Alliance asset template in common asset template, which allows you to annotate your assays in the runs of your assays. 

 
4:56 
OK. 

 
4:57 
So I'm going to talk first about the asset registration, how you can register those in CDD Vault and then we'll follow up with the assay registration. 

 
5:09 
So for the asset registration, of course we are we use the activity and registration system. 

 
5:15 
And as mentioned before, you can register the structure of your small molecules. 

 
5:21 
You can register all of the molecule and batch information associated with your small molecules. 

 
5:28 
And those fields, what we call these attributes or metadata fields can be configured by the vault administrator. 

 
5:38 
So what is a vault administrator? 

 
5:40 
It is a super user within your account which will be which will have the ability to create these fields or these attributes depending on what information the end user needs to store. 

 
5:53 
OK, so this is all configurable, all of these fields that we see here and you can register the molecules, the batches and then different samples. 

 
6:04 
When I say samples, I mean either vials or tubes or aliquots in our sample level inventory. 

 
6:12 
But you can also use the registration system to register the cell lines, for example. 

 
6:17 
And it's the same idea. 

 
6:21 
So you can register the cell line at the top level, different lots or batches and then different samples and keep store the information of those samples for your cell lines as well. 

 
6:32 
And then for your protein sequences. 

 
6:33 
And I won't show all of the possibilities here because you can register pretty much anything. 

 
6:39 
You can register even your, I don't know your, we were talking about that yesterday, your wines or your dog breeds, whatever. 

 
6:49 
You can register any entity in there. 

 
6:52 
But of course within research you can register protein sequences, oligonucleotides, antibody drug conjugates using the mixture entity type and so on. 

 
7:06 
And when we do that, you get you gain this ability to query, to search and report your assets. 

 
7:13 
So what we see here, I don't know if this is working. 

 
7:17 
Yeah, but what you see on the left-hand side is the structure of a small molecule. 

 
7:22 
And I use the substructure search in here and then we see the attributes in there and then some assay data, which we'll talk about that in a moment. When we register the assay data, we can we register the assay in what we call a protocol, which is basically the containers in CDD vault with which will store whatever you data you get from your experiments. 

 
7:49 
In this example here, it's a dose response, right? 

 
7:53 
And it's actually maybe it's not really clear, but what we have here is an overlaid. 

 
7:58 
So I have two molecules and then I can compare the dose response curves of two different molecules. 

 
8:05 
Bear in mind that in there on the left hand side, you could have a different entity. 

 
8:10 
If you're working with your ADCs, you could have your ADCs in there. 

 
8:14 
OK, so this is for the asset registration and reporting your assets, right? 

 
8:22 
The second part of the talk is about the assay registration. 

 
8:27 
So of course the first step in the workflow is you register your compounds or other entities, your assets and then you want to register the assay data. 

 
8:38 
So the assay data is registered in what we call a protocol. 

 
8:44 
And the protocol is kind of a set of instructions so that CDD will know what to do with the data. 

 
8:51 
Is it going to calculate an IC50 or is it going to calculate an average of the inhibition? 

 
8:57 
So kind of a set of the instructions of what to do with the data that is stored in CDD vault. 

 
9:03 
But of course, that very important in order to keep the data fair is to actually store or define the metadata information about the assay. 

 
9:16 
And what we see here are what we call protocol fields. 

 
9:21 
Again, these protocol fields are attributes defined by the by your vault administrator so that when the end user creates a protocol, they will be able to add all of the details of the assay. 

 
9:34 
So in this example, here we have a generic enzyme screening assay. 

 
9:39 
So you may want to add the standard, the SOP, you may want to attach it in there. 

 
9:45 
You may want to know what's the read out for your actual assay and then you may want to know you know the category. 

 
9:53 
Of course you know all of kind of information that will keep your assay well defined. 

 
10:01 
But then the same applies for what we call the run fields. 

 
10:07 
So whenever you are uploading data into a protocol, you are loading the data into what we call a run date, which is the date of the experiment. 

 
10:17 
The metadata information, what we call the run fields may change between runs. 

 
10:23 
So maybe it was run by a different scientist, maybe the study number is different or the CRO is different. 

 
10:31 
So those can also be defined by the vault administrator. 

 
10:36 
And then the end user when is uploading the data can fill in the actual information. 

 
10:43 
So going a step further is protocol forms. 

 
10:48 
So now what we want to do is make life easier for the end user and in a in an effort to standardise the creation of the protocols is that the vault administrator can create some templates for the protocols. 

 
11:07 
So what we have here is a protocol form for the cell based screening that I showed you before. 

 
11:14 
So the vault administrator has the power to add the protocol fields that are relevant for all the assays of this form and can also use the ontologies either from the common assay template or the Pistoia Alliance assay template. 

 
11:33 
So these two sets of ontologies, they have kind of they standardised the terms so they agreed on those specific terms that they can be used to describe an assay. 

 
11:49 
So basically the vault administrator can create these templates so that the end user whenever they need to create a protocol, they select the form and automatically they will get the fields and ontologies that are relevant to that specific type of assay. 

 
12:08 
So this is an effort to make the life of the end user, the scientist that wants to upload the data to make life easier and describe the assays in a more accurate way. 

 
12:26 
And then of course, the same for the runs. 

 
12:29 
Again, it's all created in a template by the vault administrator. 

 
12:35 
The end result is that we can query and report the actual assays that you have registered in your vault. 

 
12:46 
So what we see here at the top is a list of all the protocols that I have registered in my vault. 

 
12:54 
And on the left hand side we have all of the metadata fields. 

 
12:59 
So the protocol fields and the ontologies. 

 
13:03 
What these show is the ability to compare your assays so that you will know whether you will need to create and register a new assay or not. 

 
13:14 
So if you have a new set of information, because these will give you know, all the set of information for each assay that you have registered in your vault. 

 
13:26 
Also, the other benefit is to avoid the duplication of protocols. 

 
13:30 
This is something that we saw a lot in the past with our end users before having this ability to create the protocol forms. 

 
13:42 
OK, now basically CDD Vault has a restful API and that means that you can connect other instruments with CDD Vault and you can integrate these assay forms and definitions with other systems. 

 
14:02 
So kind of going a step forward. 

 
14:07 
OK. 

 
14:07 
So what I showed you so far was the activity and registration system. 

 
14:14 
So we looked at, OK, you can use CDD Vault to register your assets, compounds, entities and the assays, your protocol data. 

 
14:23 
But we do have other functionality available in CDD Vault. 

 
14:27 
So we do have of course the sample inventory system. 

 
14:32 
So this allows users to register their samples, their vials, the stock solutions can be used either for chemistry or for your biological samples and also keep track of the location as well and the quantities of course. 

 
14:48 
Then we have the visualisation tool. 

 
14:51 
So basically once the compounds and data are registered in the vault, you can then plot, filter and mine the data. 

 
15:01 
Yeah, create do substructure searches, cluster analysis, R group decomposition and so on. 

 
15:09 
We do have an electronic lab notebook both for chemists and biologists, yes. 

 
15:16 
So if you are a chemist, you can use the ELN to capture the chemical reactions and the stoichiometry table. 

 
15:23 
And then of course for biologists as well to capture all the details of your experiments. 

 
15:28 
Kind of new issue. 

 
15:30 
We have a deep learning module which is based in chemical rich vectors. 

 
15:36 
So basically you can perform a similarity search within [unclear] and [unclear] and then get the 100 most similar compounds to your heat from CDD Vault. 

 
15:48 
So without having to leave the secure instance of CDD Vault, right. 

 
15:54 
So that's kind of the main functionality in CDD Vault. 

 
15:59 
This is just to show you the types of plots that you can achieve within with the visualisation in CDD Vault and also our electronic lab notebook, which can be used for chemists and for biologists. 

 
16:15 
The final slides are really just to show that CDD Vault has a restful API. 

 
16:20 
As I mentioned before, that means that you can use either CDD Vault as your main source of truth, your main ecosystem where you which you can use to Add all of your data or you can use it, you know, kind of to connect with other systems as well. 

 
16:38 
And these are all of the systems that you can connect with. 

 
16:42 
But of course, if any of your software or instruments have an open API that can be integrated with CDD Vault. 

 
16:50 
OK, final slides, just a few examples of the customers that currently use CDD Vault. 

 
16:57 
And yeah, that's it. 

 
16:58 
Thank you for your attention. 

 
16:59 
Happy to answer any questions if there are any questions.