At Open Targets, we are improving how targets are selected for drug discovery.

Target selection

Drugs against genetically validated targets are more likely to progress through clinical trials. The influx of new data from modern genetics and genomics allows targets to be assessed in a comprehensive genome-wide context. In addition, feasibility of new technologies in gene editing, single cell sequencing and cell models allow new types of data to be generated to inform target selection. We bring these approaches together in an integrated pre-competitive research programme.

We consider target selection to encompass two key steps, target identification and target prioritisation. Target identification defines targets with significant associations to the disease biology. Target prioritisation refines the list of targets on the strength of the association evidence and additional target and disease parameters such as target tractability by drug modality, and likely target-based safety risks. We work together with scientists across our partner organisations to identify areas where our research programme will enhance target identification and prioritisation either by integrating and analysing data, or by generating new data in our therapeutic areas of interest.

Generating new data through our Experimental Programme

We have developed a portfolio of experimental projects designed to provide new information in our key therapy areas to enable target identification and prioritisation. We generate target-centred data in human, physiologically relevant systems to improve the strength of causal links between targets and diseases in our focus therapeutic areas of Oncology, Immunity and Inflammation and Neurodegeneration. We combine whole genome approaches and high throughput methods to address the full range of relevant targets in disease most relevant cellular systems. We use the expertise of all our partners in emerging and established technologies including:

  • Gene editing (CRISPR)
  • Induced pluripotent stem cells
  • Single cell genomics
  • Organoid and tissue culture
  • Large-scale genetics, genomics and epigenomics
  • Genome-wide sequencing


Our projects in Oncology utilise resources and expertise within the Wellcome Sanger Institute's Cancer, Ageing and Somatic Mutations Program, which has played an important role in understanding the genetic basis of cancer. A shared theme is the use of accessible cancer resources to curate and analyse clinical genomic datasets to identify driver genes (mutations, amplification, deletions and gene fusions) across multiple cancer sub-types. A key resource is the unique collection of >1000 human cancer cell lines at the Sanger Institute along with their drug sensitivities. Genomic information including RNA-seq and synthetic lethality from genome editing in model systems that best reflect the biology of tumours can identify many new target opportunities as exemplified in Behan et al 2019. We are expanding our gene editing approaches to both tumour based cancer targets for instance by identifying potential combination targets, and immuno-oncology targets by identifying targets that modify the tumour microenvironment or alter cell killing in co-culture.

Immunology and Inflammation

We focus on explaining the genetic basis of immune-mediated diseases including inflammatory bowel disease (IBD) and SLE where there is a strong interest from our partners and expertise on the Genome Campus. We developed a state of the art meta-analysis for the existing IBD cohorts, and we are moving candidate targets into CRISPR-cas9 knockouts in gut epithelium organoids for validation. In addition, we are generating eQTL datasets including at the single cell level that can be used to resolve the causal genes at loci implicated from genetics for IBD and SLE. We are probing the roles of targets either in well-defined immune cells through gene editing and single cell transcriptomics and epigenetic profiling (for instance in microglia, macrophages, dendritic cells and T cells in response to various stimulations, or in disease such as asthma using single cell genomics. We also have projects that intersect with oncology including identification of receptor ligand pairs in NK cells with application to immuno-oncology as well as intersections with the role of the immune system in neurodegenerative disease.


In Neurodegeneration we are using the collective expertise of our partners and through collaboration with UCL and the Cambridge Dementia Research Institute we have built a set of projects based on the use of iPSC derived neurons particularly in Alzheimer's and Parkinson's disease. These projects use CRISPR-cas9 gene editing to identify modifiers of the response to oxidative stress, mechanisms of Tau uptake and the effects of Alzheimer's disease specific mutations in neurons. We also characterise these neurons using single cell transcriptomics. We have established protocols for similar analysis of enteric and sensory neurons including the effect of rare monogenic mutations for pain. We are using fine mapping of GWAS in Alzheimer’s (AD) and Parkinson’s (PD) disease to identify and test potential targets in the same neuron systems and using functional approaches to model the effect of common variants on relevant phenotypes.

The recent discovery of the relevance of neuroinflammation in neurodegeneration has led us to develop projects at the intersection with immunity and inflammation. We have profiled microglia from trauma patients for eQTLs,and are using IPSc-derived microglia for genetic and gene editing screens.

Finally, we have recently started to profile the effect of chromatin modifying mutations of neural progenitor cells and neurons, focussing on targets identified in the Deciphering Developmental Disorders project.

Core informatics and data generation pipelines

Our core bioinformatics work focusses on bringing together the various, relevant data on targets to allow simple and seamless exploration by drug discovery scientists. We cover many data types relevant to human disease biology and target identification using the expertise from Open Targets partners and beyond.

Open Targets Platform

Our overall approach to target identification and prioritisation is brought to life in the Open Targets Platform, which integrates public data relevant to the association between targets and diseases, and provides additional data and tools for prioritisation. The Platform was developed together with scientists from our partners through a user experience (UX) design process. We integrate data from genetics, somatic mutations, expression analysis, drugs, animal models and the literature through robust pipelines and assess the association of a target with disease in a single score. User friendly interfaces are used to guide the user in target identification and allowing access to the evidence. In addition, prioritisation information is provided by target tractability assessments, safety data, gene expression information, and other target properties. The Platform is updated every 2 months and data is accessible through an intuitive web interface, a robust API, a Google BigQuery instance, and a comprehensive list of dataset downloads. The Platform is an open source project and will continue to evolve as we bring new features and data to bear.

Open Targets Genetics

Given the enhanced success of drugs targeting genetically validated targets in clinical trials, we have enhanced the use of human genetics information in target selection. We developed Open Targets Genetics to provide causal target assignments underlying each association for Genome Wide Association Studies (GWAS). The portal was initially released at the American Society of Human Genetics conference in October 2018. It supports searches that start with a gene, a single variant, a single study (trait), or multiple studies and users can also download the data generated for each release. Collaborative work with the GWAS catalog allows us to include data from genome-wide summary statistics including UK Biobank. Open Targets Genetics now also includes workflows to identify candidate causal variants and to analyse colocalization of GWAS and eQTL signals from the new eQTL Catalogue to assist in identifying causal genes. Finally we have developed a machine learning model to assign each association at a locus to the most likely causal gene based on integrated genetics and functional genomics data. The Open Targets Platform receives its common disease genetics information from Open Targets Genetics, while exporting links for gene and drug information to the Genetics portal.

Other bioinformatics services and tools

Additional ancillary projects in the area of bioinformatics include development of the eQTL Catalogue, network analysis for drug target list expansion, and enhanced data for the Open Targets Platform, including additional data from clinical trial records and data on the effect of mutations on protein function. We also release and support standalone informatics tools associated with our various projects.