SLAS2018 Innovation Award Finalist: Pharos – A Torch to Use in Your Journey In the Dark Genome

It is well known that a relatively small set of protein targets receive the bulk of research attention and thus funding. However, there are potential (druggable) opportunities in the remaining under-studied and un-studied proteins. To address this the NIH initiated the "Illuminating the Druggable Genome" program to characterize the dark regions of the druggable genome. As part of this program, a Knowledge Management Center (KMC) was created to aggregate and integrate heterogeneous data sources and data types creating a centralized location for information about all protein targets identified as part of the druggable genome. Since then the KMC has expanded to consider the entire human proteome. In this presentation, we describe Pharos, the user interface for the KMC knowledgebase. We provide an overview of the data sources and types made available via Pharos and then describe the architecture of the system and its integration with KMC & external resources. In particular we highlight the rich search facilities that enable a user to drill down to relevant subsets of data but also support the notion of "serendipitous search". Given the heterogeneous set of data types available for individual targets, it is useful to quantify how much and what types of data is available for a target. We describe the development of knowledge profiles and a Knowledge Availability Score (KAS), both derived from the Harmonizome, which is a resource that has characterized data availability across different data sources and types in a uniform manner. We then highlight how the KAS is concordant with knowledge trends characterized by traditional metrics such as publications and grants. We discuss the use of the KAS in the Pharos interface and an example of prioritizing understudied targets by computing the similarity of their knowledge availability profiles with that of well-studied targets.

