968 Views
Data Analysis and Informatics
Podium
Dana Vanderwall, PhD
Director, Biology & Preclinical IT
Bristol-Myers Squibb
Data is the product of any science-based R&D organization. And since data is created in the lab, the lab itself is the very beginning of the data management and stewardship lifecycle. In this talk we will discuss embedding data standards at a foundational level allow for the creation of a 100% “digital lab” that removes numerous error modes from laboratory processes. The digital lab expands the scope of current automation capabilities and self-documents the relevant context of the process in real-time, capturing enriched metadata during the execution of an experiment, data reduction and analysis, decision capture, including details of the materials and instruments used. Entropy is removed from the process using normative terms provided by the Allotrope Foundation Ontologies in the laboratory software – enabling correct, consistent and complete metadata capture. Conformance to SOPs and analytical methods is also enforced digitally using standard input (instruction sets) described in a semantic graph. The completed data set connects equipment, materials, processes, results and decisions and is stored in a semantic graph (RDF triples) along with “raw data” (W3C data cube) in an open, portable, vendor-neutral format. The resulting standardized, highly annotated body of data is more consistent, searchable and more easily integrated across domains, and also enables the automation of reports and other structured documents- with documented provenance to the original source. When completed, the digital lab approach will greatly enhance data integrity (ALCOA-CCEA rules) and adherence to the FAIR Guiding Principles for scientific data management and stewardship.