A framework for community-based development of standards - for harmonization of High-throughput Sequencing (HTS) computations and data formats to promote interoperability and bioinformatics verification protocols.  

Scientific Organizing Community
Workshop Agenda and Speaker Slides

Thursday March 16, 2017

Biocompute project goal: facilitate HTS (NGS) computational analysis information communication with FDA

Day 1 - Workshop Goal: discuss / create / refine concrete biocompute object (BCO) examples

Moderator(s): Vahan Simonyan (FDA), Raja Mazumder (GW)
07:45 - 8:30 Collect name tags
8:30 - 8:35 Welcome Remarks
Raja Mazumder (GW) (slides) and Vahan Simonyan (FDA)
8:35 - 9:05 Introductory/Keynote Address
Phil Bourne (NIH) (slides) and Carolyn Wilson (FDA) (slides)
Session 1A - Framing the need for HTS (NGS) computational analysis standards from the regulatory, academic and industry perspectives: examples related to infectious agent detection will be used to illustrate the need for such standards.
09:05 - 10:45 Harmonization Needs and Biocompute Objects (Infectious Agent / Contamination Detection)
  • Vahan Simonyan (CBER / FDA): Biocompute objects: integrated regulatory view of bioinformatics harmonization frameworks. (slides)
  • Eric Donaldson (CDER / FDA): High-throughput sequencing data challenges in FDA regulatory review. (slides)
  • Jeremy Goecks (Galaxy): A survey of technologies for reproducing and communicating biomedical analyses. (slides)
  • Paul Duncan (Merck): HTS (NGS) regulatory standards, Industry Application. (slides)
  • Seth Sims (CDC): Who ya gonna call? Global health outbreak, and surveillance technology. (MP4 download)
  • Heike Sichtig (FDA - CDRH): A biocompute object for FDA-ARGOS reference genomes. (slides)
Coffee Break 10:45 - 11:00
Session 1B - Panel discussion of approaches and concrete examples.
11:00 - 12:00 Panel Discussion
Panelists: Veronica Miller (HIV Forum), Marco Schito (C-Path), Arifa Khan (FDA / CBER), Scott Jackson (NIST) and Session 1A Speakers

Moderator(s): Eric Donaldson (FDA), Vahan Simonyan (FDA)
Lunch Break
12:00 - 13:00 Lunch
Group picture @ 12:45
Session 2A - Use of BCOs including creation, versioning and methods to promote interoperability and harmonization frameworks: examples related to cancer / genetic disease detection will be used to illustrate the need for such standards.
13:00 - 15:00 Harmonization Needs and Biocompute Objects (Cancer / Genetic Diseases)
Introductory / Keynote Address: Mike Huerta (Associated Director of NLM for Program Development & Coordinator of Data and Open Science) (slides)
  • Raja Mazumder (GW): Creating and using BCOs. (slides)
  • Adrian Myers (FDA): BCO harmonization's pioneering role in clinical trial data analysis. (slides)
  • Dennis Dean (Seven Bridges): Individualized cancer treatment implications for CWL and the FDA biocompute object: A neoantigen workflow case study.
  • Durga Addepalli (Attain / NCI): What does it take to build and run a pipeline on Cancer Genomic Cloud: Exome example. (slides)
  • Toby Bloom (NYGC): A pipeline for diagnosing sick kids: an example of composable biocompute objects. (slides)
  • Ben Busby (NCBI): In-memory analysis of expressed variants; leveraging thousands of breast cancer data sets for clustering. (slides)

Moderator(s): Warren Kibbe (NCI), Adrian Myers (FDA)
Coffee Break 15:00 - 15:15
Session 2B - Panel discussion of application of BCO in clinical, regulatory and R&D settings.
15:15 - 16:30 Panel Discussion
Panelists: Tanja Davidsen (NCI), Zhining Wang (NCI), Hsinyi (Steve) Tsang (NCI) and Session 2A Speakers.

Moderator(s): Zivana Tezak (FDA), Vahan Simonyan (FDA)
16:30-16:45 Day 2 plans
Vahan Simonyan (FDA), Raja Mazumder (GW)

Friday March 17, 2017

Day 2 - Workshop Goal: collect input and update biocompute object specification document
8:30 - 9:00 Introductory / Keynote Address:
Carole Goble (Univ. of Manchester) (slides) and John Quackenbush (Harvard) (slides)
Session 3A - Technical aspects of BCO development and application.
9:00 - 10:30 Workflow Languages
  • Michael R. Crusoe (CWL): Common Workflow Language project as an example of community based Open Standards development and maintenance. (slides)
  • Stian Soiland-Reyes (Workflow Research Objects): Workflow languages, annotating the workflow and emphasis on interoperability. (slides)
  • Gil Alterovitz (FHIR Genomics): Developing an Ecosystem for Precision Medicine via FHIR Genomics and BCO Integration. (slides)
  • David Steinberg (GA4GH): Leveraging GA4GH for BCO creation. (slides)

Moderator(s): Michael R. Crusoe (CWL), Gil Alterovitz (FHIR Genomics), Vahan Simonyan (FDA)
Coffee Break/Posters 10:30 - 11:00
Session 3B (parallel session) - BCOs and standards. Leveraging and providing input to existing standards/ontology initiatives; BCOs and platforms. Producing BCOs (short term) and running BCOs (long term).
11:00 - 12:00 Innovations and standards driving state of the art platform - Genomic Platforms
  • Introduce innovations employment by state of the art genomics platform.
  • Identify key developments required to drive regulatory science.

    Panel Chair: Vahan Simonyan (FDA)

    Panelist: Vahan Simonyan (HIVE), Jeremy Goecks (Galaxy), Theresa Wohlever (CLCbio), Eugene Yaschenko (NCBI), Geet Duggal (DNAnexus), Dennis Dean (Seven Bridges), Elaine Johanson (precisionFDA) and Session Moderators.

    30 Minute Panel Discussion and 25 Minute audience Q&A
  • Session 3C (parallel session) - Panel and audience discussion of different BCOs described/collected in Day 1 and identification of additional BCOs that need to be developed to provide a comprehensive list of BCO examples.
    11:00 - 12:00 BCO examples
    Open to all participants. Provide input to BCO specification document. Additional talks describing pipelines (does not need to be in BCO format). Speakers (pipeline descriptions/user needs from Session 1 and 2 panelists, moderators, speakers):

    Supportive technologies and resources
    • Philippe Rocca-Serra (Oxford): Leveraging BD2K bioCADDIE, ISA, StatO and BioSharing. (YouTube Link)
    • Konstantinos Krampis (CUNY): Leveraging Bio-Docklets for BCO creation: Virtualization containers for single-step execution of NGS pipelines and the possibility of automating biocompute object creation. (slides)
    • Jonas Almeida (Stony Brook): Portable computing using JS prototypal code inheritance. (slides)
    • Wendy Rubinstein (NCBI): Potential BCO resources at NCBI: ClinVar for variations, GTR for tests and methods and MedGen for phenotypes.
    • Tony Burdett (EBI): Facilitating sematic alignment of EBI resources. (slides)
    • Alexander (Sasha) Wait Zaranek (Curoverse): Open-consent a legal and ethical "technology" that can help drive biomedical standards. (slides)

    • James Hirmas (GenomeNext): GenomeNext platform pipelines and potential BCOs 1) newborn hearing screening 2) generating gene and variant signatures for bladder cancer populations. (slides)
    • Marilyn Matz and Alex Poliakov (Paradigm4): Reproducibility and compliance with a pipeline-in-a-database: a biocompute approach for group-based somatic mutation calling. (slides)
    • Vineeta Agarwala (Flatiron Health): Studying phenotype; clinical informatics and clinico-genomic databases at Flatiron Health.
    • Paul Giresi/Anupama Joshi (Epinomics): Defining standards for the analysis and interpretation of functional epigenomic data with ATAC-seq. (slides)
    • Hsinyi (Steve) Tsang (NCI/Attain): BLAST-based pathogen detection pipeline. (slides)
    • Errol Strain (FDA): FDA's Genome Trakr Program: Advancing food safety through whole-genome sequencing of foodborne bacteria. (slides)

    This session will continue after lunch.

    Moderator(s): Hadley King (GW), Paul Duncan (Merck)
    Lunch Break
    12:00 - 13:00 Lunch
    Group picture @ 12:45
    13:00 - 15:00 Parallel Sessions (cont.) - Topics from 3B, 3C.
    Coffee Break 15:00 - 15:15
    Session 4 (parallel session) - This session will allow for additional community feedback and comments on the current framework and what are the next steps.
    15:15 - 17:00 Workshop summary, conclusions, and recommendations
    • Moderators from previous sessions will provide brief summaries of their sessions.
    • Vahan Simonyan (FDA): "Next Steps" or "Moving Forward".
    • Closing remarks (Raja Mazumder/Vahan Simonyan).

    Workshop Participant Bios

    Philip E. Bourne - FACMI

    Former Associate Director of Data Science at the NIH

    As of May 1, 2017 Dr. Bourne will become the Stephenson Chair of Data Science, Director of the Data Science Institute and a Professor in the Department of Biomedical Engineering at the University of Virginia. Prior to that he was the Associate Director for Data Science (ADDS; aka Chief Data Scientist) for the National Institutes of Health (NIH) and a Senior Investigator at the National Center for Biotechnology Information (NCBI). In his role as ADDS he led the trans NIH US $110M per year Big Data to Knowledge (BD2K) research initiative and contributes to data policies and infrastructure aimed at accelerating biomedical discovery. Examples include establishing the NIH Commons, support for data and software citation and establishing preprints as a supported form of research. Prior to joining NIH, Dr. Bourne was Associate Vice Chancellor for Innovation and Industry Alliances in the Office of Research Affairs and a Professor in the School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego (UCSD). Dr. Bourne is a Past President of the International Society for Computational Biology, an elected fellow of the American Association for the Advancement of Science (AAAS), the International Society for Computational Biology (ISCB) and the American Medical Informatics Association (AMIA). He has published over 300 papers and 5 books and co-founded 4 companies. Awards include the Jim Gray Award eScience Award, and Benjamin Franklin Award.

    Carolyn Wilson - FDA

    Associate Director for Research at Center for Biologics Evaluation and Research (CBER) Currently, Dr. Wilson serves as the Associate Director for Research at CBER, FDA. As ADR, Dr. Wilson ensures that CBER's research is relevant, high quality and provides CBER with the appropriate scientific expertise, tools, and data to support regulatory decision-making and policy development. Dr. Wilson's responsibilities include leading FDA's Genomics Working Group and CBER's Medical Counter-Measure Regulatory Science Initiative. Dr. Wilson still maintains her laboratory program studying retroviruses which are either used as vectors for gene therapy clinical trials or are of concern in the xenotransplantation setting. Dr. Wilson joined the Division of Cellular and Gene Therapies at the Center for Biologics Evaluation and Research of the FDA in 1993. As a researcher-reviewer in DCGT, she reviewed INDs and developed policy and guidance documents in two novel product areas: gene therapy and xenotransplantation. Dr. Wilson holds a Ph.D. in Genetics from The George Washington University.

    Carole Goble - University of Manchester

    Professor for School of Computer Science; Fellow Royal Academy of Engineering; Fellow of the British Computing Society, Head of Elixir UK

    Professor Carole Goble is based in the School of Computer Science, at the University of Manchester in the UK. For over 20 years she has led a team of researchers and developers working in e-Science. She applies technical advances in knowledge technologies, distributed computing, workflows and social computing to solve information management problems for Life Scientists, especially Systems Biology, and other scientific disciplines, including Biodiversity, Chemistry, Health informatics and Astronomy. Her current research interests are in reproducible research, computational workflows, asset curation and preservation, semantic interoperability, knowledge representation, and knowledge exchange between scientists and new models of scholarly communication. She leads the ResearchObject.org project which aims to richly describe packages of research components to support recomputability and reproducibility. Research Objects arose from her team and others work on the EU Workflow4Ever project which aimed to develop methods for long-term scientific workflow preservation and portability and laid the foundations of the Common Workflow Language. She has produced many widely used software platforms and resources including: the Apache Taverna Workflow Management System, the myExperiment public repository for computational workflows, and the FAIRDOM-SEEK asset management system for Systems Biology projects. She is the co-lead of the Interoperability work stream of the EU's ELIXIR Research Infrastructure for Life Science Data management in Europe, which has 22 national members, and serves as the Head of Node for ELIXIR-UK. She co-founded the UK's Software Sustainability Institute. She received the Microsoft Jim Gray award for service to e-Science in 2008.

    John Quackenbush - Harvard

    Professor of Computational Biology and Bioinformatics

    Michael F. Huerta - NLM

    Associate Director of National Library of Medicine for Program Development & NLM Coordinator of Data Science and Open Science

    Dr. Huerta has led major trans-NIH programs advancing scientific technology research, interdisciplinary research, and team science, as well as informatics, data science and open science initiatives. The latter include the NIH Human Connectome Project (http://www.humanconnectome.org/), which produced the first open, standardized, comprehensive, multimodal image datasets of human brain connectivity, the NIH National Database for Autism Research (http://ndar.nih.gov/), a platform for collaboration with research data from over 100,000 subjects, and the Human Brain Project, which was instrumental in creating and establishing the field of neuroinformatics. Today, Dr. Huerta is focused on making the biomedical research enterprise more data-centric and open by making data broadly accessible, usable, discoverable, citable, and linked to other research objects. He co-chairs the NIH Data Sharing Task Force which is working to increase access to biomedical research data, chairs the NIH Clinical Common Data Elements Task Force which coordinates and harmonizes those efforts across the agency, and leading efforts to pivot data science and open science programs and initiatives at NIH to the future. Dr. Huerta's research background is in systems neuroscience; his undergraduate and doctoral work was completed at the University of Wisconsin at Madison, he was an NIH postdoctoral fellow at Vanderbilt University and on the faculty of the University of Connecticut Health Center before joining NIH's National Institute of Mental Health in 1991 and moving to the National Library of Medicine in 2011.

    Vahan Simonyan - FDA

    Lead Scientist, HIVE Project Director, Center for Biologics Evaluation and Research (CBER)

    Dr. Simonyan is a Lead Scientist, Director of Bioinformatics and Principal investigator of HIVE at the FDA. Prior to the FDA, Dr. Simonyan led software development projects at the National Institutes of Health (NIH), Georgetown University and Penn State. During his consulting days he participated in the project planning and development of PubChem. His current projects are with HIVE which is a multicomponent cloud infrastructure that provides distributed storage and massively parallel compute environment to handle next generation sequencing data (NGS) and to analyze the outcomes using web-interface visual environments. The visual environments are built in collaboration with research and regulatory scientists and other users. Dr. Simonyan's background is in mathematics, physics and bioinformatics. Dr. Simonyan received his PhD from Moscow State University.

    Raja Mazumder - GW

    Associate Professor of Biochemistry and Molecular Medicine and Co-Director of The McCormick Genomic Proteomic Center GW

    Dr. Mazumder is an Associate Professor of Biochemistry and Molecular Medicine and Co-Director of The McCormick Genomic Proteomic Center at The George Washington University (GW). While working at National Center for Biotechnology Information (NCBI) at NIH, UniProt and Protein Information Resource (PIR), Dr. Mazumder has worked closely with colleagues in developing international molecular biology resources and using these resources to identify therapeutics, diagnostics and vaccines targets. Through current NIH, FDA and industry funding he is involved in genomic and bioinformatics research associated with cancer biology, glycobiology, metagenomics and standards development.

    Eric Donaldson - FDA

    FDA, Virology Reviewer

    Dr. Donaldson currently serves as a Clinical Virology Reviewer in the Division of Antiviral Products at the FDA since 2012 and is a co-chair of the FDA-wide Genomics Working Group. Prior to joining the FDA, he was a Research Assistant Professor at the University of North Carolina at Chapel Hill where he used next generation sequencing to study viral evolution and cross species transmission. He is a coauthor of a science fiction book entitled, "The Virus Chronicles: The Culling" which was published in 2015 and a second book, "Seventh Extinction: Genesis Project" was released in late 2016. As a Clinical Virology Reviewer, Dr. Donaldson has been involved in developing a next generation sequencing analysis pipeline for the regulatory review of antiviral resistance data and he brings this expertise to HTS-CSRS.

    Jeremy Goecks - Galaxy

    Galaxy Project, Leads the development of Collaboration, Publication Framework and Visual Analytics Components

    Dr. Goecks is an Assistant Professor of biomedical engineering/ Computational biology program at Oregon Health and Science University and computational biology at George Washington University. His research centers on developing computational methods and infrastructure for analyzing large biomedical datasets. He is a lead investigator for Galaxy (http://galaxyproject.org), a Web-based platform for doing accessible, reproducible, and collaborative analysis that is used by tens of thousands of scientists throughout the world and has been cited extensively in the biomedical research literature. He has experience that spans across diverse facets of biomedical computing, including strategic planning for multi-stakeholder computing systems, architecting scalable infrastructure, developing new software components, and implementing best practice analysis pipelines for common needs such as RNA-seq analysis and somatic variant detection. He has also developed novel approaches to analyze genomic, transcriptomic, and proteomic datasets for a variety of biomedical applications, including multiple cancers and innate immunity. Dr. Goecks received his Ph.D. in Computer Science from Georgia Tech and did postdoctoral research in computational biology at Emory University.

    Marco Schito - C-Path

    Scientific Director, Critical Path to TB Drug Regimens

    Toby Bloom - NYGC

    Deputy Scientific Director of Informatics

    Toby Bloom brings over 20 years of experience to the New York Genome Center where she is responsible for strategy and execution for bioinformatics services, the bioinformatics data hub, analysis pipelines, and software engineering infrastructure. Prior to joining NYGC, Toby served as Director of Informatics at the Broad Institute's genome sequencing center for ten years, where she led informatics infrastructure development for the massive scale-up in next-generation sequencing data. In addition, Toby was Chief Technology Officer at Clinsoft Corporation, and an Executive Director at Phase Forward, designing Clinical Data Management, Electronic Data Collection, and Adverse Event software for major pharmaceutical companies. She received her PhD in Computer Science from the Massachusetts Institute of Technology, and a BS in Computer Science and Applied Mathematics from SUNY Stony Brook.

    Marilyn Matz - Paradigm4

    CEO and co-founder of Paradigm4

    Marilyn Matz is CEO and co-founder of Paradigm4, along with Paradigm4's CTO, Turing Laureate and MIT Professor Mike Stonebraker. Paradigm4's SciDB-a scientific computational database- powers the NCBI's public 1000 Genomes browser and underpins work at leading companies and institutions like Foundation Medicine, Harvard Medical School Bioinformatics, NASA, MIT Lincoln Laboratories, and others. Prior to Paradigm4, she co-founded Cognex Corporation, an industrial machine vision company. Marilyn has an M.S. in Computer Science from MIT.

    Alex Poliakov

    Manager, Customer Solutions Group, Paradigm4

    Alex has over a decade of experience developing commercial distributed database products. As the Director of the Solutions Group at Paradigm 4, he helps scientists and bioinformatics researchers build MegaVariant warehouses and solutions for biomarker discovery. Alex has a B.S. in Computer Science and Engineering from MIT.

    Paul Duncan - Merck

    Merck, Senior Scientist, Bioanalytical Scientist in Bioprocess R&D

    Dr. Duncan is a Senior Principal Investigator at Merck in Bioprocess R&D, focusing on analytical development. His research expertise is in characterization cell substrates (bacterial, yeast, animal-derived), and viruses or vectors used for the production of vaccines and other biological medicinal products targeted for human use. Dr. Duncan has extensive experience with classical and new technologies to support the safety and integrity of biological manufacturing processes in this highly regulated environment. In this capacity, he also spearheads Merck's efforts in the application of advanced virus detection technologies, such as next generation sequencing, to strengthen viral risk assessments. Dr. Duncan received his PhD from the University of Illinois at Urbana-Champaign in 1993.

    Ben Busby - NCBI

    Lead, Bioinformatics Training

    Seth Sims - CDC / Northrop Grumman Corp

    HPC Administrator

    Seth Sims is a computer geek in the Division of Viral Hepatitis at CDC. He is High Performance Computing administrator, lead developer, and architect for the Global Health Outbreak and Surveillance Technology (GHOST) project, CDC's first cloud-based surveillance platform. Seth holds degrees in Biochemistry and Computer Science, has been a professional Software Engineer for 10 years, and is pursuing a PhD of Computer Science with Bioinformatics concentration at Georgia State University.

    Robel Kahsay - GW

    Assistant Professor

    Dr. Kahsay is an assistant professor at the George Washington University, school of medicine and health sciences. Dr. Kahsay is the creator of the biocompute and data typing portals for the biocompute objects in the HTS-CSRS. Prior to GW Dr. Kahsay worked as a research investigator and computational biologist for over nine years.

    Dennis Dean - Seven Bridges

    Research and Development Scientist

    Dr. Dennis A. Dean, II is a Senior Scientist at Seven Bridges where he leads collaboration outreach with the US Food and Drug Administration and the US Department of Veteran Affairs. He is responsible for collaborations investigating the application of workflow standards and graph genome technology for decreasing regulatory submission review time. Dr. Dean leads research for the company’s Cooperative Research and Development Agreement with the US Department of Veterans Affairs Million Veteran Program, the largest genomics research project in the world. In this capacity, he is helping to develop tools which will be used to analyze upwards of one million patients’ genotypes and phenotype data. To do so, he is integrating whole genome sequencing data, single nucleotide polymorphism chip data, and complex phenotypes from electronic medical records. Dr. Dean trained as a research fellow in medicine at the Harvard Medical School and Brigham and Women’s Hospital in the Program for Sleep Epidemiology and the Program for Sleep and Cardiovascular Medicine. He earned his PhD in biomedical engineering and biotechnology and M.S. in computer science from the University of Massachusetts. He earned his B.S. in computer science from SUNY, Empire State College.

    Warren Kibbe - NCI

    Director of the Center for Biomedical Informatics and Information Technology (CBIIT) at the NCI

    Warren A. Kibbe is the director of the Center for Biomedical Informatics and Information Technology (CBIIT) at the NCI. NCI created CBIIT to lead the coordination, development, and deployment of enterprise-wide digital capabilities (including biomedical informatics, scientific-management information systems, and computing resources) in support of the Institute's initiatives. Through CBIIT, NCI is helping to speed scientific discovery and facilitate translational research by using IT, informatics and Data Science to address complex research challenges. The ability to manage and analyze diverse, and often extremely large, collections of data - whether high-throughput genomics data, clinical data, or annotated medical images - in an integrated fashion is increasingly indispensable. Prior to joining the NCI, Dr. Kibbe had been at Northwestern University for more than 20 years, and was most recently a professor of Health and Biomedical Informatics in the Feinberg School of Medicine and the Director of Cancer Informatics and CIO for the Robert H. Lurie Comprehensive Cancer Center. Dr. Kibbe received his Ph.D. in Chemistry from Caltech, and was a visiting scientist at the Max Planck Institute for Biophysical Chemistry in Gottingen, Germany before joining the faculty at Northwestern. Dr. Kibbe is an active member of the open biomedical ontologies community, part of the Gene Ontology Consortium, was a member of the CTSA Ontology Working Group, and was a founder of the open source, open access Human Disease Ontology. Dr. Kibbe was involved in the harmonization of the study calendar portions of BRIDG with the CDISC Study Data Tabulation Model standard. Dr. Kibbe was the co-PI of the NIH-funded Dictyostelium Model Organism Database dictyBase (http://dictybase.org) with Dr. Rex Chisholm serving as PI. He is a proponent of open science initiatives, and in particular open source development and open data access activities. Dr. Kibbe was also the PI of the National Children's Study NCS Navigator Information Management Hub.

    Vineeta Agarwala - Flatiron Health

    Director of Product Management

    Vineeta works with a team at Flatiron Health, a healthcare technology company in NYC focused on accelerating research in oncology. Flatiron is building national-scale, continuously refreshing, research-grade cancer databases by aggregating and processing structured and unstructured clinical data from real-world electronic health records (including EHR systems in place at academic cancer centers). These clinical data are being linked to genomic data to create real-world clinico-genomic databases which can power outcomes and translational research. Previously, Vineeta pursued human genomics research at the Broad Institute in Cambridge, MA. She holds MD-PhD degrees from the Harvard-MIT HST program, and a BS from Stanford.

    Durga Addepalli - ATTAIN

    Sr.Biomedical Informaticist

    I am a Sr.Biomedical/Bioinformatics Specialist at Attain LLC working for CBIIT,NCI. I have worked for NCI since 2005, in the areas of next-gen sequencing, genomics, bioinformatics and high-throughput data analysis. I have worked with a number of NCI PIs analyzing their cancer data and have also done business and requirement analysis for clinical and HPC projects. I have experience in statistical programming and development for various bioinformatics projects.

    Stian Soiland-Reyes - University of Manchester

    Technical Architect at eScience Lab, School of Computer Science

    Michael R. Crusoe - CWL

    Common Workflow Language co-founder & Community Engineer

    David Steinberg - GA4GH

    Full stack developer

    Gil Alterovitz - FHIR Genomics

    Faculty at Harvard Medical School in biomedical informatics. Precision medicine/standards

    Gil Alterovitz is Director of the Biomedical Cybernetics Laboratory and a Harvard professor with the Computational Health Informatics Program at Boston Children's Hospital and the Harvard/MIT Division of Health Sciences and Technology. His work on integrative informatics methods, including applications in drug discovery, has been published or presented in approximately 50 peer-reviewed publications and three books (including “Systems Bioinformatics: An Engineering Case-based Approach,” ranked #1 in new Amazon bioinformatics category). A large component of Dr. Alterovitz's work involves international collaborations that bring together researchers and work on heterogeneous clinico-genomic data. His work involves leading clinical genomics standards and development. He serves as co-chair of the HL7 Clinical Genomics workgroup (where he is group leader of the FHIR Genomics effort), is on the executive team of the Clinical Workgroup of the Global Alliance for Genomics and Health (GA4GH), and serves as a member of the Institute of Medicine DIGITizE project. Within the SMART consortium, he leads the SMART/FHIR Genomics effort. He was recently also appointed to the national Precision Medicine Task Force and to lead the Sync for Genes effort for enabling a national standard for sharing of clinical genomic information.

    Veronica Miller - HIV Forum

    Executive director of the Forum for Collaborative HIV Research (The Forum) and a Senior Researcher and Lecturer at the UC Berkeley School of Public Health

    Veronica Miller is the executive director of the Forum for Collaborative HIV Research (The Forum) and a Senior Researcher and Lecturer at the UC Berkeley School of Public Health. Dr. Miller obtained a Bachelor of Science in Microbiology and a Doctor of Philosophy in Immunology from the University of Manitoba. A leading expert resolving significant health policy and public health issues, Dr. Miller has extensive experience working with major global and US organizations and agencies involved in HIV, HCV and fatty liver disease research and regulatory policy. Some of her efforts to advance public health policy include the National Summit program, focusing on the implementation of the National HIV/AIDS Strategy, the Viral Hepatitis Action Plan and the Bay Area Health Disparities Program. Dr. Miller was co-founder and chair of the EuroGuidelines Group on HIV Drug Resistance established with the purpose of assuring a common standard of care for patients in all European states. She has also been an active member of several collaborative projects, including the EuroSIDA study. She has published over 90 peer-reviewed publications on HIV treatment outcomes and regulatory strategies for HIV and HCV.

    Arifa S. Khan - FDA

    Supervisory Microbiologist and Principal Investigator

    Zivana Tezak - FDA

    Associate Director for Science and Technology, Personalized Medicine Staff, Office of In Vitro Diagnostics and Radiological Health at CDRH

    Prior to being an Associate Director for Science and Technology, Personalized Medicine Staff, in the Office of In Vitro Diagnostics and Radiological Health at CDRH / FDA, Dr. Tezak worked in the biotechnology industry holding positions in research and development at a bioinformatics and array development company. At the beginning of her career Dr. Tezak was a research fellow at the University of Pittsburgh Medical Center and Children's National Medical Center, Research Center for Genetic Medicine, working on neuromuscular disorders, human genetics, gene therapy and high-throughput sequencing technologies. Dr. Tezak is a leading pioneer in the development of flexible regulatory policies for novel technology based clinical diagnostic tests, such as next generation sequencing, to ensure their smoother translation into the clinic. Dr. Tezak received her PhD from Florida State University.

    Heike Sichtig - FDA CDRH

    Subject Matter Expert/ Principal Investigator, Division of Microbiology in the Center for Devices and Radiological Health, Office of In Vitro Diagnostics - U.S. Food and Drug Administration

    Dr. Heike Sichtig is a principal investigator (PI) and subject matter expert (SME) in FDA's Office of In-Vitro Diagnostics and Radiological Health in the Division of Microbiology Devices. She directs, as sole PI, the highly collaborative effort on developing FDA-ARGOS: FDA dAtabase for Regulatory Grade micrObial Sequences. For her exceptional leadership on this project, Dr. Sichtig was awarded the Commissioners' Special Citation award in 2016. Dr. Sichtig joined the Division of Microbiology Devices in 2012 and is primarily focused on enabling next generation sequencing (NGS) based technologies for clinical diagnostics. Dr. Sichtig leads a multidisciplinary team developing and implementing concepts for validation and evaluation of NGS-based infectious disease diagnostic devices. She obtained a B.S. / M.S. in Computer Science/Statistics from Kean University in 2002 and 2003, respectively, and a Ph.D. in Biomedical Engineering from Binghamton University in 2009. Subsequently, Dr. Sichtig completed postdoctoral training at the University of Florida/Genetics Institute in Gainesville FL in pathogen signatures, transcriptional regulation and epigenetics.

    Mark Walderhaug - FDA

    Associate Office Director, Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research (CBER)

    Mark Walderhaug is an interdisciplinary scientist in FDA's Center for Biologics Evaluation and Research (CBER). He works in the Office of Biostatistics and Epidemiology where he is the Associate Office Director for Risk Assessment. He is currently working on incorporating the computational resource, High-Performance Integrated Virtual Environment (HIVE), into the regulatory structures of CBER and supporting the HIVE in the development of high-performance computing solutions that protect and promote health. In the past, he developed quantitative risk assessments on babesiosis, avian influenza/pandemic flu, malaria, and the impact of emerging infectious diseases on biologics. The quantitative risk assessments have incorporated health data from CMS Standard Analytical administrative files as well as other health data sources. He assists in managing text mining and health surveillance modeling for CBER. He is a member of CBER's Computational Science Review Committee, and represents CBER on FDA's Scientific Computing Board. Before joining CBER, he worked at FDA's Center for Food Safety and Applied Nutrition where he was a member of the Food Safety Initiative's Microbiological Risk Assessment Team where he has worked on FDA's Vibrio parahaemolyticus and Listeria monocytogenes risk assessments and USDA's E. coli O157:H7 risk assessment for ground beef. He later served as a temporary advisor for the Joint FAO/WHO Expert consultation on Microbial Risk Assessment of Vibrio spp. in seafood. He earned his Ph.D. at Vanderbilt University and held a postdoctoral appointment at the University of Chicago in the department of Molecular Genetics and Cell Biology before coming to FDA.

    Konstantinos Krampis - CUNY

    Associate Professor, Biological Sciences, City University of New York (CUNY) Director of Bioinformatics, Center for Translational and Basic Research (CTBR-CUNY) Adjunct Faculty, Institute of Computational Biomedicine, Weil Cornell Medical College

    Konstantinos Krampis has a MSc in Molecular Biology from the University of Athens, Greece (2002) and a PhD in Computational Biology and Bioinformatics from Virginia Tech (2008). He has established and currently runs the Bioinformatics Core Infrastructures (BCIL) group at the City University of New York (CUNY, 2014-present) and previously at the Craig Venter Institute (2009-2014), where he has developed a bioinformatics ecosystem that facilitates standardized and computationally scalable genomic data analysis. Currently the BCIL group research focuses on making complex bioinformatics pipelines easily accessible similar to any other laboratory tool, by deploying pre-configured data analysis pipelines within virtualization containers, which allows for easy pipeline composition, execution and management by non-experts.Furthermore, the group has implemented the Visual Omics Explorer (VOE; http://bcil.github.io/VOE/), leveraging the latest web and mobile development technologies for interactive genomic data mining. Despite achieving a large degree of bioinformatics standardization and accessibility, the majority of open-source software used for building NGS data analysis pipelines still generates complex file outputs, which require trained specialists for processing and interpretation; the goal of Dr. Krampis' research is to build the next generation of genomic data analysis infrastructures that will result in democratizing access to bioinformatics for basic biomedical research, diagnostics, and personal genomics.

    Jonas Almeida - Stoney Brook University

    Chief Technology Officer and Professor of Biomedical Informatics

    Jonas S. Almeida was first trained as a Plant Biologist (BS) in his native country Portugal and went on to pursue a graduate program in Biological Engineering (PhD 1995) which then led to faculty appointments in Chemical Engineering (1996-2000, Assistant Professor, University of Lisbon), Biostatistics (2001-2005, Associate Professor, Medical University of South Carolina), Bioinformatics and Computational Biology (2006 - 2010, Professor, University of Texas MD Anderson Cancer Center). This trajectory, with a trail of 130 peer reviewed manuscripts, reflects an interest in quantitative Biology in all of its components - experimentation, engineering, mathematical modelling, computational statistics and more recently software engineering. In 2011, he moved to the University of Alabama at Birmingham to start a new Division of Informatics in the Department of Pathology in the School of Medicine. Dr. Almeida is currently a Professor of Biomedical Informatics and Chief Technology Officer at Stony Brook University. His research is focused on the third generation of web technologies and the way they enable WebApp ecosystems that cater to the analysis of biomolecular data. The Semantic Web, Cloud Computing and Map-Reduce abstractions for distributed processing play a central role in the rewiring of bioinformatics workflows developed by his research group. Dr. Almeida received his PhD from Universidade Nova de Lisboa.

    Zhining Wang - NCI

    Project Officer, The NCI Genomic Data Commons (GDC)

    Dr. Wang joined the NCI in 2012. He is currently the Project Officer for GDC and a Biomedical Informatics Specialist at the NCI Center for Cancer Genomics (CCG). Before joining the NCI, Dr. Wang spent two years as a senior bioinformatics scientist with The Cancer Genome Atlas (TCGA) Data Coordinating Center. Prior to that, Dr. Wang worked for the Henry M. Jackson Foundation for eight years. He is specialized in microarray data analysis with extensive experience in identifying biomarkers associated with vaccine efficacy, disease progression, and pathogenesis. He also has expertise in computer programming and the development of bioinformatics tools. Dr. Wang received his bioinformatics postdoc training at the NCI Center for Cancer Research (CCR) where he obtained extensive training in human-mouse comparative genomic analysis. Dr. Wang has an interest in epigenetics and part of his postdoc training was focused on cancer associated alternative splicing and genetic imprinting. Dr. Wang is the member of National Cancer Data Ecosystem for Sharing and Analysis Implementation Team.

    Wenming Xiao - FDA

    Principal Investigator Division of Bioinformatics and Biostatistics

    Dr. Xiao is a Principle Investigator at the FDA's Division of Bioinformatics and Biostatistics. His expertise is in the development and application of integrated bioinformatics methodologies for "omics" data generated from next generation sequencing and microarray platforms, to identify biomarkers for early tumor detection and treatment and to establish quality metrics and bioinformatics solutions for personal genome assembly for clinical applications. The two main areas of focus in his research are "personal genome assembly and quality metrics" and "circulating cell-free DNA detection". The molecular data and knowledge gathered from this research will aid in developing guidance and protocol for the biomarker improvement that could advance the research of personalized medicine. Prior to joining the FDA, Dr. Xiao was a staff scientist at the National Institutes of Health (NIH) and Associated Scientist as the National Cancer Institute (NCI).

    Sean Davis - NCI

    Staff Scientist

    Charles Hadley King - GW

    Senior Research Associate at George Washington University

    Prior to working on biocompute objects and managing the HTS-CSRS project, Hadley was a classroom science teacher in the Philadelphia public school system and an adjunct laboratory instructor at Temple University. While at Temple, Hadley also worked in the Kulathinal Laboratory for Evolutionary Genetics and Bioinformatics using molecular techniques to investigate phylogenetic relationships.

    Lynn Schriml - Disease Ontology

    Assistant Professor, Department of Epidemiology and Public Health, Institute for Genome Sciences - UMD

    Wendy Rubinstein - NCBI-NLM-NIH

    Chief, Medical Genetics and Human Variation, NCBI & Senior Scientist, NIH

    Dr. Rubinstein leads NCBI's medical genetics and variation resources including ClinVar, GTR, dbGaP, dbSNP, dbVar, MedGen, GeneReviews, and OSIRIS. She has 15 years experience directing adult genetics programs and is an authority on hereditary cancer susceptibility, genetic testing, and bioinformatics. Dr. Rubinstein oversaw the development, launch, and expansion of GTR, now the most comprehensive open-access genetic testing database in the world. She leads strategy and outreach for ClinVar, NCBI's noted resource of clinically relevant variation and developed MedGen, NCBI's phenotype resource that harmonizes genetic ontologies and nomenclatures. Dr. Rubinstein works closely with the NIH Office of the Director and serves as a liaison to FDA, CMS, professional medical and laboratory associations, test code providers, genetic information analysts and vendors, and others. In 2012, Dr. Rubinstein received the NIH Director's Award for work on the Genetic Testing Registry. Dr. Rubinstein holds dual certification in clinical genetics and clinical molecular genetics (ABMG) and is a Fellow of ACMG and ACP.

    Eugene Yaschenko - NCBI

    Chief, Molecular Software Section, Information Engineering Branch at National Center for Biotechnology Information

    Mr. Yaschenko serves as Chief, Molecular Software Section, Information Engineering Branch (IEB) at National Center for Biotechnology Information (NCBI). His experience at NCBI include coordinating public access to sequence, genetics, structural, and bibliographic information, establishing collaborative informatics research projects with NIH intramural laboratories, and consulting/advising governmental agencies on methods of software and database design. Mr. Yaschenko has worked on several large-scale bioinformatics resources, including GenBank, ClinVar NIH Manuscript Submission (NIHMS), Database of Genotypes and Phenotypes (dbGaP) and Sequence Read Archive. These efforts have played an important role in the scientific research community. Mr. Yaschenko received his MS degrees from Lomonosov Moscow State University and The Catholic University of America.

    Elaine Johanson - FDA

    Health Informatics Manager at DHHS/Food & Drug Administration

    Errol Strain - FDA

    FDA, Director of Biostatistics and Bioinformatics

    Alexander (Sasha) Wait Zaranek - Curoverse

    Curoverse, Co-Founder & Chief Scientist
    Harvard Personal Genome Project, Co-Founder

    Alexander (Sasha) Wait Zaranek, PhD is co-founder and Chief Scientist at Curoverse, a venture-backed company focused on building a free and open-source platform for storing, analyzing and sharing biomedical data. Sasha works on open technologies that are part of the revolution that reduced human DNA sequencing costs by a million-fold since the completion of the Human Genome Project. A current research focus is the development of clinical-quality applications for processing massive data sets spanning millions of individuals across collaborating organizations, eventually encompassing exabytes of data. His contributions have led to highly cited publications in Science, Nature, the Lancet and other leading scientific journals. Sasha is also a co-founder and Director of Informatics at the Harvard Personal Genome Project.

    Hsinyi (Steve) Tsang - NCI/Attain

    Senior Biomedical Informaticists at Attain

    Dr. Steve Tsang is a Senior Biomedical Informaticist with Attain, LLC, working at the National Cancer Institute Center for Biomedical Informatics and Information Technology (NCI-CBIIT) on the evaluation of the NCI Cancer Genomics Cloud pilots. He received his Ph.D. from the Johns Hopkins University - National Institutes of Health joint program, and completed his dissertation on protein structural bioinformatics at the National Center for Biotechnology Information (NCBI). After conducting postdoctoral research on HIV structural modeling and docking at Japan's National Institute of Infectious Diseases, Dr. Tsang became a staff scientist and bioinformatics group lead at the Naval Medical Research Center (NMRC), where his research focused on computational bacterial and viral pathogen identification and characterization. Following the NMRC, Dr. Tsang joined Medical Sciences & Computing, where he served as a subject-matter expertise consultant at the National Institute of Allergy and Infectious Diseases (NIAID). Prior to joining Attain, Dr. Tsang served as the Chief Technology Officer for the Hong Kong-based Union Honest Investments, where he evaluated and managed the company's scientific programs. Dr. Tsang is currently an Adjunct Associate Professor in the Biology and Biotechnology Departments at the University of Maryland University College, and an adjunct faculty member with the Biology Department at Montgomery College.