Google Genomics: Running Picard with GA4GH Apis

Subscribers:
2,520,000
Published on ● Video Link: https://www.youtube.com/watch?v=d8EvXtz2uiA



Duration: 8:14
4,077 views
37


Picard/GATK tools are command line utilities for genomic sequencing data processing that typically take BAM and other files as input and produce modified BAM files.

These tools are frequently chained together into pipelines to perform step-by-step processing of the sequencing data all the way from unaligned sequencer output to variant calls (e.g. see Broad best practices).

We are teaching these tools to take cloud based datasets as a possible input. The foundation for cloud data access is now in HTSJDK library and we have converted a number of Picard tools.

If your dataset is loaded into a cloud provider supporting GA4GH API (e.g. Google Genomics) or you use one of the available datasets from Discover Published Data, you will be able to run a Picard tool against it, reading data directly from the cloud.

In this video we walk through running Picard tool processing data from the cloud via GA4GH Api implemented by Google Genomics.

See http://googlegenomics.readthedocs.org/en/latest/use_cases/run_picard_and_gatk/







Tags:
Google
developers
genomics
data
Picard
GATK tools
genomic sequencing
cloud
api
product: other
fullname: other
Location: MTV
Team: Scalable Advocacy
Type: DevByte
GDS: Post Production