Computing and Analysis in Research

I currently support multiple projects for investigators at the Fred Hutch, providing subject matter expertise on data management for genomics data, advising and technical support for various cloud resources and guidance and training for software sharing. I often advise and train staff in the use of GitHub, Docker, WDL based workflows, and reproducible code and workflow sharing, depending on the particular expertise level and needs of individual groups.

Workflows, Reproducibility and Sharing

Once you wrangle your data, the logical next step is to develop reproducible, sharable, documented methods for analyzing said data. To this end, I lead the testing and custom configuration of the workflow manager Cromwell, for use by investigators at the Hutch, including supporting additional development required for Cromwell to use AWS Batch as a backend via scope definition and testing in collaboration with AWS and Fred Hutch IT.

Workflows and Computing Support

Recently, I’ve worked with folks to tune up code to share, Dockerize it if need be, and wrap it in modern workflow definitions such that it could be run by others, on-prem or in the cloud. Often this work happens behind the scenes of more visible publications, but open access and sharing (when appropriate) has been a focus of mine during this work.

I have been writing and customizing sharable workflows (in WDL) as part of collaborations that enable sharing of tasks and entire workflows across groups and across institutions leveraging Docker containers and reproducible design of workflows.

Some public, and unfortunately some still private, resources I support include:

A custom configuration for the Cromwell workflow manager at the Hutch:

Github WDL workflows, most of which are still private while publications are in preparation:

R package

Shiny app

Training and Documentation

In collaboration with the Fred Hutch Data Science Lab, we’ve generated a guide for using Cromwell to run WDLs at Fred Hutch, and an emerging guide to designing and testing WDL workflows.

As a member of the OpenWDL governance team, I’ve also contributed to community documentation about WDL as well.