Miles Benton* & Donia Macartney-Coxson**
* Institue of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia
** Institue of Environmental Science and Research, Porirua, New Zealand
Biodiscovery Symposium 2016 (VUW)
5th July 2016
Image credit: www.biocomicals.blogspot.com
To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.
R. A. Fisher. The Indian Journal of Statistics (1933-1960), Vol. 4, No. 1 (1938), pp. 14-17.
You can't fix by analysis what was screwed up by design!
Bobby L. Jones. Department of Human Genetics, University of Pittsburgh
Researchers often want Bioinformaticians to be Biomagicians, people who can make significant results out of non-significant data, or Biomorticians, people who can bury data that disagree with the researcher's prior hypothesis.
Dan Masys. Associate Clinical Professor of Medicine, UCSD
Bioinformatics is not something you are taught, it’s a way of life
Mick Watson PhD
The best bioinformaticians I know are problem solvers – they start the day not knowing something, and they enjoy finding out (themselves) how to do it. It’s a great skill to have, but for most, it’s not even a skill – it’s a passion, it’s a way of life, it’s a thrill. It’s what these people would do at the weekend (if their families let them).
The best thing about being a statistician is that you get to play in everybody's backyard.
John Wilder Tukey.
This holds true for Bioinformatics, I've been lucky enough to work with so many different collaborators.
The Bioconductor repository currently features:
... and real time analytics: shinydashboard example
Most important tool for #ReproducibleResearch is the *mindset*, when starting, that the end product will be reproducible. Keith Baggerly
© 2015 DataScience.LA - Big Ram is Eating Our Big Data
© 2015 KDnuggets - Where is Big Data?
What makes data big and what complicates storage (opening up possibilities for the future)
/*insert collection as key:value pairs (I.e. associative array, hash map, dictionary)*/
db.collection.insert({
name: "Miles Benton",
affiliation: "QUT",
tags: [ /*includes arrays*/
"Bioinformatics",
"Human genetics"],
comments: { /*includes embedded objects*/
notes: "Simple comment",
date: Date()}
})
/*-----------Find records---------------*/
db.collection.find(
{ name: {$regex: "Miles"}} , /*add regular expressions*/
{"comments":1,"_id":0}).pretty() /*pretty adds formatting*/
/*---------Formatted output-------------*/
{
"comments" : {
"notes" : "Simple comment",
"date" : "Fri Nov 13 2015 11:54:02 GMT+1100 (AEDT)"
}
}
NoSQL encompasses a wide variety of different database technologies that were developed in response to a rise in the volume of data stored about users, objects and products, the frequency in which this data is accessed, and performance and processing needs.www.mongodb.com
© 2016 Illumina, Inc.
108 core-pedigree members
Illumina HiSeq X10 at the Garvan Institute
Many many TB of data (David Eccles)
QUT HPC facility
*Functional = predicted to be damaging in 5 in-silico tests (SIFT, POLYPHEN2, MUTATIONTASTER, PROVEAN, MUTATION ASSESSOR)
There is a higher than usual prevalence of Glaucoma on NI, we explored the WGS data for assocated variants
Predicted as deleterious/damaging
Gene has been previously associated with Retinoblastoma and Wilson's disease (cooper deposits around the cornea)
Our lab just recieved NATA (National Association of Testing Authorities) accreditation for WES
First and only such facility in Australia offering accredited diagnosis
© 2016 Thermo Fisher Scientific Inc.
After QC, alignment and variant calling get between 28-42K variants per exome
Diagnostics used to take ~4 weeks from generating the data to filtering down the variants and writing a report for the clinician
We needed to make this process faster...