Miles Benton
Senior Scientist Bioinformatics
Human Genomics, Institute of Environmental Science and Research (ESR)
.large[
eResearch, Dunedin, 12
th - 14
th Feburary 2020]
.center[

]
---
class: middle
# .center[Advocate for reproducible research...]
## .center[Where possible my presentations and code are available online]
<br />
<p>
.center[
<img src="images/github_logo.png" style="width: 520px; margin-right: 1%; margin-top: 1.5em;"/>
<img src="images/sirselim_qrcode.png" style="width: 202px; margin-right: 1%; margin-top: 1.5em;"/>
]
</p>
.center[[sirselim.github.io/presentations](http://sirselim.github.io/presentations)]
.center[twitter: [@miles_benton](https://twitter.com/miles_benton?lang=en)]
???
These are my presenter notes. :)
The theme of eResearch 2020 is "**United in Data**".
These links throughout are a combination of github and hackMD, all should be available.
---
class: middle
<img src="https://nanopore-dunedin.github.io/docs/assets/GA-Wide-Colour-1200px.jpg">
.center[.huge[
[www.genomics-aotearoa.org.nz](www.genomics-aotearoa.org.nz)
[github.com/GenomicsAotearoa](https://github.com/GenomicsAotearoa)
]]
???
Part of the Bioinformatics Leadership Team, for health.
---
class: middle
# GPU basecalling (live demo part 1)
<br>
data: mixed bacterial sample, ~0.5 Mb (or ~5.5 Gb actual data)
```bash
# fast basecalling mode
guppy_basecaller \
--disable_pings \ # don't call home
--compress_fastq \
-c dna_r9.4.1_450bps_fast.cfg \ # model file
-i fast5/ \ # input dir (fast5 files)
-s flongle_test \ # output dir
-x 'auto' \ # GPU configuration
--recursive \
--num_callers 4 \
--gpu_runners_per_device 8 \
--chunks_per_runner 256
```
.pull.right[<span style="color:#3498DB">... **que to start ~~trial by fire~~ the first demo** ...</span>]
???
Notes
---
class: middle
# while that's running...
???
Refer to Peter's opening address, reiterating that genomics is important and increasingly so.
Check to see audience knowledge of genetics/genomics.
A little genomics 101 if needed...
---
layout: false
class: middle inverse
background-image: url("https://cdn.vox-cdn.com/thumbor/llQlREwACaitewdPcLm5HzWT_g0=/0x1:1100x734/920x613/filters:focal(0x1:1100x734):format(webp)/cdn.vox-cdn.com/imported_assets/1507663/DNA-sequence.jpg")
background-size: cover
.massive[**Genomic sequencing**]
???
DNA - A, T, C, G
Part of all 'living' things
---
class: top
# what are we doing?
.large[
Portable ‘real-time’ sequencing for the masses?
]
--
* <span style="color:#3498DB">**the idea**</span>
* low cost
* accessible
* portable
* fun!
--
* <span style="color:#3498DB">**example use cases**</span>
* field sequencing (real-time monitoring, forensics, agriculture, waterways, ...)
* **clinical settings**
* community outreach / teaching
--
.center[
Community Science | "United in Data"
]
---
layout: false
class: middle
<p>
.center[
<img src="https://www.illumina.com/content/dam/illumina-marketing/images/decade-in-sequencing/decade-in-sequencing-320-web-graphic.jpg" style="width: 820px; margin-right: 1%; margin-top: 1.5em;"/>
]
</p>
.small[.center[
(image source: [illumina.com](https://www.illumina.com/techniques/sequencing/dna-sequencing.html))
]]
???
I used the word portable in my previous slide, this is not a phrase one would attribute to machines from Illumina / PacBio.
These things are huge! ... and expensive.
---
layout: false
class: middle
<p>
.center[
<img src="https://cdn-a.william-reed.com/var/wrbm_gb_food_pharma/storage/images/publications/food-beverage-nutrition/foodnavigator.com/article/2018/03/21/oxford-nanopore-raises-funds-to-support-commercial-expansion/7992741-1-eng-GB/Oxford-Nanopore-raises-funds-to-support-commercial-expansion_wrbm_large.jpg" style="width: 620px; margin-right: 1%; margin-top: 1.5em;"/>
]
</p>
.small[.center[
(image source: [nanoporetech](https://nanoporetech.com/about-us/news/oxford-nanopore-announces-ps100-million-140m-fundraising-global-investors))
]]
???
flongle: ~$500 USD for 5
1 might be enough for a metagenome
minION: $1000
smidgION: still just a concept
gridION: ~$50K
promethION: ~$230K
---
layout: false
background-image: url("images/david_slide.png")
background-size: contain
<a href="https://f1000research.com/slides/8-1947" style="position:absolute; top:625px; left:190px">slide kindly supplied by David Eccles (f1000 presentation)</a>
<a href="https://www.youtube.com/watch?v=CHCAb-PAqUI" style="position:absolute; top:652px; left:190px">"Sequencing DNA with Linux Cores and Naopores"</a>
???
David's talk and a live demo are linked.
---
layout: false
class: middle
# Example squiggle plot

.small[.center[
(image source: [tombo manual](https://nanoporetech.github.io/tombo/plotting.html))
]]
???
What we 'see' when DNA passes through the pores.
---
class: middle
# Why GPUs?
.pull-left[
New type of sequencing data requires new type of 'analysis'
* squiggle data lends itself nicely to neural nets
* GPUs are very capable in this space
- CUDA cores
ESR GPU basecalling benchmarks [(link)](https://esr-nz.github.io/gpu_basecalling_testing/gpu_benchmarking.html)
* Titan RTX & 2x Tesla V100
]
.pull-right[

]
<br>
.small[.center[
[UPDATE:] guppy is now able to scale across multiple GPUs!
]]
---
class: middle
# Our experiences with the Xavier [(link)](https://hackmd.io/@Miles/HkumH7sBH)
.pull-left[.right[
<img src="images/IMG_20191021_165639.jpg" style="width: 200px;"/>
<img src="images/IMG_20191025_094614.jpg" style="width: 200px;"/>
]]
.pull-right[.left[
<img src="images/battery_charge.jpg" style="width: 200px;"/>
<img src="images/xavier_in_use.jpg" style="width: 200px;"/>
]]
.center[
<img src="images/jetson_xavier_jtop_screenshot.png" style="width: 650px;"/>
]
???
Talk about the Xavier specs:
* 8 core arm
* 16 GB of RAM
* 512 CUDA core and 64 tensor cores
* 512 GB SSD NVMe
---
class: top
# Benchmarking [(link)<sup>*</sup>](https://gist.github.com/sirselim/2ebe2807112fae93809aa18f096dbb94)
.small[
\* spoilers for those following along live
]
--
<br>
Most Jetson devices have the ability to be put into different power modes
<br>