Interactive 2D tSNE plotting of cell-specific methylation and gene expression markers

This page provides an interactive companion to the data that is detailed in our recent publication [DOI: 10.21203/rs.2.13274/v1].
Code and data for all plots on this page can be found here. Data, figures and additional files supporting our publication can be found here.

Note: The plots below are fully interactive and are able to be dragged and zoomed using the mouse. To reset the plot double click anywhere on it. The legends are also interactive, to select a group click on it with the mouse. To make multiple selections hold down 'shift' and click each group with the mouse. Again, double clicking anywhere on the plot will reset to the default view.

Figure: tSNE plot of all 1173 methylation CpG sites which were selected as being cell type markers.

Citation: Donia Macartney-Coxson, Alanna Cameron, Jane Clapham, and Miles C Benton. DNA methylation in blood - potential to provide new insights in immune cell biology, 20 August 2019, PREPRINT (Version 1) available at Research Square [DOI: 10.21203/rs.2.13274/v1]




Exploring cell-specific markers in other public methylation array data sets

We set out to determine if the markers identified above would perform in independent public cell-sorted data sets. We identified the below studies as appropriate candidates for this validation: Two of these three data sets were generated on the Illumina EPIC platform, while our marker panel is derived from Illumina 450K array data. We identified a total of 1025 CpG sites that overlapped the original panel of 1173 markers. This drop is due to a combination of some probes not being present on the EPIC platform and some samples having missing data (with resultant probes being dropped).
Figure: tSNE plot of sorted-cells from 211 samples based on 1025 methylation CpG (overlapping with the three selected data sets).


The below figure is a re-creation of the above including the cell-sorted data from Reinius 2012 (GSE35069) - the data set which was used to identify the 1173 cell-specific CpG markers.
Figure: tSNE plot of sorted-cells from 271 samples based on 1025 methylation CpG (the three selected data sets plus the original Reinius data set used to identify the markers).


Exploring cell-specific markers in a publicly available expression data set

The below figure depicts expression (RNASeq) data from cell-sorted populations - GSE107011 For this validation of our QRTPCR results we extracted data for the same five cell populations (CD4, CD8, CD19, CD56 and CD14) and then extracted expression data (TPM) for each of the 11 genes used in our study. The resultant data was analysed using t-SNE, results below:
Figure: tSNE plot of sorted-cells from the RNASeq study GSE107011. This analysis was performed on expression data (TPM) from 11 genes, these are the same 11 genes that we validated the expression of in our experiment (QRTPCR).
There are some interesting observations in the above plot:

Testing - large visualisations

This section explores using t-sne on a large number of methylation CpG sites and then tries to visualise the results in an interactive manner.
[TESTING] - this link explores a very large data set using bokeh, be warned there be dragons!

Back to top of page