Interactive 2D tSNE plotting of cell-specific methylation and gene expression markers
This page provides an interactive companion to the data that is detailed in our recent publication [DOI: 10.21203/rs.2.13274/v1].
Code and data for all plots on this page can be found here. Data, figures and additional files supporting our publication can be found here.
Citation: Donia Macartney-Coxson, Alanna Cameron, Jane Clapham, and Miles C Benton. DNA methylation in blood - potential to provide new insights in immune cell biology, 20 August 2019, PREPRINT (Version 1) available at Research Square [DOI: 10.21203/rs.2.13274/v1]
Exploring cell-specific markers in other public methylation array data sets
We set out to determine if the markers identified above would perform in independent public cell-sorted data sets. We identified the below studies as appropriate candidates for this validation:- GSE103541 - [Illumina EPIC] 145 samples (140 used in this validation); DNA methylation profiles for CD4 T cells, CD8 T cells, B cells, Monocytes and Granulocytes purified from 28 individuals.
- GSE110554 - [Illumina EPIC] 49 samples (49 used in this validation); Salas LA, Koestler DC, Butler RA, Hansen HM et al. An optimized library for reference-based deconvolution of whole-blood biospecimens assayed using the Illumina Human Methylation EPIC BeadArray. Genome Biol 2018 May 29;19(1):64. PMID: 29843789
- GSE82084 - [Illumina 450K] 36 samples (22 used in this validation); de Goede OM, Lavoie PM, Robinson WP. Cord blood hematopoietic cells from preterm infants display altered DNA methylation patterns. Clin Epigenetics 2017;9:39. PMID: 28428831
The below figure is a re-creation of the above including the cell-sorted data from Reinius 2012 (GSE35069) - the data set which was used to identify the 1173 cell-specific CpG markers.
- GSE35069- [Illumina 450K] 60 samples (60 used in this experiment); Reinius LE, Acevedo N, Joerink M, Pershagen G, Dahlén SE, et al. (2012) Differential DNA Methylation in Purified Human Blood Cells: Implications for Cell Lineage and Studies on Disease Susceptibility. PLOS ONE 7(7): e41361. PMID: 22848472
Exploring cell-specific markers in a publicly available expression data set
The below figure depicts expression (RNASeq) data from cell-sorted populations - GSE107011- Xu W, Monaco G, Wong EH, Tan WLW et al. (2019) Mapping of γ/δ T cells reveals Vδ2+ T cells resistance to senescence. EBioMedicine PMID: 30528453
- the data used actually explored sub-types of specific populations:
- TE = Terminal Effector
- EM = Effector Memory
- CM = Central Memory
- NC_mono = Non Classical monocytes
- I_mono = Intermediate monocytes
- C_mono = Classical monocytes
- B_SM = Switched memory B cells
- B_Ex = Exhausted B cells
- B_NSM = Non-switched memory B cells
- when looking at the expression data plotted above we can see that not only do the cell populations (i.e. B, T, Monocyte and Natural Killer) clearly cluster, there is also a degree of clustering within each population.
- for CD8 cytotoxic T cells there is a clear separation between CM/naive and TE/EM sub-types.
- similar can be observed for the monocyte cluster. If zoomed in three distinct groups can be seen, one for each sub-type.
- this also holds true for the CD4 T cells with the Th17 subtype clearly clustering apart from Th1/Th2 cells.
Testing - large visualisations
This section explores using t-sne on a large number of methylation CpG sites and then tries to visualise the results in an interactive manner.[TESTING] - this link explores a very large data set using bokeh, be warned there be dragons!
Back to top of page