Mapping to multimodal reference datasets
Compiled: October 12, 2020
Intro: Seurat v4 Reference Mapping
This vignette introduces the process of mapping query datasets to annotated references in Seurat. In this example, we map one of the first scRNA-seq datasets released by 10X Genomics of 2,700 PBMC to our recently described CITE-seq reference of 162,000 PBMC measured with 228 antibodies. We chose this example to demonstrate how supervised analysis guided by a reference dataset can help to enumerate cell states that would be challenging to find with unsupervised analysis. In a second example, we demonstrate how to serially map Human Cell Atlas datasets of human BMNC profiled from different individuals onto a consistent reference.
We have previously demonstrated how to use reference-mapping approach to annotate cell labels in a query dataset . In Seurat v4, we have substantially improved the speed and memory requirements for integrative tasks including reference mapping, and also include new functionality to project query cells onto a previously computed UMAP visualization.
In this vignette, we demonstrate how to use a previously established reference to interpret an scRNA-seq query:
- Annotate each query cell based on a set of reference-defined cell states
- Project each query cell onto a previously computed UMAP visualization
- Impute the predicted levels of surface proteins that were measured in the CITE-seq reference
To run this vignette please install Seurat v4, available as a beta release on our github page. Additionally, you will need to install the latest version of the
uwot package and the
remotes::install_github("satijalab/seurat", ref = "release/4.0.0") remotes::install_github("jlmelville/uwot") remotes::install_github("mojaveazure/seurat-disk")
library(Seurat) library(SeuratDisk) library(ggplot2) library(patchwork)
A Multimodal PBMC Reference Dataset
We load the reference (download here) from our recent preprint, and visualize the pre-computed UMAP. This reference is stored as an h5Seurat file, a format that enables on-disk storage of multimodal Seurat objects (more details on h5Seurat and
SeuratDisk can be found here).
reference <- LoadH5Seurat("../data/pbmc_multimodal.h5seurat")
DimPlot(object = reference, reduction = "wnn.umap", group.by = "celltype.l2", label = TRUE, label.size = 3, repel = TRUE) + NoLegend()