About Install Vignettes Extensions FAQs Contact Search
Mapping to multimodal reference datasets

Intro: Seurat v4 Reference Mapping

This vignette introduces the process of mapping query datasets to annotated references in Seurat. In this example, we map one of the first scRNA-seq datasets released by 10X Genomics of 2,700 PBMC to our recently described CITE-seq reference of 162,000 PBMC measured with 228 antibodies. We chose this example to demonstrate how supervised analysis guided by a reference dataset can help to enumerate cell states that would be challenging to find with unsupervised analysis. In a second example, we demonstrate how to serially map Human Cell Atlas datasets of human BMNC profiled from different individuals onto a consistent reference.

We have previously demonstrated how to use reference-mapping approach to annotate cell labels in a query dataset . In Seurat v4, we have substantially improved the speed and memory requirements for integrative tasks including reference mapping, and also include new functionality to project query cells onto a previously computed UMAP visualization.

In this vignette, we demonstrate how to use a previously established reference to interpret an scRNA-seq query:

  • Annotate each query cell based on a set of reference-defined cell states
  • Project each query cell onto a previously computed UMAP visualization
  • Impute the predicted levels of surface proteins that were measured in the CITE-seq reference

To run this vignette please install Seurat v4, available as a beta release on our github page. Additionally, you will need to install the latest version of the uwot package and the SeuratDisk package.

remotes::install_github("satijalab/seurat", ref = "release/4.0.0")
remotes::install_github("jlmelville/uwot")
remotes::install_github("mojaveazure/seurat-disk")
library(Seurat)
library(SeuratDisk)
library(ggplot2)
library(patchwork)

PBMC

A Multimodal PBMC Reference Dataset

We load the reference (download here) from our recent preprint, and visualize the pre-computed UMAP. This reference is stored as an h5Seurat file, a format that enables on-disk storage of multimodal Seurat objects (more details on h5Seurat and SeuratDisk can be found here).

reference <- LoadH5Seurat("../data/pbmc_multimodal.h5seurat")
DimPlot(object = reference, reduction = "wnn.umap", group.by = "celltype.l2", label = TRUE, label.size = 3, repel = TRUE) + NoLegend()