Developed in collaboration with the Technology Innovation Group at NYGC, Cell Hashing uses oligo-tagged antibodies against ubuquitously expressed surface proteins to place a "sample barcode" on each single cell, enabling different samples to be multiplexed together and run in a single experiment. For more information, please refer to this paper.
This vignette will give a brief demonstration on how to work with data produced with Cell Hashing in Seurat. Applied to two datasets, we can successfully demultiplex cells to their the original sample-of-origin, and identify cross-sample doublets.
Read in data
# Load in the UMI matrix pbmc.umis <- readRDS("../data/pbmc_umi_mtx.rds") # For generating a hashtag count matrix from FASTQ files, please refer to # https://github.com/Hoohm/CITE-seq-Count. Load in the HTO count matrix pbmc.htos <- readRDS("../data/pbmc_hto_mtx.rds") # Select cell barcodes detected by both RNA and HTO In the example datasets we have already # filtered the cells for you, but perform this step for clarity. joint.bcs <- intersect(colnames(pbmc.umis), colnames(pbmc.htos)) # Subset RNA and HTO counts by joint cell barcodes pbmc.umis <- pbmc.umis[, joint.bcs] pbmc.htos <- as.matrix(pbmc.htos[, joint.bcs]) # Confirm that the HTO have the correct names rownames(pbmc.htos)
##  "HTO_A" "HTO_B" "HTO_C" "HTO_D" "HTO_E" "HTO_F" "HTO_G" "HTO_H"
Setup Seurat object and add in the HTO data
# Setup Seurat object pbmc.hashtag <- CreateSeuratObject(counts = pbmc.umis) # Normalize RNA data with log normalization pbmc.hashtag <- NormalizeData(pbmc.hashtag) # Find and scale variable features pbmc.hashtag <- FindVariableFeatures(pbmc.hashtag, selection.method = "mean.var.plot") pbmc.hashtag <- ScaleData(pbmc.hashtag, features = VariableFeatures(pbmc.hashtag))
You can read more about working with multi-modal data here
# Add HTO data as a new assay independent from RNA pbmc.hashtag[["HTO"]] <- CreateAssayObject(counts = pbmc.htos) # Normalize HTO data, here we use centered log-ratio (CLR) transformation pbmc.hashtag <- NormalizeData(pbmc.hashtag, assay = "HTO", normalization.method = "CLR")
Here we use the Seurat function HTODemux() to assign single cells back to their sample origins.
# If you have a very large dataset we suggest using k_function = 'clara'. This is a k-medoid # clustering function for large applications You can also play with additional parameters (see # documentation for HTODemux()) to adjust the threshold for classification Here we are using the # default settings pbmc.hashtag <- HTODemux(pbmc.hashtag, assay = "HTO", positive.quantile = 0.99)
Output from running HTODemux() is saved in the object metadata. We can visualize how many cells are classified as singlets, doublets and negative/ambiguous cells.
# Global classification results table(pbmc.hashtag$HTO_classification.global)
## ## Doublet Negative Singlet ## 2598 346 13972
Visualize enrichment for selected HTOs with ridge plots
# Group cells based on the max HTO signal Idents(pbmc.hashtag) <- "HTO_maxID" RidgePlot(pbmc.hashtag, assay = "HTO", features = rownames(pbmc.hashtag[["HTO"]])[1:2], ncol = 2)