Getting Started with Seurat
The input to Seurat is a gene expression matrix, where the rows are genes and the columns are single cells. To get started, first install the software and load the package library.
The first tutorial walks through analyzing a dataset of 2,700 Peripheral Blood Mononuclear Cells (PBMCs) made publically available by 10X Genomics using Seurat. The raw data can be found here. The tutorial is also available as an R markdown file here. This tutorial is the most detailed and new users should start with this one. The first command list below analyzes a dataset of 33,000 PBMCs also made publically available by 10X genomics. This command list can be helpful for users wanting to analyse larger single cell datasets. The raw data can be found here and the final object here. The second command list analyzes a dataset of 8,500 single cells from human pancreas made available by the Yanai lab. The raw data can be found here and the final object here.
Seurat combines dimensionality reduction and graph-based partioning algorithms for unsupervised clustering of single cells. The approach can be described briefly:
- Identification of highly variable genes
- Linear dimensionality reduction (PCA) on variable genes
- Determine significant principal components
- Graph based clustering to classify distinct groups of cells
- Non-linear dimensional reduction (t-SNE) for cluster visualization
- Marker discovery, visualization, and downstream analysis