The Signac package is an extension of Seurat designed for the analysis of genomic single-cell assays. This includes any assay that generates signal mapped to genomic coordinates, such as scATAC-seq, scCUT&Tag, scACT-seq, and other methods.

As the analysis of these single-cell chromatin datasets presents some unique challenges in comparison to the analysis of scRNA-seq data, we have created an extended Assay class to store the additional information needed, including:

  • Genomic ranges associated with the features (eg, peaks or genomic bins)
  • Gene annotations
  • Genome information
  • TF motifs
  • Genome-wide signal in a disk-based format (fragment files)
  • TF footprinting data
  • Tn5 insertion bias data
  • Linked genomic regions

A major advantage of the Signac design is its interoperability with existing functions in the Seurat package, and other packages that are able to use the Seurat object. This enables straightforward analysis of multimodal single-cell data through the addition of different assays to the Seurat object.

Here we outline the design of each class defined in the Signac package, and demonstrate methods that can be run on each class.

The ChromatinAssay Class

The ChromatinAssay class extends the standard Seurat Assay class and adds several additional slots for data useful for the analysis of single-cell chromatin datasets. The class includes all the slots present in a standard Seurat Assay, with the following additional slots:

  • ranges: A GRanges object containing the genomic coordinates of each feature in the data matrix.
  • motifs: A Motif object
  • fragments: A list of Fragment objects
  • seqinfo: A Seqinfo object containing information about the genome that the data was mapped to
  • annotation: A GRanges object containing gene annotations
  • bias: A vector containing Tn5 integration bias information (the frequency of Tn5 integration at different hexamers)
  • positionEnrichment: A named list of matrices containing positional enrichment scores for Tn5 integration (for example, enrichment at the TSS or at different TF motifs)
  • links: A GRanges object describing linked genomic positions, such as co-accessible sites or enhancer-gene regulatory relationships.

Constructing the ChromatinAssay

A ChromatinAssay object can be constructed using the CreateChromatinAssay() function.

# get some data to use in the following examples
counts <- GetAssayData(atac_small, slot = "counts")
# create a standalone ChromatinAssay object
chromatinassay <- CreateChromatinAssay(counts = counts, genome = "hg19")
## Loading required package: BiocGenerics
## Loading required package: parallel
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
## 
##     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
##     clusterExport, clusterMap, parApply, parCapply, parLapply,
##     parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
##     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
##     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
##     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
##     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
##     union, unique, unsplit, which.max, which.min
## Loading required package: S4Vectors
## Loading required package: stats4
## 
## Attaching package: 'S4Vectors'
## The following objects are masked from 'package:base':
## 
##     expand.grid, I, unname

Here the genome parameter can be used to set the seqinfo slot. We can pass the name of a genome present in UCSC (e.g., “hg19” or “mm10”), or we can pass a Seqinfo-class object.

To create a Seurat object that contains a ChromatinAssay rather than a standard Assay, we can initialize the object using the ChromatinAssay rather than a count matrix. Note that this feature was added in Seurat 3.2.

# create a Seurat object containing a ChromatinAssay
object <- CreateSeuratObject(counts = chromatinassay)

Adding a ChromatinAssay to a Seurat object

To add a new ChromatinAssay object to an existing Seurat object, we can use the standard assignment operation used for adding standard Assay objects and other data types to the Seurat object.

# create a chromatin assay and add it to an existing Seurat object
object[["peaks"]] <- CreateChromatinAssay(counts = counts, genome = "hg19")

Getting and setting ChromatinAssay data

We can get/set data for the ChromatinAssay in much the same way we do for a standard Assay object: using the GetAssayData and SetAssayData functions defined in Seurat. For example:

## Getting

# access the data slot, found in standard Assays and ChromatinAssays
data <- GetAssayData(atac_small, slot = "data")

# access the bias slot, unique to the ChromatinAssay
bias <- GetAssayData(atac_small, slot = "bias")

## Setting

# set the data slot
atac_small <- SetAssayData(atac_small, slot = "data", new.data = data)

# set the bias slot
bias <- rep(1, 100)  # create a dummy bias vector
atac_small <- SetAssayData(atac_small, slot = "bias", new.data = bias)

We also have a variety of convenience functions defined for getting/setting data in specific slots. This includes the Fragments(), Motifs(), Links(), and Annotation() functions. For example, to get or set gene annotation data we can use the Annotation() getter and Annotation<- setter functions:

# first get some gene annotations for hg19
library(EnsDb.Hsapiens.v75)

# convert EnsDb to GRanges
gene.ranges <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v75)

# convert to UCSC style
seqlevelsStyle(gene.ranges) <- "UCSC"
genome(gene.ranges) <- "hg19"

# set gene annotations
Annotation(atac_small) <- gene.ranges

# get gene annotation information
Annotation(atac_small)
## GRanges object with 3072120 ranges and 5 metadata columns:
##                   seqnames        ranges strand |           tx_id   gene_name
##                      <Rle>     <IRanges>  <Rle> |     <character> <character>
##   ENSE00001489430     chrX 192989-193061      + | ENST00000399012      PLCXD1
##   ENSE00001536003     chrX 192991-193061      + | ENST00000484611      PLCXD1
##   ENSE00002160563     chrX 193020-193061      + | ENST00000430923      PLCXD1
##   ENSE00001750899     chrX 197722-197788      + | ENST00000445062      PLCXD1
##   ENSE00001489388     chrX 197859-198351      + | ENST00000381657      PLCXD1
##               ...      ...           ...    ... .             ...         ...
##   ENST00000361739    chrMT     7586-8269      + | ENST00000361739      MT-CO2
##   ENST00000361789    chrMT   14747-15887      + | ENST00000361789      MT-CYB
##   ENST00000361851    chrMT     8366-8572      + | ENST00000361851     MT-ATP8
##   ENST00000361899    chrMT     8527-9207      + | ENST00000361899     MT-ATP6
##   ENST00000362079    chrMT     9207-9990      + | ENST00000362079      MT-CO3
##                           gene_id   gene_biotype     type
##                       <character>    <character> <factor>
##   ENSE00001489430 ENSG00000182378 protein_coding     exon
##   ENSE00001536003 ENSG00000182378 protein_coding     exon
##   ENSE00002160563 ENSG00000182378 protein_coding     exon
##   ENSE00001750899 ENSG00000182378 protein_coding     exon
##   ENSE00001489388 ENSG00000182378 protein_coding     exon
##               ...             ...            ...      ...
##   ENST00000361739 ENSG00000198712 protein_coding      cds
##   ENST00000361789 ENSG00000198727 protein_coding      cds
##   ENST00000361851 ENSG00000228253 protein_coding      cds
##   ENST00000361899 ENSG00000198899 protein_coding      cds
##   ENST00000362079 ENSG00000198938 protein_coding      cds
##   -------
##   seqinfo: 25 sequences from hg19 genome

The Fragments(), Motifs(), and Links() functions are demonstrated in other sections below.

Other ChromatinAssay methods

As the ChromatinAssay object uses Bioconductor objects like GRanges and Seqinfo , we can also call standard Bioconductor functions defined in the IRanges, GenomicRanges, and GenomeInfoDb packages on the ChromatinAssay object (or a Seurat object with a ChromatinAssay as the default assay).

The following methods use the genomic ranges stored in a ChromatinAssay object.

# extract the genomic ranges associated with each feature in the data matrix
granges(atac_small)
## GRanges object with 323 ranges and 0 metadata columns:
##         seqnames          ranges strand
##            <Rle>       <IRanges>  <Rle>
##     [1]     chr1   713460-714823      *
##     [2]     chr1   752422-753038      *
##     [3]     chr1   762106-763359      *
##     [4]     chr1   779589-780271      *
##     [5]     chr1   804872-805761      *
##     ...      ...             ...    ...
##   [319]     chr1 9299648-9300348      *
##   [320]     chr1 9327071-9327557      *
##   [321]     chr1 9335457-9336176      *
##   [322]     chr1 9349019-9350779      *
##   [323]     chr1 9352328-9354391      *
##   -------
##   seqinfo: 1 sequence from an unspecified genome; no seqlengths
# find the nearest range
nearest(atac_small, subject = Annotation(atac_small))
##   [1] 353132 313545 353180 353181 316856 352756 352756 158070 443435 429683
##  [11] 384995 416914 424674 158278 158279 433593 158289 158292 416882 416846
##  [21] 330101 416851 158359 158360 368091 367827 332919 370483 158535 308473
##  [31] 158541 158542 363998 355291 433871 416767 323350 416760 158690 372319
##  [41] 364664 431403 439123 416719 432709 427719 355667 416689 443324 386709
##  [51] 434670 303555 432660 416683 381736 159871 416657 423129 392261 365826
##  [61] 422594 159971 436548 343096 436566 355859 416566 332077 332077 363936
##  [71] 370347 351939 160801 416517 416520 338322 363920 326578 326578 327323
##  [81] 327323 327323 327323 327323 327323 327323 327323 327323 327323 367095
##  [91] 160980 160980 434820 434837 443759 324844 161135 431471 340026 340026
## [101] 340026 340026 369496 340577 340577 340577 341621 435222 435223 432912
## [111] 443427 426548 416453 335191 442153 161420 416439 342162 342162 342162
## [121] 342162 303306 363897 352608 416438 436339 428684 317981 428699 428703
## [131] 342827 435474 416332 161829 428988 339920 161934 161934 440061 306261
## [141] 306263 351669 162155 162156 363861 342747 363853 434544 380558 334073
## [151] 334074 433697 162510 162510 416213 363843 422863 352374 352374 352399
## [161] 352402 162725 375321 442859 416141 303012 416142 416109 434615 430550
## [171] 391633 391635 315910 162940 162949 162963 162981 162981 307713 371219
## [181] 338832 338832 330394 330394 375499 353291 352159 352159 352159 375501
## [191] 416050 434442 435105 376297 415948 415966 434227 415947 363703 352789
## [201] 352789 415912 352790 444057 163718 363689 326449 415881 302688 421541
## [211] 432843 330143 430627 436030 442123 439413 363667 440058 415745 434148
## [221] 363657 435144 415713 434010 349513 423368 164491 309751 164496 442737
## [231] 314928 314928 314934 314934 314936 314936 314936 339456 376041 363644
## [241] 415671 363635 363635 328360 363637 164717 371613 371633 338776 338777
## [251] 433987 371077 316670 316670 316670 164797 164797 164798 164798 164798
## [261] 164802 310712 164803 164804 415606 355641 338270 338270 338270 338270
## [271] 338270 338270 431114 322988 322988 322988 431117 342023 342023 415605
## [281] 342786 331593 331593 331594 331594 369220 331594 331594 364635 353318
## [291] 435250 340755 371936 165021 165021 165023 165023 350877 415587 363619
## [301] 363619 418872 305795 434018 335496 335496 165154 363612 323389 323542
## [311] 420824 165164 165164 165167 165167 314147 314148 363609 443585 375482
## [321] 363611 165185 355274
# distance to the nearest range
distanceToNearest(atac_small, subject = Annotation(atac_small))
## Hits object with 323 hits and 1 metadata column:
##         queryHits subjectHits |  distance
##         <integer>   <integer> | <integer>
##     [1]         1      353132 |         0
##     [2]         2      313545 |         0
##     [3]         3      353180 |         0
##     [4]         4      353181 |         0
##     [5]         5      316856 |         0
##     ...       ...         ... .       ...
##   [319]       319      443585 |         0
##   [320]       320      375482 |         0
##   [321]       321      363611 |      4060
##   [322]       322      165185 |      2159
##   [323]       323      355274 |         0
##   -------
##   queryLength: 323 / subjectLength: 3072120
# find overlaps with another set of genomic ranges
findOverlaps(atac_small, subject = Annotation(atac_small))
## Hits object with 4615 hits and 0 metadata columns:
##          queryHits subjectHits
##          <integer>   <integer>
##      [1]         1      157933
##      [2]         1      157934
##      [3]         1      157935
##      [4]         1      157936
##      [5]         1      157937
##      ...       ...         ...
##   [4611]       320      363611
##   [4612]       320      375482
##   [4613]       323      165185
##   [4614]       323      270557
##   [4615]       323      355274
##   -------
##   queryLength: 323 / subjectLength: 3072120

Many other methods are defined, see the documentation for nearest-methods, findOverlaps-methods, inter-range-methods, and coverage in Signac for a full list.

The following methods use the seqinfo data stored in a ChromatinAssay object.

# get the full seqinfo information
seqinfo(atac_small)
## Seqinfo object with 298 sequences (2 circular) from hg19 genome:
##   seqnames           seqlengths isCircular genome
##   chr1                249250621      FALSE   hg19
##   chr2                243199373      FALSE   hg19
##   chr3                198022430      FALSE   hg19
##   chr4                191154276      FALSE   hg19
##   chr5                180915260      FALSE   hg19
##   ...                       ...        ...    ...
##   chr21_gl383580_alt      74652      FALSE   hg19
##   chr21_gl383581_alt     116690      FALSE   hg19
##   chr22_gl383582_alt     162811      FALSE   hg19
##   chr22_gl383583_alt      96924      FALSE   hg19
##   chr22_kb663609_alt      74013      FALSE   hg19
# get the genome information
genome(atac_small)
##                  chr1                  chr2                  chr3 
##                "hg19"                "hg19"                "hg19" 
##                  chr4                  chr5                  chr6 
##                "hg19"                "hg19"                "hg19" 
##                  chr7                  chr8                  chr9 
##                "hg19"                "hg19"                "hg19" 
##                 chr10                 chr11                 chr12 
##                "hg19"                "hg19"                "hg19" 
##                 chr13                 chr14                 chr15 
##                "hg19"                "hg19"                "hg19" 
##                 chr16                 chr17                 chr18 
##                "hg19"                "hg19"                "hg19" 
##                 chr19                 chr20                 chr21 
##                "hg19"                "hg19"                "hg19" 
##                 chr22                  chrX                  chrY 
##                "hg19"                "hg19"                "hg19" 
##                  chrM                 chrMT        chr4_ctg9_hap1 
##                "hg19"                "hg19"                "hg19" 
##         chr6_apd_hap1         chr6_cox_hap2         chr6_dbb_hap3 
##                "hg19"                "hg19"                "hg19" 
##        chr6_mann_hap4         chr6_mcf_hap5         chr6_qbl_hap6 
##                "hg19"                "hg19"                "hg19" 
##        chr6_ssto_hap7       chr17_ctg5_hap1  chr1_gl000191_random 
##                "hg19"                "hg19"                "hg19" 
##  chr1_gl000192_random  chr4_gl000193_random  chr4_gl000194_random 
##                "hg19"                "hg19"                "hg19" 
##  chr7_gl000195_random  chr8_gl000196_random  chr8_gl000197_random 
##                "hg19"                "hg19"                "hg19" 
##  chr9_gl000198_random  chr9_gl000199_random  chr9_gl000200_random 
##                "hg19"                "hg19"                "hg19" 
##  chr9_gl000201_random chr11_gl000202_random chr17_gl000203_random 
##                "hg19"                "hg19"                "hg19" 
## chr17_gl000204_random chr17_gl000205_random chr17_gl000206_random 
##                "hg19"                "hg19"                "hg19" 
## chr18_gl000207_random chr19_gl000208_random chr19_gl000209_random 
##                "hg19"                "hg19"                "hg19" 
## chr21_gl000210_random        chrUn_gl000211        chrUn_gl000212 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000213        chrUn_gl000214        chrUn_gl000215 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000216        chrUn_gl000217        chrUn_gl000218 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000219        chrUn_gl000220        chrUn_gl000221 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000222        chrUn_gl000223        chrUn_gl000224 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000225        chrUn_gl000226        chrUn_gl000227 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000228        chrUn_gl000229        chrUn_gl000230 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000231        chrUn_gl000232        chrUn_gl000233 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000234        chrUn_gl000235        chrUn_gl000236 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000237        chrUn_gl000238        chrUn_gl000239 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000240        chrUn_gl000241        chrUn_gl000242 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000243        chrUn_gl000244        chrUn_gl000245 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000246        chrUn_gl000247        chrUn_gl000248 
##                "hg19"                "hg19"                "hg19" 
##        chrUn_gl000249     chr1_gl383516_fix     chr1_gl383517_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr1_gl949741_fix     chr1_jh636052_fix     chr1_jh636053_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr1_jh636054_fix     chr1_jh806573_fix     chr1_jh806574_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr1_jh806575_fix     chr2_gl877870_fix     chr2_gl877871_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr2_kb663603_fix     chr3_gl383523_fix     chr3_gl383524_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr3_gl383525_fix     chr3_jh159131_fix     chr3_jh159132_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr3_ke332495_fix     chr4_gl582967_fix     chr4_gl877872_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr4_ke332496_fix     chr5_jh159133_fix     chr5_ke332497_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr6_jh636056_fix     chr6_jh636057_fix     chr6_jh806576_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr6_kb663604_fix     chr6_ke332498_fix     chr7_gl582968_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr7_gl582969_fix     chr7_gl582970_fix     chr7_gl582971_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr7_gl582972_fix     chr7_jh159134_fix     chr7_jh636058_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr7_ke332499_fix     chr8_gl383535_fix     chr8_gl383536_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr8_gl949743_fix     chr8_jh159135_fix     chr8_ke332500_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr9_gl339450_fix     chr9_gl383537_fix     chr9_gl383538_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr9_jh636059_fix     chr9_jh806577_fix     chr9_jh806578_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr9_jh806579_fix     chr9_kb663605_fix    chr10_gl383543_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr10_gl383544_fix    chr10_gl877873_fix    chr10_jh591181_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr10_jh591182_fix    chr10_jh591183_fix    chr10_jh636060_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr10_jh806580_fix    chr10_kb663606_fix    chr10_ke332501_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr11_gl582973_fix    chr11_gl949744_fix    chr11_jh159138_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr11_jh159139_fix    chr11_jh159140_fix    chr11_jh159141_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr11_jh159142_fix    chr11_jh159143_fix    chr11_jh591184_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr11_jh591185_fix    chr11_jh720443_fix    chr11_jh806581_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr12_gl383548_fix    chr12_gl582974_fix    chr12_jh720444_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr12_kb663607_fix    chr13_gl582975_fix    chr14_kb021645_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr15_jh720445_fix    chr16_jh720446_fix    chr17_gl383558_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr17_gl383559_fix    chr17_gl383560_fix    chr17_gl383561_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr17_gl383562_fix    chr17_gl582976_fix    chr17_jh159144_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr17_jh159145_fix    chr17_jh591186_fix    chr17_jh636061_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr17_jh720447_fix    chr17_jh806582_fix    chr17_kb021646_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr17_ke332502_fix    chr19_gl582977_fix    chr19_jh159149_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr19_kb021647_fix    chr19_ke332505_fix    chr20_gl582979_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr20_jh720448_fix    chr20_kb663608_fix    chr21_ke332506_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr22_jh720449_fix    chr22_jh806583_fix    chr22_jh806584_fix 
##                "hg19"                "hg19"                "hg19" 
##    chr22_jh806585_fix    chr22_jh806586_fix     chrX_gl877877_fix 
##                "hg19"                "hg19"                "hg19" 
##     chrX_jh159150_fix     chrX_jh720451_fix     chrX_jh720452_fix 
##                "hg19"                "hg19"                "hg19" 
##     chrX_jh720453_fix     chrX_jh720454_fix     chrX_jh720455_fix 
##                "hg19"                "hg19"                "hg19" 
##     chrX_jh806587_fix     chrX_jh806588_fix     chrX_jh806589_fix 
##                "hg19"                "hg19"                "hg19" 
##     chrX_jh806590_fix     chrX_jh806591_fix     chrX_jh806592_fix 
##                "hg19"                "hg19"                "hg19" 
##     chrX_jh806593_fix     chrX_jh806594_fix     chrX_jh806595_fix 
##                "hg19"                "hg19"                "hg19" 
##     chrX_jh806596_fix     chrX_jh806597_fix     chrX_jh806598_fix 
##                "hg19"                "hg19"                "hg19" 
##     chrX_jh806599_fix     chrX_jh806600_fix     chrX_jh806601_fix 
##                "hg19"                "hg19"                "hg19" 
##     chrX_jh806602_fix     chrX_jh806603_fix     chrX_kb021648_fix 
##                "hg19"                "hg19"                "hg19" 
##     chr1_gl383518_alt     chr1_gl383519_alt     chr1_gl383520_alt 
##                "hg19"                "hg19"                "hg19" 
##     chr2_gl383521_alt     chr2_gl383522_alt     chr2_gl582966_alt 
##                "hg19"                "hg19"                "hg19" 
##     chr3_gl383526_alt     chr3_jh636055_alt     chr4_gl383527_alt 
##                "hg19"                "hg19"                "hg19" 
##     chr4_gl383528_alt     chr4_gl383529_alt     chr5_gl339449_alt 
##                "hg19"                "hg19"                "hg19" 
##     chr5_gl383530_alt     chr5_gl383531_alt     chr5_gl383532_alt 
##                "hg19"                "hg19"                "hg19" 
##     chr5_gl949742_alt     chr6_gl383533_alt     chr6_kb021644_alt 
##                "hg19"                "hg19"                "hg19" 
##     chr7_gl383534_alt     chr9_gl383539_alt     chr9_gl383540_alt 
##                "hg19"                "hg19"                "hg19" 
##     chr9_gl383541_alt     chr9_gl383542_alt    chr10_gl383545_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr10_gl383546_alt    chr11_gl383547_alt    chr11_jh159136_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr11_jh159137_alt    chr12_gl383549_alt    chr12_gl383550_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr12_gl383551_alt    chr12_gl383552_alt    chr12_gl383553_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr12_gl877875_alt    chr12_gl877876_alt    chr12_gl949745_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr15_gl383554_alt    chr15_gl383555_alt    chr16_gl383556_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr16_gl383557_alt    chr17_gl383563_alt    chr17_gl383564_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr17_gl383565_alt    chr17_gl383566_alt    chr17_jh159146_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr17_jh159147_alt    chr17_jh159148_alt    chr18_gl383567_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr18_gl383568_alt    chr18_gl383569_alt    chr18_gl383570_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr18_gl383571_alt    chr18_gl383572_alt    chr19_gl383573_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr19_gl383574_alt    chr19_gl383575_alt    chr19_gl383576_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr19_gl949746_alt    chr19_gl949747_alt    chr19_gl949748_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr19_gl949749_alt    chr19_gl949750_alt    chr19_gl949751_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr19_gl949752_alt    chr19_gl949753_alt    chr20_gl383577_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr21_gl383578_alt    chr21_gl383579_alt    chr21_gl383580_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr21_gl383581_alt    chr22_gl383582_alt    chr22_gl383583_alt 
##                "hg19"                "hg19"                "hg19" 
##    chr22_kb663609_alt 
##                "hg19"
# find length of each chromosome
seqlengths(atac_small)
##                  chr1                  chr2                  chr3 
##             249250621             243199373             198022430 
##                  chr4                  chr5                  chr6 
##             191154276             180915260             171115067 
##                  chr7                  chr8                  chr9 
##             159138663             146364022             141213431 
##                 chr10                 chr11                 chr12 
##             135534747             135006516             133851895 
##                 chr13                 chr14                 chr15 
##             115169878             107349540             102531392 
##                 chr16                 chr17                 chr18 
##              90354753              81195210              78077248 
##                 chr19                 chr20                 chr21 
##              59128983              63025520              48129895 
##                 chr22                  chrX                  chrY 
##              51304566             155270560              59373566 
##                  chrM                 chrMT        chr4_ctg9_hap1 
##                 16571                 16569                590426 
##         chr6_apd_hap1         chr6_cox_hap2         chr6_dbb_hap3 
##               4622290               4795371               4610396 
##        chr6_mann_hap4         chr6_mcf_hap5         chr6_qbl_hap6 
##               4683263               4833398               4611984 
##        chr6_ssto_hap7       chr17_ctg5_hap1  chr1_gl000191_random 
##               4928567               1680828                106433 
##  chr1_gl000192_random  chr4_gl000193_random  chr4_gl000194_random 
##                547496                189789                191469 
##  chr7_gl000195_random  chr8_gl000196_random  chr8_gl000197_random 
##                182896                 38914                 37175 
##  chr9_gl000198_random  chr9_gl000199_random  chr9_gl000200_random 
##                 90085                169874                187035 
##  chr9_gl000201_random chr11_gl000202_random chr17_gl000203_random 
##                 36148                 40103                 37498 
## chr17_gl000204_random chr17_gl000205_random chr17_gl000206_random 
##                 81310                174588                 41001 
## chr18_gl000207_random chr19_gl000208_random chr19_gl000209_random 
##                  4262                 92689                159169 
## chr21_gl000210_random        chrUn_gl000211        chrUn_gl000212 
##                 27682                166566                186858 
##        chrUn_gl000213        chrUn_gl000214        chrUn_gl000215 
##                164239                137718                172545 
##        chrUn_gl000216        chrUn_gl000217        chrUn_gl000218 
##                172294                172149                161147 
##        chrUn_gl000219        chrUn_gl000220        chrUn_gl000221 
##                179198                161802                155397 
##        chrUn_gl000222        chrUn_gl000223        chrUn_gl000224 
##                186861                180455                179693 
##        chrUn_gl000225        chrUn_gl000226        chrUn_gl000227 
##                211173                 15008                128374 
##        chrUn_gl000228        chrUn_gl000229        chrUn_gl000230 
##                129120                 19913                 43691 
##        chrUn_gl000231        chrUn_gl000232        chrUn_gl000233 
##                 27386                 40652                 45941 
##        chrUn_gl000234        chrUn_gl000235        chrUn_gl000236 
##                 40531                 34474                 41934 
##        chrUn_gl000237        chrUn_gl000238        chrUn_gl000239 
##                 45867                 39939                 33824 
##        chrUn_gl000240        chrUn_gl000241        chrUn_gl000242 
##                 41933                 42152                 43523 
##        chrUn_gl000243        chrUn_gl000244        chrUn_gl000245 
##                 43341                 39929                 36651 
##        chrUn_gl000246        chrUn_gl000247        chrUn_gl000248 
##                 38154                 36422                 39786 
##        chrUn_gl000249     chr1_gl383516_fix     chr1_gl383517_fix 
##                 38502                 49316                 49352 
##     chr1_gl949741_fix     chr1_jh636052_fix     chr1_jh636053_fix 
##                151551               7283150               1676126 
##     chr1_jh636054_fix     chr1_jh806573_fix     chr1_jh806574_fix 
##                758378                 24680                 22982 
##     chr1_jh806575_fix     chr2_gl877870_fix     chr2_gl877871_fix 
##                 47409                 66021                389939 
##     chr2_kb663603_fix     chr3_gl383523_fix     chr3_gl383524_fix 
##                599580                171362                 78793 
##     chr3_gl383525_fix     chr3_jh159131_fix     chr3_jh159132_fix 
##                 65063                393769                100694 
##     chr3_ke332495_fix     chr4_gl582967_fix     chr4_gl877872_fix 
##                263861                248177                297485 
##     chr4_ke332496_fix     chr5_jh159133_fix     chr5_ke332497_fix 
##                503215                266316                543325 
##     chr6_jh636056_fix     chr6_jh636057_fix     chr6_jh806576_fix 
##                262912                200195                273386 
##     chr6_kb663604_fix     chr6_ke332498_fix     chr7_gl582968_fix 
##                478993                149443                356330 
##     chr7_gl582969_fix     chr7_gl582970_fix     chr7_gl582971_fix 
##                251823                354970               1284284 
##     chr7_gl582972_fix     chr7_jh159134_fix     chr7_jh636058_fix 
##                327774               3821770                716227 
##     chr7_ke332499_fix     chr8_gl383535_fix     chr8_gl383536_fix 
##                274521                429806                203777 
##     chr8_gl949743_fix     chr8_jh159135_fix     chr8_ke332500_fix 
##                608579                102251                228602 
##     chr9_gl339450_fix     chr9_gl383537_fix     chr9_gl383538_fix 
##                330164                 62435                 49281 
##     chr9_jh636059_fix     chr9_jh806577_fix     chr9_jh806578_fix 
##                295379                 22394                169437 
##     chr9_jh806579_fix     chr9_kb663605_fix    chr10_gl383543_fix 
##                211307                155926                392792 
##    chr10_gl383544_fix    chr10_gl877873_fix    chr10_jh591181_fix 
##                128378                168465               2281126 
##    chr10_jh591182_fix    chr10_jh591183_fix    chr10_jh636060_fix 
##                196262                177920                437946 
##    chr10_jh806580_fix    chr10_kb663606_fix    chr10_ke332501_fix 
##                 93149                305900               1020827 
##    chr11_gl582973_fix    chr11_gl949744_fix    chr11_jh159138_fix 
##                321004                276448                108875 
##    chr11_jh159139_fix    chr11_jh159140_fix    chr11_jh159141_fix 
##                120441                546435                240775 
##    chr11_jh159142_fix    chr11_jh159143_fix    chr11_jh591184_fix 
##                326647                191402                462282 
##    chr11_jh591185_fix    chr11_jh720443_fix    chr11_jh806581_fix 
##                167437                408430                872115 
##    chr12_gl383548_fix    chr12_gl582974_fix    chr12_jh720444_fix 
##                165247                163298                273128 
##    chr12_kb663607_fix    chr13_gl582975_fix    chr14_kb021645_fix 
##                334922                 34662               1523386 
##    chr15_jh720445_fix    chr16_jh720446_fix    chr17_gl383558_fix 
##                170033                 97345                457041 
##    chr17_gl383559_fix    chr17_gl383560_fix    chr17_gl383561_fix 
##                338640                534288                644425 
##    chr17_gl383562_fix    chr17_gl582976_fix    chr17_jh159144_fix 
##                 45551                412535                388340 
##    chr17_jh159145_fix    chr17_jh591186_fix    chr17_jh636061_fix 
##                194862                376223                186059 
##    chr17_jh720447_fix    chr17_jh806582_fix    chr17_kb021646_fix 
##                454385                342635                211416 
##    chr17_ke332502_fix    chr19_gl582977_fix    chr19_jh159149_fix 
##                341712                580393                245473 
##    chr19_kb021647_fix    chr19_ke332505_fix    chr20_gl582979_fix 
##               1058686                579598                179899 
##    chr20_jh720448_fix    chr20_kb663608_fix    chr21_ke332506_fix 
##                 70483                283551                307252 
##    chr22_jh720449_fix    chr22_jh806583_fix    chr22_jh806584_fix 
##                212298                167183                 70876 
##    chr22_jh806585_fix    chr22_jh806586_fix     chrX_gl877877_fix 
##                 73505                 43543                284527 
##     chrX_jh159150_fix     chrX_jh720451_fix     chrX_jh720452_fix 
##               3110903                898979                522319 
##     chrX_jh720453_fix     chrX_jh720454_fix     chrX_jh720455_fix 
##               1461188                752267                 65034 
##     chrX_jh806587_fix     chrX_jh806588_fix     chrX_jh806589_fix 
##               4110759                862483                270630 
##     chrX_jh806590_fix     chrX_jh806591_fix     chrX_jh806592_fix 
##               2418393                882083                835911 
##     chrX_jh806593_fix     chrX_jh806594_fix     chrX_jh806595_fix 
##                389631                390496                444074 
##     chrX_jh806596_fix     chrX_jh806597_fix     chrX_jh806598_fix 
##                413927               1045622                899320 
##     chrX_jh806599_fix     chrX_jh806600_fix     chrX_jh806601_fix 
##               1214327               6530008               1389764 
##     chrX_jh806602_fix     chrX_jh806603_fix     chrX_kb021648_fix 
##                713266                182949                469972 
##     chr1_gl383518_alt     chr1_gl383519_alt     chr1_gl383520_alt 
##                182439                110268                366579 
##     chr2_gl383521_alt     chr2_gl383522_alt     chr2_gl582966_alt 
##                143390                123821                 96131 
##     chr3_gl383526_alt     chr3_jh636055_alt     chr4_gl383527_alt 
##                180671                173151                164536 
##     chr4_gl383528_alt     chr4_gl383529_alt     chr5_gl339449_alt 
##                376187                121345               1612928 
##     chr5_gl383530_alt     chr5_gl383531_alt     chr5_gl383532_alt 
##                101241                173459                 82728 
##     chr5_gl949742_alt     chr6_gl383533_alt     chr6_kb021644_alt 
##                226852                124736                187824 
##     chr7_gl383534_alt     chr9_gl383539_alt     chr9_gl383540_alt 
##                119183                162988                 71551 
##     chr9_gl383541_alt     chr9_gl383542_alt    chr10_gl383545_alt 
##                171286                 60032                179254 
##    chr10_gl383546_alt    chr11_gl383547_alt    chr11_jh159136_alt 
##                309802                154407                200998 
##    chr11_jh159137_alt    chr12_gl383549_alt    chr12_gl383550_alt 
##                191409                120804                169178 
##    chr12_gl383551_alt    chr12_gl383552_alt    chr12_gl383553_alt 
##                184319                138655                152874 
##    chr12_gl877875_alt    chr12_gl877876_alt    chr12_gl949745_alt 
##                167313                408271                372609 
##    chr15_gl383554_alt    chr15_gl383555_alt    chr16_gl383556_alt 
##                296527                388773                192462 
##    chr16_gl383557_alt    chr17_gl383563_alt    chr17_gl383564_alt 
##                 89672                270261                133151 
##    chr17_gl383565_alt    chr17_gl383566_alt    chr17_jh159146_alt 
##                223995                 90219                278131 
##    chr17_jh159147_alt    chr17_jh159148_alt    chr18_gl383567_alt 
##                 70345                 88070                289831 
##    chr18_gl383568_alt    chr18_gl383569_alt    chr18_gl383570_alt 
##                104552                167950                164789 
##    chr18_gl383571_alt    chr18_gl383572_alt    chr19_gl383573_alt 
##                198278                159547                385657 
##    chr19_gl383574_alt    chr19_gl383575_alt    chr19_gl383576_alt 
##                155864                170222                188024 
##    chr19_gl949746_alt    chr19_gl949747_alt    chr19_gl949748_alt 
##                987716                729519               1064303 
##    chr19_gl949749_alt    chr19_gl949750_alt    chr19_gl949751_alt 
##               1091840               1066389               1002682 
##    chr19_gl949752_alt    chr19_gl949753_alt    chr20_gl383577_alt 
##                987100                796478                128385 
##    chr21_gl383578_alt    chr21_gl383579_alt    chr21_gl383580_alt 
##                 63917                201198                 74652 
##    chr21_gl383581_alt    chr22_gl383582_alt    chr22_gl383583_alt 
##                116690                162811                 96924 
##    chr22_kb663609_alt 
##                 74013
# find name of each chromosome
seqnames(atac_small)
##   [1] "chr1"                  "chr2"                  "chr3"                 
##   [4] "chr4"                  "chr5"                  "chr6"                 
##   [7] "chr7"                  "chr8"                  "chr9"                 
##  [10] "chr10"                 "chr11"                 "chr12"                
##  [13] "chr13"                 "chr14"                 "chr15"                
##  [16] "chr16"                 "chr17"                 "chr18"                
##  [19] "chr19"                 "chr20"                 "chr21"                
##  [22] "chr22"                 "chrX"                  "chrY"                 
##  [25] "chrM"                  "chrMT"                 "chr4_ctg9_hap1"       
##  [28] "chr6_apd_hap1"         "chr6_cox_hap2"         "chr6_dbb_hap3"        
##  [31] "chr6_mann_hap4"        "chr6_mcf_hap5"         "chr6_qbl_hap6"        
##  [34] "chr6_ssto_hap7"        "chr17_ctg5_hap1"       "chr1_gl000191_random" 
##  [37] "chr1_gl000192_random"  "chr4_gl000193_random"  "chr4_gl000194_random" 
##  [40] "chr7_gl000195_random"  "chr8_gl000196_random"  "chr8_gl000197_random" 
##  [43] "chr9_gl000198_random"  "chr9_gl000199_random"  "chr9_gl000200_random" 
##  [46] "chr9_gl000201_random"  "chr11_gl000202_random" "chr17_gl000203_random"
##  [49] "chr17_gl000204_random" "chr17_gl000205_random" "chr17_gl000206_random"
##  [52] "chr18_gl000207_random" "chr19_gl000208_random" "chr19_gl000209_random"
##  [55] "chr21_gl000210_random" "chrUn_gl000211"        "chrUn_gl000212"       
##  [58] "chrUn_gl000213"        "chrUn_gl000214"        "chrUn_gl000215"       
##  [61] "chrUn_gl000216"        "chrUn_gl000217"        "chrUn_gl000218"       
##  [64] "chrUn_gl000219"        "chrUn_gl000220"        "chrUn_gl000221"       
##  [67] "chrUn_gl000222"        "chrUn_gl000223"        "chrUn_gl000224"       
##  [70] "chrUn_gl000225"        "chrUn_gl000226"        "chrUn_gl000227"       
##  [73] "chrUn_gl000228"        "chrUn_gl000229"        "chrUn_gl000230"       
##  [76] "chrUn_gl000231"        "chrUn_gl000232"        "chrUn_gl000233"       
##  [79] "chrUn_gl000234"        "chrUn_gl000235"        "chrUn_gl000236"       
##  [82] "chrUn_gl000237"        "chrUn_gl000238"        "chrUn_gl000239"       
##  [85] "chrUn_gl000240"        "chrUn_gl000241"        "chrUn_gl000242"       
##  [88] "chrUn_gl000243"        "chrUn_gl000244"        "chrUn_gl000245"       
##  [91] "chrUn_gl000246"        "chrUn_gl000247"        "chrUn_gl000248"       
##  [94] "chrUn_gl000249"        "chr1_gl383516_fix"     "chr1_gl383517_fix"    
##  [97] "chr1_gl949741_fix"     "chr1_jh636052_fix"     "chr1_jh636053_fix"    
## [100] "chr1_jh636054_fix"     "chr1_jh806573_fix"     "chr1_jh806574_fix"    
## [103] "chr1_jh806575_fix"     "chr2_gl877870_fix"     "chr2_gl877871_fix"    
## [106] "chr2_kb663603_fix"     "chr3_gl383523_fix"     "chr3_gl383524_fix"    
## [109] "chr3_gl383525_fix"     "chr3_jh159131_fix"     "chr3_jh159132_fix"    
## [112] "chr3_ke332495_fix"     "chr4_gl582967_fix"     "chr4_gl877872_fix"    
## [115] "chr4_ke332496_fix"     "chr5_jh159133_fix"     "chr5_ke332497_fix"    
## [118] "chr6_jh636056_fix"     "chr6_jh636057_fix"     "chr6_jh806576_fix"    
## [121] "chr6_kb663604_fix"     "chr6_ke332498_fix"     "chr7_gl582968_fix"    
## [124] "chr7_gl582969_fix"     "chr7_gl582970_fix"     "chr7_gl582971_fix"    
## [127] "chr7_gl582972_fix"     "chr7_jh159134_fix"     "chr7_jh636058_fix"    
## [130] "chr7_ke332499_fix"     "chr8_gl383535_fix"     "chr8_gl383536_fix"    
## [133] "chr8_gl949743_fix"     "chr8_jh159135_fix"     "chr8_ke332500_fix"    
## [136] "chr9_gl339450_fix"     "chr9_gl383537_fix"     "chr9_gl383538_fix"    
## [139] "chr9_jh636059_fix"     "chr9_jh806577_fix"     "chr9_jh806578_fix"    
## [142] "chr9_jh806579_fix"     "chr9_kb663605_fix"     "chr10_gl383543_fix"   
## [145] "chr10_gl383544_fix"    "chr10_gl877873_fix"    "chr10_jh591181_fix"   
## [148] "chr10_jh591182_fix"    "chr10_jh591183_fix"    "chr10_jh636060_fix"   
## [151] "chr10_jh806580_fix"    "chr10_kb663606_fix"    "chr10_ke332501_fix"   
## [154] "chr11_gl582973_fix"    "chr11_gl949744_fix"    "chr11_jh159138_fix"   
## [157] "chr11_jh159139_fix"    "chr11_jh159140_fix"    "chr11_jh159141_fix"   
## [160] "chr11_jh159142_fix"    "chr11_jh159143_fix"    "chr11_jh591184_fix"   
## [163] "chr11_jh591185_fix"    "chr11_jh720443_fix"    "chr11_jh806581_fix"   
## [166] "chr12_gl383548_fix"    "chr12_gl582974_fix"    "chr12_jh720444_fix"   
## [169] "chr12_kb663607_fix"    "chr13_gl582975_fix"    "chr14_kb021645_fix"   
## [172] "chr15_jh720445_fix"    "chr16_jh720446_fix"    "chr17_gl383558_fix"   
## [175] "chr17_gl383559_fix"    "chr17_gl383560_fix"    "chr17_gl383561_fix"   
## [178] "chr17_gl383562_fix"    "chr17_gl582976_fix"    "chr17_jh159144_fix"   
## [181] "chr17_jh159145_fix"    "chr17_jh591186_fix"    "chr17_jh636061_fix"   
## [184] "chr17_jh720447_fix"    "chr17_jh806582_fix"    "chr17_kb021646_fix"   
## [187] "chr17_ke332502_fix"    "chr19_gl582977_fix"    "chr19_jh159149_fix"   
## [190] "chr19_kb021647_fix"    "chr19_ke332505_fix"    "chr20_gl582979_fix"   
## [193] "chr20_jh720448_fix"    "chr20_kb663608_fix"    "chr21_ke332506_fix"   
## [196] "chr22_jh720449_fix"    "chr22_jh806583_fix"    "chr22_jh806584_fix"   
## [199] "chr22_jh806585_fix"    "chr22_jh806586_fix"    "chrX_gl877877_fix"    
## [202] "chrX_jh159150_fix"     "chrX_jh720451_fix"     "chrX_jh720452_fix"    
## [205] "chrX_jh720453_fix"     "chrX_jh720454_fix"     "chrX_jh720455_fix"    
## [208] "chrX_jh806587_fix"     "chrX_jh806588_fix"     "chrX_jh806589_fix"    
## [211] "chrX_jh806590_fix"     "chrX_jh806591_fix"     "chrX_jh806592_fix"    
## [214] "chrX_jh806593_fix"     "chrX_jh806594_fix"     "chrX_jh806595_fix"    
## [217] "chrX_jh806596_fix"     "chrX_jh806597_fix"     "chrX_jh806598_fix"    
## [220] "chrX_jh806599_fix"     "chrX_jh806600_fix"     "chrX_jh806601_fix"    
## [223] "chrX_jh806602_fix"     "chrX_jh806603_fix"     "chrX_kb021648_fix"    
## [226] "chr1_gl383518_alt"     "chr1_gl383519_alt"     "chr1_gl383520_alt"    
## [229] "chr2_gl383521_alt"     "chr2_gl383522_alt"     "chr2_gl582966_alt"    
## [232] "chr3_gl383526_alt"     "chr3_jh636055_alt"     "chr4_gl383527_alt"    
## [235] "chr4_gl383528_alt"     "chr4_gl383529_alt"     "chr5_gl339449_alt"    
## [238] "chr5_gl383530_alt"     "chr5_gl383531_alt"     "chr5_gl383532_alt"    
## [241] "chr5_gl949742_alt"     "chr6_gl383533_alt"     "chr6_kb021644_alt"    
## [244] "chr7_gl383534_alt"     "chr9_gl383539_alt"     "chr9_gl383540_alt"    
## [247] "chr9_gl383541_alt"     "chr9_gl383542_alt"     "chr10_gl383545_alt"   
## [250] "chr10_gl383546_alt"    "chr11_gl383547_alt"    "chr11_jh159136_alt"   
## [253] "chr11_jh159137_alt"    "chr12_gl383549_alt"    "chr12_gl383550_alt"   
## [256] "chr12_gl383551_alt"    "chr12_gl383552_alt"    "chr12_gl383553_alt"   
## [259] "chr12_gl877875_alt"    "chr12_gl877876_alt"    "chr12_gl949745_alt"   
## [262] "chr15_gl383554_alt"    "chr15_gl383555_alt"    "chr16_gl383556_alt"   
## [265] "chr16_gl383557_alt"    "chr17_gl383563_alt"    "chr17_gl383564_alt"   
## [268] "chr17_gl383565_alt"    "chr17_gl383566_alt"    "chr17_jh159146_alt"   
## [271] "chr17_jh159147_alt"    "chr17_jh159148_alt"    "chr18_gl383567_alt"   
## [274] "chr18_gl383568_alt"    "chr18_gl383569_alt"    "chr18_gl383570_alt"   
## [277] "chr18_gl383571_alt"    "chr18_gl383572_alt"    "chr19_gl383573_alt"   
## [280] "chr19_gl383574_alt"    "chr19_gl383575_alt"    "chr19_gl383576_alt"   
## [283] "chr19_gl949746_alt"    "chr19_gl949747_alt"    "chr19_gl949748_alt"   
## [286] "chr19_gl949749_alt"    "chr19_gl949750_alt"    "chr19_gl949751_alt"   
## [289] "chr19_gl949752_alt"    "chr19_gl949753_alt"    "chr20_gl383577_alt"   
## [292] "chr21_gl383578_alt"    "chr21_gl383579_alt"    "chr21_gl383580_alt"   
## [295] "chr21_gl383581_alt"    "chr22_gl383582_alt"    "chr22_gl383583_alt"   
## [298] "chr22_kb663609_alt"
# assign a new genome
genome(atac_small) <- "hg19"

Again, several other methods are available that are not listed here. See the documentation for seqinfo-methods in Signac for a full list.

For a full list of methods for the ChromatinAssay class run:

methods(class = 'ChromatinAssay')
##  [1] [[<-              AddMotifs         AggregateTiles    Annotation       
##  [5] Annotation<-      CallPeaks         coerce            colMeans         
##  [9] colSums           ConvertMotifID    countOverlaps     coverage         
## [13] disjoin           disjointBins      distance          distanceToNearest
## [17] findOverlaps      FoldChange        follow            Footprint        
## [21] Fragments         Fragments<-       gaps              genome           
## [25] genome<-          GetAssayData      GetMotifData      granges          
## [29] InsertionBias     isCircular        isCircular<-      isDisjoint       
## [33] Links             Links<-           merge             Motifs           
## [37] Motifs<-          nearest           precede           range            
## [41] reduce            RegionStats       RenameCells       rowMeans         
## [45] rowSums           RunChromVAR       seqinfo           seqinfo<-        
## [49] seqlengths        seqlengths<-      seqlevels         seqlevels<-      
## [53] seqnames          seqnames<-        SetAssayData      SetMotifData     
## [57] show              subset           
## see '?methods' for accessing help and source code

Subsetting a ChromatinAssay

We can use the standard subset() function or the [ operator to subset Seurat object containing ChromatinAssays. This works the same way as for standard Assay objects.

# subset using the subset() function
# this is meant for interactive use
subset.obj <- subset(atac_small, subset = nCount_peaks > 100)

# subset using the [ extract operator
# this can be used programmatically
subset.obj <- atac_small[, atac_small$nCount_peaks > 100]

Converting between Assay and ChromatinAssay

To convert from a ChromatinAssay to a standard Assay use the as() function

# convert a ChromatinAssay to an Assay
assay <- as(object = atac_small[["peaks"]], Class = "Assay")
assay
## Assay data with 323 features for 100 cells
## Top 10 variable features:
##  chr1-2157847-2188813, chr1-2471903-2481288, chr1-6843960-6846894,
## chr1-3815928-3820356, chr1-8935313-8940649, chr1-2515241-2519350,
## chr1-6051145-6055407, chr1-1708510-1715065, chr1-6659264-6664388,
## chr1-2227715-2234197

To convert from a standard Assay to a ChromatinAssay we use the as.ChromatinAssay() function. This takes a standard assay object, as well as information to fill the additional slots in the ChromatinAssay class.

# convert an Assay to a ChromatinAssay
chromatinassay <- as.ChromatinAssay(assay, seqinfo = "hg19")
chromatinassay
## ChromatinAssay data with 323 features for 100 cells
## Variable features: 323 
## Genome: hg19 
## Annotation present: FALSE 
## Motifs present: FALSE 
## Fragment files: 0

The Fragment Class

The Fragment class is designed for storing and interacting with a fragment file commonly used for single-cell chromatin data. It contains the path to an indexed fragment file on disk, a MD5 hash for the fragment file and the fragment file index, and a vector of cell names contained in the fragment file. Importantly, this is a named vector where the elements of the vector are the cell names as they appear in the fragment file, and the name of each element is the cell name as it appears in the ChromatinAssay object storing the Fragment object. This allows a mapping of cell names on disk to cell names in R, and avoids the need to alter fragment files on disk. This path can also be a remote file accessible by http or ftp.

Constructing the Fragment class

A Fragment object can be constructed using the CreateFragmentObject() function.

frag.path <- system.file("extdata", "fragments.tsv.gz", package="Signac")
fragments <- CreateFragmentObject(
  path = frag.path,
  cells = colnames(atac_small), 
  validate.fragments = TRUE
)
## Computing hash

The validate.fragments parameter controls whether the file is inspected to check whether the expected cell names are present. This can help avoid assigning the wrong fragment file to the object. If you’re sure that the file is correct, you can set this value to FALSE to skip this step and save some time. This check is typically only run once when the Fragment object is created, and is not normally run on existing Fragment files.

Inspecting the fragment file

To extract the first few lines of a fragment file on-disk, we can use the head() method defined for Fragment objects. This is useful for quickly checking the chromosome naming style in our fragment file, or checking how the cell barcodes are named:

head(fragments)
##   chrom  start    end            barcode readCount
## 1  chr1  10245  10302 AAAGATGAGGCTAAAT-1         1
## 2  chr1  55313  55699 AAACTCGTCTGGCACG-1         2
## 3  chr1  56455  56658 AAACTCGTCTGGCACG-1         1
## 4  chr1  60687  60726 AAACTGCAGTCTGTGT-1         1
## 5  chr1 235723 235936 AAACTGCTCCTATCCG-1         1
## 6  chr1 237741 237772 AAAGGATTCCTTACGC-1         1

Adding a Fragment object to the ChromatinAssay

A ChromatinAssay object can contain a list of Fragment objects. This avoids the need to merge fragment files on disk and simplifies processes of merging or integrating different Seurat objects containing ChromatinAssays. To add a new Fragment object to a ChromatinAssay, or a Seurat object containing a ChromatinAssay, we can use the Fragments<- assignment function. This will do a few things:

  1. Re-compute the MD5 hash for the fragment file and index and verify that it matches the hash computed when the Fragment object was created.
  2. Check that none of the cells contained in the Fragment object being added are already contained in another Fragment object stored in the ChromatinAssay. All fragments from a cell must be present in only one fragment file.
  3. Append the Fragment object to the list of Fragment objects stored in the ChromatinAssay.
Fragments(atac_small) <- fragments

The show() method for Fragment-class objects prints the number of cells that the Fragment object contains data for.

fragments
## A Fragment object for 100 cells

Alternatively, we can initialize the ChromatinAssay with a Fragment object in a couple of ways. We can either pass a vector of Fragment objects to the fragments parameter in CreateChromatinAssay(), or pass the path to a single fragment file. If we pass the path to a fragment file we assume that the file contains fragments for all cells in the ChromatinAssay and that the cell names are the same in the fragment file on disk and in the ChromatinAssay. For example:

chrom_assay <- CreateChromatinAssay(
  counts = counts,
  genome = "hg19",
  fragments = frag.path
)
## Computing hash
object <- CreateSeuratObject(
  counts = chrom_assay,
  assay = "peaks"
)

This will create a Seurat object containing a ChromatinAssay, with a single Fragment object.

Removing a Fragment object from the ChromatinAssay

All the Fragment objects associated with a ChromatinAssay can be removed by assigning NULL using the Fragment<- assignment function. For example:

Fragments(chrom_assay) <- NULL
Fragments(chrom_assay)
## list()

To remove a subset of Fragment object from the list of Fragment objects stored in the ChromatinAssay, you will need to extract the list of Fragment objects using the Fragments() function, subset the list of objects, then assign the subsetted list to the assay using the Seurat::SetAssayData() function. For example:

chrom_assay <- SetAssayData(chrom_assay, slot = "fragments", new.data = fragments)
Fragments(chrom_assay)
## [[1]]
## A Fragment object for 100 cells

Changing the fragment file path in an existing Fragment object

The path to the fragment file can be updated using the UpdatePath() function. This can be useful if you move the fragment file to a new directory, or if you copy a stored Seurat object containing a ChromatinAssay to a different server.

fragments <- UpdatePath(fragments, new.path = frag.path)

To change the path to fragment files in an object, you will need to remove the fragment objects, update the paths, and then add the fragment objects back to the object. For example:

frags <- Fragments(object)  # get list of fragment objects
Fragments(object) <- NULL  # remove fragment information from assay

# create a vector with all the new paths, in the correct order for your list of fragment objects
# In this case we only have 1
new.paths <- list(frag.path)
for (i in seq_along(frags)) {
  frags[[i]] <- UpdatePath(frags[[i]], new.path = new.paths[[i]]) # update path
}

Fragments(object) <- frags # assign updated list back to the object
Fragments(object)
## [[1]]
## A Fragment object for 100 cells

Using remote fragment files

Fragment files hosted on remote servers accessible via http or ftp can also be added to the ChromatinAssay in the same way as for locally-hosted fragment files. This can enable the exploration of large single-cell datasets without the need for downloading large files. For example, we can create a Fragment object using a file hosted on the 10x Genomics website:

fragments <- CreateFragmentObject(
  path = "http://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_v1_pbmc_10k/atac_v1_pbmc_10k_fragments.tsv.gz"
)
## Warning in readLines(con = con, n = 10000): incomplete final line found on
## 'gzcon(http://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_v1_pbmc_10k/
## atac_v1_pbmc_10k_fragments.tsv.gz)'
## Computing hash
fragments
## A Fragment object for 0 cells

When files are hosted remotely, the checks described in the section above (MD5 hash and expected cells) are not performed.

Getting and setting Fragment data

To access the cell names stored in a Fragment object, we can use the Cells() function. Importantly, this returns the cell names as they appear in the ChromatinAssay, rather than as they appear in the fragment file itself.

fragments <- CreateFragmentObject(
  path = frag.path,
  cells = colnames(atac_small), 
  validate.fragments = TRUE
)
## Computing hash
cells <- Cells(fragments)
head(cells)
## [1] "AAACGAAAGAGCGAAA-1" "AAACGAAAGAGTTTGA-1" "AAACGAAAGCGAGCTA-1"
## [4] "AAACGAAAGGCTTCGC-1" "AAACGAAAGTGCTGAG-1" "AAACGAACAAGGGTAC-1"

Similarly, we can set the cell name information in a Fragment object using the Cells<- assignment function. This will set the named vector of cells stored in the Fragment object. Here we must supply a named vector.

names(cells) <- cells
Cells(fragments) <- cells

To extract any of the data stored in a Fragment object we can also use the GetFragmentData() function. For example, we can find the path to the fragment file on disk:

GetFragmentData(object = fragments, slot = "path")
## [1] "/tmp/RtmpQVBH4y/temp_libpath333f01102e3cb2/Signac/extdata/fragments.tsv.gz"

For a full list of methods for the Fragment class run:

methods(class = 'Fragment')
## [1] CallPeaks   Cells       Cells<-     head        RenameCells show       
## see '?methods' for accessing help and source code

The Motif Class

The Motif class stores information needed for DNA sequence motif analysis, and has the following slots:

  • data: a sparse feature by motif matrix, where entries are 1 if the feature contains the motif, and 0 otherwise
  • pwm: A named list of position weight or position frequency matrices
  • motif.names: a list of motif IDs and their common names
  • positions: A GRangesList object containing the exact positions of each motif
  • meta.data: Additional information about the motifs

Many of these slots are optional and do not need to be filled, but are only required when running certain functions. For example, the positions slot will be needed if running TF footprinting.

Constructing the Motif class

A Motif object can be constructed using the CreateMotifObject() function. Much of the data needed for constructing a Motif object can be generated using functions from the TFBSTools and motifmatchr packages. Position frequency matrices for motifs can be loaded using the JASPAR packages on Bioconductor or the chromVARmotifs package. For example:

library(JASPAR2020)
library(TFBSTools)
library(motifmatchr)

# Get a list of motif position frequency matrices from the JASPAR database
pfm <- getMatrixSet(
  x = JASPAR2020,
  opts = list(species = 9606) # 9606 is the species code for human
)

# Scan the DNA sequence of each peak for the presence of each motif
motif.matrix <- CreateMotifMatrix(
  features = granges(atac_small),
  pwm = pfm,
  genome = 'hg19'
)

# Create a new Mofif object to store the results
motif <- CreateMotifObject(
  data = motif.matrix,
  pwm = pfm
)

The show() method for the Motif class prints the total number of motifs and regions included in the object:

motif
## A Motif object containing 633 motifs in 323 regions

Adding a Motif object to the ChromatinAssay

We can add a Motif object to the ChromatinAssay, or a Seurat object containing a ChromatinAssay using the Motifs<- assignment operator.

Motifs(atac_small) <- motif

Getting and setting Motif data

Data stored in a Motif object can be accessed using the GetMotifData() and SetMotifData() functions.

# extract data from the Motif object
pfm <- GetMotifData(object = motif, slot = "pwm")

# set data in the Motif object
motif <- SetMotifData(object = motif, slot = "pwm", new.data = pfm)

We can access the set of motifs and set of features used in the Motif object using the colnames() and rownames() functions:

# look at the motifs included in the Motif object
head(colnames(motif))
## [1] "MA0030.1" "MA0031.1" "MA0051.1" "MA0057.1" "MA0059.1" "MA0066.1"
# look at the features included in the Motif object
head(rownames(motif))
## [1] "chr1-713460-714823" "chr1-752422-753038" "chr1-762106-763359"
## [4] "chr1-779589-780271" "chr1-804872-805761" "chr1-839520-841123"

To quickly convert between motif IDs (like MA0497.1) and motif common names (like MEF2C), we can use the ConvertMotifID() function. For example:

# convert ID to common name
ids <- c("MA0025.1","MA0030.1","MA0031.1","MA0051.1","MA0056.1","MA0057.1")
names <- ConvertMotifID(object = motif, id = ids)
names
## [1] NA            "FOXF2"       "FOXD1"       "IRF2"        NA           
## [6] "MZF1(var.2)"
# convert names to IDs
ConvertMotifID(object = motif, name = names)
## [1] NA         "MA0030.1" "MA0031.1" "MA0051.1" NA         "MA0057.1"

For a full list of methods for the Motif class run:

methods(class = 'Motif')
## [1] [              ConvertMotifID dim            dimnames       GetMotifData  
## [6] SetMotifData   show           subset        
## see '?methods' for accessing help and source code

Session Info

## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] BSgenome.Hsapiens.UCSC.hg19_1.4.3 BSgenome_1.60.0                  
##  [3] rtracklayer_1.52.1                Biostrings_2.60.2                
##  [5] XVector_0.32.0                    motifmatchr_1.14.0               
##  [7] TFBSTools_1.30.0                  JASPAR2020_0.99.10               
##  [9] EnsDb.Hsapiens.v75_2.99.0         ensembldb_2.16.4                 
## [11] AnnotationFilter_1.16.0           GenomicFeatures_1.44.2           
## [13] AnnotationDbi_1.54.1              Biobase_2.52.0                   
## [15] GenomicRanges_1.44.0              GenomeInfoDb_1.28.4              
## [17] IRanges_2.26.0                    S4Vectors_0.30.0                 
## [19] BiocGenerics_0.38.0               Signac_1.4.0                     
## [21] SeuratObject_4.0.2                Seurat_4.0.4                     
## 
## loaded via a namespace (and not attached):
##   [1] rappdirs_0.3.3              SnowballC_0.7.0            
##   [3] scattermore_0.7             R.methodsS3_1.8.1          
##   [5] ragg_1.1.3                  tidyr_1.1.3                
##   [7] ggplot2_3.3.5               bit64_4.0.5                
##   [9] knitr_1.34                  R.utils_2.10.1             
##  [11] irlba_2.3.3                 DelayedArray_0.18.0        
##  [13] data.table_1.14.0           rpart_4.1-15               
##  [15] KEGGREST_1.32.0             RCurl_1.98-1.5             
##  [17] generics_0.1.0              cowplot_1.1.1              
##  [19] RSQLite_2.2.8               RANN_2.6.1                 
##  [21] future_1.22.1               tzdb_0.1.2                 
##  [23] bit_4.0.4                   spatstat.data_2.1-0        
##  [25] xml2_1.3.2                  httpuv_1.6.3               
##  [27] SummarizedExperiment_1.22.0 assertthat_0.2.1           
##  [29] DirichletMultinomial_1.34.0 xfun_0.26                  
##  [31] hms_1.1.0                   jquerylib_0.1.4            
##  [33] evaluate_0.14               promises_1.2.0.1           
##  [35] fansi_0.5.0                 restfulr_0.0.13            
##  [37] progress_1.2.2              caTools_1.18.2             
##  [39] dbplyr_2.1.1                igraph_1.2.6               
##  [41] DBI_1.1.1                   htmlwidgets_1.5.4          
##  [43] sparsesvd_0.2               spatstat.geom_2.2-2        
##  [45] purrr_0.3.4                 ellipsis_0.3.2             
##  [47] dplyr_1.0.7                 backports_1.2.1            
##  [49] annotate_1.70.0             biomaRt_2.48.3             
##  [51] deldir_0.2-10               MatrixGenerics_1.4.3       
##  [53] vctrs_0.3.8                 ROCR_1.0-11                
##  [55] abind_1.4-5                 cachem_1.0.6               
##  [57] ggforce_0.3.3               checkmate_2.0.0            
##  [59] sctransform_0.3.2           GenomicAlignments_1.28.0   
##  [61] prettyunits_1.1.1           goftest_1.2-2              
##  [63] cluster_2.1.2               lazyeval_0.2.2             
##  [65] seqLogo_1.58.0              crayon_1.4.1               
##  [67] pkgconfig_2.0.3             slam_0.1-48                
##  [69] tweenr_1.0.2                nlme_3.1-152               
##  [71] ProtGenerics_1.24.0         nnet_7.3-16                
##  [73] rlang_0.4.11                globals_0.14.0             
##  [75] lifecycle_1.0.0             miniUI_0.1.1.1             
##  [77] filelock_1.0.2              BiocFileCache_2.0.0        
##  [79] dichromat_2.0-0             rprojroot_2.0.2            
##  [81] polyclip_1.10-0             matrixStats_0.61.0         
##  [83] lmtest_0.9-38               Matrix_1.3-4               
##  [85] ggseqlogo_0.1               zoo_1.8-9                  
##  [87] base64enc_0.1-3             ggridges_0.5.3             
##  [89] png_0.1-7                   viridisLite_0.4.0          
##  [91] rjson_0.2.20                bitops_1.0-7               
##  [93] R.oo_1.24.0                 KernSmooth_2.23-20         
##  [95] blob_1.2.2                  stringr_1.4.0              
##  [97] parallelly_1.28.1           readr_2.0.1                
##  [99] jpeg_0.1-9                  CNEr_1.28.0                
## [101] scales_1.1.1                memoise_2.0.0              
## [103] magrittr_2.0.1              plyr_1.8.6                 
## [105] ica_1.0-2                   zlibbioc_1.38.0            
## [107] compiler_4.1.0              BiocIO_1.2.0               
## [109] RColorBrewer_1.1-2          fitdistrplus_1.1-5         
## [111] Rsamtools_2.8.0             listenv_0.8.0              
## [113] patchwork_1.1.1             pbapply_1.5-0              
## [115] htmlTable_2.2.1             Formula_1.2-4              
## [117] MASS_7.3-54                 mgcv_1.8-36                
## [119] tidyselect_1.1.1            stringi_1.7.4              
## [121] textshaping_0.3.5           yaml_2.2.1                 
## [123] latticeExtra_0.6-29         ggrepel_0.9.1              
## [125] grid_4.1.0                  sass_0.4.0                 
## [127] VariantAnnotation_1.38.0    fastmatch_1.1-3            
## [129] tools_4.1.0                 future.apply_1.8.1         
## [131] rstudioapi_0.13             TFMPvalue_0.0.8            
## [133] foreign_0.8-81              lsa_0.73.2                 
## [135] gridExtra_2.3               farver_2.1.0               
## [137] Rtsne_0.15                  digest_0.6.27              
## [139] pracma_2.3.3                shiny_1.6.0                
## [141] qlcMatrix_0.9.7             Rcpp_1.0.7                 
## [143] later_1.3.0                 RcppAnnoy_0.0.19           
## [145] httr_1.4.2                  biovizBase_1.40.0          
## [147] colorspace_2.0-2            XML_3.99-0.8               
## [149] fs_1.5.0                    tensor_1.5                 
## [151] reticulate_1.22             splines_4.1.0              
## [153] uwot_0.1.10                 RcppRoll_0.3.0             
## [155] spatstat.utils_2.2-0        pkgdown_1.6.1.9001         
## [157] plotly_4.9.4.1              systemfonts_1.0.2          
## [159] xtable_1.8-4                poweRlaw_0.70.6            
## [161] jsonlite_1.7.2              R6_2.5.1                   
## [163] Hmisc_4.5-0                 pillar_1.6.2               
## [165] htmltools_0.5.2             mime_0.11                  
## [167] glue_1.4.2                  fastmap_1.1.0              
## [169] BiocParallel_1.26.2         codetools_0.2-18           
## [171] utf8_1.2.2                  lattice_0.20-44            
## [173] bslib_0.3.0                 spatstat.sparse_2.0-0      
## [175] tibble_3.1.4                curl_4.3.2                 
## [177] leiden_0.3.9                gtools_3.9.2               
## [179] GO.db_3.13.0                survival_3.2-11            
## [181] rmarkdown_2.11              docopt_0.7.1               
## [183] desc_1.3.0                  munsell_0.5.0              
## [185] GenomeInfoDbData_1.2.6      reshape2_1.4.4             
## [187] gtable_0.3.0                spatstat.core_2.3-0