In this vignette we will demonstrate how to find cis-co-accessible networks with Cicero using single-cell ATAC-seq data. Please see the Cicero website for information about Cicero.

To facilitate conversion between the Seurat (used by Signac) and CellDataSet (used by Cicero) formats, we will use a conversion function in the SeuratWrappers package available on GitHub.

Data loading

We will use a single-cell ATAC-seq dataset containing human CD34+ hematopoietic stem and progenitor cells published by Satpathy and Granja et al. (2019, Nature Biotechnology). The processed datasets are available on NCBI GEO here: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE129785

This is the same dataset we used in the trajectory vignette, and we’ll start by loading the dataset that was created in that vignette. See the trajectory vignette for the code used to create the object from raw data.

First we will load their dataset and perform some standard preprocessing using Signac.

# load the object created in the Monocle 3 vignette
bone <- readRDS("../vignette_data/cd34.rds")

Create the Cicero object

We can find cis-co-accessible networks (CCANs) using Cicero.

The Cicero developers have developed a separate branch of the package that works with a Monocle 3 CellDataSet object. We will first make sure this branch is installed, then convert our Seurat object for the whole bone marrow dataset to CellDataSet format.

# Install Cicero
if (!requireNamespace("remotes", quietly = TRUE))
    install.packages("remotes")
remotes::install_github("cole-trapnell-lab/cicero-release", ref = "monocle3")
library(cicero)
# convert to CellDataSet format and make the cicero object
bone.cds <- as.cell_data_set(x = bone)
bone.cicero <- make_cicero_cds(bone.cds, reduced_coordinates = reducedDims(bone.cds)$UMAP)

Find Cicero connections

We’ll demonstrate running Cicero here using just one chromosome to save some time, but the same workflow can be applied to find CCANs for the whole genome.

Here we demonstrate the most basic workflow for running Cicero. This workflow can be broken down into several steps, each with parameters that can be changed from their defaults to fine-tune the Cicero algorithm depending on your data. We highly recommend that you explore the Cicero website, paper, and documentation for more information.

# get the chromosome sizes from the Seurat object
genome <- seqlengths(bone)

# use chromosome 1 to save some time
# omit this step to run on the whole genome
genome <- genome[1]

# convert chromosome sizes to a dataframe
genome.df <- data.frame("chr" = names(genome), "length" = genome)

# run cicero
conns <- run_cicero(bone.cicero, genomic_coords = genome.df, sample_num = 100)
## [1] "Starting Cicero"
## [1] "Calculating distance_parameter value"
## [1] "Running models"
## [1] "Assembling connections"
## [1] "Successful cicero models:  755"
## [1] "Other models: "
## 
##   Too many elements in range Zero or one element in range 
##                          157                           86 
## [1] "Models with errors:  0"
## [1] "Done"
head(conns)
##                      Peak1                  Peak2 coaccess
## 1 chr1-100003337-100003837 chr1-99791719-99792219        0
## 2 chr1-100003337-100003837 chr1-99828699-99829199        0
## 3 chr1-100003337-100003837 chr1-99835542-99836042        0
## 4 chr1-100003337-100003837 chr1-99836217-99836717        0
## 5 chr1-100003337-100003837 chr1-99839576-99840076        0
## 6 chr1-100003337-100003837 chr1-99840640-99841140        0

Find cis-co-accessible networks (CCANs)

Now that we’ve found pairwise co-accessibility scores for each peak, we can now group these pairwise connections into larger co-accessible networks using the generate_ccans() function from Cicero.

ccans <- generate_ccans(conns)
## [1] "Coaccessibility cutoff used: 0.21"
head(ccans)
##                                              Peak CCAN
## chr1-10009702-10010202     chr1-10009702-10010202   20
## chr1-100151188-100151688 chr1-100151188-100151688    1
## chr1-100165566-100166066 chr1-100165566-100166066    1
## chr1-100247892-100248392 chr1-100247892-100248392    1
## chr1-100252210-100252710 chr1-100252210-100252710    2
## chr1-100259383-100259883 chr1-100259383-100259883    2

Acknowledgements

Thanks to the developers of Cicero, especially Cole Trapnell, Hannah Pliner, and members of the Trapnell lab. If you use Cicero please cite the Cicero paper.

Session Info

## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
##  [1] grid      stats4    parallel  stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] cicero_1.3.4.11             Gviz_1.36.2                
##  [3] monocle3_1.0.0              SingleCellExperiment_1.14.1
##  [5] SummarizedExperiment_1.22.0 GenomicRanges_1.44.0       
##  [7] GenomeInfoDb_1.28.4         IRanges_2.26.0             
##  [9] S4Vectors_0.30.0            MatrixGenerics_1.4.3       
## [11] matrixStats_0.61.0          Biobase_2.52.0             
## [13] BiocGenerics_0.38.0         patchwork_1.1.1            
## [15] ggplot2_3.3.5               SeuratWrappers_0.3.0       
## [17] SeuratObject_4.0.2          Seurat_4.0.4               
## [19] Signac_1.4.0               
## 
## loaded via a namespace (and not attached):
##   [1] rappdirs_0.3.3           SnowballC_0.7.0          rtracklayer_1.52.1      
##   [4] scattermore_0.7          R.methodsS3_1.8.1        ragg_1.1.3              
##   [7] tidyr_1.1.3              bit64_4.0.5              knitr_1.34              
##  [10] irlba_2.3.3              DelayedArray_0.18.0      R.utils_2.10.1          
##  [13] data.table_1.14.0        rpart_4.1-15             KEGGREST_1.32.0         
##  [16] RCurl_1.98-1.5           AnnotationFilter_1.16.0  generics_0.1.0          
##  [19] GenomicFeatures_1.44.2   leidenbase_0.1.3         cowplot_1.1.1           
##  [22] RSQLite_2.2.8            RANN_2.6.1               VGAM_1.1-5              
##  [25] future_1.22.1            bit_4.0.4                spatstat.data_2.1-0     
##  [28] xml2_1.3.2               httpuv_1.6.3             assertthat_0.2.1        
##  [31] viridis_0.6.1            xfun_0.26                hms_1.1.0               
##  [34] jquerylib_0.1.4          evaluate_0.14            promises_1.2.0.1        
##  [37] fansi_0.5.0              restfulr_0.0.13          progress_1.2.2          
##  [40] dbplyr_2.1.1             igraph_1.2.6             DBI_1.1.1               
##  [43] htmlwidgets_1.5.4        sparsesvd_0.2            spatstat.geom_2.2-2     
##  [46] purrr_0.3.4              ellipsis_0.3.2           dplyr_1.0.7             
##  [49] backports_1.2.1          biomaRt_2.48.3           deldir_0.2-10           
##  [52] vctrs_0.3.8              remotes_2.4.0            ensembldb_2.16.4        
##  [55] ROCR_1.0-11              abind_1.4-5              withr_2.4.2             
##  [58] cachem_1.0.6             ggforce_0.3.3            BSgenome_1.60.0         
##  [61] checkmate_2.0.0          sctransform_0.3.2        GenomicAlignments_1.28.0
##  [64] prettyunits_1.1.1        goftest_1.2-2            cluster_2.1.2           
##  [67] lazyeval_0.2.2           crayon_1.4.1             labeling_0.4.2          
##  [70] pkgconfig_2.0.3          slam_0.1-48              tweenr_1.0.2            
##  [73] nlme_3.1-152             ProtGenerics_1.24.0      nnet_7.3-16             
##  [76] rlang_0.4.11             globals_0.14.0           lifecycle_1.0.0         
##  [79] miniUI_0.1.1.1           filelock_1.0.2           BiocFileCache_2.0.0     
##  [82] rsvd_1.0.5               dichromat_2.0-0          rprojroot_2.0.2         
##  [85] polyclip_1.10-0          lmtest_0.9-38            Matrix_1.3-4            
##  [88] ggseqlogo_0.1            zoo_1.8-9                base64enc_0.1-3         
##  [91] ggridges_0.5.3           png_0.1-7                viridisLite_0.4.0       
##  [94] rjson_0.2.20             bitops_1.0-7             R.oo_1.24.0             
##  [97] KernSmooth_2.23-20       Biostrings_2.60.2        blob_1.2.2              
## [100] stringr_1.4.0            parallelly_1.28.1        jpeg_0.1-9              
## [103] scales_1.1.1             memoise_2.0.0            magrittr_2.0.1          
## [106] plyr_1.8.6               ica_1.0-2                zlibbioc_1.38.0         
## [109] compiler_4.1.0           BiocIO_1.2.0             RColorBrewer_1.1-2      
## [112] fitdistrplus_1.1-5       Rsamtools_2.8.0          XVector_0.32.0          
## [115] listenv_0.8.0            pbapply_1.5-0            htmlTable_2.2.1         
## [118] Formula_1.2-4            MASS_7.3-54              mgcv_1.8-36             
## [121] tidyselect_1.1.1         stringi_1.7.4            textshaping_0.3.5       
## [124] highr_0.9                yaml_2.2.1               latticeExtra_0.6-29     
## [127] ggrepel_0.9.1            sass_0.4.0               VariantAnnotation_1.38.0
## [130] fastmatch_1.1-3          tools_4.1.0              future.apply_1.8.1      
## [133] rstudioapi_0.13          foreign_0.8-81           lsa_0.73.2              
## [136] gridExtra_2.3            farver_2.1.0             Rtsne_0.15              
## [139] digest_0.6.27            BiocManager_1.30.16      FNN_1.1.3               
## [142] shiny_1.6.0              qlcMatrix_0.9.7          Rcpp_1.0.7              
## [145] later_1.3.0              RcppAnnoy_0.0.19         httr_1.4.2              
## [148] AnnotationDbi_1.54.1     biovizBase_1.40.0        colorspace_2.0-2        
## [151] XML_3.99-0.8             fs_1.5.0                 tensor_1.5              
## [154] reticulate_1.22          splines_4.1.0            uwot_0.1.10             
## [157] RcppRoll_0.3.0           spatstat.utils_2.2-0     pkgdown_1.6.1.9001      
## [160] plotly_4.9.4.1           systemfonts_1.0.2        xtable_1.8-4            
## [163] jsonlite_1.7.2           R6_2.5.1                 Hmisc_4.5-0             
## [166] pillar_1.6.2             htmltools_0.5.2          mime_0.11               
## [169] glue_1.4.2               fastmap_1.1.0            BiocParallel_1.26.2     
## [172] codetools_0.2-18         utf8_1.2.2               lattice_0.20-44         
## [175] bslib_0.3.0              spatstat.sparse_2.0-0    tibble_3.1.4            
## [178] curl_4.3.2               leiden_0.3.9             survival_3.2-11         
## [181] docopt_0.7.1             rmarkdown_2.11           desc_1.3.0              
## [184] munsell_0.5.0            GenomeInfoDbData_1.2.6   reshape2_1.4.4          
## [187] gtable_0.3.0             spatstat.core_2.3-0