This function calculates an adaptive inflection point ("knee") of the barcode distribution for each sample group. This is useful for determining a threshold for removing low-quality samples.

CalculateBarcodeInflections(
object,
barcode.column = "nCount_RNA",
group.column = "orig.ident",
threshold.low = NULL,
threshold.high = NULL
)

## Arguments

object

Seurat object

barcode.column

Column to use as proxy for barcodes ("nCount_RNA" by default)

group.column

Column to group by ("orig.ident" by default)

threshold.low

Ignore barcodes of rank below this threshold in inflection calculation

threshold.high

Ignore barcodes of rank above thisf threshold in inflection calculation

## Value

Returns Seurat object with a new list in the tools slot, CalculateBarcodeInflections with values:

* barcode_distribution - contains the full barcode distribution across the entire dataset * inflection_points - the calculated inflection points within the thresholds * threshold_values - the provided (or default) threshold values to search within for inflections * cells_pass - the cells that pass the inflection point calculation

## Details

The function operates by calculating the slope of the barcode number vs. rank distribution, and then finding the point at which the distribution changes most steeply (the "knee"). Of note, this calculation often must be restricted as to the range at which it performs, so threshold parameters are provided to restrict the range of the calculation based on the rank of the barcodes. [BarcodeInflectionsPlot()] is provided as a convenience function to visualize and test different thresholds and thus provide more sensical end results.

See [BarcodeInflectionsPlot()] to visualize the calculated inflection points and [SubsetByBarcodeInflections()] to subsequently subset the Seurat object.

BarcodeInflectionsPlot SubsetByBarcodeInflections

## Author

Robert A. Amezquita, robert.amezquita@fredhutch.org

## Examples

data("pbmc_small")
CalculateBarcodeInflections(pbmc_small, group.column = 'groups')
#> An object of class Seurat
#> 230 features across 80 samples within 1 assay
#> Active assay: RNA (230 features, 20 variable features)
#>  2 dimensional reductions calculated: pca, tsne