utils

This module provides several utility functions to be used within the SNP-IC and cpSNP-IC calculation pipelines. It exposes both a Click CLI group and Python-callable wrappers.

CLI Commands

oddSNP utils

Usage

oddSNP utils [OPTIONS] COMMAND [ARGS]...

count-lines

Count the number of lines in a BAM file using samtools

Arguments:

BAM: Name of the BAM file

Usage

oddSNP utils count-lines [OPTIONS] BAM

Options

--nproc <nproc>

Number of parallel processes to use

Arguments

BAM

Required argument

cslite-pileup

Run cellsnp-lite to generate pileup files

Arguments:

BAM: input bam file

REFERENCE: List of reference variants

BARCODES: List of cell barcodes

OUPATH: Path to directory to store results.

Usage

oddSNP utils cslite-pileup [OPTIONS] BAM REFERENCE BARCODES OUPATH

Options

--celltag <celltag>

Tag used to reference cell barcodes in the BAM.

--umitag <umitag>

Tag used to reference UMI codes in the BAM.

--mincount <mincount>
--nproc <nproc>

Number of processes to use

--force

Override previous results

Arguments

BAM

Required argument

REFERENCE

Required argument

BARCODES

Required argument

OUPATH

Required argument

vireo

Run vireo to demultiplex single-cell data based on genotype information.

Arguments:

INPATH: Path to the input file containing the single-cell data to be demultiplexed.

OUPATH: Path to the output directory to store results.

Usage

oddSNP utils vireo [OPTIONS] INPATH OUPATH

Options

--genotype <genotype>

Path to the donors genotype VCF file.

--genotag <genotag>

Genotype tag to use from the VCF file (GT, GP, or PL). Default is PL.

--ndonor <ndonor>

Number of donors in the sample. If not provided, vireo will try to estimate it automatically.

--nproc <nproc>

Number of parallel processes to use

--force

Override target files.

Arguments

INPATH

Required argument

OUPATH

Required argument

Python API

oddSNP.utils.assert_cellsnplite()[source]

Asserts that cellsnp-slite has been installed and executable.

oddSNP.utils.assert_vireo()[source]

Asserts that vireo has been installed and executable.

oddSNP.utils.call_count_lines(bam, nproc)[source]

Python wrapper for count_lines().

Parameters:
  • bam – The BAM file to process

  • nproc – Number of parallel processes to use

Returns:

The number of lines in the BAM file as an integer

oddSNP.utils.call_cslite_pileup(bam, reference, barcodes, oupath, celltag, umitag, mincount, nproc, force)[source]

Python wrapper for cslite_pileup().

Parameters:
  • bam – The BAM file to process

  • reference – List of reference variants

  • barcodes – List of cell barcodes

  • oupath – Path to directory to store results

  • celltag – Tag used to reference cell barcodes in the BAM

  • umitag – Tag used to reference UMI codes in the BAM

  • mincount – Minimum count threshold for pileup

  • nproc – Number of processes to use

  • force – If True overwrites existing results

Returns:

None

oddSNP.utils.call_vireo(inpath, oupath, genotype, genotag, ndonor, nproc, force)[source]

Python wrapper for vireo()

Parameters:
  • inpath – Path to the input file containing the single-cell data to be demultiplexed.

  • oupath – Path to the output directory to store results.

  • genotype – Path to the donors genotype VCF file.

  • genotag – Genotype tag to use from the VCF file (GT, GP, or PL). Default is PL.

  • ndonor – Number of donors in the sample. If not provided, vireo will try to estimate it automatically.

  • nproc – Number of parallel processes to use

  • force – If True overwrites existing results

oddSNP.utils.generate_bam_index(bam, nproc)[source]

Generate an index for the input file using pysam.samtools

Parameters:
  • bam – The BAM file to index

  • nproc – Number of parallel processes to use for indexing

oddSNP.utils.generate_vcf_index(vcf, nproc)[source]

Generate an index for the input VCF file using pysam.bcftools

Parameters:
  • vcf – The VCF file to index

  • nproc – Number of parallel processes to use for indexing