Omnition is a pipeline designed to process data generated with Bio-Rad's ddSEQ™ Single-Cell 3’ RNA-Seq Kit or ddSEQ™ SureCell ATAC-Seq Library Prep Kit and the dsciATAC protocol. It performs debarcoding, alignment, bead merging, cell calling, and feature counting. Output is an HTML report and files for downstream biological analysis.
This page is intended to be a quick start. For the complete Omnition user guides, please see:
Omnition Single-cell 3’ RNA-Seq
Omnition Single-cell ATAC-Seq
Introduction to Omnition
Installation
System requirements
Hardware requirements
Software requirements
Download Omnition
Verify installation
Getting started
YAML configuration
Run ATAC pipeline
Run RNA pipeline
Human genome
Mouse genome
References
Running Omnition
Outputs
Omnition Analysis Software utilizes the Nextflow framework to connect individual processes, and runs the processes in virtual environments called containers using the Docker or Singularity container programs. Once Omnition is installed on a system meeting minimum requirements, Nextflow integration with GitHub and Docker will prepare workflows without any additional configuration.
Omnition is designed to run on a local Linux server, high performance computing (HPC) cluster, or cloud virtual machine, and has been tested on the 64-bit CentOS 7 and 8, Amazon Linux 2, and Ubuntu 18.04.6, 20.04 LTS, 21.04, and 21.10 Linux operating systems.
NOTE: Although they might be functional, Bio-Rad does not support additional Linux variants or other versions of the specified operating systems.
Internet connection
NOTE: For installation without direct internet access, please see user manual.
Requirement | ATAC seq analysis | Combinatorial ATAC seq analysis | RNA seq analysis | Recommended for >=12x samples |
---|---|---|---|---|
CPU | 16 | 16 | 16 | 64 |
RAM | 64 GB | 128 GB | 64 GB | 512 GB |
IOPS* | 3,000 | 3,000 | 3,000 | 16,000 |
I/O throughput* | 125 mbps | 125 mbps | 125 mbps | 1,000 mbps |
EBS volume type* | gp3 | gp3 | gp3 | gp3 |
*AWS specific cloud computing specifications
Nextflow (v22.04.0 to v23.10.1) or Nextflow with Conda
NOTE: Specify the Nextflow version in the Conda installation by using the following command:
conda install –c bioconda nextflow=
Only one of the following container programs are needed: Either Docker (>=20.10.7) or Singularity (>=3.6.4)
NOTE: If using Docker, your
USER
must be added to the docker root user group before executing the pipeline. On shared systems, such as HPC clusters, this may not be possible due to security risks and the pipeline should be executed using the Singularity profile (default) instead. The user must verify with their system administrator that Docker or Singularity is available before using Omnition.
To download and run Omnition, use nextflow to retrieve the latest Omnition version from GitHub.
nextflow pull BioRadOpenSource/omnition
Omnition includes small demonstration datasets to verify that the environment has been properly built and all software dependencies are in place. To verify the success of the installation for each analysis type, run the Nextflow command for the container system that is installed on your computer (Singularity or Docker).
To verify each analysis workflow is installed correctly, run the following commands with either Docker or Singularity.
NOTE: Working files (and output files, unless otherwise specified) will be generated in the same directory the pipeline was run from on the command line.
mkdir /home/ubuntu/demo_data cd /home/ubuntu/demo_data
NOTE: Specified file paths are for example purposes
Docker:
# Verify single and mixed species 3’ RNA workflows nextflow run BioRadOpenSource/omnition -profile demo_rna_single,docker --rna.output /home/ubuntu/demo_data/rna nextflow run BioRadOpenSource/omnition -profile demo_rna_mixed_options,docker --rna.output /home/ubuntu/demo_data/rna_mixed # Verify ATAC-seq and combinatorial ATAC-seq workflows nextflow run BioRadOpenSource/omnition -profile demo_atac,docker --atac.output /home/ubuntu/demo_data/atac nextflow run BioRadOpenSource/omnition -profile demo_catac,docker --catac.output /home/ubuntu/demo_data/catac
Singularity:
# Verify single and mixed species 3’ RNA workflows nextflow run BioRadOpenSource/omnition -profile demo_rna_single,standard --rna.output /home/ubuntu/demo_data/rna nextflow run BioRadOpenSource/omnition -profile demo_rna_mixed_options,standard --rna.output /home/ubuntu/demo_data/rna_mixed # Verify ATAC-seq and combinatorial ATAC-seq workflows nextflow run BioRadOpenSource/omnition -profile demo_atac,standard --atac.output /home/ubuntu/demo_data/atac nextflow run BioRadOpenSource/omnition -profile demo_catac,standard --catac.output /home/ubuntu/demo_data/catac
NOTE: Please note the use of
-
and--
in the execution commands. Arguments with a single-
in front are Nextflow arguments and those with--
are user-defined parameters. You may also use the Nextflow -r flag to specify a git tag (i.e. release), branch, or hash to execute the pipeline at that point in the git history.
NOTE: When launching a nextflow run there will be a randomly generated [adjective_names] that appear in the terminal. These names are from a list prepared by Nextflow. This is an inherent feature of nextflow, not something added by Bio-Rad.
When the run finishes, you should see a message that it has completed without any failed tasks.
Before running a workflow, Omnition requires the following files to run:
Genome FASTA and GTF files from ENSEMBL.
Genome reference sequences must be formatted as FASTA files.
Annotations must be formatted as GTF files.
Sequence names in the FASTA and GTF files must match.
Directory with input FASTQs
NOTE: Input reference files can be compressed as gzip (.gz) files
Omnition is compatible with references from ENSEMBL. The ENSEMBL Human/GRCh38 and Mouse/GRCm39 references are supported by Omnition. Other species from ENSEMBL are not supported and references produced by other sources (NCBI, GENCODE, etc.) are not compatible.
Human and mouse references can be obtained with the following commands.
mkdir ~/references/human cd ~/references/human
NOTE: Specified file paths are for example purposes
curl -o Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz http://ftp.ensembl.org/pub/release-106/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz curl -o Homo_sapiens.GRCh38.106.gtf.gz http://ftp.ensembl.org/pub/release-106/gtf/homo_sapiens/Homo_sapiens.GRCh38.106.gtf.gz
mkdir ~/references/mouse cd ~/references/mouse
curl -o Mus_musculus.GRCm39.dna.primary_assembly.fa.gz http://ftp.ensembl.org/pub/release-106/fasta/mus_musculus/dna/Mus_musculus.GRCm39.dna.primary_assembly.fa.gz curl -o Mus_musculus.GRCm39.106.gtf.gz http://ftp.ensembl.org/pub/release-106/gtf/mus_musculus/Mus_musculus.GRCm39.106.gtf.gz
The pipeline requires a YAML-formatted file with input/output paths and assay-specific parameters when not running the test data. A detailed description of the file contents, available parameters, and how to format them can be found in the user manual. Example YAML configs can be found under the example-yamls folder of this repo.
With Singularity:
nextflow run BioRadOpenSource/omnition -params-file
With Docker:
nextflow run BioRadOpenSource/omnition -params-file-profile docker
With Singularity:
nextflow run BioRadOpenSource/omnition -params-file
With Docker:
nextflow run BioRadOpenSource/omnition -params-file-profile docker
After completion, reports can be found in the results/report/
subdirectory and intermediate files can be found in the results/Sample_Files/
subdirectory. Additionally, a Nextflow cache directory (.nextflow/
) and working directory (work/
) will be created in the directory where Nextflow was executed from. This allows interrupted/failed analyses to resume from their point of failure. If you would like to continue a run, add -resume
to the execution commands below. You can delete the cache and working directories to save space after completion though this will inhibit the -resume
feature.
Return to top