到
Binnacle準確地計算了圖形支架的覆蓋範圍,並與領先的Binning方法(例如Metabat2,Maxbin 2.0和Concotct)無縫集成。使用圖形支架,與重疊群(最常見的方法)相反,用於binning可以改善元基因組垃圾箱的連續性和質量,並可以捕獲重建基因組的更廣泛的輔助元素。
要運行Binnacle,您將需要Python 3.7.x,BedTools,Samtools,Biopython,NetworkX,Numpy和Pandas。
可以使用一個環境。這裡提供了有關如何安裝這些軟件包的詳細文檔。我們使用掌腳架工具輸出的圖形腳手架,因此您還需要下載和安裝MetAcarvel。有一個逐步安裝指南,用於掌級。
通常,當您有一個或多個元基因組樣品時,我們需要從每個樣品中組裝,腳手架和垃圾桶/支架來生成元基因組箱。我們建議將Megahit用於組裝,並使用MetaCarvel進行腳手架。我們通過組裝,腳手架和每位基礎覆蓋範圍估算步驟提供了一個輔助指南。
請按照以下步驟生成文件,以便使用圖形支架運行的binning方法:
python Estimate_Abundances.py -g [ORIENTED.gml] -a [COVERAGE_SORTED.txt] -c [CONTIGS.fa] -d [OUTPUT_DIRECTORY]
usage: Estimate_Abundances.py [-h] [-g ASSEMBLY] [-a COVERAGE] [-bam BAMFILE]
[-bed BEDFILE] [-c CONTIGS] -d DIR [-o COORDS]
[-w WINDOW_SIZE] [-t THRESHOLD]
[-n NEIGHBOR_CUTOFF] [-p POSCUTOFF]
[-pre PREFIX]
binnacle: A tool for binning metagenomic datasets using assembly graphs and
scaffolds generated by metacarvel. Estimate_Abundances.py estimates abundance
for scaffolds generated by MetaCarvel. If the coordinates computed by binnacle
is specified then the abundance for each scaffold is estimated based on the
contig abundances and the coordinates. If the coordinates are not specified
then binnacle etimates the abundance from scratch. While calculating all vs
all abundances please specify the coordinates(Coordinates_After_Delinking.txt)
through the "coords" parameter. The abundances can be provided as a bed file,
bam file or a text file describing the per base coverage obtained by running
the genomeCoverageBed program of the bedtools suite.
optional arguments:
-h, --help show this help message and exit
-g ASSEMBLY, --assembly ASSEMBLY
Assembly Graph generated by Metacarvel
-a COVERAGE, --coverage COVERAGE
Output generated by running genomecov -d on the bed
file generated by MetaCarvel.
-bam BAMFILE, --bamfile BAMFILE
Bam file from aligning reads to contigs
-bed BEDFILE, --bedfile BEDFILE
Bed file from aligning reads to contigs. If bed file
is provided please provide a fasta file of the contigs
-c CONTIGS, --contigs CONTIGS
Contigs generated by the assembler, contigs.fasta
-d DIR, --dir DIR output directory for results
-o COORDS, --coords COORDS
Coordinate file generated by Binnacle
-w WINDOW_SIZE, --window_size WINDOW_SIZE
Size of the sliding window for computing test
statistic to identify changepoints in coverages
(Default=1500)
-t THRESHOLD, --threshold THRESHOLD
Threshold to identify outliers (Default=99)
-n NEIGHBOR_CUTOFF, --neighbor_cutoff NEIGHBOR_CUTOFF
Filter size to identify outliers within (Defualt=100)
-p POSCUTOFF, --poscutoff POSCUTOFF
Position cutoff to consider delinking (Default=100)
-pre PREFIX, --prefix PREFIX
Prefix to be attached to all outputs
-g Path to oriented.gml from running metacarvel on sample
-c Path to contigs obtained by assembling reads of sample
-a Coverage of contigs in ths sample by mapping to its reads -- See Wiki for how to calculate coverage information
-d Output directory
-a Coverage of contigs in Sample 1 by mapping reads of Sample 2 -- See Wiki for how to calculate coverage information
-o Coordinates of scaffolds from Sample 1 that you would have generated from the previous step.
-d Same output directory as Sample 1
python Collate.py -h
usage: Collate.py [-h] -d DIR [-m METHOD] [-k KEEP]
binnacle: A tool for binning metagenomic datasets using assembly graphs and
scaffolds generated by metacarvel.Estimate_Abundances.py estimates abundance
for scaffolds generated by MetaCarvel. The program Collate.py collects the
summary files generated by Estimate_Abundances.py
optional arguments:
-h, --help show this help message and exit
-d DIR, --dir DIR Output directory that contains the summary files
generated by running Estimate_Abundances.py
-m METHOD, --method METHOD
Binning method to format the output to. Presently we
support 1. Metabat 2. Maxbin 3. Concoct 4. Binnacle
(Default)
-k KEEP, --keep KEEP Retain the summary files generated by
Estimate_Abundances.py. Defaults to True
請查看Wiki,以獲取有關設置Python環境的詳細說明,計算覆蓋範圍的方法以及運行Binnacle的典型工作流程。
為了可視化圖形支架,我們建議使用基於Web的瀏覽器Metagenomescope。 Metagenomescope的輸入為assembly_graph_filtered.gml。此處給出了有關安裝和運行元基因組學的詳細文檔。
請引用Muralidharan HS,Shah N,Meisel JS和Pop M(2021)Binnacle:使用腳手架來改善元基因組垃圾箱的連續性和質量。正面。微生物。 12:638561。 doi:10.3389/fmicb.2021.638561。
該工具仍在開發中。如果您有任何疑問,請在Github上在此處打開問題或與我們聯繫。
Harihara Muralidharan:[email protected]
nidhi shah:[email protected]