jcvi下載 - jcvi原始碼下載

jcvi

其他源碼

下載

JCVI：用於比較基因組分析的多功能工具包

用於解析生物資訊學文件或執行與組裝、註釋和比較基因組學相關的計算的 Python 庫集合。


作者	海寶堂 (tanghaibao)
	維韋克‧克里希納庫瑪 (vivekkrish)
	張杏壇 (tangerzhang)
	嚴元哲 (wyim-pgl)
電子郵件	[email protected]
執照	BSD

如何引用

提示

JCVI 現已發佈在 iMeta 上！

唐等人。 (2024) JCVI：用於比較基因組分析的多功能工具包。易元

內容

以下模組可用作通用生物資訊處理方法。

演算法
- 使用 SCIP 和 GLPK 的線性規劃求解器。
- Supermap：在 BLAST 或 NUCMER 輸出中尋找一組不重疊的錨點。
- 最長或最重的遞增子序列。
- 矩陣運算。
應用程式
- GenBank entrez accession、Phytozome、Ensembl 和 SRA 下載器。
- 計算基因對之間的（非）同義替換率。
- 使用 PHYLIP、PhyML 或 RAxML 建構基本系統發育樹以及虛擬化。
- BLAST+、LASTZ、LAST、BWA、BOWTIE2、CLC、CDHIT、CAP3 等的包裝。
格式
目前支援.ace格式（phrap、cap3等）、. .agp （goldenpath）、. .bed格式、 .blast輸出、 .btab格式、 .coords格式（ nucmer輸出）、. .fasta格式、 .fastq格式、 .fpc格式、 .gff格式、 obo格式（本體）、. .psl格式（UCSC blat、GMAP 等）、. .posmap格式（Celera 組譯器輸出）、. .sam格式（讀取對應）、. .contig格式（TIGR 彙編格式）， ETC 。
圖形
- BLAST 或同線性點圖。
- 使用 R 和 ASCII 藝術的直方圖。
- 在一組染色體上繪製區域。
- 宏觀同線性和微觀同線性圖。
實用程式
- 石斑魚可以用作不相交集資料結構。
- range 包含常見的範圍操作，例如重疊和連結。
- 各種食譜、迭代器裝飾器、表格實用程式。

然後是包含特定於域的方法的模組。

集會
- K-mer 直方圖分析。
- 為基於克隆的組件準備和驗證平鋪路徑。
- 透過 ALLMAPS、光學圖和遺傳圖搭建支架。
- 裝配前和裝配後品質控製程序。
註解
- 從頭開始基因預測器的訓練。
- 計算基因、外顯子和內含子統計。
- PASA 和 EVM 的包裝。
- 啟動多個 MAKER 程序。
比較
- 基於 C 分數的 BLAST 過濾器。
- 同線性掃描（從頭）並提升（找到附近的錨點）。
- 使用 Sankoff 和 PAR 方法重建祖先基因組。
- 直向同源基因和串聯基因重複查找器。

應用領域

請造訪 wiki 以了解完整的應用程式。

依賴關係

以下是庫中某些例程使用的第三方 python 套件的清單。這些依賴項不是強制性的，因為它們僅由少數模組使用。

生物蟒蛇
麻木
繪圖庫

各種腳本中到處都有其他 Python 模組。最好的方法是當您看到ImportError時透過pip install安裝它們。

安裝

最簡單的方法是透過 PyPI 安裝它：

 pip install jcvi

安裝開發版本：

 pip install git+git://github.com/tanghaibao/jcvi.git

或者，如果您想手動安裝：

 cd ~/code  # or any directory of your choice
git clone git://github.com/tanghaibao/jcvi.git
pip install -e .

此外，如果在您的PATH中找不到擴充程序，某些模組可能會詢問外部程式的位置。經常使用的外部程式有：

肯特工具
床具
浮雕

該套件中的大多數腳本都包含多個操作。使用fasta範例：

 Usage:
    python -m jcvi.formats.fasta ACTION


Available ACTIONs:
          clean | Remove irregular chars in FASTA seqs
           diff | Check if two fasta records contain same information
        extract | Given fasta file and seq id, retrieve the sequence in fasta format
          fastq | Combine fasta and qual to create fastq file
         filter | Filter the records by size
         format | Trim accession id to the first space or switch id based on 2-column mapping file
        fromtab | Convert 2-column sequence file to FASTA format
           gaps | Print out a list of gap sizes within sequences
             gc | Plot G+C content distribution
      identical | Given 2 fasta files, find all exactly identical records
            ids | Generate a list of headers
           info | Run `sequence_info` on fasta files
          ispcr | Reformat paired primers into isPcr query format
           join | Concatenate a list of seqs and add gaps in between
     longestorf | Find longest orf for CDS fasta
           pair | Sort paired reads to .pairs, rest to .fragments
    pairinplace | Starting from fragment.fasta, find if adjacent records can form pairs
           pool | Pool a bunch of fastafiles together and add prefix
           qual | Generate dummy .qual file based on FASTA file
         random | Randomly take some records
         sequin | Generate a gapped fasta file for sequin submission
       simulate | Simulate random fasta file for testing
           some | Include or exclude a list of records (also performs on .qual file if available)
           sort | Sort the records by IDs, sizes, etc.
        summary | Report the real no of bases and N's in fasta files
           tidy | Normalize gap sizes and remove small components in fasta
      translate | Translate CDS to proteins
           trim | Given a cross_match screened fasta, trim the sequence
      trimsplit | Split sequences at lower-cased letters
           uniq | Remove records that are the same