jcvi下载 - jcvi源码下载

jcvi

其他源码

下载

JCVI：用于比较基因组分析的多功能工具包

用于解析生物信息学文件或执行与组装、注释和比较基因组学相关的计算的 Python 库集合。


作者	海宝堂 (tanghaibao)
	维韦克·克里希纳库玛 (vivekkrish)
	张杏坛 (tangerzhang)
	严元哲 (wyim-pgl)
电子邮件	[email protected]
执照	BSD

如何引用

提示

JCVI 现已发布在 iMeta 上！

唐等人。 (2024) JCVI：用于比较基因组分析的多功能工具包。易元

内容

以下模块可用作通用生物信息学处理方法。

算法
- 使用 SCIP 和 GLPK 的线性规划求解器。
- Supermap：在 BLAST 或 NUCMER 输出中查找一组不重叠的锚点。
- 最长或最重的递增子序列。
- 矩阵运算。
应用程序
- GenBank entrez accession、Phytozome、Ensembl 和 SRA 下载器。
- 计算基因对之间的（非）同义替换率。
- 使用 PHYLIP、PhyML 或 RAxML 构建基本系统发育树以及虚拟化。
- BLAST+、LASTZ、LAST、BWA、BOWTIE2、CLC、CDHIT、CAP3 等的包装。
格式
目前支持.ace格式（phrap、cap3等）、. .agp （goldenpath）、. .bed格式、 .blast输出、 .btab格式、 .coords格式（ nucmer输出）、. .fasta格式、 .fastq格式、 .fpc格式、 .gff格式、 obo格式（本体）、. .psl格式（UCSC blat、GMAP 等）、. .posmap格式（Celera汇编器输出）、. .sam格式（读映射）、. .contig格式（TIGR 汇编格式）等。
图形
- BLAST 或同线性点图。
- 使用 R 和 ASCII 艺术的直方图。
- 在一组染色体上绘制区域。
- 宏观同线性和微观同线性图。
实用程序
- 石斑鱼可以用作不相交集数据结构。
- range 包含常见的范围操作，例如重叠和链接。
- 各种食谱、迭代器装饰器、表格实用程序。

然后是包含特定于域的方法的模块。

集会
- K-mer 直方图分析。
- 为基于克隆的组件准备和验证平铺路径。
- 通过 ALLMAPS、光学图和遗传图搭建支架。
- 装配前和装配后质量控制程序。
注解
- 从头开始基因预测器的训练。
- 计算基因、外显子和内含子统计数据。
- PASA 和 EVM 的包装。
- 启动多个 MAKER 进程。
比较
- 基于 C 分数的 BLAST 过滤器。
- 同线性扫描（从头）并提升（找到附近的锚点）。
- 使用 Sankoff 和 PAR 方法重建祖先基因组。
- 直向同源基因和串联基因重复查找器。

应用领域

请访问 wiki 了解完整的应用程序。

依赖关系

以下是库中某些例程使用的第三方 python 包的列表。这些依赖项不是强制性的，因为它们仅由少数模块使用。

生物蟒蛇
麻木
绘图库

各种脚本中到处都有其他 Python 模块。最好的方法是当您看到ImportError时通过pip install安装它们。

安装

最简单的方法是通过 PyPI 安装它：

 pip install jcvi

安装开发版本：

 pip install git+git://github.com/tanghaibao/jcvi.git

或者，如果您想手动安装：

 cd ~/code  # or any directory of your choice
git clone git://github.com/tanghaibao/jcvi.git
pip install -e .

此外，如果在您的PATH中找不到扩展程序，某些模块可能会询问外部程序的位置。经常使用的外部程序有：

肯特工具
床具
浮雕

该包中的大多数脚本都包含多个操作。使用fasta示例：

 Usage:
    python -m jcvi.formats.fasta ACTION


Available ACTIONs:
          clean | Remove irregular chars in FASTA seqs
           diff | Check if two fasta records contain same information
        extract | Given fasta file and seq id, retrieve the sequence in fasta format
          fastq | Combine fasta and qual to create fastq file
         filter | Filter the records by size
         format | Trim accession id to the first space or switch id based on 2-column mapping file
        fromtab | Convert 2-column sequence file to FASTA format
           gaps | Print out a list of gap sizes within sequences
             gc | Plot G+C content distribution
      identical | Given 2 fasta files, find all exactly identical records
            ids | Generate a list of headers
           info | Run `sequence_info` on fasta files
          ispcr | Reformat paired primers into isPcr query format
           join | Concatenate a list of seqs and add gaps in between
     longestorf | Find longest orf for CDS fasta
           pair | Sort paired reads to .pairs, rest to .fragments
    pairinplace | Starting from fragment.fasta, find if adjacent records can form pairs
           pool | Pool a bunch of fastafiles together and add prefix
           qual | Generate dummy .qual file based on FASTA file
         random | Randomly take some records
         sequin | Generate a gapped fasta file for sequin submission
       simulate | Simulate random fasta file for testing
           some | Include or exclude a list of records (also performs on .qual file if available)
           sort | Sort the records by IDs, sizes, etc.
        summary | Report the real no of bases and N's in fasta files
           tidy | Normalize gap sizes and remove small components in fasta
      translate | Translate CDS to proteins
           trim | Given a cross_match screened fasta, trim the sequence
      trimsplit | Split sequences at lower-cased letters
           uniq | Remove records that are the same