%global _empty_manifest_terminate_build 0 Name: python-jcvi Version: 1.3.5 Release: 1 Summary: Python utility libraries on genome assembly, annotation and comparative genomics License: BSD URL: http://github.com/tanghaibao/jcvi Source0: https://mirrors.aliyun.com/pypi/web/packages/54/41/853aa89aac24c68a4576289e3964a0b24be91075c76928323e7bba42d4e7/jcvi-1.3.5.tar.gz BuildArch: noarch Requires: python3-CrossMap Requires: python3-PyPDF2 Requires: python3-biopython Requires: python3-boto3 Requires: python3-brewer2mpl Requires: python3-deap Requires: python3-ete3 Requires: python3-ftpretty Requires: python3-gffutils Requires: python3-goatools Requires: python3-graphviz Requires: python3-jinja2 Requires: python3-matplotlib Requires: python3-more-itertools Requires: python3-natsort Requires: python3-networkx Requires: python3-numpy Requires: python3-ortools Requires: python3-pybedtools Requires: python3-rich Requires: python3-scikit-image Requires: python3-scipy Requires: python3-seaborn Requires: python3-webcolors %description # JCVI utility libraries [![DOI](https://zenodo.org/badge/doi/10.5281/zenodo.31631.svg)](https://doi.org/10.5281/zenodo.594205) [![Latest PyPI version](https://img.shields.io/pypi/v/jcvi.svg)](https://pypi.python.org/pypi/jcvi) [![bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/jcvi/README.html?highlight=jcvi) [![Github Actions](https://github.com/tanghaibao/jcvi/workflows/build/badge.svg)](https://github.com/tanghaibao/jcvi/actions) [![Downloads](https://pepy.tech/badge/jcvi)](https://pepy.tech/project/jcvi) Collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics. | | | | ------- | ---------------------------------------------------------------- | | Authors | Haibao Tang ([tanghaibao](http://github.com/tanghaibao)) | | | Vivek Krishnakumar ([vivekkrish](https://github.com/vivekkrish)) | | | Jingping Li ([Jingping](https://github.com/Jingping)) | | | Xingtan Zhang ([tangerzhang](https://github.com/tangerzhang)) | | Email | | | License | [BSD](http://creativecommons.org/licenses/BSD/) | ## Citations - If you use the MCscan pipeline for synteny inference, please cite: _Tang et al. (2008) Synteny and Collinearity in Plant Genomes. [Science](https://science.sciencemag.org/content/320/5875/486)_ ![MCSCAN example](https://www.dropbox.com/s/9vl3ys3ndvimg4c/grape-peach-cacao.png?raw=1) - If you use the ALLMAPS pipeline for genome scaffolding, please cite: _Tang et al. (2015) ALLMAPS: robust scaffold ordering based on multiple maps. [Genome Biology](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0573-1)_ ![ALLMAPS animation](https://www.dropbox.com/s/jfs8xavcxix37se/ALLMAPS.gif?raw=1) - For other uses, please cite the package directly: _Tang et al. (2015). jcvi: JCVI utility libraries. Zenodo. [10.5281/zenodo.31631](http://dx.doi.org/10.5281/zenodo.31631)_ ![GRABSEEDS example](https://www.dropbox.com/s/yu9ehsi6sqifuaa/bluredges.png?raw=1) ## Contents Following modules are available as generic Bioinformatics handling methods. - algorithms - Linear programming solver with SCIP and GLPK. - Supermap: find set of non-overlapping anchors in BLAST or NUCMER output. - Longest or heaviest increasing subsequence. - Matrix operations. - apps - GenBank entrez accession, Phytozome, Ensembl and SRA downloader. - Calculate (non)synonymous substitution rate between gene pairs. - Basic phylogenetic tree construction using PHYLIP, PhyML, or RAxML, and viualization. - Wrapper for BLAST+, LASTZ, LAST, BWA, BOWTIE2, CLC, CDHIT, CAP3, etc. - formats Currently supports `.ace` format (phrap, cap3, etc.), `.agp` (goldenpath), `.bed` format, `.blast` output, `.btab` format, `.coords` format (`nucmer` output), `.fasta` format, `.fastq` format, `.fpc` format, `.gff` format, `obo` format (ontology), `.psl` format (UCSC blat, GMAP, etc.), `.posmap` format (Celera assembler output), `.sam` format (read mapping), `.contig` format (TIGR assembly format), etc. - graphics - BLAST or synteny dot plot. - Histogram using R and ASCII art. - Paint regions on set of chromosomes. - Macro-synteny and micro-synteny plots. - utils - Grouper can be used as disjoint set data structure. - range contains common range operations, like overlap and chaining. - Miscellaneous cookbook recipes, iterators decorators, table utilities. Then there are modules that contain domain-specific methods. - assembly - K-mer histogram analysis. - Preparation and validation of tiling path for clone-based assemblies. - Scaffolding through ALLMAPS, optical map and genetic map. - Pre-assembly and post-assembly QC procedures. - annotation - Training of _ab initio_ gene predictors. - Calculate gene, exon and intron statistics. - Wrapper for PASA and EVM. - Launch multiple MAKER processes. - compara - C-score based BLAST filter. - Synteny scan (de-novo) and lift over (find nearby anchors). - Ancestral genome reconstruction using Sankoff's and PAR method. - Ortholog and tandem gene duplicates finder. ## Applications Please visit [wiki](https://github.com/tanghaibao/jcvi/wiki) for full-fledged applications. ## Dependencies Following are a list of third-party python packages that are used by some routines in the library. These dependencies are _not_ mandatory since they are only used by a few modules. - [Biopython](http://www.biopython.org) - [numpy](http://numpy.scipy.org) - [matplotlib](http://matplotlib.org/) There are other Python modules here and there in various scripts. The best way is to install them via `pip install` when you see `ImportError`. ## Installation The easiest way is to install it via PyPI: ```console pip install jcvi ``` To install the development version: ```console pip install git+git://github.com/tanghaibao/jcvi.git ``` Alternatively, if you want to install manually: ```console cd ~/code # or any directory of your choice git clone git://github.com/tanghaibao/jcvi.git pip install -e . ``` In addition, a few module might ask for locations of external programs, if the extended cannot be found in your `PATH`. The external programs that are often used are: - [Kent tools](http://hgdownload.cse.ucsc.edu/admin/jksrc.zip) - [BEDTOOLS](http://code.google.com/p/bedtools/) - [EMBOSS](http://emboss.sourceforge.net/) Most of the scripts in this package contains multiple actions. To use the `fasta` example: ```console Usage: python -m jcvi.formats.fasta ACTION Available ACTIONs: clean | Remove irregular chars in FASTA seqs diff | Check if two fasta records contain same information extract | Given fasta file and seq id, retrieve the sequence in fasta format fastq | Combine fasta and qual to create fastq file filter | Filter the records by size format | Trim accession id to the first space or switch id based on 2-column mapping file fromtab | Convert 2-column sequence file to FASTA format gaps | Print out a list of gap sizes within sequences gc | Plot G+C content distribution identical | Given 2 fasta files, find all exactly identical records ids | Generate a list of headers info | Run `sequence_info` on fasta files ispcr | Reformat paired primers into isPcr query format join | Concatenate a list of seqs and add gaps in between longestorf | Find longest orf for CDS fasta pair | Sort paired reads to .pairs, rest to .fragments pairinplace | Starting from fragment.fasta, find if adjacent records can form pairs pool | Pool a bunch of fastafiles together and add prefix qual | Generate dummy .qual file based on FASTA file random | Randomly take some records sequin | Generate a gapped fasta file for sequin submission simulate | Simulate random fasta file for testing some | Include or exclude a list of records (also performs on .qual file if available) sort | Sort the records by IDs, sizes, etc. summary | Report the real no of bases and N's in fasta files tidy | Normalize gap sizes and remove small components in fasta translate | Translate CDS to proteins trim | Given a cross_match screened fasta, trim the sequence trimsplit | Split sequences at lower-cased letters uniq | Remove records that are the same ``` Then you need to use one action, you can just do: ```console python -m jcvi.formats.fasta extract ``` This will tell you the options and arguments it expects. **Feel free to check out other scripts in the package, it is not just for FASTA.** %package -n python3-jcvi Summary: Python utility libraries on genome assembly, annotation and comparative genomics Provides: python-jcvi BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-jcvi # JCVI utility libraries [![DOI](https://zenodo.org/badge/doi/10.5281/zenodo.31631.svg)](https://doi.org/10.5281/zenodo.594205) [![Latest PyPI version](https://img.shields.io/pypi/v/jcvi.svg)](https://pypi.python.org/pypi/jcvi) [![bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/jcvi/README.html?highlight=jcvi) [![Github Actions](https://github.com/tanghaibao/jcvi/workflows/build/badge.svg)](https://github.com/tanghaibao/jcvi/actions) [![Downloads](https://pepy.tech/badge/jcvi)](https://pepy.tech/project/jcvi) Collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics. | | | | ------- | ---------------------------------------------------------------- | | Authors | Haibao Tang ([tanghaibao](http://github.com/tanghaibao)) | | | Vivek Krishnakumar ([vivekkrish](https://github.com/vivekkrish)) | | | Jingping Li ([Jingping](https://github.com/Jingping)) | | | Xingtan Zhang ([tangerzhang](https://github.com/tangerzhang)) | | Email | | | License | [BSD](http://creativecommons.org/licenses/BSD/) | ## Citations - If you use the MCscan pipeline for synteny inference, please cite: _Tang et al. (2008) Synteny and Collinearity in Plant Genomes. [Science](https://science.sciencemag.org/content/320/5875/486)_ ![MCSCAN example](https://www.dropbox.com/s/9vl3ys3ndvimg4c/grape-peach-cacao.png?raw=1) - If you use the ALLMAPS pipeline for genome scaffolding, please cite: _Tang et al. (2015) ALLMAPS: robust scaffold ordering based on multiple maps. [Genome Biology](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0573-1)_ ![ALLMAPS animation](https://www.dropbox.com/s/jfs8xavcxix37se/ALLMAPS.gif?raw=1) - For other uses, please cite the package directly: _Tang et al. (2015). jcvi: JCVI utility libraries. Zenodo. [10.5281/zenodo.31631](http://dx.doi.org/10.5281/zenodo.31631)_ ![GRABSEEDS example](https://www.dropbox.com/s/yu9ehsi6sqifuaa/bluredges.png?raw=1) ## Contents Following modules are available as generic Bioinformatics handling methods. - algorithms - Linear programming solver with SCIP and GLPK. - Supermap: find set of non-overlapping anchors in BLAST or NUCMER output. - Longest or heaviest increasing subsequence. - Matrix operations. - apps - GenBank entrez accession, Phytozome, Ensembl and SRA downloader. - Calculate (non)synonymous substitution rate between gene pairs. - Basic phylogenetic tree construction using PHYLIP, PhyML, or RAxML, and viualization. - Wrapper for BLAST+, LASTZ, LAST, BWA, BOWTIE2, CLC, CDHIT, CAP3, etc. - formats Currently supports `.ace` format (phrap, cap3, etc.), `.agp` (goldenpath), `.bed` format, `.blast` output, `.btab` format, `.coords` format (`nucmer` output), `.fasta` format, `.fastq` format, `.fpc` format, `.gff` format, `obo` format (ontology), `.psl` format (UCSC blat, GMAP, etc.), `.posmap` format (Celera assembler output), `.sam` format (read mapping), `.contig` format (TIGR assembly format), etc. - graphics - BLAST or synteny dot plot. - Histogram using R and ASCII art. - Paint regions on set of chromosomes. - Macro-synteny and micro-synteny plots. - utils - Grouper can be used as disjoint set data structure. - range contains common range operations, like overlap and chaining. - Miscellaneous cookbook recipes, iterators decorators, table utilities. Then there are modules that contain domain-specific methods. - assembly - K-mer histogram analysis. - Preparation and validation of tiling path for clone-based assemblies. - Scaffolding through ALLMAPS, optical map and genetic map. - Pre-assembly and post-assembly QC procedures. - annotation - Training of _ab initio_ gene predictors. - Calculate gene, exon and intron statistics. - Wrapper for PASA and EVM. - Launch multiple MAKER processes. - compara - C-score based BLAST filter. - Synteny scan (de-novo) and lift over (find nearby anchors). - Ancestral genome reconstruction using Sankoff's and PAR method. - Ortholog and tandem gene duplicates finder. ## Applications Please visit [wiki](https://github.com/tanghaibao/jcvi/wiki) for full-fledged applications. ## Dependencies Following are a list of third-party python packages that are used by some routines in the library. These dependencies are _not_ mandatory since they are only used by a few modules. - [Biopython](http://www.biopython.org) - [numpy](http://numpy.scipy.org) - [matplotlib](http://matplotlib.org/) There are other Python modules here and there in various scripts. The best way is to install them via `pip install` when you see `ImportError`. ## Installation The easiest way is to install it via PyPI: ```console pip install jcvi ``` To install the development version: ```console pip install git+git://github.com/tanghaibao/jcvi.git ``` Alternatively, if you want to install manually: ```console cd ~/code # or any directory of your choice git clone git://github.com/tanghaibao/jcvi.git pip install -e . ``` In addition, a few module might ask for locations of external programs, if the extended cannot be found in your `PATH`. The external programs that are often used are: - [Kent tools](http://hgdownload.cse.ucsc.edu/admin/jksrc.zip) - [BEDTOOLS](http://code.google.com/p/bedtools/) - [EMBOSS](http://emboss.sourceforge.net/) Most of the scripts in this package contains multiple actions. To use the `fasta` example: ```console Usage: python -m jcvi.formats.fasta ACTION Available ACTIONs: clean | Remove irregular chars in FASTA seqs diff | Check if two fasta records contain same information extract | Given fasta file and seq id, retrieve the sequence in fasta format fastq | Combine fasta and qual to create fastq file filter | Filter the records by size format | Trim accession id to the first space or switch id based on 2-column mapping file fromtab | Convert 2-column sequence file to FASTA format gaps | Print out a list of gap sizes within sequences gc | Plot G+C content distribution identical | Given 2 fasta files, find all exactly identical records ids | Generate a list of headers info | Run `sequence_info` on fasta files ispcr | Reformat paired primers into isPcr query format join | Concatenate a list of seqs and add gaps in between longestorf | Find longest orf for CDS fasta pair | Sort paired reads to .pairs, rest to .fragments pairinplace | Starting from fragment.fasta, find if adjacent records can form pairs pool | Pool a bunch of fastafiles together and add prefix qual | Generate dummy .qual file based on FASTA file random | Randomly take some records sequin | Generate a gapped fasta file for sequin submission simulate | Simulate random fasta file for testing some | Include or exclude a list of records (also performs on .qual file if available) sort | Sort the records by IDs, sizes, etc. summary | Report the real no of bases and N's in fasta files tidy | Normalize gap sizes and remove small components in fasta translate | Translate CDS to proteins trim | Given a cross_match screened fasta, trim the sequence trimsplit | Split sequences at lower-cased letters uniq | Remove records that are the same ``` Then you need to use one action, you can just do: ```console python -m jcvi.formats.fasta extract ``` This will tell you the options and arguments it expects. **Feel free to check out other scripts in the package, it is not just for FASTA.** %package help Summary: Development documents and examples for jcvi Provides: python3-jcvi-doc %description help # JCVI utility libraries [![DOI](https://zenodo.org/badge/doi/10.5281/zenodo.31631.svg)](https://doi.org/10.5281/zenodo.594205) [![Latest PyPI version](https://img.shields.io/pypi/v/jcvi.svg)](https://pypi.python.org/pypi/jcvi) [![bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/jcvi/README.html?highlight=jcvi) [![Github Actions](https://github.com/tanghaibao/jcvi/workflows/build/badge.svg)](https://github.com/tanghaibao/jcvi/actions) [![Downloads](https://pepy.tech/badge/jcvi)](https://pepy.tech/project/jcvi) Collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics. | | | | ------- | ---------------------------------------------------------------- | | Authors | Haibao Tang ([tanghaibao](http://github.com/tanghaibao)) | | | Vivek Krishnakumar ([vivekkrish](https://github.com/vivekkrish)) | | | Jingping Li ([Jingping](https://github.com/Jingping)) | | | Xingtan Zhang ([tangerzhang](https://github.com/tangerzhang)) | | Email | | | License | [BSD](http://creativecommons.org/licenses/BSD/) | ## Citations - If you use the MCscan pipeline for synteny inference, please cite: _Tang et al. (2008) Synteny and Collinearity in Plant Genomes. [Science](https://science.sciencemag.org/content/320/5875/486)_ ![MCSCAN example](https://www.dropbox.com/s/9vl3ys3ndvimg4c/grape-peach-cacao.png?raw=1) - If you use the ALLMAPS pipeline for genome scaffolding, please cite: _Tang et al. (2015) ALLMAPS: robust scaffold ordering based on multiple maps. [Genome Biology](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0573-1)_ ![ALLMAPS animation](https://www.dropbox.com/s/jfs8xavcxix37se/ALLMAPS.gif?raw=1) - For other uses, please cite the package directly: _Tang et al. (2015). jcvi: JCVI utility libraries. Zenodo. [10.5281/zenodo.31631](http://dx.doi.org/10.5281/zenodo.31631)_ ![GRABSEEDS example](https://www.dropbox.com/s/yu9ehsi6sqifuaa/bluredges.png?raw=1) ## Contents Following modules are available as generic Bioinformatics handling methods. - algorithms - Linear programming solver with SCIP and GLPK. - Supermap: find set of non-overlapping anchors in BLAST or NUCMER output. - Longest or heaviest increasing subsequence. - Matrix operations. - apps - GenBank entrez accession, Phytozome, Ensembl and SRA downloader. - Calculate (non)synonymous substitution rate between gene pairs. - Basic phylogenetic tree construction using PHYLIP, PhyML, or RAxML, and viualization. - Wrapper for BLAST+, LASTZ, LAST, BWA, BOWTIE2, CLC, CDHIT, CAP3, etc. - formats Currently supports `.ace` format (phrap, cap3, etc.), `.agp` (goldenpath), `.bed` format, `.blast` output, `.btab` format, `.coords` format (`nucmer` output), `.fasta` format, `.fastq` format, `.fpc` format, `.gff` format, `obo` format (ontology), `.psl` format (UCSC blat, GMAP, etc.), `.posmap` format (Celera assembler output), `.sam` format (read mapping), `.contig` format (TIGR assembly format), etc. - graphics - BLAST or synteny dot plot. - Histogram using R and ASCII art. - Paint regions on set of chromosomes. - Macro-synteny and micro-synteny plots. - utils - Grouper can be used as disjoint set data structure. - range contains common range operations, like overlap and chaining. - Miscellaneous cookbook recipes, iterators decorators, table utilities. Then there are modules that contain domain-specific methods. - assembly - K-mer histogram analysis. - Preparation and validation of tiling path for clone-based assemblies. - Scaffolding through ALLMAPS, optical map and genetic map. - Pre-assembly and post-assembly QC procedures. - annotation - Training of _ab initio_ gene predictors. - Calculate gene, exon and intron statistics. - Wrapper for PASA and EVM. - Launch multiple MAKER processes. - compara - C-score based BLAST filter. - Synteny scan (de-novo) and lift over (find nearby anchors). - Ancestral genome reconstruction using Sankoff's and PAR method. - Ortholog and tandem gene duplicates finder. ## Applications Please visit [wiki](https://github.com/tanghaibao/jcvi/wiki) for full-fledged applications. ## Dependencies Following are a list of third-party python packages that are used by some routines in the library. These dependencies are _not_ mandatory since they are only used by a few modules. - [Biopython](http://www.biopython.org) - [numpy](http://numpy.scipy.org) - [matplotlib](http://matplotlib.org/) There are other Python modules here and there in various scripts. The best way is to install them via `pip install` when you see `ImportError`. ## Installation The easiest way is to install it via PyPI: ```console pip install jcvi ``` To install the development version: ```console pip install git+git://github.com/tanghaibao/jcvi.git ``` Alternatively, if you want to install manually: ```console cd ~/code # or any directory of your choice git clone git://github.com/tanghaibao/jcvi.git pip install -e . ``` In addition, a few module might ask for locations of external programs, if the extended cannot be found in your `PATH`. The external programs that are often used are: - [Kent tools](http://hgdownload.cse.ucsc.edu/admin/jksrc.zip) - [BEDTOOLS](http://code.google.com/p/bedtools/) - [EMBOSS](http://emboss.sourceforge.net/) Most of the scripts in this package contains multiple actions. To use the `fasta` example: ```console Usage: python -m jcvi.formats.fasta ACTION Available ACTIONs: clean | Remove irregular chars in FASTA seqs diff | Check if two fasta records contain same information extract | Given fasta file and seq id, retrieve the sequence in fasta format fastq | Combine fasta and qual to create fastq file filter | Filter the records by size format | Trim accession id to the first space or switch id based on 2-column mapping file fromtab | Convert 2-column sequence file to FASTA format gaps | Print out a list of gap sizes within sequences gc | Plot G+C content distribution identical | Given 2 fasta files, find all exactly identical records ids | Generate a list of headers info | Run `sequence_info` on fasta files ispcr | Reformat paired primers into isPcr query format join | Concatenate a list of seqs and add gaps in between longestorf | Find longest orf for CDS fasta pair | Sort paired reads to .pairs, rest to .fragments pairinplace | Starting from fragment.fasta, find if adjacent records can form pairs pool | Pool a bunch of fastafiles together and add prefix qual | Generate dummy .qual file based on FASTA file random | Randomly take some records sequin | Generate a gapped fasta file for sequin submission simulate | Simulate random fasta file for testing some | Include or exclude a list of records (also performs on .qual file if available) sort | Sort the records by IDs, sizes, etc. summary | Report the real no of bases and N's in fasta files tidy | Normalize gap sizes and remove small components in fasta translate | Translate CDS to proteins trim | Given a cross_match screened fasta, trim the sequence trimsplit | Split sequences at lower-cased letters uniq | Remove records that are the same ``` Then you need to use one action, you can just do: ```console python -m jcvi.formats.fasta extract ``` This will tell you the options and arguments it expects. **Feel free to check out other scripts in the package, it is not just for FASTA.** %prep %autosetup -n jcvi-1.3.5 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-jcvi -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Fri Jun 09 2023 Python_Bot - 1.3.5-1 - Package Spec generated