diff options
author | CoprDistGit <infra@openeuler.org> | 2023-06-20 04:27:19 +0000 |
---|---|---|
committer | CoprDistGit <infra@openeuler.org> | 2023-06-20 04:27:19 +0000 |
commit | 9075c9ff3909d371b32699a423592040a9712428 (patch) | |
tree | 43184bc3ea903da3625d322d7fe51803b2d33673 | |
parent | d09143c2122ccdd3fa44786796a2e18f4a40478c (diff) |
automatic import of python-GeneGrouperopeneuler20.03
-rw-r--r-- | .gitignore | 1 | ||||
-rw-r--r-- | python-genegrouper.spec | 585 | ||||
-rw-r--r-- | sources | 1 |
3 files changed, 587 insertions, 0 deletions
@@ -0,0 +1 @@ +/GeneGrouper-1.0.3.tar.gz diff --git a/python-genegrouper.spec b/python-genegrouper.spec new file mode 100644 index 0000000..8303f54 --- /dev/null +++ b/python-genegrouper.spec @@ -0,0 +1,585 @@ +%global _empty_manifest_terminate_build 0 +Name: python-GeneGrouper +Version: 1.0.3 +Release: 1 +Summary: Find and cluster genomic regions containing a seed gene +License: MIT License +URL: https://github.com/agmcfarland/GeneGrouper +Source0: https://mirrors.aliyun.com/pypi/web/packages/26/9b/a432e0124b851931e00c00871b667a06f318bc23c46edab6fb7eb24a6c64/GeneGrouper-1.0.3.tar.gz +BuildArch: noarch + + +%description +<img src="docs/overview_figure.png" alt="GeneGrouper overview figure" width=1000> +[Why use GeneGrouper?](https://github.com/agmcfarland/GeneGrouper/wiki#what-is-genegrouper) +[See GeneGrouper tutorial](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +[See GeneGrouper tutorial](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +[See GeneGrouper outputs](https://github.com/agmcfarland/GeneGrouper/wiki/Output-file-descriptions) +[See FAQs](https://github.com/agmcfarland/GeneGrouper/wiki/Frequently-Asked-Questions) +# Installation +GeneGrouper can be installed using pip +```pip install GeneGrouper``` +[GeneGrouper has multiple dependences.]((https://github.com/agmcfarland/GeneGrouper/wiki/Installation-and-dependencies#requirements-and-dependencies)) +Follow the code below to create a self-contained conda environment for GeneGrouper. **Recommended** +**Installing Python and bioinformatic dependencies for grouping** +``` +conda create -n GeneGrouper_env python=3.9 +source activate GeneGrouper_env #or try: conda activate GeneGrouper_env +conda config --add channels defaults +conda config --add channels bioconda +conda config --add channels conda-forge +pip install biopython scipy scikit-learn pandas matplotlib GeneGrouper +conda install -c bioconda mcl blast mmseqs2 fasttree mafft +``` +**Installing R and required packages for visualizations** +``` +conda install -c conda-forge r-base=4.1.1 r-svglite r-reshape r-ggplot2 r-cowplot r-dplyr r-gggenes r-ape r-phytools r-BiocManager r-codetools +# enter R environment +R +# install additional packages from CRAN +install.packages('groupdata2',repos='https://cloud.r-project.org/', quiet=TRUE) +# install additional packages from +BiocManager::install("ggtree") +# quit +q(save="no") +``` +[For more information, see the installation wiki page](https://github.com/agmcfarland/GeneGrouper/wiki/Installation-and-dependencies) +# Inputs +### GeneGrouper has two required inputs: +1. A translated gene sequence in fasta format (with file extension .fasta/.txt) +2. A folder containing RefSeq GenBank-format genomes (with the file extension .gbff). [See instructions to download many RefSeq genomes at a time.](https://github.com/agmcfarland/GeneGrouper/wiki/Frequently-Asked-Questions#1-where-can-i-download-genbank-format-refseq-genomes-with-file-extension-gbff) +# Basic usage +#### Use `build_database` to make a GeneGrouper database of your RefSeq .gbff genomes +``` +GeneGrouper -g /path/to/gbff -d /path/to/main_directory \ +build_database +``` +#### Use `find_regions` to search for regions containing a gene of interest and output to a search-specific directory +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +find_regions \ +-f /path/to/query_gene.fasta +``` +#### Use `visualize --visual_type main` to output visualizations of group gene architectures and their distribution within genomes and taxa +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type main +``` +#### Use `visualize --visual_type group` to inspect a GeneGrouper group more closely. Replace <> with a group ID number. +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type group <> +``` +#### Use `visualize --visual_type tree` to make a phylogenetic tree of each group's seed gene +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type tree +``` +[See advanced usage examples](https://github.com/agmcfarland/GeneGrouper/wiki/Advanced-usage) +[See tutorial with provided example data](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +# Outputs + 1. **For each search ```find_regions``` outputs:** +* **Four** tabular files with quantitative and qualitative descriptions of grouping results. +* **One** fasta file containing all genes used in the analysis. +2. **For each search, ```visualize --visual_type main``` outputs:** +* **Three** main visualizations provided. +3. **For each search, ```visualize --visual_type group \--group_label <n>``` outputs:** +* **One** additional visualization per group, where ```--group_label <n>``` has `<n>` replaced with the group number. +* **Two** tabular files containing subgroup information for each ```--group_label <n>``` supplied. +4. **For each search, ```visualize --visual_type tree``` outputs:** +* **One** phylogenetic tree of each seed gene in each group. +[See complete output file descriptions](https://github.com/agmcfarland/GeneGrouper/wiki/Output-file-descriptions) +Each search and visualization will have the following file structure. Files under `visualizations` may differ. +``` +├── main_directory +│ ├── search_results +│ │ ├── group_statistics_summmary.csv +│ │ ├── representative_group_member_summary.csv +│ │ ├── group_taxa_summary.csv +│ │ ├── group_regions.csv +│ │ ├── group_region_seqs.faa +│ │ ├── visualizations +│ │ │ ├── group_summary.png +│ │ │ ├── groups_by_taxa.png +│ │ │ ├── taxa_searched.png +│ │ │ ├── inspect_group_-1.png +│ │ │ ├── representative_seed_phylogeny.png +│ │ ├── internal_data +│ │ ├── subgroups +│ │ ├── seed_results.db +``` +# Usage options +### Global flags +``` +usage: GeneGrouper [-h] [-d] [-n] [-g] [-t] + {build_database,find_regions,visualize} ... + -h, --help show this help message and exit + -d , --project_directory + Main directory to contain the base files used for + region searching and clustering. Default=current + directory. + -n , --search_name Name of the directory to contain search-specific + results. Default=region_search + -g , --genomes_directory + Directory containing genbank-file format genomes with + the suffix .gbff. Default=./genomes. + -t , --threads Number of threads to use. Default=all threads. +``` +### Subcommands +``` + build_database Convert a set of genomes into a useable format for + GeneGrouper + find_regions Find regions given a translated gene and a set of + genomes + visualize Visualize GeneGrouper outputs. Three visualization options are provided. + Check the --visual_type help description. +``` +### Subcommand flags +```build_database``` +``` +usage: GeneGrouper build_database [-h] + -h, --help show this help message and exit +``` +```find_regions``` +``` +usage: GeneGrouper find_regions [-h] -f [-us] [-ds] [-i] [-c] [-hk] [--min_group_size] [-re] [--force] + -h, --help show this help message and exit + -f , --query_file Provide the absolute path to a fasta file containing a translated gene sequence. + -us , --upstream_search + Upstream search length in basepairs. Default=10000 + -ds , --downstream_search + Downstream search length in basepairs. Default=10000 + -i , --seed_identity + Identity cutoff for initial blast search. Default=60 + -c , --seed_coverage + Coverage cutoff for initial blast search. Default=90 + -hk , --seed_hits_kept + Number of blast hits to keep. Default=None + --min_group_size + The minimum number of gene regions to constitute a group. Default=ln(jaccard distance length) + -re , --recluster_iterations + Number of region re-clustering attempts after the initial clustering. Default=0 + --force Flag to overwrite search name directory. +``` +```visualize``` +``` +usage: GeneGrouper visualize [-h] [--visual_type] [--group_label] + --visual_type Choices: [main, group, tree]. Use main for main visualizations. Use group to + inspect specific group. Use tree for a phylogenetic tree of representative + seed sequencess. Default=main + --group_label The integer identifier of the group you wish to inspect. Default=-1 + --image_format Choices: [png, svg]. Output image format. Use svg if you want to edit the + images. Default=png. + --tip_label_type Choices: [full, group]. Use full to include the sequence ID followed by group + ID. Use group to only have the group ID. Default=full + --tip_label_size Specify the tip label size in the output image. Default=2 +``` +# Citation +Alexander G McFarland, Nolan W Kennedy, Carolyn E Mills, Danielle Tullman-Ercek, Curtis Huttenhower, Erica M Hartmann, **Density-based binning of gene clusters to infer function or evolutionary history using GeneGrouper**, Bioinformatics, 2021;, btab752, https://doi.org/10.1093/bioinformatics/btab752 +# Contact +Please message me at alexandermcfarland2022@u.northwestern.edu +Follow me on twitter [@alexmcfarland_](https://twitter.com/alexmcfarland_)! + +%package -n python3-GeneGrouper +Summary: Find and cluster genomic regions containing a seed gene +Provides: python-GeneGrouper +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-GeneGrouper +<img src="docs/overview_figure.png" alt="GeneGrouper overview figure" width=1000> +[Why use GeneGrouper?](https://github.com/agmcfarland/GeneGrouper/wiki#what-is-genegrouper) +[See GeneGrouper tutorial](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +[See GeneGrouper tutorial](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +[See GeneGrouper outputs](https://github.com/agmcfarland/GeneGrouper/wiki/Output-file-descriptions) +[See FAQs](https://github.com/agmcfarland/GeneGrouper/wiki/Frequently-Asked-Questions) +# Installation +GeneGrouper can be installed using pip +```pip install GeneGrouper``` +[GeneGrouper has multiple dependences.]((https://github.com/agmcfarland/GeneGrouper/wiki/Installation-and-dependencies#requirements-and-dependencies)) +Follow the code below to create a self-contained conda environment for GeneGrouper. **Recommended** +**Installing Python and bioinformatic dependencies for grouping** +``` +conda create -n GeneGrouper_env python=3.9 +source activate GeneGrouper_env #or try: conda activate GeneGrouper_env +conda config --add channels defaults +conda config --add channels bioconda +conda config --add channels conda-forge +pip install biopython scipy scikit-learn pandas matplotlib GeneGrouper +conda install -c bioconda mcl blast mmseqs2 fasttree mafft +``` +**Installing R and required packages for visualizations** +``` +conda install -c conda-forge r-base=4.1.1 r-svglite r-reshape r-ggplot2 r-cowplot r-dplyr r-gggenes r-ape r-phytools r-BiocManager r-codetools +# enter R environment +R +# install additional packages from CRAN +install.packages('groupdata2',repos='https://cloud.r-project.org/', quiet=TRUE) +# install additional packages from +BiocManager::install("ggtree") +# quit +q(save="no") +``` +[For more information, see the installation wiki page](https://github.com/agmcfarland/GeneGrouper/wiki/Installation-and-dependencies) +# Inputs +### GeneGrouper has two required inputs: +1. A translated gene sequence in fasta format (with file extension .fasta/.txt) +2. A folder containing RefSeq GenBank-format genomes (with the file extension .gbff). [See instructions to download many RefSeq genomes at a time.](https://github.com/agmcfarland/GeneGrouper/wiki/Frequently-Asked-Questions#1-where-can-i-download-genbank-format-refseq-genomes-with-file-extension-gbff) +# Basic usage +#### Use `build_database` to make a GeneGrouper database of your RefSeq .gbff genomes +``` +GeneGrouper -g /path/to/gbff -d /path/to/main_directory \ +build_database +``` +#### Use `find_regions` to search for regions containing a gene of interest and output to a search-specific directory +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +find_regions \ +-f /path/to/query_gene.fasta +``` +#### Use `visualize --visual_type main` to output visualizations of group gene architectures and their distribution within genomes and taxa +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type main +``` +#### Use `visualize --visual_type group` to inspect a GeneGrouper group more closely. Replace <> with a group ID number. +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type group <> +``` +#### Use `visualize --visual_type tree` to make a phylogenetic tree of each group's seed gene +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type tree +``` +[See advanced usage examples](https://github.com/agmcfarland/GeneGrouper/wiki/Advanced-usage) +[See tutorial with provided example data](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +# Outputs + 1. **For each search ```find_regions``` outputs:** +* **Four** tabular files with quantitative and qualitative descriptions of grouping results. +* **One** fasta file containing all genes used in the analysis. +2. **For each search, ```visualize --visual_type main``` outputs:** +* **Three** main visualizations provided. +3. **For each search, ```visualize --visual_type group \--group_label <n>``` outputs:** +* **One** additional visualization per group, where ```--group_label <n>``` has `<n>` replaced with the group number. +* **Two** tabular files containing subgroup information for each ```--group_label <n>``` supplied. +4. **For each search, ```visualize --visual_type tree``` outputs:** +* **One** phylogenetic tree of each seed gene in each group. +[See complete output file descriptions](https://github.com/agmcfarland/GeneGrouper/wiki/Output-file-descriptions) +Each search and visualization will have the following file structure. Files under `visualizations` may differ. +``` +├── main_directory +│ ├── search_results +│ │ ├── group_statistics_summmary.csv +│ │ ├── representative_group_member_summary.csv +│ │ ├── group_taxa_summary.csv +│ │ ├── group_regions.csv +│ │ ├── group_region_seqs.faa +│ │ ├── visualizations +│ │ │ ├── group_summary.png +│ │ │ ├── groups_by_taxa.png +│ │ │ ├── taxa_searched.png +│ │ │ ├── inspect_group_-1.png +│ │ │ ├── representative_seed_phylogeny.png +│ │ ├── internal_data +│ │ ├── subgroups +│ │ ├── seed_results.db +``` +# Usage options +### Global flags +``` +usage: GeneGrouper [-h] [-d] [-n] [-g] [-t] + {build_database,find_regions,visualize} ... + -h, --help show this help message and exit + -d , --project_directory + Main directory to contain the base files used for + region searching and clustering. Default=current + directory. + -n , --search_name Name of the directory to contain search-specific + results. Default=region_search + -g , --genomes_directory + Directory containing genbank-file format genomes with + the suffix .gbff. Default=./genomes. + -t , --threads Number of threads to use. Default=all threads. +``` +### Subcommands +``` + build_database Convert a set of genomes into a useable format for + GeneGrouper + find_regions Find regions given a translated gene and a set of + genomes + visualize Visualize GeneGrouper outputs. Three visualization options are provided. + Check the --visual_type help description. +``` +### Subcommand flags +```build_database``` +``` +usage: GeneGrouper build_database [-h] + -h, --help show this help message and exit +``` +```find_regions``` +``` +usage: GeneGrouper find_regions [-h] -f [-us] [-ds] [-i] [-c] [-hk] [--min_group_size] [-re] [--force] + -h, --help show this help message and exit + -f , --query_file Provide the absolute path to a fasta file containing a translated gene sequence. + -us , --upstream_search + Upstream search length in basepairs. Default=10000 + -ds , --downstream_search + Downstream search length in basepairs. Default=10000 + -i , --seed_identity + Identity cutoff for initial blast search. Default=60 + -c , --seed_coverage + Coverage cutoff for initial blast search. Default=90 + -hk , --seed_hits_kept + Number of blast hits to keep. Default=None + --min_group_size + The minimum number of gene regions to constitute a group. Default=ln(jaccard distance length) + -re , --recluster_iterations + Number of region re-clustering attempts after the initial clustering. Default=0 + --force Flag to overwrite search name directory. +``` +```visualize``` +``` +usage: GeneGrouper visualize [-h] [--visual_type] [--group_label] + --visual_type Choices: [main, group, tree]. Use main for main visualizations. Use group to + inspect specific group. Use tree for a phylogenetic tree of representative + seed sequencess. Default=main + --group_label The integer identifier of the group you wish to inspect. Default=-1 + --image_format Choices: [png, svg]. Output image format. Use svg if you want to edit the + images. Default=png. + --tip_label_type Choices: [full, group]. Use full to include the sequence ID followed by group + ID. Use group to only have the group ID. Default=full + --tip_label_size Specify the tip label size in the output image. Default=2 +``` +# Citation +Alexander G McFarland, Nolan W Kennedy, Carolyn E Mills, Danielle Tullman-Ercek, Curtis Huttenhower, Erica M Hartmann, **Density-based binning of gene clusters to infer function or evolutionary history using GeneGrouper**, Bioinformatics, 2021;, btab752, https://doi.org/10.1093/bioinformatics/btab752 +# Contact +Please message me at alexandermcfarland2022@u.northwestern.edu +Follow me on twitter [@alexmcfarland_](https://twitter.com/alexmcfarland_)! + +%package help +Summary: Development documents and examples for GeneGrouper +Provides: python3-GeneGrouper-doc +%description help +<img src="docs/overview_figure.png" alt="GeneGrouper overview figure" width=1000> +[Why use GeneGrouper?](https://github.com/agmcfarland/GeneGrouper/wiki#what-is-genegrouper) +[See GeneGrouper tutorial](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +[See GeneGrouper tutorial](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +[See GeneGrouper outputs](https://github.com/agmcfarland/GeneGrouper/wiki/Output-file-descriptions) +[See FAQs](https://github.com/agmcfarland/GeneGrouper/wiki/Frequently-Asked-Questions) +# Installation +GeneGrouper can be installed using pip +```pip install GeneGrouper``` +[GeneGrouper has multiple dependences.]((https://github.com/agmcfarland/GeneGrouper/wiki/Installation-and-dependencies#requirements-and-dependencies)) +Follow the code below to create a self-contained conda environment for GeneGrouper. **Recommended** +**Installing Python and bioinformatic dependencies for grouping** +``` +conda create -n GeneGrouper_env python=3.9 +source activate GeneGrouper_env #or try: conda activate GeneGrouper_env +conda config --add channels defaults +conda config --add channels bioconda +conda config --add channels conda-forge +pip install biopython scipy scikit-learn pandas matplotlib GeneGrouper +conda install -c bioconda mcl blast mmseqs2 fasttree mafft +``` +**Installing R and required packages for visualizations** +``` +conda install -c conda-forge r-base=4.1.1 r-svglite r-reshape r-ggplot2 r-cowplot r-dplyr r-gggenes r-ape r-phytools r-BiocManager r-codetools +# enter R environment +R +# install additional packages from CRAN +install.packages('groupdata2',repos='https://cloud.r-project.org/', quiet=TRUE) +# install additional packages from +BiocManager::install("ggtree") +# quit +q(save="no") +``` +[For more information, see the installation wiki page](https://github.com/agmcfarland/GeneGrouper/wiki/Installation-and-dependencies) +# Inputs +### GeneGrouper has two required inputs: +1. A translated gene sequence in fasta format (with file extension .fasta/.txt) +2. A folder containing RefSeq GenBank-format genomes (with the file extension .gbff). [See instructions to download many RefSeq genomes at a time.](https://github.com/agmcfarland/GeneGrouper/wiki/Frequently-Asked-Questions#1-where-can-i-download-genbank-format-refseq-genomes-with-file-extension-gbff) +# Basic usage +#### Use `build_database` to make a GeneGrouper database of your RefSeq .gbff genomes +``` +GeneGrouper -g /path/to/gbff -d /path/to/main_directory \ +build_database +``` +#### Use `find_regions` to search for regions containing a gene of interest and output to a search-specific directory +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +find_regions \ +-f /path/to/query_gene.fasta +``` +#### Use `visualize --visual_type main` to output visualizations of group gene architectures and their distribution within genomes and taxa +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type main +``` +#### Use `visualize --visual_type group` to inspect a GeneGrouper group more closely. Replace <> with a group ID number. +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type group <> +``` +#### Use `visualize --visual_type tree` to make a phylogenetic tree of each group's seed gene +``` +GeneGrouper -d /path/to/main_directory -n gene_search \ +visualize \ +--visual_type tree +``` +[See advanced usage examples](https://github.com/agmcfarland/GeneGrouper/wiki/Advanced-usage) +[See tutorial with provided example data](https://github.com/agmcfarland/GeneGrouper/wiki/GeneGrouper-tutorial-with-data) +# Outputs + 1. **For each search ```find_regions``` outputs:** +* **Four** tabular files with quantitative and qualitative descriptions of grouping results. +* **One** fasta file containing all genes used in the analysis. +2. **For each search, ```visualize --visual_type main``` outputs:** +* **Three** main visualizations provided. +3. **For each search, ```visualize --visual_type group \--group_label <n>``` outputs:** +* **One** additional visualization per group, where ```--group_label <n>``` has `<n>` replaced with the group number. +* **Two** tabular files containing subgroup information for each ```--group_label <n>``` supplied. +4. **For each search, ```visualize --visual_type tree``` outputs:** +* **One** phylogenetic tree of each seed gene in each group. +[See complete output file descriptions](https://github.com/agmcfarland/GeneGrouper/wiki/Output-file-descriptions) +Each search and visualization will have the following file structure. Files under `visualizations` may differ. +``` +├── main_directory +│ ├── search_results +│ │ ├── group_statistics_summmary.csv +│ │ ├── representative_group_member_summary.csv +│ │ ├── group_taxa_summary.csv +│ │ ├── group_regions.csv +│ │ ├── group_region_seqs.faa +│ │ ├── visualizations +│ │ │ ├── group_summary.png +│ │ │ ├── groups_by_taxa.png +│ │ │ ├── taxa_searched.png +│ │ │ ├── inspect_group_-1.png +│ │ │ ├── representative_seed_phylogeny.png +│ │ ├── internal_data +│ │ ├── subgroups +│ │ ├── seed_results.db +``` +# Usage options +### Global flags +``` +usage: GeneGrouper [-h] [-d] [-n] [-g] [-t] + {build_database,find_regions,visualize} ... + -h, --help show this help message and exit + -d , --project_directory + Main directory to contain the base files used for + region searching and clustering. Default=current + directory. + -n , --search_name Name of the directory to contain search-specific + results. Default=region_search + -g , --genomes_directory + Directory containing genbank-file format genomes with + the suffix .gbff. Default=./genomes. + -t , --threads Number of threads to use. Default=all threads. +``` +### Subcommands +``` + build_database Convert a set of genomes into a useable format for + GeneGrouper + find_regions Find regions given a translated gene and a set of + genomes + visualize Visualize GeneGrouper outputs. Three visualization options are provided. + Check the --visual_type help description. +``` +### Subcommand flags +```build_database``` +``` +usage: GeneGrouper build_database [-h] + -h, --help show this help message and exit +``` +```find_regions``` +``` +usage: GeneGrouper find_regions [-h] -f [-us] [-ds] [-i] [-c] [-hk] [--min_group_size] [-re] [--force] + -h, --help show this help message and exit + -f , --query_file Provide the absolute path to a fasta file containing a translated gene sequence. + -us , --upstream_search + Upstream search length in basepairs. Default=10000 + -ds , --downstream_search + Downstream search length in basepairs. Default=10000 + -i , --seed_identity + Identity cutoff for initial blast search. Default=60 + -c , --seed_coverage + Coverage cutoff for initial blast search. Default=90 + -hk , --seed_hits_kept + Number of blast hits to keep. Default=None + --min_group_size + The minimum number of gene regions to constitute a group. Default=ln(jaccard distance length) + -re , --recluster_iterations + Number of region re-clustering attempts after the initial clustering. Default=0 + --force Flag to overwrite search name directory. +``` +```visualize``` +``` +usage: GeneGrouper visualize [-h] [--visual_type] [--group_label] + --visual_type Choices: [main, group, tree]. Use main for main visualizations. Use group to + inspect specific group. Use tree for a phylogenetic tree of representative + seed sequencess. Default=main + --group_label The integer identifier of the group you wish to inspect. Default=-1 + --image_format Choices: [png, svg]. Output image format. Use svg if you want to edit the + images. Default=png. + --tip_label_type Choices: [full, group]. Use full to include the sequence ID followed by group + ID. Use group to only have the group ID. Default=full + --tip_label_size Specify the tip label size in the output image. Default=2 +``` +# Citation +Alexander G McFarland, Nolan W Kennedy, Carolyn E Mills, Danielle Tullman-Ercek, Curtis Huttenhower, Erica M Hartmann, **Density-based binning of gene clusters to infer function or evolutionary history using GeneGrouper**, Bioinformatics, 2021;, btab752, https://doi.org/10.1093/bioinformatics/btab752 +# Contact +Please message me at alexandermcfarland2022@u.northwestern.edu +Follow me on twitter [@alexmcfarland_](https://twitter.com/alexmcfarland_)! + +%prep +%autosetup -n GeneGrouper-1.0.3 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-GeneGrouper -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 1.0.3-1 +- Package Spec generated @@ -0,0 +1 @@ +335a17ebf09267c83ce8ca658b222657 GeneGrouper-1.0.3.tar.gz |