diff options
Diffstat (limited to 'python-bfg-y2h.spec')
-rw-r--r-- | python-bfg-y2h.spec | 360 |
1 files changed, 360 insertions, 0 deletions
diff --git a/python-bfg-y2h.spec b/python-bfg-y2h.spec new file mode 100644 index 0000000..048ef29 --- /dev/null +++ b/python-bfg-y2h.spec @@ -0,0 +1,360 @@ +%global _empty_manifest_terminate_build 0 +Name: python-BFG-Y2H +Version: 0.1.2 +Release: 1 +Summary: Analysis scripts for BFG-Y2H data +License: MIT License +URL: https://github.com/RyogaLi/BFG_Y2H +Source0: https://mirrors.aliyun.com/pypi/web/packages/5c/7c/d61e8e2a4fa3a4bfb097b044609a1f3980860368e398d8c55fe9ede6bcea/BFG-Y2H-0.1.2.tar.gz +BuildArch: noarch + + +%description +### BFG Y2H Analysis Pipeline ### + +**Requirements** + +* Python 3.7 +* Bowtie 2 and Bowtie2 build + +### Files required ### + +The pipeline requires reference files before running. They can be found on GALEN: +``` +all reference files contain all the barcodes in fasta format +path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/ +``` +Before running the pipeline, you need to copy everything in these two folders to your designated directory. + + +#### Build new reference ### + +If you need to build a new reference for your analysis, please follow: + +1. You can refer to the create_fasta.py script to build the new fasta file +2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always + be the second item, and the up/dn identifier should always be the last item. (see examples below) +3. Example sequences in output fasta file: +``` +>G1;YDL169C_BC-1;7;up +CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA + +>G1;YMR206W_BC-1;1.0;DB;up +CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC + +>G09;56346_BC-1;24126.0;DB;dn +TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG +``` +4. After making the fasta file, build index with bowtie2-build +`bowtie2-build filename.fasta filename` +5. Update main.py to use the summary files you generated + * Edit parse_input_files() to add a case + +### Running the pipeline ### + +* Install from pypi (recommend): `python -m pip install BFG-Y2H` + +* Install and build from github, the update.sh might need to be modified before you install +``` +1. download the package from github +2. inside the root folder, run ./update.sh +``` + +1. Input arguments: +``` +usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment] + [--ref REF] [--cutOff CUTOFF] + +BFG-Y2H + +optional arguments: + -h, --help show this help message and exit + --fastq FASTQ Path to all fastq files you want to analyze + --output OUTPUT Output path for sam files + --mode MODE pick yeast or human or virus or hedgy or LAgag + --alignment turn on alignment + --ref REF path to all reference files + --cutOff CUTOFF assign cut off + +``` + +2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast) + +3. Run the pipeline on GALEN +``` +# this will run the pipeline using slurm +# all the fastq files in the given folder will be processed +# run with alignment +bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference + +# if alignment was finished, you want to only do read counts +bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference +``` + +### Output files ### + +* After running the pipeline, one folder will be generated for each group pair (yAD*DB*) + +* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN + +* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high. + +* `*_sorted.sam`: Raw sam files generated from bowtie2 + +* `*_noh.csv`: shrinked sam files, used for scoring + +* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn) + + + + +%package -n python3-BFG-Y2H +Summary: Analysis scripts for BFG-Y2H data +Provides: python-BFG-Y2H +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-BFG-Y2H +### BFG Y2H Analysis Pipeline ### + +**Requirements** + +* Python 3.7 +* Bowtie 2 and Bowtie2 build + +### Files required ### + +The pipeline requires reference files before running. They can be found on GALEN: +``` +all reference files contain all the barcodes in fasta format +path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/ +``` +Before running the pipeline, you need to copy everything in these two folders to your designated directory. + + +#### Build new reference ### + +If you need to build a new reference for your analysis, please follow: + +1. You can refer to the create_fasta.py script to build the new fasta file +2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always + be the second item, and the up/dn identifier should always be the last item. (see examples below) +3. Example sequences in output fasta file: +``` +>G1;YDL169C_BC-1;7;up +CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA + +>G1;YMR206W_BC-1;1.0;DB;up +CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC + +>G09;56346_BC-1;24126.0;DB;dn +TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG +``` +4. After making the fasta file, build index with bowtie2-build +`bowtie2-build filename.fasta filename` +5. Update main.py to use the summary files you generated + * Edit parse_input_files() to add a case + +### Running the pipeline ### + +* Install from pypi (recommend): `python -m pip install BFG-Y2H` + +* Install and build from github, the update.sh might need to be modified before you install +``` +1. download the package from github +2. inside the root folder, run ./update.sh +``` + +1. Input arguments: +``` +usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment] + [--ref REF] [--cutOff CUTOFF] + +BFG-Y2H + +optional arguments: + -h, --help show this help message and exit + --fastq FASTQ Path to all fastq files you want to analyze + --output OUTPUT Output path for sam files + --mode MODE pick yeast or human or virus or hedgy or LAgag + --alignment turn on alignment + --ref REF path to all reference files + --cutOff CUTOFF assign cut off + +``` + +2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast) + +3. Run the pipeline on GALEN +``` +# this will run the pipeline using slurm +# all the fastq files in the given folder will be processed +# run with alignment +bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference + +# if alignment was finished, you want to only do read counts +bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference +``` + +### Output files ### + +* After running the pipeline, one folder will be generated for each group pair (yAD*DB*) + +* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN + +* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high. + +* `*_sorted.sam`: Raw sam files generated from bowtie2 + +* `*_noh.csv`: shrinked sam files, used for scoring + +* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn) + + + + +%package help +Summary: Development documents and examples for BFG-Y2H +Provides: python3-BFG-Y2H-doc +%description help +### BFG Y2H Analysis Pipeline ### + +**Requirements** + +* Python 3.7 +* Bowtie 2 and Bowtie2 build + +### Files required ### + +The pipeline requires reference files before running. They can be found on GALEN: +``` +all reference files contain all the barcodes in fasta format +path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/ +``` +Before running the pipeline, you need to copy everything in these two folders to your designated directory. + + +#### Build new reference ### + +If you need to build a new reference for your analysis, please follow: + +1. You can refer to the create_fasta.py script to build the new fasta file +2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always + be the second item, and the up/dn identifier should always be the last item. (see examples below) +3. Example sequences in output fasta file: +``` +>G1;YDL169C_BC-1;7;up +CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA + +>G1;YMR206W_BC-1;1.0;DB;up +CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC + +>G09;56346_BC-1;24126.0;DB;dn +TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG +``` +4. After making the fasta file, build index with bowtie2-build +`bowtie2-build filename.fasta filename` +5. Update main.py to use the summary files you generated + * Edit parse_input_files() to add a case + +### Running the pipeline ### + +* Install from pypi (recommend): `python -m pip install BFG-Y2H` + +* Install and build from github, the update.sh might need to be modified before you install +``` +1. download the package from github +2. inside the root folder, run ./update.sh +``` + +1. Input arguments: +``` +usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment] + [--ref REF] [--cutOff CUTOFF] + +BFG-Y2H + +optional arguments: + -h, --help show this help message and exit + --fastq FASTQ Path to all fastq files you want to analyze + --output OUTPUT Output path for sam files + --mode MODE pick yeast or human or virus or hedgy or LAgag + --alignment turn on alignment + --ref REF path to all reference files + --cutOff CUTOFF assign cut off + +``` + +2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast) + +3. Run the pipeline on GALEN +``` +# this will run the pipeline using slurm +# all the fastq files in the given folder will be processed +# run with alignment +bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference + +# if alignment was finished, you want to only do read counts +bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference +``` + +### Output files ### + +* After running the pipeline, one folder will be generated for each group pair (yAD*DB*) + +* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN + +* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high. + +* `*_sorted.sam`: Raw sam files generated from bowtie2 + +* `*_noh.csv`: shrinked sam files, used for scoring + +* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn) + + + + +%prep +%autosetup -n BFG-Y2H-0.1.2 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-BFG-Y2H -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.2-1 +- Package Spec generated |