summaryrefslogtreecommitdiff
path: root/python-bfg-y2h.spec
diff options
context:
space:
mode:
Diffstat (limited to 'python-bfg-y2h.spec')
-rw-r--r--python-bfg-y2h.spec360
1 files changed, 360 insertions, 0 deletions
diff --git a/python-bfg-y2h.spec b/python-bfg-y2h.spec
new file mode 100644
index 0000000..048ef29
--- /dev/null
+++ b/python-bfg-y2h.spec
@@ -0,0 +1,360 @@
+%global _empty_manifest_terminate_build 0
+Name: python-BFG-Y2H
+Version: 0.1.2
+Release: 1
+Summary: Analysis scripts for BFG-Y2H data
+License: MIT License
+URL: https://github.com/RyogaLi/BFG_Y2H
+Source0: https://mirrors.aliyun.com/pypi/web/packages/5c/7c/d61e8e2a4fa3a4bfb097b044609a1f3980860368e398d8c55fe9ede6bcea/BFG-Y2H-0.1.2.tar.gz
+BuildArch: noarch
+
+
+%description
+### BFG Y2H Analysis Pipeline ###
+
+**Requirements**
+
+* Python 3.7
+* Bowtie 2 and Bowtie2 build
+
+### Files required ###
+
+The pipeline requires reference files before running. They can be found on GALEN:
+```
+all reference files contain all the barcodes in fasta format
+path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/
+```
+Before running the pipeline, you need to copy everything in these two folders to your designated directory.
+
+
+#### Build new reference ###
+
+If you need to build a new reference for your analysis, please follow:
+
+1. You can refer to the create_fasta.py script to build the new fasta file
+2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always
+ be the second item, and the up/dn identifier should always be the last item. (see examples below)
+3. Example sequences in output fasta file:
+```
+>G1;YDL169C_BC-1;7;up
+CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA
+
+>G1;YMR206W_BC-1;1.0;DB;up
+CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC
+
+>G09;56346_BC-1;24126.0;DB;dn
+TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG
+```
+4. After making the fasta file, build index with bowtie2-build
+`bowtie2-build filename.fasta filename`
+5. Update main.py to use the summary files you generated
+ * Edit parse_input_files() to add a case
+
+### Running the pipeline ###
+
+* Install from pypi (recommend): `python -m pip install BFG-Y2H`
+
+* Install and build from github, the update.sh might need to be modified before you install
+```
+1. download the package from github
+2. inside the root folder, run ./update.sh
+```
+
+1. Input arguments:
+```
+usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment]
+ [--ref REF] [--cutOff CUTOFF]
+
+BFG-Y2H
+
+optional arguments:
+ -h, --help show this help message and exit
+ --fastq FASTQ Path to all fastq files you want to analyze
+ --output OUTPUT Output path for sam files
+ --mode MODE pick yeast or human or virus or hedgy or LAgag
+ --alignment turn on alignment
+ --ref REF path to all reference files
+ --cutOff CUTOFF assign cut off
+
+```
+
+2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast)
+
+3. Run the pipeline on GALEN
+```
+# this will run the pipeline using slurm
+# all the fastq files in the given folder will be processed
+# run with alignment
+bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference
+
+# if alignment was finished, you want to only do read counts
+bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference
+```
+
+### Output files ###
+
+* After running the pipeline, one folder will be generated for each group pair (yAD*DB*)
+
+* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN
+
+* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high.
+
+* `*_sorted.sam`: Raw sam files generated from bowtie2
+
+* `*_noh.csv`: shrinked sam files, used for scoring
+
+* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn)
+
+
+
+
+%package -n python3-BFG-Y2H
+Summary: Analysis scripts for BFG-Y2H data
+Provides: python-BFG-Y2H
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-BFG-Y2H
+### BFG Y2H Analysis Pipeline ###
+
+**Requirements**
+
+* Python 3.7
+* Bowtie 2 and Bowtie2 build
+
+### Files required ###
+
+The pipeline requires reference files before running. They can be found on GALEN:
+```
+all reference files contain all the barcodes in fasta format
+path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/
+```
+Before running the pipeline, you need to copy everything in these two folders to your designated directory.
+
+
+#### Build new reference ###
+
+If you need to build a new reference for your analysis, please follow:
+
+1. You can refer to the create_fasta.py script to build the new fasta file
+2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always
+ be the second item, and the up/dn identifier should always be the last item. (see examples below)
+3. Example sequences in output fasta file:
+```
+>G1;YDL169C_BC-1;7;up
+CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA
+
+>G1;YMR206W_BC-1;1.0;DB;up
+CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC
+
+>G09;56346_BC-1;24126.0;DB;dn
+TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG
+```
+4. After making the fasta file, build index with bowtie2-build
+`bowtie2-build filename.fasta filename`
+5. Update main.py to use the summary files you generated
+ * Edit parse_input_files() to add a case
+
+### Running the pipeline ###
+
+* Install from pypi (recommend): `python -m pip install BFG-Y2H`
+
+* Install and build from github, the update.sh might need to be modified before you install
+```
+1. download the package from github
+2. inside the root folder, run ./update.sh
+```
+
+1. Input arguments:
+```
+usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment]
+ [--ref REF] [--cutOff CUTOFF]
+
+BFG-Y2H
+
+optional arguments:
+ -h, --help show this help message and exit
+ --fastq FASTQ Path to all fastq files you want to analyze
+ --output OUTPUT Output path for sam files
+ --mode MODE pick yeast or human or virus or hedgy or LAgag
+ --alignment turn on alignment
+ --ref REF path to all reference files
+ --cutOff CUTOFF assign cut off
+
+```
+
+2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast)
+
+3. Run the pipeline on GALEN
+```
+# this will run the pipeline using slurm
+# all the fastq files in the given folder will be processed
+# run with alignment
+bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference
+
+# if alignment was finished, you want to only do read counts
+bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference
+```
+
+### Output files ###
+
+* After running the pipeline, one folder will be generated for each group pair (yAD*DB*)
+
+* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN
+
+* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high.
+
+* `*_sorted.sam`: Raw sam files generated from bowtie2
+
+* `*_noh.csv`: shrinked sam files, used for scoring
+
+* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn)
+
+
+
+
+%package help
+Summary: Development documents and examples for BFG-Y2H
+Provides: python3-BFG-Y2H-doc
+%description help
+### BFG Y2H Analysis Pipeline ###
+
+**Requirements**
+
+* Python 3.7
+* Bowtie 2 and Bowtie2 build
+
+### Files required ###
+
+The pipeline requires reference files before running. They can be found on GALEN:
+```
+all reference files contain all the barcodes in fasta format
+path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/
+```
+Before running the pipeline, you need to copy everything in these two folders to your designated directory.
+
+
+#### Build new reference ###
+
+If you need to build a new reference for your analysis, please follow:
+
+1. You can refer to the create_fasta.py script to build the new fasta file
+2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always
+ be the second item, and the up/dn identifier should always be the last item. (see examples below)
+3. Example sequences in output fasta file:
+```
+>G1;YDL169C_BC-1;7;up
+CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA
+
+>G1;YMR206W_BC-1;1.0;DB;up
+CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC
+
+>G09;56346_BC-1;24126.0;DB;dn
+TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG
+```
+4. After making the fasta file, build index with bowtie2-build
+`bowtie2-build filename.fasta filename`
+5. Update main.py to use the summary files you generated
+ * Edit parse_input_files() to add a case
+
+### Running the pipeline ###
+
+* Install from pypi (recommend): `python -m pip install BFG-Y2H`
+
+* Install and build from github, the update.sh might need to be modified before you install
+```
+1. download the package from github
+2. inside the root folder, run ./update.sh
+```
+
+1. Input arguments:
+```
+usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment]
+ [--ref REF] [--cutOff CUTOFF]
+
+BFG-Y2H
+
+optional arguments:
+ -h, --help show this help message and exit
+ --fastq FASTQ Path to all fastq files you want to analyze
+ --output OUTPUT Output path for sam files
+ --mode MODE pick yeast or human or virus or hedgy or LAgag
+ --alignment turn on alignment
+ --ref REF path to all reference files
+ --cutOff CUTOFF assign cut off
+
+```
+
+2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast)
+
+3. Run the pipeline on GALEN
+```
+# this will run the pipeline using slurm
+# all the fastq files in the given folder will be processed
+# run with alignment
+bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference
+
+# if alignment was finished, you want to only do read counts
+bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference
+```
+
+### Output files ###
+
+* After running the pipeline, one folder will be generated for each group pair (yAD*DB*)
+
+* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN
+
+* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high.
+
+* `*_sorted.sam`: Raw sam files generated from bowtie2
+
+* `*_noh.csv`: shrinked sam files, used for scoring
+
+* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn)
+
+
+
+
+%prep
+%autosetup -n BFG-Y2H-0.1.2
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-BFG-Y2H -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.2-1
+- Package Spec generated