%global _empty_manifest_terminate_build 0
Name:		python-BFG-Y2H
Version:	0.1.2
Release:	1
Summary:	Analysis scripts for BFG-Y2H data
License:	MIT License
URL:		https://github.com/RyogaLi/BFG_Y2H
Source0:	https://mirrors.aliyun.com/pypi/web/packages/5c/7c/d61e8e2a4fa3a4bfb097b044609a1f3980860368e398d8c55fe9ede6bcea/BFG-Y2H-0.1.2.tar.gz
BuildArch:	noarch


%description
### BFG Y2H Analysis Pipeline ###

**Requirements**

* Python 3.7
* Bowtie 2 and Bowtie2 build

### Files required ###

The pipeline requires reference files before running. They can be found on GALEN: 
```
all reference files contain all the barcodes in fasta format
path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/
```
Before running the pipeline, you need to copy everything in these two folders to your designated directory.


#### Build new reference ###

If you need to build a new reference for your analysis, please follow:
    
1. You can refer to the create_fasta.py script to build the new fasta file 
2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always 
   be the second item, and the up/dn identifier should always be the last item. (see examples below)
3. Example sequences in output fasta file:
```
>G1;YDL169C_BC-1;7;up
CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA

>G1;YMR206W_BC-1;1.0;DB;up
CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC

>G09;56346_BC-1;24126.0;DB;dn
TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG
```
4. After making the fasta file, build index with bowtie2-build
`bowtie2-build filename.fasta filename`
5. Update main.py to use the summary files you generated
   * Edit parse_input_files() to add a case

### Running the pipeline  ###

* Install from pypi (recommend): `python -m pip install BFG-Y2H`

* Install and build from github, the update.sh might need to be modified before you install
```
1. download the package from github
2. inside the root folder, run ./update.sh
```

1. Input arguments: 
```
usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment]
           [--ref REF] [--cutOff CUTOFF]

BFG-Y2H

optional arguments:
  -h, --help       show this help message and exit
  --fastq FASTQ    Path to all fastq files you want to analyze
  --output OUTPUT  Output path for sam files
  --mode MODE      pick yeast or human or virus or hedgy or LAgag
  --alignment      turn on alignment
  --ref REF        path to all reference files
  --cutOff CUTOFF  assign cut off

```

2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast) 

3. Run the pipeline on GALEN
```
# this will run the pipeline using slurm         
# all the fastq files in the given folder will be processed
# run with alignment 
bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference

# if alignment was finished, you want to only do read counts
bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference
```

### Output files  ###

* After running the pipeline, one folder will be generated for each group pair (yAD*DB*)

* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN
  
* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high.

* `*_sorted.sam`: Raw sam files generated from bowtie2

* `*_noh.csv`: shrinked sam files, used for scoring

* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn)


%package -n python3-BFG-Y2H
Summary:	Analysis scripts for BFG-Y2H data
Provides:	python-BFG-Y2H
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-BFG-Y2H
### BFG Y2H Analysis Pipeline ###

**Requirements**

* Python 3.7
* Bowtie 2 and Bowtie2 build

### Files required ###

The pipeline requires reference files before running. They can be found on GALEN: 
```
all reference files contain all the barcodes in fasta format
path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/
```
Before running the pipeline, you need to copy everything in these two folders to your designated directory.


#### Build new reference ###

If you need to build a new reference for your analysis, please follow:
    
1. You can refer to the create_fasta.py script to build the new fasta file 
2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always 
   be the second item, and the up/dn identifier should always be the last item. (see examples below)
3. Example sequences in output fasta file:
```
>G1;YDL169C_BC-1;7;up
CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA

>G1;YMR206W_BC-1;1.0;DB;up
CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC

>G09;56346_BC-1;24126.0;DB;dn
TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG
```
4. After making the fasta file, build index with bowtie2-build
`bowtie2-build filename.fasta filename`
5. Update main.py to use the summary files you generated
   * Edit parse_input_files() to add a case

### Running the pipeline  ###

* Install from pypi (recommend): `python -m pip install BFG-Y2H`

* Install and build from github, the update.sh might need to be modified before you install
```
1. download the package from github
2. inside the root folder, run ./update.sh
```

1. Input arguments: 
```
usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment]
           [--ref REF] [--cutOff CUTOFF]

BFG-Y2H

optional arguments:
  -h, --help       show this help message and exit
  --fastq FASTQ    Path to all fastq files you want to analyze
  --output OUTPUT  Output path for sam files
  --mode MODE      pick yeast or human or virus or hedgy or LAgag
  --alignment      turn on alignment
  --ref REF        path to all reference files
  --cutOff CUTOFF  assign cut off

```

2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast) 

3. Run the pipeline on GALEN
```
# this will run the pipeline using slurm         
# all the fastq files in the given folder will be processed
# run with alignment 
bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference

# if alignment was finished, you want to only do read counts
bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference
```

### Output files  ###

* After running the pipeline, one folder will be generated for each group pair (yAD*DB*)

* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN
  
* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high.

* `*_sorted.sam`: Raw sam files generated from bowtie2

* `*_noh.csv`: shrinked sam files, used for scoring

* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn)


%package help
Summary:	Development documents and examples for BFG-Y2H
Provides:	python3-BFG-Y2H-doc
%description help
### BFG Y2H Analysis Pipeline ###

**Requirements**

* Python 3.7
* Bowtie 2 and Bowtie2 build

### Files required ###

The pipeline requires reference files before running. They can be found on GALEN: 
```
all reference files contain all the barcodes in fasta format
path: /home/rothlab/rli/02_dev/08_bfg_y2h/bfg_data/reference/
```
Before running the pipeline, you need to copy everything in these two folders to your designated directory.


#### Build new reference ###

If you need to build a new reference for your analysis, please follow:
    
1. You can refer to the create_fasta.py script to build the new fasta file 
2. Make sure the name for the sequences follows the format: `>*;ORF-BC-ID;*;up/dn`. In other words, the ORF-ID should always 
   be the second item, and the up/dn identifier should always be the last item. (see examples below)
3. Example sequences in output fasta file:
```
>G1;YDL169C_BC-1;7;up
CCCTTAGAACCGAGAGTGTGGGTTAAATGGGTGAATTCAGGGATTCACTCCGTTCGTCACTCAATAA

>G1;YMR206W_BC-1;1.0;DB;up
CCATACGAGCACATTACGGGGCTTGAGTTATATAGTCGATCCGGGCTAACTCGCATACCTCTGATAAC

>G09;56346_BC-1;24126.0;DB;dn
TCGATAGGTGCGTGTGAAGGATGTTCCCCCGGTCACCGGGCCAGTCCTCAGTCGCTCAGTCAAG
```
4. After making the fasta file, build index with bowtie2-build
`bowtie2-build filename.fasta filename`
5. Update main.py to use the summary files you generated
   * Edit parse_input_files() to add a case

### Running the pipeline  ###

* Install from pypi (recommend): `python -m pip install BFG-Y2H`

* Install and build from github, the update.sh might need to be modified before you install
```
1. download the package from github
2. inside the root folder, run ./update.sh
```

1. Input arguments: 
```
usage: bfg [-h] [--fastq FASTQ] [--output OUTPUT] --mode MODE [--alignment]
           [--ref REF] [--cutOff CUTOFF]

BFG-Y2H

optional arguments:
  -h, --help       show this help message and exit
  --fastq FASTQ    Path to all fastq files you want to analyze
  --output OUTPUT  Output path for sam files
  --mode MODE      pick yeast or human or virus or hedgy or LAgag
  --alignment      turn on alignment
  --ref REF        path to all reference files
  --cutOff CUTOFF  assign cut off

```

2. All the input fastq files should have names following the format: y|hAD*DB*_GFP_(pre|med|high) (for human and yeast) 

3. Run the pipeline on GALEN
```
# this will run the pipeline using slurm         
# all the fastq files in the given folder will be processed
# run with alignment 
bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --alignment --ref path/to/reference

# if alignment was finished, you want to only do read counts
bfg --fastq /path/to/fastq_files/ --output /path/to/output_dir/ --mode yeast/human/virus/hedgy --ref path/to/reference
```

### Output files  ###

* After running the pipeline, one folder will be generated for each group pair (yAD*DB*)

* The folder called `GALEN_jobs` saves all the bash scripts submited to GALEN
  
* In the output folder for each group pair, we aligned R1 and R2 separately to the reference sequences for GFP_pre, GFP_med and GFP_high.

* `*_sorted.sam`: Raw sam files generated from bowtie2

* `*_noh.csv`: shrinked sam files, used for scoring

* `*_counts.csv`: barcode counts for uptags, dntags, and combined (up+dn)


%prep
%autosetup -n BFG-Y2H-0.1.2

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-BFG-Y2H -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.2-1
- Package Spec generated