diff options
author | CoprDistGit <infra@openeuler.org> | 2023-06-20 08:12:58 +0000 |
---|---|---|
committer | CoprDistGit <infra@openeuler.org> | 2023-06-20 08:12:58 +0000 |
commit | 183277ccd73990f805cfd78600756d22a3859ea8 (patch) | |
tree | a7da8715a20b900d77cfe52d42819f14a87a0cb2 | |
parent | 138f7eaeeb3a91de83895ef73131d503d9e92268 (diff) |
automatic import of python-ScaleHDopeneuler20.03
-rw-r--r-- | .gitignore | 1 | ||||
-rw-r--r-- | python-scalehd.spec | 120 | ||||
-rw-r--r-- | sources | 1 |
3 files changed, 122 insertions, 0 deletions
@@ -0,0 +1 @@ +/ScaleHD-1.1.1.tar.gz diff --git a/python-scalehd.spec b/python-scalehd.spec new file mode 100644 index 0000000..9b70710 --- /dev/null +++ b/python-scalehd.spec @@ -0,0 +1,120 @@ +%global _empty_manifest_terminate_build 0 +Name: python-ScaleHD +Version: 1.1.1 +Release: 1 +Summary: Automated DNA micro-satellite genotyping. +License: GPLv3 +URL: https://github.com/helloabunai/ScaleHD +Source0: https://mirrors.aliyun.com/pypi/web/packages/e7/8a/61a646e73b2bee543cb4040ddd30d10270e2021f2fdfb0966fb5aedb158c/ScaleHD-1.1.1.tar.gz +BuildArch: noarch + + +%description +ScaleHD is a package for automating the process of genotyping microsatellite repeats in Huntington Disease data. +We utilise machine learning approaches to take into account natural data 'artefacts', such as PCR slippage and somatic +mosaicism, when processing data. This provides the end-user with a simple to use platform which can robustly predict genotypes from input data. +By default, input is a pair of unaligned .fastq sequence data -- both forward and reverse reads, per sample. We utilise both forward and reverse +reads in order to reduce the complex dimensionality issue posed by Huntington Disease's multiple repeat tract genetic structure. Reverse reads allow +us to determine the current sample's CCG state -- this provides us with a mechanism by which to more easily call the entire genotype. Forward reads +are utilised in a similar approach, to determine the CAG and intervening structure. +The general overview of the application is as follows: +1) Input FastQ files are subsampled, if an overwhelming number of reads are present. This can be overruled with the -b flag. +2) Sequence quality control is carried out per the user's instructions. We reccomend trimming of any 5-prime spacer+primer combinations, for optimal alignment. +3) Alignment of these files, to a typical HD structure (CAG_1_1_CCG_2) reference, is carried out. +4) Assemblies are scanned with Digital Signal Processing to detect any possible atypical structures (e.g. CAG_2_1_CCG_3). +4.1) If no atypical alleles are detected, proceed as normal. +4.2) If atypical alleles are detected, a custom reference is generated, and re-alignment to this is carried out. +5) With the appropriate allele information and sequence assembly(ies) present, sampled are genotyped. +6) Output is written for the current sample; the procedure is repeated for the next sample in the queue (if present). +Check the full documentation at http://scalehd.rtfd.io + +%package -n python3-ScaleHD +Summary: Automated DNA micro-satellite genotyping. +Provides: python-ScaleHD +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-ScaleHD +ScaleHD is a package for automating the process of genotyping microsatellite repeats in Huntington Disease data. +We utilise machine learning approaches to take into account natural data 'artefacts', such as PCR slippage and somatic +mosaicism, when processing data. This provides the end-user with a simple to use platform which can robustly predict genotypes from input data. +By default, input is a pair of unaligned .fastq sequence data -- both forward and reverse reads, per sample. We utilise both forward and reverse +reads in order to reduce the complex dimensionality issue posed by Huntington Disease's multiple repeat tract genetic structure. Reverse reads allow +us to determine the current sample's CCG state -- this provides us with a mechanism by which to more easily call the entire genotype. Forward reads +are utilised in a similar approach, to determine the CAG and intervening structure. +The general overview of the application is as follows: +1) Input FastQ files are subsampled, if an overwhelming number of reads are present. This can be overruled with the -b flag. +2) Sequence quality control is carried out per the user's instructions. We reccomend trimming of any 5-prime spacer+primer combinations, for optimal alignment. +3) Alignment of these files, to a typical HD structure (CAG_1_1_CCG_2) reference, is carried out. +4) Assemblies are scanned with Digital Signal Processing to detect any possible atypical structures (e.g. CAG_2_1_CCG_3). +4.1) If no atypical alleles are detected, proceed as normal. +4.2) If atypical alleles are detected, a custom reference is generated, and re-alignment to this is carried out. +5) With the appropriate allele information and sequence assembly(ies) present, sampled are genotyped. +6) Output is written for the current sample; the procedure is repeated for the next sample in the queue (if present). +Check the full documentation at http://scalehd.rtfd.io + +%package help +Summary: Development documents and examples for ScaleHD +Provides: python3-ScaleHD-doc +%description help +ScaleHD is a package for automating the process of genotyping microsatellite repeats in Huntington Disease data. +We utilise machine learning approaches to take into account natural data 'artefacts', such as PCR slippage and somatic +mosaicism, when processing data. This provides the end-user with a simple to use platform which can robustly predict genotypes from input data. +By default, input is a pair of unaligned .fastq sequence data -- both forward and reverse reads, per sample. We utilise both forward and reverse +reads in order to reduce the complex dimensionality issue posed by Huntington Disease's multiple repeat tract genetic structure. Reverse reads allow +us to determine the current sample's CCG state -- this provides us with a mechanism by which to more easily call the entire genotype. Forward reads +are utilised in a similar approach, to determine the CAG and intervening structure. +The general overview of the application is as follows: +1) Input FastQ files are subsampled, if an overwhelming number of reads are present. This can be overruled with the -b flag. +2) Sequence quality control is carried out per the user's instructions. We reccomend trimming of any 5-prime spacer+primer combinations, for optimal alignment. +3) Alignment of these files, to a typical HD structure (CAG_1_1_CCG_2) reference, is carried out. +4) Assemblies are scanned with Digital Signal Processing to detect any possible atypical structures (e.g. CAG_2_1_CCG_3). +4.1) If no atypical alleles are detected, proceed as normal. +4.2) If atypical alleles are detected, a custom reference is generated, and re-alignment to this is carried out. +5) With the appropriate allele information and sequence assembly(ies) present, sampled are genotyped. +6) Output is written for the current sample; the procedure is repeated for the next sample in the queue (if present). +Check the full documentation at http://scalehd.rtfd.io + +%prep +%autosetup -n ScaleHD-1.1.1 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-ScaleHD -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 1.1.1-1 +- Package Spec generated @@ -0,0 +1 @@ +4ae926a24a81e08eb499b221b86a69b4 ScaleHD-1.1.1.tar.gz |