summaryrefslogtreecommitdiff
path: root/python-scalehd.spec
diff options
context:
space:
mode:
Diffstat (limited to 'python-scalehd.spec')
-rw-r--r--python-scalehd.spec120
1 files changed, 120 insertions, 0 deletions
diff --git a/python-scalehd.spec b/python-scalehd.spec
new file mode 100644
index 0000000..9b70710
--- /dev/null
+++ b/python-scalehd.spec
@@ -0,0 +1,120 @@
+%global _empty_manifest_terminate_build 0
+Name: python-ScaleHD
+Version: 1.1.1
+Release: 1
+Summary: Automated DNA micro-satellite genotyping.
+License: GPLv3
+URL: https://github.com/helloabunai/ScaleHD
+Source0: https://mirrors.aliyun.com/pypi/web/packages/e7/8a/61a646e73b2bee543cb4040ddd30d10270e2021f2fdfb0966fb5aedb158c/ScaleHD-1.1.1.tar.gz
+BuildArch: noarch
+
+
+%description
+ScaleHD is a package for automating the process of genotyping microsatellite repeats in Huntington Disease data.
+We utilise machine learning approaches to take into account natural data 'artefacts', such as PCR slippage and somatic
+mosaicism, when processing data. This provides the end-user with a simple to use platform which can robustly predict genotypes from input data.
+By default, input is a pair of unaligned .fastq sequence data -- both forward and reverse reads, per sample. We utilise both forward and reverse
+reads in order to reduce the complex dimensionality issue posed by Huntington Disease's multiple repeat tract genetic structure. Reverse reads allow
+us to determine the current sample's CCG state -- this provides us with a mechanism by which to more easily call the entire genotype. Forward reads
+are utilised in a similar approach, to determine the CAG and intervening structure.
+The general overview of the application is as follows:
+1) Input FastQ files are subsampled, if an overwhelming number of reads are present. This can be overruled with the -b flag.
+2) Sequence quality control is carried out per the user's instructions. We reccomend trimming of any 5-prime spacer+primer combinations, for optimal alignment.
+3) Alignment of these files, to a typical HD structure (CAG_1_1_CCG_2) reference, is carried out.
+4) Assemblies are scanned with Digital Signal Processing to detect any possible atypical structures (e.g. CAG_2_1_CCG_3).
+4.1) If no atypical alleles are detected, proceed as normal.
+4.2) If atypical alleles are detected, a custom reference is generated, and re-alignment to this is carried out.
+5) With the appropriate allele information and sequence assembly(ies) present, sampled are genotyped.
+6) Output is written for the current sample; the procedure is repeated for the next sample in the queue (if present).
+Check the full documentation at http://scalehd.rtfd.io
+
+%package -n python3-ScaleHD
+Summary: Automated DNA micro-satellite genotyping.
+Provides: python-ScaleHD
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-ScaleHD
+ScaleHD is a package for automating the process of genotyping microsatellite repeats in Huntington Disease data.
+We utilise machine learning approaches to take into account natural data 'artefacts', such as PCR slippage and somatic
+mosaicism, when processing data. This provides the end-user with a simple to use platform which can robustly predict genotypes from input data.
+By default, input is a pair of unaligned .fastq sequence data -- both forward and reverse reads, per sample. We utilise both forward and reverse
+reads in order to reduce the complex dimensionality issue posed by Huntington Disease's multiple repeat tract genetic structure. Reverse reads allow
+us to determine the current sample's CCG state -- this provides us with a mechanism by which to more easily call the entire genotype. Forward reads
+are utilised in a similar approach, to determine the CAG and intervening structure.
+The general overview of the application is as follows:
+1) Input FastQ files are subsampled, if an overwhelming number of reads are present. This can be overruled with the -b flag.
+2) Sequence quality control is carried out per the user's instructions. We reccomend trimming of any 5-prime spacer+primer combinations, for optimal alignment.
+3) Alignment of these files, to a typical HD structure (CAG_1_1_CCG_2) reference, is carried out.
+4) Assemblies are scanned with Digital Signal Processing to detect any possible atypical structures (e.g. CAG_2_1_CCG_3).
+4.1) If no atypical alleles are detected, proceed as normal.
+4.2) If atypical alleles are detected, a custom reference is generated, and re-alignment to this is carried out.
+5) With the appropriate allele information and sequence assembly(ies) present, sampled are genotyped.
+6) Output is written for the current sample; the procedure is repeated for the next sample in the queue (if present).
+Check the full documentation at http://scalehd.rtfd.io
+
+%package help
+Summary: Development documents and examples for ScaleHD
+Provides: python3-ScaleHD-doc
+%description help
+ScaleHD is a package for automating the process of genotyping microsatellite repeats in Huntington Disease data.
+We utilise machine learning approaches to take into account natural data 'artefacts', such as PCR slippage and somatic
+mosaicism, when processing data. This provides the end-user with a simple to use platform which can robustly predict genotypes from input data.
+By default, input is a pair of unaligned .fastq sequence data -- both forward and reverse reads, per sample. We utilise both forward and reverse
+reads in order to reduce the complex dimensionality issue posed by Huntington Disease's multiple repeat tract genetic structure. Reverse reads allow
+us to determine the current sample's CCG state -- this provides us with a mechanism by which to more easily call the entire genotype. Forward reads
+are utilised in a similar approach, to determine the CAG and intervening structure.
+The general overview of the application is as follows:
+1) Input FastQ files are subsampled, if an overwhelming number of reads are present. This can be overruled with the -b flag.
+2) Sequence quality control is carried out per the user's instructions. We reccomend trimming of any 5-prime spacer+primer combinations, for optimal alignment.
+3) Alignment of these files, to a typical HD structure (CAG_1_1_CCG_2) reference, is carried out.
+4) Assemblies are scanned with Digital Signal Processing to detect any possible atypical structures (e.g. CAG_2_1_CCG_3).
+4.1) If no atypical alleles are detected, proceed as normal.
+4.2) If atypical alleles are detected, a custom reference is generated, and re-alignment to this is carried out.
+5) With the appropriate allele information and sequence assembly(ies) present, sampled are genotyped.
+6) Output is written for the current sample; the procedure is repeated for the next sample in the queue (if present).
+Check the full documentation at http://scalehd.rtfd.io
+
+%prep
+%autosetup -n ScaleHD-1.1.1
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-ScaleHD -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 1.1.1-1
+- Package Spec generated