summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCoprDistGit <infra@openeuler.org>2023-05-29 12:57:58 +0000
committerCoprDistGit <infra@openeuler.org>2023-05-29 12:57:58 +0000
commit3676d19ae721ebd77dd7fa02324dfc2d47b50db4 (patch)
tree153c9167f95a59b458b21f17055ddeb0ee8c6e2e
parent90d8b4974554606b6c2fc6adb2908a294f9ffbc2 (diff)
automatic import of python-epitoolkit
-rw-r--r--.gitignore1
-rw-r--r--python-epitoolkit.spec433
-rw-r--r--sources1
3 files changed, 435 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..d281fd9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/epitoolkit-0.2.6.tar.gz
diff --git a/python-epitoolkit.spec b/python-epitoolkit.spec
new file mode 100644
index 0000000..2900d34
--- /dev/null
+++ b/python-epitoolkit.spec
@@ -0,0 +1,433 @@
+%global _empty_manifest_terminate_build 0
+Name: python-epitoolkit
+Version: 0.2.6
+Release: 1
+Summary: EpiToolkit is a set of tools useful in the analysis of data from EPIC / 450K microarrays.
+License: MIT
+URL: https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/a0/9e/d65d7b009fe2454a0db42cd416c3d199511450af9c58745d01c3c679507b/epitoolkit-0.2.6.tar.gz
+BuildArch: noarch
+
+Requires: python3-pandas
+Requires: python3-numpy
+Requires: python3-seaborn
+Requires: python3-matplotlib
+Requires: python3-plotly
+Requires: python3-scipy
+Requires: python3-tqdm
+Requires: python3-pathlib
+Requires: python3-autopep8
+Requires: python3-Sphinx
+
+%description
+# EpiGenToolKit
+Is a small library created to deal with data from `EPIC / 450K` microarrays. The tool allows to:
+
+a) Simply visualize methylation levels of specific CpG or genomic region.
+
+b) Perform enrichment analysis of a selected subset of CpG against the whole array. In this type of analysis expected frequency [%] (based on mynorm) of genomic regions is compared to observed (based on provided cpgs set), results are comapred using chi-square test.
+
+# How to start?
+
+a) using env
+
+
+```
+python -m venv env
+source env/bin/activate # Windows: env\Scripts\activate
+pip install epitoolkit
+```
+
+b) using poetry
+
+```
+poetry new .
+poetry add epitoolkit
+```
+
+c) or just clone the repository:
+
+
+```
+git clone https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit.git
+cd EpiGenToolKit && poetry install
+```
+
+# How to use?
+
+## Visualization
+
+To visualize single **CpG** site or specific genomic region initialize **Visualise** object:
+
+```
+from epitoolkit.tools import Visualize
+
+viz = Visualize(manifest=<path_to_array_manifest>, # path to manifest file
+ mynorm=<path_to_mynorm_file>, # path to mynorm file
+ poi=<path_to_poi_file>, # path to poi file
+ poi_col=<column_name> # name of column containing sample phenotype
+ skiprows=0) # many manifest contains headers, set skiprows argument to ignore them.
+```
+all files must have *.csv extension, mynorm must contain sample names as `columns` and cpgs as `rows`, the proper
+EPIC manifest may be downloaded from [here](https://emea.support.illumina.com/downloads/infinium-methylationepic-v1-0-product-files.html),
+poi file must contain sample names `rows` (only samples overlapped between poi and mynorm will be used)
+and POI (phenotype of interest) column containing names of phenotype e.g. Control and Case.
+
+To visualize single CpG:
+```
+viz.plot_CpG("cg07881041", # cpg ID
+ static=False, # plot type static / interactive [default]
+ height=400, # plot size [default]
+ width=700, # plot size [default]
+ title="", # plot title [default]
+ legend_title="", # legend title [default]
+ font_size=22, # font size [default]
+ show_legend=True, # False to hide legedn [default]
+ x_axis_label="CpG", # x axsis label [default]
+ category_order=["Cohort 1", "Cohort 2], # box order [default]
+ y_axis_label="beta-values") # y axis label [default]
+```
+> NOTE: most of those arguments are default! So you don't need to specify most of them!
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot1.png?raw=true)
+
+
+To visualize specific genomic region:
+```
+vis.plot_Range(chr=17, start=5999, end=7000)
+```
+
+> NOTE: please note that all arguments available in `viz.plot_CpG` are also in `plot_Range`
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot2.png?raw=true)
+
+
+To visualize specific CpGs in genomic order, instead of whole region, just pass collection of CpGs:
+```
+viz.plot_Range(cpgs=["cg04594855", "cg19812938", "cg05451842"]
+```
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot3.png?raw=true)
+
+
+To save plots use *export* argument, for instance:
+```
+viz.plot_Range(chr=17, start=5999, end=6770, export="plot.html") # if static = False only html format is supported if static = True, use png extension.
+```
+
+### Enrichment analysis
+
+To perform enrichment analysis against any type of genomic region specified in the manifest file, the user needs to initialize **EnrichemntAnalysis** object.
+```
+from src.epitoolkit.tools import EnrichmentAnalysis
+
+ea = EnrichmentAnalysis(manifest=<path_to_array_manifest>,
+ mynorm=<path_to_mynorm_file>)
+```
+
+or if `Visualize` object already exists use `load` method (this approach makes you not have to load the data again):
+```
+ea = EnrichmentAnalysis.load(<Visualize_object_name>)
+```
+To start analysis:
+
+```
+ea.enrichmentAnalysis(categories_to_analyse=["UCSC_RefGene_Group", "Relation_to_UCSC_CpG_Island"], # list of categories to analyse
+ cpgs=cpgs) # list of cpgs to analyse against background
+```
+
+![examplePlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot4.png?raw=true)
+
+
+%package -n python3-epitoolkit
+Summary: EpiToolkit is a set of tools useful in the analysis of data from EPIC / 450K microarrays.
+Provides: python-epitoolkit
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-epitoolkit
+# EpiGenToolKit
+Is a small library created to deal with data from `EPIC / 450K` microarrays. The tool allows to:
+
+a) Simply visualize methylation levels of specific CpG or genomic region.
+
+b) Perform enrichment analysis of a selected subset of CpG against the whole array. In this type of analysis expected frequency [%] (based on mynorm) of genomic regions is compared to observed (based on provided cpgs set), results are comapred using chi-square test.
+
+# How to start?
+
+a) using env
+
+
+```
+python -m venv env
+source env/bin/activate # Windows: env\Scripts\activate
+pip install epitoolkit
+```
+
+b) using poetry
+
+```
+poetry new .
+poetry add epitoolkit
+```
+
+c) or just clone the repository:
+
+
+```
+git clone https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit.git
+cd EpiGenToolKit && poetry install
+```
+
+# How to use?
+
+## Visualization
+
+To visualize single **CpG** site or specific genomic region initialize **Visualise** object:
+
+```
+from epitoolkit.tools import Visualize
+
+viz = Visualize(manifest=<path_to_array_manifest>, # path to manifest file
+ mynorm=<path_to_mynorm_file>, # path to mynorm file
+ poi=<path_to_poi_file>, # path to poi file
+ poi_col=<column_name> # name of column containing sample phenotype
+ skiprows=0) # many manifest contains headers, set skiprows argument to ignore them.
+```
+all files must have *.csv extension, mynorm must contain sample names as `columns` and cpgs as `rows`, the proper
+EPIC manifest may be downloaded from [here](https://emea.support.illumina.com/downloads/infinium-methylationepic-v1-0-product-files.html),
+poi file must contain sample names `rows` (only samples overlapped between poi and mynorm will be used)
+and POI (phenotype of interest) column containing names of phenotype e.g. Control and Case.
+
+To visualize single CpG:
+```
+viz.plot_CpG("cg07881041", # cpg ID
+ static=False, # plot type static / interactive [default]
+ height=400, # plot size [default]
+ width=700, # plot size [default]
+ title="", # plot title [default]
+ legend_title="", # legend title [default]
+ font_size=22, # font size [default]
+ show_legend=True, # False to hide legedn [default]
+ x_axis_label="CpG", # x axsis label [default]
+ category_order=["Cohort 1", "Cohort 2], # box order [default]
+ y_axis_label="beta-values") # y axis label [default]
+```
+> NOTE: most of those arguments are default! So you don't need to specify most of them!
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot1.png?raw=true)
+
+
+To visualize specific genomic region:
+```
+vis.plot_Range(chr=17, start=5999, end=7000)
+```
+
+> NOTE: please note that all arguments available in `viz.plot_CpG` are also in `plot_Range`
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot2.png?raw=true)
+
+
+To visualize specific CpGs in genomic order, instead of whole region, just pass collection of CpGs:
+```
+viz.plot_Range(cpgs=["cg04594855", "cg19812938", "cg05451842"]
+```
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot3.png?raw=true)
+
+
+To save plots use *export* argument, for instance:
+```
+viz.plot_Range(chr=17, start=5999, end=6770, export="plot.html") # if static = False only html format is supported if static = True, use png extension.
+```
+
+### Enrichment analysis
+
+To perform enrichment analysis against any type of genomic region specified in the manifest file, the user needs to initialize **EnrichemntAnalysis** object.
+```
+from src.epitoolkit.tools import EnrichmentAnalysis
+
+ea = EnrichmentAnalysis(manifest=<path_to_array_manifest>,
+ mynorm=<path_to_mynorm_file>)
+```
+
+or if `Visualize` object already exists use `load` method (this approach makes you not have to load the data again):
+```
+ea = EnrichmentAnalysis.load(<Visualize_object_name>)
+```
+To start analysis:
+
+```
+ea.enrichmentAnalysis(categories_to_analyse=["UCSC_RefGene_Group", "Relation_to_UCSC_CpG_Island"], # list of categories to analyse
+ cpgs=cpgs) # list of cpgs to analyse against background
+```
+
+![examplePlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot4.png?raw=true)
+
+
+%package help
+Summary: Development documents and examples for epitoolkit
+Provides: python3-epitoolkit-doc
+%description help
+# EpiGenToolKit
+Is a small library created to deal with data from `EPIC / 450K` microarrays. The tool allows to:
+
+a) Simply visualize methylation levels of specific CpG or genomic region.
+
+b) Perform enrichment analysis of a selected subset of CpG against the whole array. In this type of analysis expected frequency [%] (based on mynorm) of genomic regions is compared to observed (based on provided cpgs set), results are comapred using chi-square test.
+
+# How to start?
+
+a) using env
+
+
+```
+python -m venv env
+source env/bin/activate # Windows: env\Scripts\activate
+pip install epitoolkit
+```
+
+b) using poetry
+
+```
+poetry new .
+poetry add epitoolkit
+```
+
+c) or just clone the repository:
+
+
+```
+git clone https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit.git
+cd EpiGenToolKit && poetry install
+```
+
+# How to use?
+
+## Visualization
+
+To visualize single **CpG** site or specific genomic region initialize **Visualise** object:
+
+```
+from epitoolkit.tools import Visualize
+
+viz = Visualize(manifest=<path_to_array_manifest>, # path to manifest file
+ mynorm=<path_to_mynorm_file>, # path to mynorm file
+ poi=<path_to_poi_file>, # path to poi file
+ poi_col=<column_name> # name of column containing sample phenotype
+ skiprows=0) # many manifest contains headers, set skiprows argument to ignore them.
+```
+all files must have *.csv extension, mynorm must contain sample names as `columns` and cpgs as `rows`, the proper
+EPIC manifest may be downloaded from [here](https://emea.support.illumina.com/downloads/infinium-methylationepic-v1-0-product-files.html),
+poi file must contain sample names `rows` (only samples overlapped between poi and mynorm will be used)
+and POI (phenotype of interest) column containing names of phenotype e.g. Control and Case.
+
+To visualize single CpG:
+```
+viz.plot_CpG("cg07881041", # cpg ID
+ static=False, # plot type static / interactive [default]
+ height=400, # plot size [default]
+ width=700, # plot size [default]
+ title="", # plot title [default]
+ legend_title="", # legend title [default]
+ font_size=22, # font size [default]
+ show_legend=True, # False to hide legedn [default]
+ x_axis_label="CpG", # x axsis label [default]
+ category_order=["Cohort 1", "Cohort 2], # box order [default]
+ y_axis_label="beta-values") # y axis label [default]
+```
+> NOTE: most of those arguments are default! So you don't need to specify most of them!
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot1.png?raw=true)
+
+
+To visualize specific genomic region:
+```
+vis.plot_Range(chr=17, start=5999, end=7000)
+```
+
+> NOTE: please note that all arguments available in `viz.plot_CpG` are also in `plot_Range`
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot2.png?raw=true)
+
+
+To visualize specific CpGs in genomic order, instead of whole region, just pass collection of CpGs:
+```
+viz.plot_Range(cpgs=["cg04594855", "cg19812938", "cg05451842"]
+```
+
+![CpGPlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot3.png?raw=true)
+
+
+To save plots use *export* argument, for instance:
+```
+viz.plot_Range(chr=17, start=5999, end=6770, export="plot.html") # if static = False only html format is supported if static = True, use png extension.
+```
+
+### Enrichment analysis
+
+To perform enrichment analysis against any type of genomic region specified in the manifest file, the user needs to initialize **EnrichemntAnalysis** object.
+```
+from src.epitoolkit.tools import EnrichmentAnalysis
+
+ea = EnrichmentAnalysis(manifest=<path_to_array_manifest>,
+ mynorm=<path_to_mynorm_file>)
+```
+
+or if `Visualize` object already exists use `load` method (this approach makes you not have to load the data again):
+```
+ea = EnrichmentAnalysis.load(<Visualize_object_name>)
+```
+To start analysis:
+
+```
+ea.enrichmentAnalysis(categories_to_analyse=["UCSC_RefGene_Group", "Relation_to_UCSC_CpG_Island"], # list of categories to analyse
+ cpgs=cpgs) # list of cpgs to analyse against background
+```
+
+![examplePlot](https://github.com/ClinicalEpigeneticsLaboratory/EpiGenToolKit/blob/main/Plots/Plot4.png?raw=true)
+
+
+%prep
+%autosetup -n epitoolkit-0.2.6
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-epitoolkit -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Mon May 29 2023 Python_Bot <Python_Bot@openeuler.org> - 0.2.6-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..6844cc0
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+5c09ab9210117771b15cc722d46f9661 epitoolkit-0.2.6.tar.gz