diff options
Diffstat (limited to 'python-data-warehouse-client.spec')
-rw-r--r-- | python-data-warehouse-client.spec | 256 |
1 files changed, 256 insertions, 0 deletions
diff --git a/python-data-warehouse-client.spec b/python-data-warehouse-client.spec new file mode 100644 index 0000000..50e194a --- /dev/null +++ b/python-data-warehouse-client.spec @@ -0,0 +1,256 @@ +%global _empty_manifest_terminate_build 0 +Name: python-data-warehouse-client +Version: 3.0.2 +Release: 1 +Summary: This package provides access to the e-Science Central data warehouse that can be used to store, access and analyse data collected in scientific studies, including for healthcare applications +License: Apache Software License +URL: https://github.com/e-science-central/data-warehouse-client +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/6f/a5/c5ec93e3a9a098d245cd20088be16e509bafe9be764b0091dda6fe3fbd2b/data-warehouse-client-3.0.2.tar.gz +BuildArch: noarch + +Requires: python3-more-itertools +Requires: python3-matplotlib +Requires: python3-psycopg2 +Requires: python3-tabulate + +%description +# Data Warehouse Client + +This package provides access to the e-Science Central data warehouse that can be used to store, access and analyse +data collected in scientific studies, including for healthcare applications. The primary aim of the warehouse +was to create a general system that enables users to explore data collected in a variety of forms. This might include +data collected through questionnaires, data collected from sensors, +and features extracted from the analysis of sensor data (e.g. activity levels derived from raw accelerometer data). +Researchers might wish to slice, dice, visualise, analyse and explore this data in different ways, +e.g. all results for one participant, +all results for one type of measure in a study, +changes in measurements over time. Others may wish to build models that can then be used in applications +that make predictions about future values. + +Traditionally, data collected in studies has been stored in a collection of files, +often with metadata encoded in the filenames. +This makes it difficult, and time consuming, for researchers to explore, interpret and analyse the data. +The data warehouse exploits modern database technology to vastly simplify this effort. +In doing this we have drawn heavily on the best practice for data warehouse design. +However, there is more variety in the types of healthcare data to be stored than there is in a typical warehouse, +and so we have been forced to deviate from a conventional data warehouse in some aspect of the design. +There are three guiding principles behind the design: +1. The data warehouse must be able to store any type of data collected in a study without modifying the schema. +This means that when new types of data are collected in studies (e.g. from a new questionnaire, +a new data analysis program, or a new sensor) they can be stored in the warehouse without any changes to its design. +This has 3 main advantages: +firstly, it enables us to fix and optimise the schema for the tables in which the data is stored; +secondly it means that applications and tools (e.g. for analysis and visualisation) +built on the warehouse do not have to be updated when new types of data are added; +thirdly, a single, multi-tenant database server can support many studies. +This reduces the overall costs, the start-up time for a new study, and the overheads of managing the warehouse. +2. Descriptive information about the types of measurement is stored in the warehouse so that tools or humans +can interpret the data stored there. +3. The design is optimised for query performance. In several cases, this has led to denormalization + (duplication of data) to reduce the need for expensive joins. +4. It must support a security regime to restrict each user’s access +to the data collected in studies. + + +For more information see: +P. Watson and H. Hiden, "The e-Science Central Study Data Platform" +2022 IEEE 18th International Conference on e-Science (e-Science), +Salt Lake City, UT, USA, 2022, pp. 55-64, doi: 10.1109/eScience55777.2022.00020. +https://scholar.google.co.uk/citations?view_op=view_citation&hl=en&user=KQJg3lwAAAAJ&sortby=pubdate&citation_for_view=KQJg3lwAAAAJ:z0_F5_TITjQC + +For more documentation see [A Data Warehouse for Storing and Analysing Study Data](docs/data_warehouse_guide.pdf). + +# Running Instructions + +To install from PyPi, run: + +pip install data-warehouse-client + +In directory in which your executable is run, create a `db-credentials.json` file containing database +credentials (substituting all `<VARS>`): + ``` + {"user": "<USER>", "pass": "<PASSWORD>", "IP": "<IP>", "port": <PORT>} + ``` + + + + + +%package -n python3-data-warehouse-client +Summary: This package provides access to the e-Science Central data warehouse that can be used to store, access and analyse data collected in scientific studies, including for healthcare applications +Provides: python-data-warehouse-client +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-data-warehouse-client +# Data Warehouse Client + +This package provides access to the e-Science Central data warehouse that can be used to store, access and analyse +data collected in scientific studies, including for healthcare applications. The primary aim of the warehouse +was to create a general system that enables users to explore data collected in a variety of forms. This might include +data collected through questionnaires, data collected from sensors, +and features extracted from the analysis of sensor data (e.g. activity levels derived from raw accelerometer data). +Researchers might wish to slice, dice, visualise, analyse and explore this data in different ways, +e.g. all results for one participant, +all results for one type of measure in a study, +changes in measurements over time. Others may wish to build models that can then be used in applications +that make predictions about future values. + +Traditionally, data collected in studies has been stored in a collection of files, +often with metadata encoded in the filenames. +This makes it difficult, and time consuming, for researchers to explore, interpret and analyse the data. +The data warehouse exploits modern database technology to vastly simplify this effort. +In doing this we have drawn heavily on the best practice for data warehouse design. +However, there is more variety in the types of healthcare data to be stored than there is in a typical warehouse, +and so we have been forced to deviate from a conventional data warehouse in some aspect of the design. +There are three guiding principles behind the design: +1. The data warehouse must be able to store any type of data collected in a study without modifying the schema. +This means that when new types of data are collected in studies (e.g. from a new questionnaire, +a new data analysis program, or a new sensor) they can be stored in the warehouse without any changes to its design. +This has 3 main advantages: +firstly, it enables us to fix and optimise the schema for the tables in which the data is stored; +secondly it means that applications and tools (e.g. for analysis and visualisation) +built on the warehouse do not have to be updated when new types of data are added; +thirdly, a single, multi-tenant database server can support many studies. +This reduces the overall costs, the start-up time for a new study, and the overheads of managing the warehouse. +2. Descriptive information about the types of measurement is stored in the warehouse so that tools or humans +can interpret the data stored there. +3. The design is optimised for query performance. In several cases, this has led to denormalization + (duplication of data) to reduce the need for expensive joins. +4. It must support a security regime to restrict each user’s access +to the data collected in studies. + + +For more information see: +P. Watson and H. Hiden, "The e-Science Central Study Data Platform" +2022 IEEE 18th International Conference on e-Science (e-Science), +Salt Lake City, UT, USA, 2022, pp. 55-64, doi: 10.1109/eScience55777.2022.00020. +https://scholar.google.co.uk/citations?view_op=view_citation&hl=en&user=KQJg3lwAAAAJ&sortby=pubdate&citation_for_view=KQJg3lwAAAAJ:z0_F5_TITjQC + +For more documentation see [A Data Warehouse for Storing and Analysing Study Data](docs/data_warehouse_guide.pdf). + +# Running Instructions + +To install from PyPi, run: + +pip install data-warehouse-client + +In directory in which your executable is run, create a `db-credentials.json` file containing database +credentials (substituting all `<VARS>`): + ``` + {"user": "<USER>", "pass": "<PASSWORD>", "IP": "<IP>", "port": <PORT>} + ``` + + + + + +%package help +Summary: Development documents and examples for data-warehouse-client +Provides: python3-data-warehouse-client-doc +%description help +# Data Warehouse Client + +This package provides access to the e-Science Central data warehouse that can be used to store, access and analyse +data collected in scientific studies, including for healthcare applications. The primary aim of the warehouse +was to create a general system that enables users to explore data collected in a variety of forms. This might include +data collected through questionnaires, data collected from sensors, +and features extracted from the analysis of sensor data (e.g. activity levels derived from raw accelerometer data). +Researchers might wish to slice, dice, visualise, analyse and explore this data in different ways, +e.g. all results for one participant, +all results for one type of measure in a study, +changes in measurements over time. Others may wish to build models that can then be used in applications +that make predictions about future values. + +Traditionally, data collected in studies has been stored in a collection of files, +often with metadata encoded in the filenames. +This makes it difficult, and time consuming, for researchers to explore, interpret and analyse the data. +The data warehouse exploits modern database technology to vastly simplify this effort. +In doing this we have drawn heavily on the best practice for data warehouse design. +However, there is more variety in the types of healthcare data to be stored than there is in a typical warehouse, +and so we have been forced to deviate from a conventional data warehouse in some aspect of the design. +There are three guiding principles behind the design: +1. The data warehouse must be able to store any type of data collected in a study without modifying the schema. +This means that when new types of data are collected in studies (e.g. from a new questionnaire, +a new data analysis program, or a new sensor) they can be stored in the warehouse without any changes to its design. +This has 3 main advantages: +firstly, it enables us to fix and optimise the schema for the tables in which the data is stored; +secondly it means that applications and tools (e.g. for analysis and visualisation) +built on the warehouse do not have to be updated when new types of data are added; +thirdly, a single, multi-tenant database server can support many studies. +This reduces the overall costs, the start-up time for a new study, and the overheads of managing the warehouse. +2. Descriptive information about the types of measurement is stored in the warehouse so that tools or humans +can interpret the data stored there. +3. The design is optimised for query performance. In several cases, this has led to denormalization + (duplication of data) to reduce the need for expensive joins. +4. It must support a security regime to restrict each user’s access +to the data collected in studies. + + +For more information see: +P. Watson and H. Hiden, "The e-Science Central Study Data Platform" +2022 IEEE 18th International Conference on e-Science (e-Science), +Salt Lake City, UT, USA, 2022, pp. 55-64, doi: 10.1109/eScience55777.2022.00020. +https://scholar.google.co.uk/citations?view_op=view_citation&hl=en&user=KQJg3lwAAAAJ&sortby=pubdate&citation_for_view=KQJg3lwAAAAJ:z0_F5_TITjQC + +For more documentation see [A Data Warehouse for Storing and Analysing Study Data](docs/data_warehouse_guide.pdf). + +# Running Instructions + +To install from PyPi, run: + +pip install data-warehouse-client + +In directory in which your executable is run, create a `db-credentials.json` file containing database +credentials (substituting all `<VARS>`): + ``` + {"user": "<USER>", "pass": "<PASSWORD>", "IP": "<IP>", "port": <PORT>} + ``` + + + + + +%prep +%autosetup -n data-warehouse-client-3.0.2 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-data-warehouse-client -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 3.0.2-1 +- Package Spec generated |