diff options
author | CoprDistGit <infra@openeuler.org> | 2023-05-18 06:23:10 +0000 |
---|---|---|
committer | CoprDistGit <infra@openeuler.org> | 2023-05-18 06:23:10 +0000 |
commit | a2f0bfb4cf733d99f612ac14c456ab82d7b81dd1 (patch) | |
tree | b940e6a5ff8731ee5b1550a7a8b135038afa4256 | |
parent | 6382711e01911318f20ca7a04039ad45cdd2aed2 (diff) |
automatic import of python-bioframe
-rw-r--r-- | .gitignore | 1 | ||||
-rw-r--r-- | python-bioframe.spec | 349 | ||||
-rw-r--r-- | sources | 1 |
3 files changed, 351 insertions, 0 deletions
@@ -0,0 +1 @@ +/bioframe-0.4.1.tar.gz diff --git a/python-bioframe.spec b/python-bioframe.spec new file mode 100644 index 0000000..3823419 --- /dev/null +++ b/python-bioframe.spec @@ -0,0 +1,349 @@ +%global _empty_manifest_terminate_build 0 +Name: python-bioframe +Version: 0.4.1 +Release: 1 +Summary: Pandas utilities for tab-delimited and other genomic files +License: MIT +URL: https://github.com/open2c/bioframe +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/28/30/aa3177c68a55c5e14e1dfa7a57c959fab8b41c7521e4d9f492a86b904534/bioframe-0.4.1.tar.gz +BuildArch: noarch + +Requires: python3-numpy +Requires: python3-matplotlib +Requires: python3-pandas +Requires: python3-requests + +%description +# Bioframe: Operations on Genomic Interval Dataframes + + +[](https://zenodo.org/badge/latestdoi/69901992) +[](https://bioframe.readthedocs.io/en/latest/) +<img src="./docs/figs/bioframe-logo.png" width=75%> + +Bioframe is a library to enable flexible and scalable operations on genomic interval dataframes in python. Building bioframe directly on top of [pandas](https://pandas.pydata.org/) enables immediate access to a rich set of dataframe operations. Working in python enables rapid visualization (e.g. matplotlib, seaborn) and iteration of genomic analyses. + +The philosophy underlying bioframe is to enable flexible operations: instead of creating a function for every possible use-case, we instead encourage users to compose functions to achieve their goals. + +Bioframe implements a variety of genomic interval operations directly on dataframes. Bioframe also includes functions for loading diverse genomic data formats, and performing operations on special classes of genomic intervals, including chromosome arms and fixed size bins. + +Read the [docs](https://bioframe.readthedocs.io/en/latest/), including the [guide](https://bioframe.readthedocs.io/en/latest/guide-intervalops.html), as well as the [bioframe preprint](https://doi.org/10.1101/2022.02.16.480748) for more information. + +If you use ***bioframe*** in your work, please cite: +*Bioframe: Operations on Genomic Intervals in Pandas Dataframes*. Open2C, Nezar Abdennur, Geoffrey Fudenberg, Ilya Flyamer, Aleksandra A. Galitsyna, Anton Goloborodko, Maxim Imakaev, Sergey V. Venev. +bioRxiv 2022.02.16.480748; doi: https://doi.org/10.1101/2022.02.16.480748 + + +## Installation +The following are required before installing bioframe: +* Python 3.7+ +* `numpy` +* `pandas>=1.3` + +```sh +pip install bioframe +``` + +## Interval operations + +Key genomic interval operations in bioframe include: +- `closest`: For every interval in a dataframe, find the closest intervals in a second dataframe. +- `cluster`: Group overlapping intervals in a dataframe into clusters. +- `complement`: Find genomic intervals that are not covered by any interval from a dataframe. +- `overlap`: Find pairs of overlapping genomic intervals between two dataframes. + +Bioframe additionally has functions that are frequently used for genomic interval operations and can be expressed as combinations of these core operations and dataframe operations, including: `coverage`, `expand`, `merge`, `select`, and `subtract`. + +To `overlap` two dataframes, call: +```python +import bioframe as bf + +bf.overlap(df1, df2) +``` + +For these two input dataframes, with intervals all on the same chromosome: + +<img src="./docs/figs/df1.png" width=60%> +<img src="./docs/figs/df2.png" width=60%> + +`overlap` will return the following interval pairs as overlaps: + +<img src="./docs/figs/overlap_inner_0.png" width=60%> +<img src="./docs/figs/overlap_inner_1.png" width=60%> + + +To `merge` all overlapping intervals in a dataframe, call: +```python +import bioframe as bf + +bf.merge(df1) +``` + +For this input dataframe, with intervals all on the same chromosome: + +<img src="./docs/figs/df1.png" width=60%> + +`merge` will return a new dataframe with these merged intervals: + +<img src="./docs/figs/merge_df1.png" width=60%> + +See the [guide](https://bioframe.readthedocs.io/en/latest/guide-intervalops.html) for visualizations of other interval operations in bioframe. + +## File I/O + +Bioframe includes utilities for reading genomic file formats into dataframes and vice versa. One handy function is `read_table` which mirrors pandas’s read_csv/read_table but provides a [`schema`](https://github.com/open2c/bioframe/blob/main/bioframe/io/schemas.py) argument to populate column names for common tabular file formats. + +```python +jaspar_url = 'http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2018/hg38/tsv/MA0139.1.tsv.gz' +ctcf_motif_calls = bioframe.read_table(jaspar_url, schema='jaspar', skiprows=1) +``` + +## Tutorials +See this [jupyter notebook](https://github.com/open2c/bioframe/tree/master/docs/tutorials/tutorial_assign_motifs_to_peaks.ipynb) for an example of how to assign TF motifs to ChIP-seq peaks using bioframe. + +## Projects currently using bioframe: +* [cooler](https://github.com/open2c/cooler) +* [cooltools](https://github.com/open2c/cooltools) +* yours? :) + + +%package -n python3-bioframe +Summary: Pandas utilities for tab-delimited and other genomic files +Provides: python-bioframe +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-bioframe +# Bioframe: Operations on Genomic Interval Dataframes + + +[](https://zenodo.org/badge/latestdoi/69901992) +[](https://bioframe.readthedocs.io/en/latest/) +<img src="./docs/figs/bioframe-logo.png" width=75%> + +Bioframe is a library to enable flexible and scalable operations on genomic interval dataframes in python. Building bioframe directly on top of [pandas](https://pandas.pydata.org/) enables immediate access to a rich set of dataframe operations. Working in python enables rapid visualization (e.g. matplotlib, seaborn) and iteration of genomic analyses. + +The philosophy underlying bioframe is to enable flexible operations: instead of creating a function for every possible use-case, we instead encourage users to compose functions to achieve their goals. + +Bioframe implements a variety of genomic interval operations directly on dataframes. Bioframe also includes functions for loading diverse genomic data formats, and performing operations on special classes of genomic intervals, including chromosome arms and fixed size bins. + +Read the [docs](https://bioframe.readthedocs.io/en/latest/), including the [guide](https://bioframe.readthedocs.io/en/latest/guide-intervalops.html), as well as the [bioframe preprint](https://doi.org/10.1101/2022.02.16.480748) for more information. + +If you use ***bioframe*** in your work, please cite: +*Bioframe: Operations on Genomic Intervals in Pandas Dataframes*. Open2C, Nezar Abdennur, Geoffrey Fudenberg, Ilya Flyamer, Aleksandra A. Galitsyna, Anton Goloborodko, Maxim Imakaev, Sergey V. Venev. +bioRxiv 2022.02.16.480748; doi: https://doi.org/10.1101/2022.02.16.480748 + + +## Installation +The following are required before installing bioframe: +* Python 3.7+ +* `numpy` +* `pandas>=1.3` + +```sh +pip install bioframe +``` + +## Interval operations + +Key genomic interval operations in bioframe include: +- `closest`: For every interval in a dataframe, find the closest intervals in a second dataframe. +- `cluster`: Group overlapping intervals in a dataframe into clusters. +- `complement`: Find genomic intervals that are not covered by any interval from a dataframe. +- `overlap`: Find pairs of overlapping genomic intervals between two dataframes. + +Bioframe additionally has functions that are frequently used for genomic interval operations and can be expressed as combinations of these core operations and dataframe operations, including: `coverage`, `expand`, `merge`, `select`, and `subtract`. + +To `overlap` two dataframes, call: +```python +import bioframe as bf + +bf.overlap(df1, df2) +``` + +For these two input dataframes, with intervals all on the same chromosome: + +<img src="./docs/figs/df1.png" width=60%> +<img src="./docs/figs/df2.png" width=60%> + +`overlap` will return the following interval pairs as overlaps: + +<img src="./docs/figs/overlap_inner_0.png" width=60%> +<img src="./docs/figs/overlap_inner_1.png" width=60%> + + +To `merge` all overlapping intervals in a dataframe, call: +```python +import bioframe as bf + +bf.merge(df1) +``` + +For this input dataframe, with intervals all on the same chromosome: + +<img src="./docs/figs/df1.png" width=60%> + +`merge` will return a new dataframe with these merged intervals: + +<img src="./docs/figs/merge_df1.png" width=60%> + +See the [guide](https://bioframe.readthedocs.io/en/latest/guide-intervalops.html) for visualizations of other interval operations in bioframe. + +## File I/O + +Bioframe includes utilities for reading genomic file formats into dataframes and vice versa. One handy function is `read_table` which mirrors pandas’s read_csv/read_table but provides a [`schema`](https://github.com/open2c/bioframe/blob/main/bioframe/io/schemas.py) argument to populate column names for common tabular file formats. + +```python +jaspar_url = 'http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2018/hg38/tsv/MA0139.1.tsv.gz' +ctcf_motif_calls = bioframe.read_table(jaspar_url, schema='jaspar', skiprows=1) +``` + +## Tutorials +See this [jupyter notebook](https://github.com/open2c/bioframe/tree/master/docs/tutorials/tutorial_assign_motifs_to_peaks.ipynb) for an example of how to assign TF motifs to ChIP-seq peaks using bioframe. + +## Projects currently using bioframe: +* [cooler](https://github.com/open2c/cooler) +* [cooltools](https://github.com/open2c/cooltools) +* yours? :) + + +%package help +Summary: Development documents and examples for bioframe +Provides: python3-bioframe-doc +%description help +# Bioframe: Operations on Genomic Interval Dataframes + + +[](https://zenodo.org/badge/latestdoi/69901992) +[](https://bioframe.readthedocs.io/en/latest/) +<img src="./docs/figs/bioframe-logo.png" width=75%> + +Bioframe is a library to enable flexible and scalable operations on genomic interval dataframes in python. Building bioframe directly on top of [pandas](https://pandas.pydata.org/) enables immediate access to a rich set of dataframe operations. Working in python enables rapid visualization (e.g. matplotlib, seaborn) and iteration of genomic analyses. + +The philosophy underlying bioframe is to enable flexible operations: instead of creating a function for every possible use-case, we instead encourage users to compose functions to achieve their goals. + +Bioframe implements a variety of genomic interval operations directly on dataframes. Bioframe also includes functions for loading diverse genomic data formats, and performing operations on special classes of genomic intervals, including chromosome arms and fixed size bins. + +Read the [docs](https://bioframe.readthedocs.io/en/latest/), including the [guide](https://bioframe.readthedocs.io/en/latest/guide-intervalops.html), as well as the [bioframe preprint](https://doi.org/10.1101/2022.02.16.480748) for more information. + +If you use ***bioframe*** in your work, please cite: +*Bioframe: Operations on Genomic Intervals in Pandas Dataframes*. Open2C, Nezar Abdennur, Geoffrey Fudenberg, Ilya Flyamer, Aleksandra A. Galitsyna, Anton Goloborodko, Maxim Imakaev, Sergey V. Venev. +bioRxiv 2022.02.16.480748; doi: https://doi.org/10.1101/2022.02.16.480748 + + +## Installation +The following are required before installing bioframe: +* Python 3.7+ +* `numpy` +* `pandas>=1.3` + +```sh +pip install bioframe +``` + +## Interval operations + +Key genomic interval operations in bioframe include: +- `closest`: For every interval in a dataframe, find the closest intervals in a second dataframe. +- `cluster`: Group overlapping intervals in a dataframe into clusters. +- `complement`: Find genomic intervals that are not covered by any interval from a dataframe. +- `overlap`: Find pairs of overlapping genomic intervals between two dataframes. + +Bioframe additionally has functions that are frequently used for genomic interval operations and can be expressed as combinations of these core operations and dataframe operations, including: `coverage`, `expand`, `merge`, `select`, and `subtract`. + +To `overlap` two dataframes, call: +```python +import bioframe as bf + +bf.overlap(df1, df2) +``` + +For these two input dataframes, with intervals all on the same chromosome: + +<img src="./docs/figs/df1.png" width=60%> +<img src="./docs/figs/df2.png" width=60%> + +`overlap` will return the following interval pairs as overlaps: + +<img src="./docs/figs/overlap_inner_0.png" width=60%> +<img src="./docs/figs/overlap_inner_1.png" width=60%> + + +To `merge` all overlapping intervals in a dataframe, call: +```python +import bioframe as bf + +bf.merge(df1) +``` + +For this input dataframe, with intervals all on the same chromosome: + +<img src="./docs/figs/df1.png" width=60%> + +`merge` will return a new dataframe with these merged intervals: + +<img src="./docs/figs/merge_df1.png" width=60%> + +See the [guide](https://bioframe.readthedocs.io/en/latest/guide-intervalops.html) for visualizations of other interval operations in bioframe. + +## File I/O + +Bioframe includes utilities for reading genomic file formats into dataframes and vice versa. One handy function is `read_table` which mirrors pandas’s read_csv/read_table but provides a [`schema`](https://github.com/open2c/bioframe/blob/main/bioframe/io/schemas.py) argument to populate column names for common tabular file formats. + +```python +jaspar_url = 'http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2018/hg38/tsv/MA0139.1.tsv.gz' +ctcf_motif_calls = bioframe.read_table(jaspar_url, schema='jaspar', skiprows=1) +``` + +## Tutorials +See this [jupyter notebook](https://github.com/open2c/bioframe/tree/master/docs/tutorials/tutorial_assign_motifs_to_peaks.ipynb) for an example of how to assign TF motifs to ChIP-seq peaks using bioframe. + +## Projects currently using bioframe: +* [cooler](https://github.com/open2c/cooler) +* [cooltools](https://github.com/open2c/cooltools) +* yours? :) + + +%prep +%autosetup -n bioframe-0.4.1 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-bioframe -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Thu May 18 2023 Python_Bot <Python_Bot@openeuler.org> - 0.4.1-1 +- Package Spec generated @@ -0,0 +1 @@ +bba0b7db03199b21896639afa1a00073 bioframe-0.4.1.tar.gz |