summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--.gitignore1
-rw-r--r--python-identify.spec367
-rw-r--r--sources1
3 files changed, 369 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..510aa9a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/identify-2.5.19.tar.gz
diff --git a/python-identify.spec b/python-identify.spec
new file mode 100644
index 0000000..babfd1f
--- /dev/null
+++ b/python-identify.spec
@@ -0,0 +1,367 @@
+%global _empty_manifest_terminate_build 0
+Name: python-identify
+Version: 2.5.19
+Release: 1
+Summary: File identification library for Python
+License: MIT
+URL: https://github.com/pre-commit/identify
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/2d/39/743c442eecc9405d98f90a3b4308740b5c09765052e526392c307e6054a3/identify-2.5.19.tar.gz
+BuildArch: noarch
+
+Requires: python3-ukkonen
+
+%description
+File identification library for Python.
+Given a file (or some information about a file), return a set of standardized
+tags identifying what the file is.
+## Installation
+```bash
+pip install identify
+```
+## Usage
+### With a file on disk
+If you have an actual file on disk, you can get the most information possible
+(a superset of all other methods):
+```python
+>>> from identify import identify
+>>> identify.tags_from_path('/path/to/file.py')
+{'file', 'text', 'python', 'non-executable'}
+>>> identify.tags_from_path('/path/to/file-with-shebang')
+{'file', 'text', 'shell', 'bash', 'executable'}
+>>> identify.tags_from_path('/bin/bash')
+{'file', 'binary', 'executable'}
+>>> identify.tags_from_path('/path/to/directory')
+{'directory'}
+>>> identify.tags_from_path('/path/to/symlink')
+{'symlink'}
+```
+When using a file on disk, the checks performed are:
+* File type (file, symlink, directory, socket)
+* Mode (is it executable?)
+* File name (mostly based on extension)
+* If executable, the shebang is read and the interpreter interpreted
+### If you only have the filename
+```python
+>>> identify.tags_from_filename('file.py')
+{'text', 'python'}
+```
+### If you only have the interpreter
+```python
+>>> identify.tags_from_interpreter('python3.5')
+{'python', 'python3'}
+>>> identify.tags_from_interpreter('bash')
+{'shell', 'bash'}
+>>> identify.tags_from_interpreter('some-unrecognized-thing')
+set()
+```
+### As a cli
+```
+$ identify-cli --help
+usage: identify-cli [-h] [--filename-only] path
+positional arguments:
+ path
+optional arguments:
+ -h, --help show this help message and exit
+ --filename-only
+```
+```console
+$ identify-cli setup.py; echo $?
+["file", "non-executable", "python", "text"]
+0
+$ identify-cli setup.py --filename-only; echo $?
+["python", "text"]
+0
+$ identify-cli wat.wat; echo $?
+wat.wat does not exist.
+1
+$ identify-cli wat.wat --filename-only; echo $?
+1
+```
+### Identifying LICENSE files
+`identify` also has an api for determining what type of license is contained
+in a file. This routine is roughly based on the approaches used by
+[licensee] (the ruby gem that github uses to figure out the license for a
+repo).
+The approach that `identify` uses is as follows:
+1. Strip the copyright line
+2. Normalize all whitespace
+3. Return any exact matches
+4. Return the closest by edit distance (where edit distance < 5%)
+To use the api, install via `pip install identify[license]`
+```pycon
+>>> from identify import identify
+>>> identify.license_id('LICENSE')
+'MIT'
+```
+The return value of the `license_id` function is an [SPDX] id. Currently
+licenses are sourced from [choosealicense.com].
+[licensee]: https://github.com/benbalter/licensee
+[SPDX]: https://spdx.org/licenses/
+[choosealicense.com]: https://github.com/github/choosealicense.com
+## How it works
+A call to `tags_from_path` does this:
+1. What is the type: file, symlink, directory? If it's not file, stop here.
+2. Is it executable? Add the appropriate tag.
+3. Do we recognize the file extension? If so, add the appropriate tags, stop
+ here. These tags would include binary/text.
+4. Peek at the first X bytes of the file. Use these to determine whether it is
+ binary or text, add the appropriate tag.
+5. If identified as text above, try to read and interpret the shebang, and add
+ appropriate tags.
+By design, this means we don't need to partially read files where we recognize
+the file extension.
+
+%package -n python3-identify
+Summary: File identification library for Python
+Provides: python-identify
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-identify
+File identification library for Python.
+Given a file (or some information about a file), return a set of standardized
+tags identifying what the file is.
+## Installation
+```bash
+pip install identify
+```
+## Usage
+### With a file on disk
+If you have an actual file on disk, you can get the most information possible
+(a superset of all other methods):
+```python
+>>> from identify import identify
+>>> identify.tags_from_path('/path/to/file.py')
+{'file', 'text', 'python', 'non-executable'}
+>>> identify.tags_from_path('/path/to/file-with-shebang')
+{'file', 'text', 'shell', 'bash', 'executable'}
+>>> identify.tags_from_path('/bin/bash')
+{'file', 'binary', 'executable'}
+>>> identify.tags_from_path('/path/to/directory')
+{'directory'}
+>>> identify.tags_from_path('/path/to/symlink')
+{'symlink'}
+```
+When using a file on disk, the checks performed are:
+* File type (file, symlink, directory, socket)
+* Mode (is it executable?)
+* File name (mostly based on extension)
+* If executable, the shebang is read and the interpreter interpreted
+### If you only have the filename
+```python
+>>> identify.tags_from_filename('file.py')
+{'text', 'python'}
+```
+### If you only have the interpreter
+```python
+>>> identify.tags_from_interpreter('python3.5')
+{'python', 'python3'}
+>>> identify.tags_from_interpreter('bash')
+{'shell', 'bash'}
+>>> identify.tags_from_interpreter('some-unrecognized-thing')
+set()
+```
+### As a cli
+```
+$ identify-cli --help
+usage: identify-cli [-h] [--filename-only] path
+positional arguments:
+ path
+optional arguments:
+ -h, --help show this help message and exit
+ --filename-only
+```
+```console
+$ identify-cli setup.py; echo $?
+["file", "non-executable", "python", "text"]
+0
+$ identify-cli setup.py --filename-only; echo $?
+["python", "text"]
+0
+$ identify-cli wat.wat; echo $?
+wat.wat does not exist.
+1
+$ identify-cli wat.wat --filename-only; echo $?
+1
+```
+### Identifying LICENSE files
+`identify` also has an api for determining what type of license is contained
+in a file. This routine is roughly based on the approaches used by
+[licensee] (the ruby gem that github uses to figure out the license for a
+repo).
+The approach that `identify` uses is as follows:
+1. Strip the copyright line
+2. Normalize all whitespace
+3. Return any exact matches
+4. Return the closest by edit distance (where edit distance < 5%)
+To use the api, install via `pip install identify[license]`
+```pycon
+>>> from identify import identify
+>>> identify.license_id('LICENSE')
+'MIT'
+```
+The return value of the `license_id` function is an [SPDX] id. Currently
+licenses are sourced from [choosealicense.com].
+[licensee]: https://github.com/benbalter/licensee
+[SPDX]: https://spdx.org/licenses/
+[choosealicense.com]: https://github.com/github/choosealicense.com
+## How it works
+A call to `tags_from_path` does this:
+1. What is the type: file, symlink, directory? If it's not file, stop here.
+2. Is it executable? Add the appropriate tag.
+3. Do we recognize the file extension? If so, add the appropriate tags, stop
+ here. These tags would include binary/text.
+4. Peek at the first X bytes of the file. Use these to determine whether it is
+ binary or text, add the appropriate tag.
+5. If identified as text above, try to read and interpret the shebang, and add
+ appropriate tags.
+By design, this means we don't need to partially read files where we recognize
+the file extension.
+
+%package help
+Summary: Development documents and examples for identify
+Provides: python3-identify-doc
+%description help
+File identification library for Python.
+Given a file (or some information about a file), return a set of standardized
+tags identifying what the file is.
+## Installation
+```bash
+pip install identify
+```
+## Usage
+### With a file on disk
+If you have an actual file on disk, you can get the most information possible
+(a superset of all other methods):
+```python
+>>> from identify import identify
+>>> identify.tags_from_path('/path/to/file.py')
+{'file', 'text', 'python', 'non-executable'}
+>>> identify.tags_from_path('/path/to/file-with-shebang')
+{'file', 'text', 'shell', 'bash', 'executable'}
+>>> identify.tags_from_path('/bin/bash')
+{'file', 'binary', 'executable'}
+>>> identify.tags_from_path('/path/to/directory')
+{'directory'}
+>>> identify.tags_from_path('/path/to/symlink')
+{'symlink'}
+```
+When using a file on disk, the checks performed are:
+* File type (file, symlink, directory, socket)
+* Mode (is it executable?)
+* File name (mostly based on extension)
+* If executable, the shebang is read and the interpreter interpreted
+### If you only have the filename
+```python
+>>> identify.tags_from_filename('file.py')
+{'text', 'python'}
+```
+### If you only have the interpreter
+```python
+>>> identify.tags_from_interpreter('python3.5')
+{'python', 'python3'}
+>>> identify.tags_from_interpreter('bash')
+{'shell', 'bash'}
+>>> identify.tags_from_interpreter('some-unrecognized-thing')
+set()
+```
+### As a cli
+```
+$ identify-cli --help
+usage: identify-cli [-h] [--filename-only] path
+positional arguments:
+ path
+optional arguments:
+ -h, --help show this help message and exit
+ --filename-only
+```
+```console
+$ identify-cli setup.py; echo $?
+["file", "non-executable", "python", "text"]
+0
+$ identify-cli setup.py --filename-only; echo $?
+["python", "text"]
+0
+$ identify-cli wat.wat; echo $?
+wat.wat does not exist.
+1
+$ identify-cli wat.wat --filename-only; echo $?
+1
+```
+### Identifying LICENSE files
+`identify` also has an api for determining what type of license is contained
+in a file. This routine is roughly based on the approaches used by
+[licensee] (the ruby gem that github uses to figure out the license for a
+repo).
+The approach that `identify` uses is as follows:
+1. Strip the copyright line
+2. Normalize all whitespace
+3. Return any exact matches
+4. Return the closest by edit distance (where edit distance < 5%)
+To use the api, install via `pip install identify[license]`
+```pycon
+>>> from identify import identify
+>>> identify.license_id('LICENSE')
+'MIT'
+```
+The return value of the `license_id` function is an [SPDX] id. Currently
+licenses are sourced from [choosealicense.com].
+[licensee]: https://github.com/benbalter/licensee
+[SPDX]: https://spdx.org/licenses/
+[choosealicense.com]: https://github.com/github/choosealicense.com
+## How it works
+A call to `tags_from_path` does this:
+1. What is the type: file, symlink, directory? If it's not file, stop here.
+2. Is it executable? Add the appropriate tag.
+3. Do we recognize the file extension? If so, add the appropriate tags, stop
+ here. These tags would include binary/text.
+4. Peek at the first X bytes of the file. Use these to determine whether it is
+ binary or text, add the appropriate tag.
+5. If identified as text above, try to read and interpret the shebang, and add
+ appropriate tags.
+By design, this means we don't need to partially read files where we recognize
+the file extension.
+
+%prep
+%autosetup -n identify-2.5.19
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-identify -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Thu Mar 09 2023 Python_Bot <Python_Bot@openeuler.org> - 2.5.19-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..89cb5f5
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+81360900b0db030efa49e46210620d04 identify-2.5.19.tar.gz