automatic import of python-targetran

author: CoprDistGit <infra@openeuler.org> 2023-05-10 07:45:17 +0000
committer: CoprDistGit <infra@openeuler.org> 2023-05-10 07:45:17 +0000
commit: a4db565020a76f8dad25891372b382f4dd67892f (patch)
tree: 8a0565e82980e72e2d4d0bb27536147568fce6be /python-targetran.spec
parent: 9f92d34cb58c80f5b0e19837b07f020aa18192e0 (diff)
1 files changed, 1166 insertions, 0 deletions
diff --git a/python-targetran.spec b/python-targetran.spec
new file mode 100644
index 0000000..203c13f
--- /dev/null
+++ b/python-targetran.spec
@@ -0,0 +1,1166 @@
+%global _empty_manifest_terminate_build 0
+Name:		python-targetran
+Version:	0.12.0
+Release:	1
+Summary:	Target transformation for data augmentation in objection detection
+License:	MIT License
+URL:		https://github.com/bhky/targetran
+Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/62/8c/14caab4eb936e8a1901a801ca95cd42c8858e37cf925ac2e65411bfe56c4/targetran-0.12.0.tar.gz
+BuildArch:	noarch
+
+Requires:	python3-opencv-python
+Requires:	python3-numpy
+
+%description
+![logo](logo/targetran_logo.png)
+
+[![ci](https://github.com/bhky/targetran/actions/workflows/ci.yml/badge.svg)](https://github.com/bhky/targetran/actions)
+[![License MIT 1.0](https://img.shields.io/badge/license-MIT%201.0-blue.svg)](LICENSE)
+
+# Motivation
+
+[Data augmentation](https://en.wikipedia.org/wiki/Data_augmentation) 
+is a technique commonly used for training machine learning models in the
+computer vision field, where one can increase the amount of image data by
+creating transformed copies of the original images.
+
+In the object detection sub-field, the transformation has to be done also
+to the target rectangular bounding-boxes. However, such functionality is not 
+readily available in frameworks such as TensorFlow and PyTorch.
+
+While there are other powerful augmentation tools available, many of those 
+do not work well with the 
+[TPU](https://cloud.google.com/tpu)
+when accessing from [Google Colab](https://colab.research.google.com/) or 
+[Kaggle Notebooks](https://www.kaggle.com/code),
+which are popular options nowadays for a lot of people who do not have their
+own hardware resources.
+
+Here comes Targetran to fill the gap.
+
+# What is Targetran?
+
+- A light-weight data augmentation library to assist object detection or 
+  image classification model training.
+- Has simple Python API to transform both the images and the target rectangular 
+  bounding-boxes.
+- Use dataset-idiomatic approach for TensorFlow and PyTorch.
+- Can be used with the TPU for acceleration (TensorFlow Dataset only).
+
+![example](docs/example.png)
+
+(Figure produced by the example code [here](examples/local/run_tf_dataset_local_example.py).)
+
+# Table of contents
+
+- [Installation](#installation)
+- [Usage](#usage)
+  - [Notations](#notations)
+  - [Data format](#data-format)
+  - [Design principles](#design-principles)
+  - [TensorFlow Dataset](#tensorflow-dataset)
+  - [PyTorch Dataset](#pytorch-dataset)
+  - [Image classification](#image-classification)
+  - [Examples](#examples)
+- [API](#api)
+
+# Installation
+
+Tested for Python 3.8, 3.9, and 3.10.
+
+The best way to install Targetran with its dependencies is from PyPI:
+```shell
+python3 -m pip install --upgrade targetran
+```
+Alternatively, to obtain the latest version from this repository:
+```shell
+git clone https://github.com/bhky/targetran.git
+cd targetran
+python3 -m pip install .
+```
+
+# Usage
+
+## Notations
+
+- `NDFloatArray`: NumPy float array type, which is an alias to `np.typing.NDArray[np.float_]`. 
+  The values are converted to `np.float32` internally.
+- `tf.Tensor`: General TensorFlow Tensor type. The values are converted to `tf.float32` internally.
+
+## Data format
+
+For object detection model training, which is the primary usage here, the following data are needed.
+- `image_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(height, width, num_channels)`):
+  - images in channel-last format;
+  - image sizes can be different.
+- `bboxes_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(num_bboxes_per_image, 4)`):
+  - each `bboxes` array/tensor provides the bounding-boxes associated with an image;
+  - each single bounding-box is given as `[top_left_x, top_left_y, bbox_width, bbox_height]`;
+  - empty array/tensor means no bounding-boxes (and labels) for that image.
+- `labels_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(num_bboxes_per_image,)`):
+  - each `labels` array/tensor provides the bounding-box labels associated with an image;
+  - empty array/tensor means no labels (and bounding-boxes) for that image.
+
+Some dummy data are created below for illustration. Please note the required format.
+```python
+import numpy as np
+
+# Each image could have different sizes, but they must follow the channel-last format, 
+# i.e., (height, width, num_channels).
+image_seq = [np.random.rand(480, 512, 3) for _ in range(3)]
+
+# The bounding-boxes (bboxes) are given as a sequence of NumPy arrays (or TF tensors).
+# Each array represents the bboxes for one corresponding image.
+#
+# Each bbox is given as [top_left_x, top_left_y, bbox_width, bbox_height].
+# 
+# In case an image has no bboxes, an empty array should be provided.
+bboxes_seq = [
+    np.array([  # Image with 2 bboxes.
+        [214, 223, 10, 11],
+        [345, 230, 21, 9],
+    ]),
+    np.array([]),  # Empty array for image with no bboxes.
+    np.array([  # Image with 3 bboxes.
+        [104, 151, 22, 10],
+        [99, 132, 20, 15],
+        [340, 220, 31, 12],
+    ]),
+]
+
+# Labels for the bboxes are also given as a sequence of NumPy arrays (or TF tensors).
+# The number of bboxes and labels should match. An empty array indicates no bboxes/labels.
+labels_seq = [
+    np.array([0, 1]),  # 2 labels.
+    np.array([]),  # No labels.
+    np.array([2, 3, 0]),  # 3 labels.
+]
+
+# During operation, all the data values will be converted to float32.
+```
+
+## Design principles
+
+- Bounding-boxes will always be rectangular with sides parallel to the image frame.
+- After transformation, each resulting bounding-box is determined by the smallest 
+  rectangle (with sides parallel to the image frame) enclosing the original transformed bounding-box.
+- After transformation, resulting bounding-boxes with their centroids outside the 
+  image frame will be removed, together with the corresponding labels.
+
+## TensorFlow Dataset
+
+```python
+import tensorflow as tf
+
+from targetran.tf import (
+    seqs_to_tf_dataset,
+    TFCombineAffine,
+    TFRandomFlipLeftRight,
+    TFRandomFlipUpDown,    
+    TFRandomRotate,
+    TFRandomShear,
+    TFRandomTranslate,
+    TFRandomCrop,
+    TFResize,
+)
+
+# Convert the above data sequences into a TensorFlow Dataset.
+# Users can have their own way to create the Dataset, as long as for each iteration 
+# it returns a tuple of tensors for a single sample: (image, bboxes, labels).
+ds = seqs_to_tf_dataset(image_seq, bboxes_seq, labels_seq)
+
+# The affine transformations can be combined into one operation for better performance.
+# Note that cropping and resizing are not affine and cannot be combined.
+# Option (1):
+affine_transform = TFCombineAffine(
+    [TFRandomRotate(probability=0.8),  # Probability to include each affine transformation step 
+     TFRandomShear(probability=0.6),   # can be specified, otherwise the default value is used.
+     TFRandomTranslate(),              # Thus, the number of selected steps could vary.
+     TFRandomFlipLeftRight(),
+     TFRandomFlipUpDown()],
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Option (2):
+# Alternatively, one can decide the exact number of randomly selected transformations,
+# e.g., use only any two of them. This could be a better option because too many 
+# transformation steps may deform the images too much.
+affine_transform = TFCombineAffine(
+    [TFRandomRotate(),  # Individual `probability` has no effect in this approach.
+     TFRandomShear(),
+     TFRandomTranslate(),
+     TFRandomFlipLeftRight(),
+     TFRandomFlipUpDown()],
+    num_selected_transforms=2,  # Only two steps from the list will be selected.
+    selected_probabilities=[0.5, 0.0, 0.3, 0.2, 0.0],  # Must sum up to 1.0, if given.
+    keep_order=True,  # If True, the selected steps must be performed in the given order.
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Please refer to the API manual for more parameter options.
+
+# Apply transformations.
+auto_tune = tf.data.AUTOTUNE
+ds = ds \
+    .map(TFRandomCrop(probability=0.5), num_parallel_calls=auto_tune) \
+    .map(affine_transform, num_parallel_calls=auto_tune) \
+    .map(TFResize((256, 256)), num_parallel_calls=auto_tune)
+
+# In the Dataset `map` call, the parameter `num_parallel_calls` can be set to,
+# e.g., tf.data.AUTOTUNE, for better performance. See docs for TensorFlow Dataset.
+```
+```python
+# Batching:
+# Since the array/tensor shape of each sample could be different, conventional
+# way of batching may not work. Users will have to consider their own use cases.
+# One possibly useful way is the padded-batch.
+ds = ds.padded_batch(batch_size=2, padding_values=-1.0)
+```
+
+## PyTorch Dataset
+
+```python
+from typing import Optional, Sequence, Tuple
+
+import numpy.typing
+from torch.utils.data import Dataset
+
+from targetran.np import (
+    CombineAffine,
+    RandomFlipLeftRight,
+    RandomFlipUpDown,
+    RandomRotate,
+    RandomShear,
+    RandomTranslate,
+    RandomCrop,
+    Resize,
+)
+from targetran.utils import Compose
+
+NDFloatArray = numpy.typing.NDArray[numpy.float_]
+
+
+class PTDataset(Dataset):
+    """
+    A very simple PyTorch Dataset.
+    As per common practice, transforms are done on NumPy arrays.
+    """
+    
+    def __init__(
+            self,
+            image_seq: Sequence[NDFloatArray],
+            bboxes_seq: Sequence[NDFloatArray],
+            labels_seq: Sequence[NDFloatArray],
+            transforms: Optional[Compose]
+    ) -> None:
+        self.image_seq = image_seq
+        self.bboxes_seq = bboxes_seq
+        self.labels_seq = labels_seq
+        self.transforms = transforms
+
+    def __len__(self) -> int:
+        return len(self.image_seq)
+
+    def __getitem__(
+            self,
+            idx: int
+    ) -> Tuple[NDFloatArray, NDFloatArray, NDFloatArray]:
+        if self.transforms:
+            return self.transforms(
+                self.image_seq[idx],
+                self.bboxes_seq[idx],
+                self.labels_seq[idx]
+            )
+        return (
+            self.image_seq[idx],
+            self.bboxes_seq[idx],
+            self.labels_seq[idx]
+        )
+
+
+# The affine transformations can be combined into one operation for better performance.
+# Note that cropping and resizing are not affine and cannot be combined.
+# Option (1):
+affine_transform = CombineAffine(
+    [RandomRotate(probability=0.8),  # Probability to include each affine transformation step 
+     RandomShear(probability=0.6),   # can be specified, otherwise the default value is used.
+     RandomTranslate(),              # Thus, the number of selected steps could vary.
+     RandomFlipLeftRight(),
+     RandomFlipUpDown()],
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Option (2):
+# Alternatively, one can decide the exact number of randomly selected transformations,
+# e.g., use only any two of them. This could be a better option because too many 
+# transformation steps may deform the images too much.
+affine_transform = CombineAffine(
+    [RandomRotate(),  # Individual `probability` has no effect in this approach.
+     RandomShear(),
+     RandomTranslate(),
+     RandomFlipLeftRight(),
+     RandomFlipUpDown()],
+    num_selected_transforms=2,  # Only two steps from the list will be selected.
+    selected_probabilities=[0.5, 0.0, 0.3, 0.2, 0.0],  # Must sum up to 1.0, if given.
+    keep_order=True,  # If True, the selected steps must be performed in the given order.
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Please refer to the API manual for more parameter options.
+
+# The `Compose` here is similar to that from the torchvision package, except 
+# that here it also supports callables with multiple inputs and outputs needed
+# for objection detection tasks, i.e., (image, bboxes, labels).
+transforms = Compose([
+    RandomCrop(probability=0.5),
+    affine_transform,
+    Resize((256, 256)),
+])
+
+# Convert the above data sequences into a PyTorch Dataset.
+# Users can have their own way to create the Dataset, as long as for each iteration 
+# it returns a tuple of arrays for a single sample: (image, bboxes, labels).
+ds = PTDataset(image_seq, bboxes_seq, labels_seq, transforms=transforms)
+```
+```python
+# Batching:
+# In PyTorch, it is common to use a Dataset with a DataLoader, which provides
+# batching functionality. However, since the array/tensor shape of each sample 
+# could be different, the default batching may not work. Targetran provides
+# a `collate_fn` that helps producing batches of (image_seq, bboxes_seq, labels_seq).
+from torch.utils.data import DataLoader
+from targetran.utils import collate_fn
+
+data_loader = DataLoader(ds, batch_size=2, collate_fn=collate_fn)
+```
+
+## Image classification
+
+While the tools here are primarily designed for object detection tasks, they can 
+also be used for image classification in which only the images are to be transformed,
+e.g., given a dataset that returns `(image, label)` samples, or even only `image` samples. 
+The `image_only` function can be used to convert a transformation class for this purpose.
+
+If the dataset returns a tuple `(image, ...)` in each iteration, only the `image`
+will be transformed, other parameters that followed such as `(..., label, weight)` 
+will be returned untouched.
+
+If the dataset returns `image` only (not a tuple), then only the transformed `image` will be returned. 
+```python
+from targetran.utils import image_only
+```
+```python
+# TensorFlow.
+ds = ds \
+    .map(image_only(TFRandomCrop())) \
+    .map(image_only(affine_transform)) \
+    .map(image_only(TFResize((256, 256)))) \
+    .batch(32)  # Conventional batching can be used for classification setup.
+```
+```python
+# PyTorch.
+transforms = Compose([
+    image_only(RandomCrop()),
+    image_only(affine_transform),
+    image_only(Resize((256, 256))),
+])
+ds = PTDataset(..., transforms=transforms)
+data_loader = DataLoader(ds, batch_size=32)
+```
+
+## Examples
+
+- [Code examples in this repository](examples) 
+- [Construct a TensorFlow Dataset with Targetran 
+   and object detection data](https://www.kaggle.com/boscoyung/targetran-example-with-tensorflow-dataset) 
+  (Kaggle Notebook)
+- [Image classification with TensorFlow and Targetran on TPU](https://www.kaggle.com/boscoyung/targetran-tpu-for-image-classification-example)
+  (Kaggle Notebook)
+
+# API
+
+See [here](docs/API.md) for API details.
+
+
+%package -n python3-targetran
+Summary:	Target transformation for data augmentation in objection detection
+Provides:	python-targetran
+BuildRequires:	python3-devel
+BuildRequires:	python3-setuptools
+BuildRequires:	python3-pip
+%description -n python3-targetran
+![logo](logo/targetran_logo.png)
+
+[![ci](https://github.com/bhky/targetran/actions/workflows/ci.yml/badge.svg)](https://github.com/bhky/targetran/actions)
+[![License MIT 1.0](https://img.shields.io/badge/license-MIT%201.0-blue.svg)](LICENSE)
+
+# Motivation
+
+[Data augmentation](https://en.wikipedia.org/wiki/Data_augmentation) 
+is a technique commonly used for training machine learning models in the
+computer vision field, where one can increase the amount of image data by
+creating transformed copies of the original images.
+
+In the object detection sub-field, the transformation has to be done also
+to the target rectangular bounding-boxes. However, such functionality is not 
+readily available in frameworks such as TensorFlow and PyTorch.
+
+While there are other powerful augmentation tools available, many of those 
+do not work well with the 
+[TPU](https://cloud.google.com/tpu)
+when accessing from [Google Colab](https://colab.research.google.com/) or 
+[Kaggle Notebooks](https://www.kaggle.com/code),
+which are popular options nowadays for a lot of people who do not have their
+own hardware resources.
+
+Here comes Targetran to fill the gap.
+
+# What is Targetran?
+
+- A light-weight data augmentation library to assist object detection or 
+  image classification model training.
+- Has simple Python API to transform both the images and the target rectangular 
+  bounding-boxes.
+- Use dataset-idiomatic approach for TensorFlow and PyTorch.
+- Can be used with the TPU for acceleration (TensorFlow Dataset only).
+
+![example](docs/example.png)
+
+(Figure produced by the example code [here](examples/local/run_tf_dataset_local_example.py).)
+
+# Table of contents
+
+- [Installation](#installation)
+- [Usage](#usage)
+  - [Notations](#notations)
+  - [Data format](#data-format)
+  - [Design principles](#design-principles)
+  - [TensorFlow Dataset](#tensorflow-dataset)
+  - [PyTorch Dataset](#pytorch-dataset)
+  - [Image classification](#image-classification)
+  - [Examples](#examples)
+- [API](#api)
+
+# Installation
+
+Tested for Python 3.8, 3.9, and 3.10.
+
+The best way to install Targetran with its dependencies is from PyPI:
+```shell
+python3 -m pip install --upgrade targetran
+```
+Alternatively, to obtain the latest version from this repository:
+```shell
+git clone https://github.com/bhky/targetran.git
+cd targetran
+python3 -m pip install .
+```
+
+# Usage
+
+## Notations
+
+- `NDFloatArray`: NumPy float array type, which is an alias to `np.typing.NDArray[np.float_]`. 
+  The values are converted to `np.float32` internally.
+- `tf.Tensor`: General TensorFlow Tensor type. The values are converted to `tf.float32` internally.
+
+## Data format
+
+For object detection model training, which is the primary usage here, the following data are needed.
+- `image_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(height, width, num_channels)`):
+  - images in channel-last format;
+  - image sizes can be different.
+- `bboxes_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(num_bboxes_per_image, 4)`):
+  - each `bboxes` array/tensor provides the bounding-boxes associated with an image;
+  - each single bounding-box is given as `[top_left_x, top_left_y, bbox_width, bbox_height]`;
+  - empty array/tensor means no bounding-boxes (and labels) for that image.
+- `labels_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(num_bboxes_per_image,)`):
+  - each `labels` array/tensor provides the bounding-box labels associated with an image;
+  - empty array/tensor means no labels (and bounding-boxes) for that image.
+
+Some dummy data are created below for illustration. Please note the required format.
+```python
+import numpy as np
+
+# Each image could have different sizes, but they must follow the channel-last format, 
+# i.e., (height, width, num_channels).
+image_seq = [np.random.rand(480, 512, 3) for _ in range(3)]
+
+# The bounding-boxes (bboxes) are given as a sequence of NumPy arrays (or TF tensors).
+# Each array represents the bboxes for one corresponding image.
+#
+# Each bbox is given as [top_left_x, top_left_y, bbox_width, bbox_height].
+# 
+# In case an image has no bboxes, an empty array should be provided.
+bboxes_seq = [
+    np.array([  # Image with 2 bboxes.
+        [214, 223, 10, 11],
+        [345, 230, 21, 9],
+    ]),
+    np.array([]),  # Empty array for image with no bboxes.
+    np.array([  # Image with 3 bboxes.
+        [104, 151, 22, 10],
+        [99, 132, 20, 15],
+        [340, 220, 31, 12],
+    ]),
+]
+
+# Labels for the bboxes are also given as a sequence of NumPy arrays (or TF tensors).
+# The number of bboxes and labels should match. An empty array indicates no bboxes/labels.
+labels_seq = [
+    np.array([0, 1]),  # 2 labels.
+    np.array([]),  # No labels.
+    np.array([2, 3, 0]),  # 3 labels.
+]
+
+# During operation, all the data values will be converted to float32.
+```
+
+## Design principles
+
+- Bounding-boxes will always be rectangular with sides parallel to the image frame.
+- After transformation, each resulting bounding-box is determined by the smallest 
+  rectangle (with sides parallel to the image frame) enclosing the original transformed bounding-box.
+- After transformation, resulting bounding-boxes with their centroids outside the 
+  image frame will be removed, together with the corresponding labels.
+
+## TensorFlow Dataset
+
+```python
+import tensorflow as tf
+
+from targetran.tf import (
+    seqs_to_tf_dataset,
+    TFCombineAffine,
+    TFRandomFlipLeftRight,
+    TFRandomFlipUpDown,    
+    TFRandomRotate,
+    TFRandomShear,
+    TFRandomTranslate,
+    TFRandomCrop,
+    TFResize,
+)
+
+# Convert the above data sequences into a TensorFlow Dataset.
+# Users can have their own way to create the Dataset, as long as for each iteration 
+# it returns a tuple of tensors for a single sample: (image, bboxes, labels).
+ds = seqs_to_tf_dataset(image_seq, bboxes_seq, labels_seq)
+
+# The affine transformations can be combined into one operation for better performance.
+# Note that cropping and resizing are not affine and cannot be combined.
+# Option (1):
+affine_transform = TFCombineAffine(
+    [TFRandomRotate(probability=0.8),  # Probability to include each affine transformation step 
+     TFRandomShear(probability=0.6),   # can be specified, otherwise the default value is used.
+     TFRandomTranslate(),              # Thus, the number of selected steps could vary.
+     TFRandomFlipLeftRight(),
+     TFRandomFlipUpDown()],
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Option (2):
+# Alternatively, one can decide the exact number of randomly selected transformations,
+# e.g., use only any two of them. This could be a better option because too many 
+# transformation steps may deform the images too much.
+affine_transform = TFCombineAffine(
+    [TFRandomRotate(),  # Individual `probability` has no effect in this approach.
+     TFRandomShear(),
+     TFRandomTranslate(),
+     TFRandomFlipLeftRight(),
+     TFRandomFlipUpDown()],
+    num_selected_transforms=2,  # Only two steps from the list will be selected.
+    selected_probabilities=[0.5, 0.0, 0.3, 0.2, 0.0],  # Must sum up to 1.0, if given.
+    keep_order=True,  # If True, the selected steps must be performed in the given order.
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Please refer to the API manual for more parameter options.
+
+# Apply transformations.
+auto_tune = tf.data.AUTOTUNE
+ds = ds \
+    .map(TFRandomCrop(probability=0.5), num_parallel_calls=auto_tune) \
+    .map(affine_transform, num_parallel_calls=auto_tune) \
+    .map(TFResize((256, 256)), num_parallel_calls=auto_tune)
+
+# In the Dataset `map` call, the parameter `num_parallel_calls` can be set to,
+# e.g., tf.data.AUTOTUNE, for better performance. See docs for TensorFlow Dataset.
+```
+```python
+# Batching:
+# Since the array/tensor shape of each sample could be different, conventional
+# way of batching may not work. Users will have to consider their own use cases.
+# One possibly useful way is the padded-batch.
+ds = ds.padded_batch(batch_size=2, padding_values=-1.0)
+```
+
+## PyTorch Dataset
+
+```python
+from typing import Optional, Sequence, Tuple
+
+import numpy.typing
+from torch.utils.data import Dataset
+
+from targetran.np import (
+    CombineAffine,
+    RandomFlipLeftRight,
+    RandomFlipUpDown,
+    RandomRotate,
+    RandomShear,
+    RandomTranslate,
+    RandomCrop,
+    Resize,
+)
+from targetran.utils import Compose
+
+NDFloatArray = numpy.typing.NDArray[numpy.float_]
+
+
+class PTDataset(Dataset):
+    """
+    A very simple PyTorch Dataset.
+    As per common practice, transforms are done on NumPy arrays.
+    """
+    
+    def __init__(
+            self,
+            image_seq: Sequence[NDFloatArray],
+            bboxes_seq: Sequence[NDFloatArray],
+            labels_seq: Sequence[NDFloatArray],
+            transforms: Optional[Compose]
+    ) -> None:
+        self.image_seq = image_seq
+        self.bboxes_seq = bboxes_seq
+        self.labels_seq = labels_seq
+        self.transforms = transforms
+
+    def __len__(self) -> int:
+        return len(self.image_seq)
+
+    def __getitem__(
+            self,
+            idx: int
+    ) -> Tuple[NDFloatArray, NDFloatArray, NDFloatArray]:
+        if self.transforms:
+            return self.transforms(
+                self.image_seq[idx],
+                self.bboxes_seq[idx],
+                self.labels_seq[idx]
+            )
+        return (
+            self.image_seq[idx],
+            self.bboxes_seq[idx],
+            self.labels_seq[idx]
+        )
+
+
+# The affine transformations can be combined into one operation for better performance.
+# Note that cropping and resizing are not affine and cannot be combined.
+# Option (1):
+affine_transform = CombineAffine(
+    [RandomRotate(probability=0.8),  # Probability to include each affine transformation step 
+     RandomShear(probability=0.6),   # can be specified, otherwise the default value is used.
+     RandomTranslate(),              # Thus, the number of selected steps could vary.
+     RandomFlipLeftRight(),
+     RandomFlipUpDown()],
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Option (2):
+# Alternatively, one can decide the exact number of randomly selected transformations,
+# e.g., use only any two of them. This could be a better option because too many 
+# transformation steps may deform the images too much.
+affine_transform = CombineAffine(
+    [RandomRotate(),  # Individual `probability` has no effect in this approach.
+     RandomShear(),
+     RandomTranslate(),
+     RandomFlipLeftRight(),
+     RandomFlipUpDown()],
+    num_selected_transforms=2,  # Only two steps from the list will be selected.
+    selected_probabilities=[0.5, 0.0, 0.3, 0.2, 0.0],  # Must sum up to 1.0, if given.
+    keep_order=True,  # If True, the selected steps must be performed in the given order.
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Please refer to the API manual for more parameter options.
+
+# The `Compose` here is similar to that from the torchvision package, except 
+# that here it also supports callables with multiple inputs and outputs needed
+# for objection detection tasks, i.e., (image, bboxes, labels).
+transforms = Compose([
+    RandomCrop(probability=0.5),
+    affine_transform,
+    Resize((256, 256)),
+])
+
+# Convert the above data sequences into a PyTorch Dataset.
+# Users can have their own way to create the Dataset, as long as for each iteration 
+# it returns a tuple of arrays for a single sample: (image, bboxes, labels).
+ds = PTDataset(image_seq, bboxes_seq, labels_seq, transforms=transforms)
+```
+```python
+# Batching:
+# In PyTorch, it is common to use a Dataset with a DataLoader, which provides
+# batching functionality. However, since the array/tensor shape of each sample 
+# could be different, the default batching may not work. Targetran provides
+# a `collate_fn` that helps producing batches of (image_seq, bboxes_seq, labels_seq).
+from torch.utils.data import DataLoader
+from targetran.utils import collate_fn
+
+data_loader = DataLoader(ds, batch_size=2, collate_fn=collate_fn)
+```
+
+## Image classification
+
+While the tools here are primarily designed for object detection tasks, they can 
+also be used for image classification in which only the images are to be transformed,
+e.g., given a dataset that returns `(image, label)` samples, or even only `image` samples. 
+The `image_only` function can be used to convert a transformation class for this purpose.
+
+If the dataset returns a tuple `(image, ...)` in each iteration, only the `image`
+will be transformed, other parameters that followed such as `(..., label, weight)` 
+will be returned untouched.
+
+If the dataset returns `image` only (not a tuple), then only the transformed `image` will be returned. 
+```python
+from targetran.utils import image_only
+```
+```python
+# TensorFlow.
+ds = ds \
+    .map(image_only(TFRandomCrop())) \
+    .map(image_only(affine_transform)) \
+    .map(image_only(TFResize((256, 256)))) \
+    .batch(32)  # Conventional batching can be used for classification setup.
+```
+```python
+# PyTorch.
+transforms = Compose([
+    image_only(RandomCrop()),
+    image_only(affine_transform),
+    image_only(Resize((256, 256))),
+])
+ds = PTDataset(..., transforms=transforms)
+data_loader = DataLoader(ds, batch_size=32)
+```
+
+## Examples
+
+- [Code examples in this repository](examples) 
+- [Construct a TensorFlow Dataset with Targetran 
+   and object detection data](https://www.kaggle.com/boscoyung/targetran-example-with-tensorflow-dataset) 
+  (Kaggle Notebook)
+- [Image classification with TensorFlow and Targetran on TPU](https://www.kaggle.com/boscoyung/targetran-tpu-for-image-classification-example)
+  (Kaggle Notebook)
+
+# API
+
+See [here](docs/API.md) for API details.
+
+
+%package help
+Summary:	Development documents and examples for targetran
+Provides:	python3-targetran-doc
+%description help
+![logo](logo/targetran_logo.png)
+
+[![ci](https://github.com/bhky/targetran/actions/workflows/ci.yml/badge.svg)](https://github.com/bhky/targetran/actions)
+[![License MIT 1.0](https://img.shields.io/badge/license-MIT%201.0-blue.svg)](LICENSE)
+
+# Motivation
+
+[Data augmentation](https://en.wikipedia.org/wiki/Data_augmentation) 
+is a technique commonly used for training machine learning models in the
+computer vision field, where one can increase the amount of image data by
+creating transformed copies of the original images.
+
+In the object detection sub-field, the transformation has to be done also
+to the target rectangular bounding-boxes. However, such functionality is not 
+readily available in frameworks such as TensorFlow and PyTorch.
+
+While there are other powerful augmentation tools available, many of those 
+do not work well with the 
+[TPU](https://cloud.google.com/tpu)
+when accessing from [Google Colab](https://colab.research.google.com/) or 
+[Kaggle Notebooks](https://www.kaggle.com/code),
+which are popular options nowadays for a lot of people who do not have their
+own hardware resources.
+
+Here comes Targetran to fill the gap.
+
+# What is Targetran?
+
+- A light-weight data augmentation library to assist object detection or 
+  image classification model training.
+- Has simple Python API to transform both the images and the target rectangular 
+  bounding-boxes.
+- Use dataset-idiomatic approach for TensorFlow and PyTorch.
+- Can be used with the TPU for acceleration (TensorFlow Dataset only).
+
+![example](docs/example.png)
+
+(Figure produced by the example code [here](examples/local/run_tf_dataset_local_example.py).)
+
+# Table of contents
+
+- [Installation](#installation)
+- [Usage](#usage)
+  - [Notations](#notations)
+  - [Data format](#data-format)
+  - [Design principles](#design-principles)
+  - [TensorFlow Dataset](#tensorflow-dataset)
+  - [PyTorch Dataset](#pytorch-dataset)
+  - [Image classification](#image-classification)
+  - [Examples](#examples)
+- [API](#api)
+
+# Installation
+
+Tested for Python 3.8, 3.9, and 3.10.
+
+The best way to install Targetran with its dependencies is from PyPI:
+```shell
+python3 -m pip install --upgrade targetran
+```
+Alternatively, to obtain the latest version from this repository:
+```shell
+git clone https://github.com/bhky/targetran.git
+cd targetran
+python3 -m pip install .
+```
+
+# Usage
+
+## Notations
+
+- `NDFloatArray`: NumPy float array type, which is an alias to `np.typing.NDArray[np.float_]`. 
+  The values are converted to `np.float32` internally.
+- `tf.Tensor`: General TensorFlow Tensor type. The values are converted to `tf.float32` internally.
+
+## Data format
+
+For object detection model training, which is the primary usage here, the following data are needed.
+- `image_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(height, width, num_channels)`):
+  - images in channel-last format;
+  - image sizes can be different.
+- `bboxes_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(num_bboxes_per_image, 4)`):
+  - each `bboxes` array/tensor provides the bounding-boxes associated with an image;
+  - each single bounding-box is given as `[top_left_x, top_left_y, bbox_width, bbox_height]`;
+  - empty array/tensor means no bounding-boxes (and labels) for that image.
+- `labels_seq` (Sequence of `NDFloatArray` or `tf.Tensor` of shape `(num_bboxes_per_image,)`):
+  - each `labels` array/tensor provides the bounding-box labels associated with an image;
+  - empty array/tensor means no labels (and bounding-boxes) for that image.
+
+Some dummy data are created below for illustration. Please note the required format.
+```python
+import numpy as np
+
+# Each image could have different sizes, but they must follow the channel-last format, 
+# i.e., (height, width, num_channels).
+image_seq = [np.random.rand(480, 512, 3) for _ in range(3)]
+
+# The bounding-boxes (bboxes) are given as a sequence of NumPy arrays (or TF tensors).
+# Each array represents the bboxes for one corresponding image.
+#
+# Each bbox is given as [top_left_x, top_left_y, bbox_width, bbox_height].
+# 
+# In case an image has no bboxes, an empty array should be provided.
+bboxes_seq = [
+    np.array([  # Image with 2 bboxes.
+        [214, 223, 10, 11],
+        [345, 230, 21, 9],
+    ]),
+    np.array([]),  # Empty array for image with no bboxes.
+    np.array([  # Image with 3 bboxes.
+        [104, 151, 22, 10],
+        [99, 132, 20, 15],
+        [340, 220, 31, 12],
+    ]),
+]
+
+# Labels for the bboxes are also given as a sequence of NumPy arrays (or TF tensors).
+# The number of bboxes and labels should match. An empty array indicates no bboxes/labels.
+labels_seq = [
+    np.array([0, 1]),  # 2 labels.
+    np.array([]),  # No labels.
+    np.array([2, 3, 0]),  # 3 labels.
+]
+
+# During operation, all the data values will be converted to float32.
+```
+
+## Design principles
+
+- Bounding-boxes will always be rectangular with sides parallel to the image frame.
+- After transformation, each resulting bounding-box is determined by the smallest 
+  rectangle (with sides parallel to the image frame) enclosing the original transformed bounding-box.
+- After transformation, resulting bounding-boxes with their centroids outside the 
+  image frame will be removed, together with the corresponding labels.
+
+## TensorFlow Dataset
+
+```python
+import tensorflow as tf
+
+from targetran.tf import (
+    seqs_to_tf_dataset,
+    TFCombineAffine,
+    TFRandomFlipLeftRight,
+    TFRandomFlipUpDown,    
+    TFRandomRotate,
+    TFRandomShear,
+    TFRandomTranslate,
+    TFRandomCrop,
+    TFResize,
+)
+
+# Convert the above data sequences into a TensorFlow Dataset.
+# Users can have their own way to create the Dataset, as long as for each iteration 
+# it returns a tuple of tensors for a single sample: (image, bboxes, labels).
+ds = seqs_to_tf_dataset(image_seq, bboxes_seq, labels_seq)
+
+# The affine transformations can be combined into one operation for better performance.
+# Note that cropping and resizing are not affine and cannot be combined.
+# Option (1):
+affine_transform = TFCombineAffine(
+    [TFRandomRotate(probability=0.8),  # Probability to include each affine transformation step 
+     TFRandomShear(probability=0.6),   # can be specified, otherwise the default value is used.
+     TFRandomTranslate(),              # Thus, the number of selected steps could vary.
+     TFRandomFlipLeftRight(),
+     TFRandomFlipUpDown()],
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Option (2):
+# Alternatively, one can decide the exact number of randomly selected transformations,
+# e.g., use only any two of them. This could be a better option because too many 
+# transformation steps may deform the images too much.
+affine_transform = TFCombineAffine(
+    [TFRandomRotate(),  # Individual `probability` has no effect in this approach.
+     TFRandomShear(),
+     TFRandomTranslate(),
+     TFRandomFlipLeftRight(),
+     TFRandomFlipUpDown()],
+    num_selected_transforms=2,  # Only two steps from the list will be selected.
+    selected_probabilities=[0.5, 0.0, 0.3, 0.2, 0.0],  # Must sum up to 1.0, if given.
+    keep_order=True,  # If True, the selected steps must be performed in the given order.
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Please refer to the API manual for more parameter options.
+
+# Apply transformations.
+auto_tune = tf.data.AUTOTUNE
+ds = ds \
+    .map(TFRandomCrop(probability=0.5), num_parallel_calls=auto_tune) \
+    .map(affine_transform, num_parallel_calls=auto_tune) \
+    .map(TFResize((256, 256)), num_parallel_calls=auto_tune)
+
+# In the Dataset `map` call, the parameter `num_parallel_calls` can be set to,
+# e.g., tf.data.AUTOTUNE, for better performance. See docs for TensorFlow Dataset.
+```
+```python
+# Batching:
+# Since the array/tensor shape of each sample could be different, conventional
+# way of batching may not work. Users will have to consider their own use cases.
+# One possibly useful way is the padded-batch.
+ds = ds.padded_batch(batch_size=2, padding_values=-1.0)
+```
+
+## PyTorch Dataset
+
+```python
+from typing import Optional, Sequence, Tuple
+
+import numpy.typing
+from torch.utils.data import Dataset
+
+from targetran.np import (
+    CombineAffine,
+    RandomFlipLeftRight,
+    RandomFlipUpDown,
+    RandomRotate,
+    RandomShear,
+    RandomTranslate,
+    RandomCrop,
+    Resize,
+)
+from targetran.utils import Compose
+
+NDFloatArray = numpy.typing.NDArray[numpy.float_]
+
+
+class PTDataset(Dataset):
+    """
+    A very simple PyTorch Dataset.
+    As per common practice, transforms are done on NumPy arrays.
+    """
+    
+    def __init__(
+            self,
+            image_seq: Sequence[NDFloatArray],
+            bboxes_seq: Sequence[NDFloatArray],
+            labels_seq: Sequence[NDFloatArray],
+            transforms: Optional[Compose]
+    ) -> None:
+        self.image_seq = image_seq
+        self.bboxes_seq = bboxes_seq
+        self.labels_seq = labels_seq
+        self.transforms = transforms
+
+    def __len__(self) -> int:
+        return len(self.image_seq)
+
+    def __getitem__(
+            self,
+            idx: int
+    ) -> Tuple[NDFloatArray, NDFloatArray, NDFloatArray]:
+        if self.transforms:
+            return self.transforms(
+                self.image_seq[idx],
+                self.bboxes_seq[idx],
+                self.labels_seq[idx]
+            )
+        return (
+            self.image_seq[idx],
+            self.bboxes_seq[idx],
+            self.labels_seq[idx]
+        )
+
+
+# The affine transformations can be combined into one operation for better performance.
+# Note that cropping and resizing are not affine and cannot be combined.
+# Option (1):
+affine_transform = CombineAffine(
+    [RandomRotate(probability=0.8),  # Probability to include each affine transformation step 
+     RandomShear(probability=0.6),   # can be specified, otherwise the default value is used.
+     RandomTranslate(),              # Thus, the number of selected steps could vary.
+     RandomFlipLeftRight(),
+     RandomFlipUpDown()],
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Option (2):
+# Alternatively, one can decide the exact number of randomly selected transformations,
+# e.g., use only any two of them. This could be a better option because too many 
+# transformation steps may deform the images too much.
+affine_transform = CombineAffine(
+    [RandomRotate(),  # Individual `probability` has no effect in this approach.
+     RandomShear(),
+     RandomTranslate(),
+     RandomFlipLeftRight(),
+     RandomFlipUpDown()],
+    num_selected_transforms=2,  # Only two steps from the list will be selected.
+    selected_probabilities=[0.5, 0.0, 0.3, 0.2, 0.0],  # Must sum up to 1.0, if given.
+    keep_order=True,  # If True, the selected steps must be performed in the given order.
+    probability=1.0  # Probability to apply this single combined transformation.
+)
+# Please refer to the API manual for more parameter options.
+
+# The `Compose` here is similar to that from the torchvision package, except 
+# that here it also supports callables with multiple inputs and outputs needed
+# for objection detection tasks, i.e., (image, bboxes, labels).
+transforms = Compose([
+    RandomCrop(probability=0.5),
+    affine_transform,
+    Resize((256, 256)),
+])
+
+# Convert the above data sequences into a PyTorch Dataset.
+# Users can have their own way to create the Dataset, as long as for each iteration 
+# it returns a tuple of arrays for a single sample: (image, bboxes, labels).
+ds = PTDataset(image_seq, bboxes_seq, labels_seq, transforms=transforms)
+```
+```python
+# Batching:
+# In PyTorch, it is common to use a Dataset with a DataLoader, which provides
+# batching functionality. However, since the array/tensor shape of each sample 
+# could be different, the default batching may not work. Targetran provides
+# a `collate_fn` that helps producing batches of (image_seq, bboxes_seq, labels_seq).
+from torch.utils.data import DataLoader
+from targetran.utils import collate_fn
+
+data_loader = DataLoader(ds, batch_size=2, collate_fn=collate_fn)
+```
+
+## Image classification
+
+While the tools here are primarily designed for object detection tasks, they can 
+also be used for image classification in which only the images are to be transformed,
+e.g., given a dataset that returns `(image, label)` samples, or even only `image` samples. 
+The `image_only` function can be used to convert a transformation class for this purpose.
+
+If the dataset returns a tuple `(image, ...)` in each iteration, only the `image`
+will be transformed, other parameters that followed such as `(..., label, weight)` 
+will be returned untouched.
+
+If the dataset returns `image` only (not a tuple), then only the transformed `image` will be returned. 
+```python
+from targetran.utils import image_only
+```
+```python
+# TensorFlow.
+ds = ds \
+    .map(image_only(TFRandomCrop())) \
+    .map(image_only(affine_transform)) \
+    .map(image_only(TFResize((256, 256)))) \
+    .batch(32)  # Conventional batching can be used for classification setup.
+```
+```python
+# PyTorch.
+transforms = Compose([
+    image_only(RandomCrop()),
+    image_only(affine_transform),
+    image_only(Resize((256, 256))),
+])
+ds = PTDataset(..., transforms=transforms)
+data_loader = DataLoader(ds, batch_size=32)
+```
+
+## Examples
+
+- [Code examples in this repository](examples) 
+- [Construct a TensorFlow Dataset with Targetran 
+   and object detection data](https://www.kaggle.com/boscoyung/targetran-example-with-tensorflow-dataset) 
+  (Kaggle Notebook)
+- [Image classification with TensorFlow and Targetran on TPU](https://www.kaggle.com/boscoyung/targetran-tpu-for-image-classification-example)
+  (Kaggle Notebook)
+
+# API
+
+See [here](docs/API.md) for API details.
+
+
+%prep
+%autosetup -n targetran-0.12.0
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-targetran -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Wed May 10 2023 Python_Bot <Python_Bot@openeuler.org> - 0.12.0-1
+- Package Spec generated
author	CoprDistGit <infra@openeuler.org>	2023-05-10 07:45:17 +0000
committer	CoprDistGit <infra@openeuler.org>	2023-05-10 07:45:17 +0000
commit	a4db565020a76f8dad25891372b382f4dd67892f (patch)
tree	8a0565e82980e72e2d4d0bb27536147568fce6be /python-targetran.spec
parent	9f92d34cb58c80f5b0e19837b07f020aa18192e0 (diff)