%global _empty_manifest_terminate_build 0 Name: python-deep-utils Version: 1.0.2 Release: 1 Summary: Deep Utils License: MIT License URL: https://github.com/pooya-mohammadi/deep_utils Source0: https://mirrors.nju.edu.cn/pypi/web/packages/d8/a2/6e84a6fe1f1a1db4f046e1b4d0f8fbf8b4023b7de49f51df90d4a166a5bd/deep_utils-1.0.2.tar.gz BuildArch: noarch Requires: python3-numpy Requires: python3-requests Requires: python3-tqdm Requires: python3-opencv-python Requires: python3-tensorflow Requires: python3-opencv-python Requires: python3-torch Requires: python3-torchvision Requires: python3-torchaudio Requires: python3-torch Requires: python3-torchvision Requires: python3-torchaudio Requires: python3-transformers Requires: python3-torch Requires: python3-torchvision Requires: python3-opencv-python %description [](https://pepy.tech/project/deep_utils) [](https://pypi.python.org/pypi/deep_utils) [](https://github.com/pooya-mohammadi/deep_utils/actions/workflows/ci-tests.yml)
## Object Detection
### YoloV5
YoloV5 by far is one of the top-5 most used object detection models. The training process is straight forward and the
results
are spectacular. However, using a trained model can be very challenging because of several files that yolov5's model
needs in production.
To tackle this issue we have wrapped yolov5's models in a simple module whose usage will be illustrated in the following
section.
3. Detect and visualize Objects
```python
# Detect the objects
# the image is opened by cv2 which results to a BGR image. Therefore the `is_rgb` is set to `False`
result = yolov5.detect_objects(base_image, is_rgb=False, confidence=0.5)
# Draw detected boxes on the image.
img = Box.put_box_text(base_image,
box=result.boxes,
label=[f"{c_n} {c}" for c_n, c in zip(result.class_names, result.confidences)])
# pil.Image is used for visualization
Image.fromarray(img[..., ::-1]) # convert to rgb
# visualize using oepncv
# show_destroy_cv2(img)
```
## NLP
In this section, models and utilities for nlp projects are provided
### NER
Name Entity Recognition
#### multi-label-stratify
## Augmentation
### CutMix

As it illustrated in the above image a section of the triangle and the circle are combined together. By
changing `seg_cutmix` to `seg_cutmix_batch` one can use CutMix augmentation for batch of images.
```python
cutmix_img, cutmix_mask = CutMixTF.seg_cutmix_batch(a_images=batch_img, a_masks=batch_mask[..., 0], beta=1)
```
**Input:**

## Utils
In this section, various utility functions are provided.
### DictNamedTuple
In this custom data type, we have added the methods of the Dict type to the NamedTuple type. You have access to .get(),
.values(), .items() alongside all of the functionalities of a NamedTuple. Also, all the outputs of our models are
DictNamedTuple, and you can modify and manipulate them easily. Let's see how to use it:
```
from deep_utils import dictnamedtuple
# create a new object
dict_object = dictnamedtuple(typename='letters', field_names=['firstname', 'lastname'])
# pass the values
instance_dict = dict_object(firstname='pooya', lastname='mohammadi')
# get items and ...
print("items: ", instance_dict.items())
print("keys: ", instance_dict.keys())
print("values: ", instance_dict.values())
print("firstname: ", instance_dict.firstname)
print("firstname: ", instance_dict['firstname'])
print("lastname: ", instance_dict.lastname)
print("lastname: ", instance_dict['lastname'])
```
```
# results
items: [('firstname', 'pooya'), ('lastname', 'mohammadi')]
keys: ['firstname', 'lastname']
values: ['pooya', 'mohammadi']
firstname: pooya
firstname: pooya
lastname: mohammadi
lastname: mohammadi
```
### Multi-Label-Stratify
While splitting a dataset for NER or Object detection tasks, you might have noticed that there is no way to split the
dataset using
stratify functionality of `train_test_split` of the `scikit-learn` library because not only does each sample in these two tasks may
have
more than one tag/object, but also each tag/object of each class may appear more than once. For example, an image/sample may
contain two dogs and three cats, which means the label/y of that sample would be like [2, 3] in which the index zero
corresponds
to the dog class, and the index one corresponds to the cat class.
To split these types of datasets, the following function is
developed in the `deep_utils` library which is very easy to use. To use this function, two arrays are needed. The first
is an array
or list containing the input samples. The type of these samples could be anything; they could be a list of sentences, a
list of
paths to input images, or even structured data like the one in the following example. The other array, however, must be
a 2D ndarray whose first dimension is equal to the number of samples, and the second dimension is equal to the number
of the classes. Likewise, each index is correspondent to a class, and each element of this array shows the number of
each sample in a specific class. For example, the element in index `[0, 0]` of the following array
`[[1, 0], [3, 3]]`, which is equal to 1, shows that the sample 0 contains 1 item of the first class or the class that
corresponds to index zero. Now, let's see an example:
```commandline
>>> from deep_utils import stratify_train_test_split_multi_label
>>> x = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([[0, 0], [0, 0], [0, 1], [0, 1], [1, 1], [1, 1], [1, 0], [1, 0]])
>>> x_train, x_test, y_train, y_test = stratify_train_test_split_multi_label(x, y, test_size=0.5, closest_ratio=False)
>>> x_train
array([[1, 2],
[3, 4],
[1, 2],
[3, 4]])
>>> x_test
array([[1, 2],
[3, 4],
[1, 2],
[3, 4]])
>>> y_train
array([[0, 1],
[0, 1],
[1, 0],
[1, 0]])
>>> y_test
array([[1, 1],
[1, 1],
[0, 0],
[0, 0]])
>>> print("class ratio:", tuple(y_test.sum(0) / y.sum(0)))
class ratio: (0.5, 0.5)
>>> print("sample ratio:", y_test.shape[0] / y.shape[0])
sample ratio: 0.5
```
As clearly shown in the results, both the sample and the class ratios are preserved. In some datasets,
it's impossible to get the exact expected ratio, so the function will split the input dataset in a way that it would
be the closest ratio to the expected one. Link to code:
https://github.com/pooya-mohammadi/deep_utils/blob/master/deep_utils/utils/multi_label_utils/stratify/stratify_train_test_split.py
## Tests
Tests are done for python 3.8 and 3.9. Deep-Utils will probably run without any errors on lower versions as well.
**Note**: Model tests are done on CPU devices provided by GitHub Actions. GPU based models are tested manually by the
authors.
## Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any
contributions you make are **greatly appreciated**.
If you have a suggestion that would make this toolkit enhanced, please fork the repo and create a pull request. You can
also simply open an issue with the tag "enhancement".
Don't forget to give the project a ⭐️! Thanks again!
1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## 🌟 Spread the word!
If you want to say thank you and/or support active development of the repo:
- Add a GitHub Star to the project!
- Join our discord servers [Deep Utils](https://discord.gg/pWe3yChw) .
- Follow my profile [pooya-mohammadi](https://github.com/pooya-mohammadi)
Thanks so much for your interest in growing the reach of the repo!
## ⚠️ License
Distributed under the MIT License. See `LICENSE` for more information.
The LICENSE of each model is located inside its corresponding directory.
## 🤝 Collaborators
|
Pooya Mohammadi Kazaj |
Vargha Khallokhi |
Zahra Zamanshoar |
Dorna Sabet |
Menua Bedrosian |
Alireza Kazemipour |
## Object Detection
### YoloV5
YoloV5 by far is one of the top-5 most used object detection models. The training process is straight forward and the
results
are spectacular. However, using a trained model can be very challenging because of several files that yolov5's model
needs in production.
To tackle this issue we have wrapped yolov5's models in a simple module whose usage will be illustrated in the following
section.
3. Detect and visualize Objects
```python
# Detect the objects
# the image is opened by cv2 which results to a BGR image. Therefore the `is_rgb` is set to `False`
result = yolov5.detect_objects(base_image, is_rgb=False, confidence=0.5)
# Draw detected boxes on the image.
img = Box.put_box_text(base_image,
box=result.boxes,
label=[f"{c_n} {c}" for c_n, c in zip(result.class_names, result.confidences)])
# pil.Image is used for visualization
Image.fromarray(img[..., ::-1]) # convert to rgb
# visualize using oepncv
# show_destroy_cv2(img)
```
## NLP
In this section, models and utilities for nlp projects are provided
### NER
Name Entity Recognition
#### multi-label-stratify
## Augmentation
### CutMix

As it illustrated in the above image a section of the triangle and the circle are combined together. By
changing `seg_cutmix` to `seg_cutmix_batch` one can use CutMix augmentation for batch of images.
```python
cutmix_img, cutmix_mask = CutMixTF.seg_cutmix_batch(a_images=batch_img, a_masks=batch_mask[..., 0], beta=1)
```
**Input:**

## Utils
In this section, various utility functions are provided.
### DictNamedTuple
In this custom data type, we have added the methods of the Dict type to the NamedTuple type. You have access to .get(),
.values(), .items() alongside all of the functionalities of a NamedTuple. Also, all the outputs of our models are
DictNamedTuple, and you can modify and manipulate them easily. Let's see how to use it:
```
from deep_utils import dictnamedtuple
# create a new object
dict_object = dictnamedtuple(typename='letters', field_names=['firstname', 'lastname'])
# pass the values
instance_dict = dict_object(firstname='pooya', lastname='mohammadi')
# get items and ...
print("items: ", instance_dict.items())
print("keys: ", instance_dict.keys())
print("values: ", instance_dict.values())
print("firstname: ", instance_dict.firstname)
print("firstname: ", instance_dict['firstname'])
print("lastname: ", instance_dict.lastname)
print("lastname: ", instance_dict['lastname'])
```
```
# results
items: [('firstname', 'pooya'), ('lastname', 'mohammadi')]
keys: ['firstname', 'lastname']
values: ['pooya', 'mohammadi']
firstname: pooya
firstname: pooya
lastname: mohammadi
lastname: mohammadi
```
### Multi-Label-Stratify
While splitting a dataset for NER or Object detection tasks, you might have noticed that there is no way to split the
dataset using
stratify functionality of `train_test_split` of the `scikit-learn` library because not only does each sample in these two tasks may
have
more than one tag/object, but also each tag/object of each class may appear more than once. For example, an image/sample may
contain two dogs and three cats, which means the label/y of that sample would be like [2, 3] in which the index zero
corresponds
to the dog class, and the index one corresponds to the cat class.
To split these types of datasets, the following function is
developed in the `deep_utils` library which is very easy to use. To use this function, two arrays are needed. The first
is an array
or list containing the input samples. The type of these samples could be anything; they could be a list of sentences, a
list of
paths to input images, or even structured data like the one in the following example. The other array, however, must be
a 2D ndarray whose first dimension is equal to the number of samples, and the second dimension is equal to the number
of the classes. Likewise, each index is correspondent to a class, and each element of this array shows the number of
each sample in a specific class. For example, the element in index `[0, 0]` of the following array
`[[1, 0], [3, 3]]`, which is equal to 1, shows that the sample 0 contains 1 item of the first class or the class that
corresponds to index zero. Now, let's see an example:
```commandline
>>> from deep_utils import stratify_train_test_split_multi_label
>>> x = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([[0, 0], [0, 0], [0, 1], [0, 1], [1, 1], [1, 1], [1, 0], [1, 0]])
>>> x_train, x_test, y_train, y_test = stratify_train_test_split_multi_label(x, y, test_size=0.5, closest_ratio=False)
>>> x_train
array([[1, 2],
[3, 4],
[1, 2],
[3, 4]])
>>> x_test
array([[1, 2],
[3, 4],
[1, 2],
[3, 4]])
>>> y_train
array([[0, 1],
[0, 1],
[1, 0],
[1, 0]])
>>> y_test
array([[1, 1],
[1, 1],
[0, 0],
[0, 0]])
>>> print("class ratio:", tuple(y_test.sum(0) / y.sum(0)))
class ratio: (0.5, 0.5)
>>> print("sample ratio:", y_test.shape[0] / y.shape[0])
sample ratio: 0.5
```
As clearly shown in the results, both the sample and the class ratios are preserved. In some datasets,
it's impossible to get the exact expected ratio, so the function will split the input dataset in a way that it would
be the closest ratio to the expected one. Link to code:
https://github.com/pooya-mohammadi/deep_utils/blob/master/deep_utils/utils/multi_label_utils/stratify/stratify_train_test_split.py
## Tests
Tests are done for python 3.8 and 3.9. Deep-Utils will probably run without any errors on lower versions as well.
**Note**: Model tests are done on CPU devices provided by GitHub Actions. GPU based models are tested manually by the
authors.
## Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any
contributions you make are **greatly appreciated**.
If you have a suggestion that would make this toolkit enhanced, please fork the repo and create a pull request. You can
also simply open an issue with the tag "enhancement".
Don't forget to give the project a ⭐️! Thanks again!
1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## 🌟 Spread the word!
If you want to say thank you and/or support active development of the repo:
- Add a GitHub Star to the project!
- Join our discord servers [Deep Utils](https://discord.gg/pWe3yChw) .
- Follow my profile [pooya-mohammadi](https://github.com/pooya-mohammadi)
Thanks so much for your interest in growing the reach of the repo!
## ⚠️ License
Distributed under the MIT License. See `LICENSE` for more information.
The LICENSE of each model is located inside its corresponding directory.
## 🤝 Collaborators
|
Pooya Mohammadi Kazaj |
Vargha Khallokhi |
Zahra Zamanshoar |
Dorna Sabet |
Menua Bedrosian |
Alireza Kazemipour |
## Object Detection
### YoloV5
YoloV5 by far is one of the top-5 most used object detection models. The training process is straight forward and the
results
are spectacular. However, using a trained model can be very challenging because of several files that yolov5's model
needs in production.
To tackle this issue we have wrapped yolov5's models in a simple module whose usage will be illustrated in the following
section.
3. Detect and visualize Objects
```python
# Detect the objects
# the image is opened by cv2 which results to a BGR image. Therefore the `is_rgb` is set to `False`
result = yolov5.detect_objects(base_image, is_rgb=False, confidence=0.5)
# Draw detected boxes on the image.
img = Box.put_box_text(base_image,
box=result.boxes,
label=[f"{c_n} {c}" for c_n, c in zip(result.class_names, result.confidences)])
# pil.Image is used for visualization
Image.fromarray(img[..., ::-1]) # convert to rgb
# visualize using oepncv
# show_destroy_cv2(img)
```
## NLP
In this section, models and utilities for nlp projects are provided
### NER
Name Entity Recognition
#### multi-label-stratify
## Augmentation
### CutMix

As it illustrated in the above image a section of the triangle and the circle are combined together. By
changing `seg_cutmix` to `seg_cutmix_batch` one can use CutMix augmentation for batch of images.
```python
cutmix_img, cutmix_mask = CutMixTF.seg_cutmix_batch(a_images=batch_img, a_masks=batch_mask[..., 0], beta=1)
```
**Input:**

## Utils
In this section, various utility functions are provided.
### DictNamedTuple
In this custom data type, we have added the methods of the Dict type to the NamedTuple type. You have access to .get(),
.values(), .items() alongside all of the functionalities of a NamedTuple. Also, all the outputs of our models are
DictNamedTuple, and you can modify and manipulate them easily. Let's see how to use it:
```
from deep_utils import dictnamedtuple
# create a new object
dict_object = dictnamedtuple(typename='letters', field_names=['firstname', 'lastname'])
# pass the values
instance_dict = dict_object(firstname='pooya', lastname='mohammadi')
# get items and ...
print("items: ", instance_dict.items())
print("keys: ", instance_dict.keys())
print("values: ", instance_dict.values())
print("firstname: ", instance_dict.firstname)
print("firstname: ", instance_dict['firstname'])
print("lastname: ", instance_dict.lastname)
print("lastname: ", instance_dict['lastname'])
```
```
# results
items: [('firstname', 'pooya'), ('lastname', 'mohammadi')]
keys: ['firstname', 'lastname']
values: ['pooya', 'mohammadi']
firstname: pooya
firstname: pooya
lastname: mohammadi
lastname: mohammadi
```
### Multi-Label-Stratify
While splitting a dataset for NER or Object detection tasks, you might have noticed that there is no way to split the
dataset using
stratify functionality of `train_test_split` of the `scikit-learn` library because not only does each sample in these two tasks may
have
more than one tag/object, but also each tag/object of each class may appear more than once. For example, an image/sample may
contain two dogs and three cats, which means the label/y of that sample would be like [2, 3] in which the index zero
corresponds
to the dog class, and the index one corresponds to the cat class.
To split these types of datasets, the following function is
developed in the `deep_utils` library which is very easy to use. To use this function, two arrays are needed. The first
is an array
or list containing the input samples. The type of these samples could be anything; they could be a list of sentences, a
list of
paths to input images, or even structured data like the one in the following example. The other array, however, must be
a 2D ndarray whose first dimension is equal to the number of samples, and the second dimension is equal to the number
of the classes. Likewise, each index is correspondent to a class, and each element of this array shows the number of
each sample in a specific class. For example, the element in index `[0, 0]` of the following array
`[[1, 0], [3, 3]]`, which is equal to 1, shows that the sample 0 contains 1 item of the first class or the class that
corresponds to index zero. Now, let's see an example:
```commandline
>>> from deep_utils import stratify_train_test_split_multi_label
>>> x = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([[0, 0], [0, 0], [0, 1], [0, 1], [1, 1], [1, 1], [1, 0], [1, 0]])
>>> x_train, x_test, y_train, y_test = stratify_train_test_split_multi_label(x, y, test_size=0.5, closest_ratio=False)
>>> x_train
array([[1, 2],
[3, 4],
[1, 2],
[3, 4]])
>>> x_test
array([[1, 2],
[3, 4],
[1, 2],
[3, 4]])
>>> y_train
array([[0, 1],
[0, 1],
[1, 0],
[1, 0]])
>>> y_test
array([[1, 1],
[1, 1],
[0, 0],
[0, 0]])
>>> print("class ratio:", tuple(y_test.sum(0) / y.sum(0)))
class ratio: (0.5, 0.5)
>>> print("sample ratio:", y_test.shape[0] / y.shape[0])
sample ratio: 0.5
```
As clearly shown in the results, both the sample and the class ratios are preserved. In some datasets,
it's impossible to get the exact expected ratio, so the function will split the input dataset in a way that it would
be the closest ratio to the expected one. Link to code:
https://github.com/pooya-mohammadi/deep_utils/blob/master/deep_utils/utils/multi_label_utils/stratify/stratify_train_test_split.py
## Tests
Tests are done for python 3.8 and 3.9. Deep-Utils will probably run without any errors on lower versions as well.
**Note**: Model tests are done on CPU devices provided by GitHub Actions. GPU based models are tested manually by the
authors.
## Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any
contributions you make are **greatly appreciated**.
If you have a suggestion that would make this toolkit enhanced, please fork the repo and create a pull request. You can
also simply open an issue with the tag "enhancement".
Don't forget to give the project a ⭐️! Thanks again!
1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## 🌟 Spread the word!
If you want to say thank you and/or support active development of the repo:
- Add a GitHub Star to the project!
- Join our discord servers [Deep Utils](https://discord.gg/pWe3yChw) .
- Follow my profile [pooya-mohammadi](https://github.com/pooya-mohammadi)
Thanks so much for your interest in growing the reach of the repo!
## ⚠️ License
Distributed under the MIT License. See `LICENSE` for more information.
The LICENSE of each model is located inside its corresponding directory.
## 🤝 Collaborators
|
Pooya Mohammadi Kazaj |
Vargha Khallokhi |
Zahra Zamanshoar |
Dorna Sabet |
Menua Bedrosian |
Alireza Kazemipour |