summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--.gitignore1
-rw-r--r--python-fastdist.spec777
-rw-r--r--sources1
3 files changed, 779 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..321a85d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/fastdist-1.1.5.tar.gz
diff --git a/python-fastdist.spec b/python-fastdist.spec
new file mode 100644
index 0000000..4cababc
--- /dev/null
+++ b/python-fastdist.spec
@@ -0,0 +1,777 @@
+%global _empty_manifest_terminate_build 0
+Name: python-fastdist
+Version: 1.1.5
+Release: 1
+Summary: Faster distance calculations in python using numba
+License: MIT License
+URL: https://github.com/talboger/fastdist
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/e7/a1/75ceda5255db764e9f37d501afe108ccd0952dc2a2d20697dd88e63ae980/fastdist-1.1.5.tar.gz
+BuildArch: noarch
+
+
+%description
+# fastdist: Faster distance calculations in python using numba
+
+fastdist is a replacement for scipy.spatial.distance that shows significant speed improvements by using numba and some optimization
+
+Newer versions of fastdist (> 1.0.0) also add partial implementations of sklearn.metrics which also show significant speed improvements.
+
+What's new in each version:
+
+- 1.1.0: adds implementation of several sklearn.metrics functions, fixes an error in the Chebyshev distance calculation and adds slight speed optimizations.
+- 1.1.1: large speed optimizations for confusion matrix-based metrics (see more about this in the "1.1.1 speed improvements" section), fix precision and recall scores
+- 1.1.2: speed improvement and bug fix for `cosine_pairwise_distance`
+- 1.1.3: bug fix for `f1_score`, which resulted from v1.1.1 speed improvements
+- 1.1.4: bug fix for `float32`, speed improvements for accuracy score by allowing confusion matrix
+- 1.1.5: make cosine function calculate cosine distance rather than cosine distance (as in earlier versions) for consistency with scipy, fix in-place matrix modification for cosine matrix functions
+
+## Installation
+
+Use the package manager [pip](https://pip.pypa.io/en/stable/) to install fastdist.
+
+```bash
+pip install fastdist
+```
+
+## Usage
+
+For calculating the distance between 2 vectors, fastdist uses the same function calls
+as scipy.spatial.distance. So, for example, to calculate the Euclidean distance between
+2 vectors, run:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+u = np.random.rand(100)
+v = np.random.rand(100)
+
+fastdist.euclidean(u, v)
+```
+
+The same is true for most sklearn.metrics functions, though not all functions in sklearn.metrics are implemented in fastdist.
+Notably, most of the ROC-based functions are not (yet) available in fastdist. However, the other functions are the same as sklearn.metrics.
+So, for example, to create a confusion matrix from two discrete vectors, run:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+y_true = np.random.randint(10, size=10000)
+y_pred = np.random.randint(10, size=10000)
+
+fastdist.confusion_matrix(y_true, y_pred)
+```
+
+For calculating distances involving matrices, fastdist has a few different functions instead of scipy's cdist and pdist.
+
+To calculate the distance between a vector and each row of a matrix, use `vector_to_matrix_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+u = np.random.rand(100)
+m = np.random.rand(50, 100)
+
+fastdist.vector_to_matrix_distance(u, m, fastdist.euclidean, "euclidean")
+# returns an array of shape (50,)
+```
+
+To calculate the distance between the rows of 2 matrices, use `matrix_to_matrix_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+a = np.random.rand(25, 100)
+b = np.random.rand(50, 100)
+
+fastdist.matrix_to_matrix_distance(a, b, fastdist.euclidean, "euclidean")
+# returns an array of shape (25, 50)
+```
+
+Finally, to calculate the pairwise distances between the rows of a matrix, use `matrix_pairwise_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+a = np.random.rand(10, 100)
+fastdist.matrix_pairwise_distance(a, fastdist.euclidean, "euclidean", return_matrix=False)
+# returns an array of shape (10 choose 2, 1)
+# to return a matrix with entry (i, j) as the distance between row i and j
+# set return_matrix=True, in which case this will return a (10, 10) array
+```
+
+## Speed
+
+fastdist is significantly faster than scipy.spatial.distance in most cases.
+
+Though almost all functions will show a speed improvement in fastdist, certain functions will have
+an especially large improvement. Notably, cosine similarity is much faster, as are the vector/matrix,
+matrix/matrix, and pairwise matrix calculations.
+
+Note that numba - the primary package fastdist uses - compiles the function to machine code the first
+time it is called. So, the first time you call a function will be slower than the following times, as
+the first runtime includes the compile time.
+
+Here are some examples comparing the speed of fastdist to scipy.spatial.distance:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a, b = np.random.rand(200, 100), np.random.rand(2500, 100)
+%timeit -n 100 fastdist.matrix_to_matrix_distance(a, b, fastdist.cosine, "cosine")
+# 8.97 ms ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+# note this high stdev is because of the first run taking longer to compile
+
+%timeit -n 100 distance.cdist(a, b, "cosine")
+# 57.9 ms ± 4.43 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+In this example, fastdist is about 7x faster than scipy.spatial.distance. This difference only gets larger
+as the matrices get bigger and when we compile the fastdist function once before running it. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a, b = np.random.rand(200, 1000), np.random.rand(2500, 1000)
+# i complied the matrix_to_matrix function once before this so it's already in machine code
+%timeit fastdist.matrix_to_matrix_distance(a, b, fastdist.cosine, "cosine")
+# 25.4 ms ± 1.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
+
+%timeit distance.cdist(a, b, "cosine")
+# 689 ms ± 10.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
+```
+
+Here, fastdist is about 27x faster than scipy.spatial.distance. Though cosine similarity is particularly
+optimized, other functions are still faster with fastdist. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a = np.random.rand(200, 1000)
+
+%timeit fastdist.matrix_pairwise_distance(a, fastdist.euclidean, "euclidean")
+# 14 ms ± 458 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit distance.pdist(a, "euclidean")
+# 26.9 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
+```
+
+fastdist's implementation of the functions in sklearn.metrics are also significantly faster. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=100000)
+y_pred = np.random.randint(2, size=100000)
+
+%timeit fastdist.accuracy_score(y_true, y_pred)
+# 74 µs ± 5.81 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.accuracy_score(y_true, y_pred)
+# 7.23 ms ± 157 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Here, fastdist is about 97x faster than sklearn's implementation.
+
+#### 1.1.1 speed improvements
+
+fastdist v1.1.1 adds significant speed improvements to confusion matrix-based metrics functions (balanced accuracy score, precision, and recall).
+These speed improvements are possible by not recalculating the confusion matrix each time, as sklearn.metrics does.
+
+In older versions of fastdist (<v1.1.1), we also recalculate the confusion matrix each time, giving us the following speed:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=10000)
+
+%timeit fastdist.balanced_accuracy_score(y_true, y_pred)
+# 1.39 ms ± 66.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.balanced_accuracy_score(y_true, y_pred)
+# 11.3 ms ± 185 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Here, fastdist is about 8x faster than sklearn.metrics.
+
+However, now let's say that we need to compute confusion matrices and then also want to compute balanced accuracy:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=00000)
+
+%timeit fastdist.confusion_matrix(y_true, y_pred)
+# 1.45 ms ± 55.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.confusion_matrix(y_true, y_pred)
+# 11.8 ms ± 499 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+The confusion matrix computation by itself is about 8x faster with fastdist. But the larger speed improvement will come now that we don't need to
+recompute the confusion matrix to calculate balanced accuracy:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=10000)
+
+%timeit fastdist.balanced_accuracy_score(y_true, y_pred, cm)
+# 11.7 µs ± 2.12 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.balanced_accuracy_score(y_true, y_pred)
+# 9.81 ms ± 1.08 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Saving the confusion matrix computation here makes fastdist's balanced accuracy score 838x faster than sklearn's.
+
+
+%package -n python3-fastdist
+Summary: Faster distance calculations in python using numba
+Provides: python-fastdist
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-fastdist
+# fastdist: Faster distance calculations in python using numba
+
+fastdist is a replacement for scipy.spatial.distance that shows significant speed improvements by using numba and some optimization
+
+Newer versions of fastdist (> 1.0.0) also add partial implementations of sklearn.metrics which also show significant speed improvements.
+
+What's new in each version:
+
+- 1.1.0: adds implementation of several sklearn.metrics functions, fixes an error in the Chebyshev distance calculation and adds slight speed optimizations.
+- 1.1.1: large speed optimizations for confusion matrix-based metrics (see more about this in the "1.1.1 speed improvements" section), fix precision and recall scores
+- 1.1.2: speed improvement and bug fix for `cosine_pairwise_distance`
+- 1.1.3: bug fix for `f1_score`, which resulted from v1.1.1 speed improvements
+- 1.1.4: bug fix for `float32`, speed improvements for accuracy score by allowing confusion matrix
+- 1.1.5: make cosine function calculate cosine distance rather than cosine distance (as in earlier versions) for consistency with scipy, fix in-place matrix modification for cosine matrix functions
+
+## Installation
+
+Use the package manager [pip](https://pip.pypa.io/en/stable/) to install fastdist.
+
+```bash
+pip install fastdist
+```
+
+## Usage
+
+For calculating the distance between 2 vectors, fastdist uses the same function calls
+as scipy.spatial.distance. So, for example, to calculate the Euclidean distance between
+2 vectors, run:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+u = np.random.rand(100)
+v = np.random.rand(100)
+
+fastdist.euclidean(u, v)
+```
+
+The same is true for most sklearn.metrics functions, though not all functions in sklearn.metrics are implemented in fastdist.
+Notably, most of the ROC-based functions are not (yet) available in fastdist. However, the other functions are the same as sklearn.metrics.
+So, for example, to create a confusion matrix from two discrete vectors, run:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+y_true = np.random.randint(10, size=10000)
+y_pred = np.random.randint(10, size=10000)
+
+fastdist.confusion_matrix(y_true, y_pred)
+```
+
+For calculating distances involving matrices, fastdist has a few different functions instead of scipy's cdist and pdist.
+
+To calculate the distance between a vector and each row of a matrix, use `vector_to_matrix_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+u = np.random.rand(100)
+m = np.random.rand(50, 100)
+
+fastdist.vector_to_matrix_distance(u, m, fastdist.euclidean, "euclidean")
+# returns an array of shape (50,)
+```
+
+To calculate the distance between the rows of 2 matrices, use `matrix_to_matrix_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+a = np.random.rand(25, 100)
+b = np.random.rand(50, 100)
+
+fastdist.matrix_to_matrix_distance(a, b, fastdist.euclidean, "euclidean")
+# returns an array of shape (25, 50)
+```
+
+Finally, to calculate the pairwise distances between the rows of a matrix, use `matrix_pairwise_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+a = np.random.rand(10, 100)
+fastdist.matrix_pairwise_distance(a, fastdist.euclidean, "euclidean", return_matrix=False)
+# returns an array of shape (10 choose 2, 1)
+# to return a matrix with entry (i, j) as the distance between row i and j
+# set return_matrix=True, in which case this will return a (10, 10) array
+```
+
+## Speed
+
+fastdist is significantly faster than scipy.spatial.distance in most cases.
+
+Though almost all functions will show a speed improvement in fastdist, certain functions will have
+an especially large improvement. Notably, cosine similarity is much faster, as are the vector/matrix,
+matrix/matrix, and pairwise matrix calculations.
+
+Note that numba - the primary package fastdist uses - compiles the function to machine code the first
+time it is called. So, the first time you call a function will be slower than the following times, as
+the first runtime includes the compile time.
+
+Here are some examples comparing the speed of fastdist to scipy.spatial.distance:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a, b = np.random.rand(200, 100), np.random.rand(2500, 100)
+%timeit -n 100 fastdist.matrix_to_matrix_distance(a, b, fastdist.cosine, "cosine")
+# 8.97 ms ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+# note this high stdev is because of the first run taking longer to compile
+
+%timeit -n 100 distance.cdist(a, b, "cosine")
+# 57.9 ms ± 4.43 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+In this example, fastdist is about 7x faster than scipy.spatial.distance. This difference only gets larger
+as the matrices get bigger and when we compile the fastdist function once before running it. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a, b = np.random.rand(200, 1000), np.random.rand(2500, 1000)
+# i complied the matrix_to_matrix function once before this so it's already in machine code
+%timeit fastdist.matrix_to_matrix_distance(a, b, fastdist.cosine, "cosine")
+# 25.4 ms ± 1.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
+
+%timeit distance.cdist(a, b, "cosine")
+# 689 ms ± 10.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
+```
+
+Here, fastdist is about 27x faster than scipy.spatial.distance. Though cosine similarity is particularly
+optimized, other functions are still faster with fastdist. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a = np.random.rand(200, 1000)
+
+%timeit fastdist.matrix_pairwise_distance(a, fastdist.euclidean, "euclidean")
+# 14 ms ± 458 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit distance.pdist(a, "euclidean")
+# 26.9 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
+```
+
+fastdist's implementation of the functions in sklearn.metrics are also significantly faster. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=100000)
+y_pred = np.random.randint(2, size=100000)
+
+%timeit fastdist.accuracy_score(y_true, y_pred)
+# 74 µs ± 5.81 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.accuracy_score(y_true, y_pred)
+# 7.23 ms ± 157 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Here, fastdist is about 97x faster than sklearn's implementation.
+
+#### 1.1.1 speed improvements
+
+fastdist v1.1.1 adds significant speed improvements to confusion matrix-based metrics functions (balanced accuracy score, precision, and recall).
+These speed improvements are possible by not recalculating the confusion matrix each time, as sklearn.metrics does.
+
+In older versions of fastdist (<v1.1.1), we also recalculate the confusion matrix each time, giving us the following speed:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=10000)
+
+%timeit fastdist.balanced_accuracy_score(y_true, y_pred)
+# 1.39 ms ± 66.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.balanced_accuracy_score(y_true, y_pred)
+# 11.3 ms ± 185 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Here, fastdist is about 8x faster than sklearn.metrics.
+
+However, now let's say that we need to compute confusion matrices and then also want to compute balanced accuracy:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=00000)
+
+%timeit fastdist.confusion_matrix(y_true, y_pred)
+# 1.45 ms ± 55.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.confusion_matrix(y_true, y_pred)
+# 11.8 ms ± 499 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+The confusion matrix computation by itself is about 8x faster with fastdist. But the larger speed improvement will come now that we don't need to
+recompute the confusion matrix to calculate balanced accuracy:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=10000)
+
+%timeit fastdist.balanced_accuracy_score(y_true, y_pred, cm)
+# 11.7 µs ± 2.12 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.balanced_accuracy_score(y_true, y_pred)
+# 9.81 ms ± 1.08 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Saving the confusion matrix computation here makes fastdist's balanced accuracy score 838x faster than sklearn's.
+
+
+%package help
+Summary: Development documents and examples for fastdist
+Provides: python3-fastdist-doc
+%description help
+# fastdist: Faster distance calculations in python using numba
+
+fastdist is a replacement for scipy.spatial.distance that shows significant speed improvements by using numba and some optimization
+
+Newer versions of fastdist (> 1.0.0) also add partial implementations of sklearn.metrics which also show significant speed improvements.
+
+What's new in each version:
+
+- 1.1.0: adds implementation of several sklearn.metrics functions, fixes an error in the Chebyshev distance calculation and adds slight speed optimizations.
+- 1.1.1: large speed optimizations for confusion matrix-based metrics (see more about this in the "1.1.1 speed improvements" section), fix precision and recall scores
+- 1.1.2: speed improvement and bug fix for `cosine_pairwise_distance`
+- 1.1.3: bug fix for `f1_score`, which resulted from v1.1.1 speed improvements
+- 1.1.4: bug fix for `float32`, speed improvements for accuracy score by allowing confusion matrix
+- 1.1.5: make cosine function calculate cosine distance rather than cosine distance (as in earlier versions) for consistency with scipy, fix in-place matrix modification for cosine matrix functions
+
+## Installation
+
+Use the package manager [pip](https://pip.pypa.io/en/stable/) to install fastdist.
+
+```bash
+pip install fastdist
+```
+
+## Usage
+
+For calculating the distance between 2 vectors, fastdist uses the same function calls
+as scipy.spatial.distance. So, for example, to calculate the Euclidean distance between
+2 vectors, run:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+u = np.random.rand(100)
+v = np.random.rand(100)
+
+fastdist.euclidean(u, v)
+```
+
+The same is true for most sklearn.metrics functions, though not all functions in sklearn.metrics are implemented in fastdist.
+Notably, most of the ROC-based functions are not (yet) available in fastdist. However, the other functions are the same as sklearn.metrics.
+So, for example, to create a confusion matrix from two discrete vectors, run:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+y_true = np.random.randint(10, size=10000)
+y_pred = np.random.randint(10, size=10000)
+
+fastdist.confusion_matrix(y_true, y_pred)
+```
+
+For calculating distances involving matrices, fastdist has a few different functions instead of scipy's cdist and pdist.
+
+To calculate the distance between a vector and each row of a matrix, use `vector_to_matrix_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+u = np.random.rand(100)
+m = np.random.rand(50, 100)
+
+fastdist.vector_to_matrix_distance(u, m, fastdist.euclidean, "euclidean")
+# returns an array of shape (50,)
+```
+
+To calculate the distance between the rows of 2 matrices, use `matrix_to_matrix_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+a = np.random.rand(25, 100)
+b = np.random.rand(50, 100)
+
+fastdist.matrix_to_matrix_distance(a, b, fastdist.euclidean, "euclidean")
+# returns an array of shape (25, 50)
+```
+
+Finally, to calculate the pairwise distances between the rows of a matrix, use `matrix_pairwise_distance`:
+
+```python
+from fastdist import fastdist
+import numpy as np
+
+a = np.random.rand(10, 100)
+fastdist.matrix_pairwise_distance(a, fastdist.euclidean, "euclidean", return_matrix=False)
+# returns an array of shape (10 choose 2, 1)
+# to return a matrix with entry (i, j) as the distance between row i and j
+# set return_matrix=True, in which case this will return a (10, 10) array
+```
+
+## Speed
+
+fastdist is significantly faster than scipy.spatial.distance in most cases.
+
+Though almost all functions will show a speed improvement in fastdist, certain functions will have
+an especially large improvement. Notably, cosine similarity is much faster, as are the vector/matrix,
+matrix/matrix, and pairwise matrix calculations.
+
+Note that numba - the primary package fastdist uses - compiles the function to machine code the first
+time it is called. So, the first time you call a function will be slower than the following times, as
+the first runtime includes the compile time.
+
+Here are some examples comparing the speed of fastdist to scipy.spatial.distance:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a, b = np.random.rand(200, 100), np.random.rand(2500, 100)
+%timeit -n 100 fastdist.matrix_to_matrix_distance(a, b, fastdist.cosine, "cosine")
+# 8.97 ms ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+# note this high stdev is because of the first run taking longer to compile
+
+%timeit -n 100 distance.cdist(a, b, "cosine")
+# 57.9 ms ± 4.43 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+In this example, fastdist is about 7x faster than scipy.spatial.distance. This difference only gets larger
+as the matrices get bigger and when we compile the fastdist function once before running it. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a, b = np.random.rand(200, 1000), np.random.rand(2500, 1000)
+# i complied the matrix_to_matrix function once before this so it's already in machine code
+%timeit fastdist.matrix_to_matrix_distance(a, b, fastdist.cosine, "cosine")
+# 25.4 ms ± 1.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
+
+%timeit distance.cdist(a, b, "cosine")
+# 689 ms ± 10.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
+```
+
+Here, fastdist is about 27x faster than scipy.spatial.distance. Though cosine similarity is particularly
+optimized, other functions are still faster with fastdist. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from scipy.spatial import distance
+
+a = np.random.rand(200, 1000)
+
+%timeit fastdist.matrix_pairwise_distance(a, fastdist.euclidean, "euclidean")
+# 14 ms ± 458 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit distance.pdist(a, "euclidean")
+# 26.9 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
+```
+
+fastdist's implementation of the functions in sklearn.metrics are also significantly faster. For example:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=100000)
+y_pred = np.random.randint(2, size=100000)
+
+%timeit fastdist.accuracy_score(y_true, y_pred)
+# 74 µs ± 5.81 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.accuracy_score(y_true, y_pred)
+# 7.23 ms ± 157 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Here, fastdist is about 97x faster than sklearn's implementation.
+
+#### 1.1.1 speed improvements
+
+fastdist v1.1.1 adds significant speed improvements to confusion matrix-based metrics functions (balanced accuracy score, precision, and recall).
+These speed improvements are possible by not recalculating the confusion matrix each time, as sklearn.metrics does.
+
+In older versions of fastdist (<v1.1.1), we also recalculate the confusion matrix each time, giving us the following speed:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=10000)
+
+%timeit fastdist.balanced_accuracy_score(y_true, y_pred)
+# 1.39 ms ± 66.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.balanced_accuracy_score(y_true, y_pred)
+# 11.3 ms ± 185 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Here, fastdist is about 8x faster than sklearn.metrics.
+
+However, now let's say that we need to compute confusion matrices and then also want to compute balanced accuracy:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=00000)
+
+%timeit fastdist.confusion_matrix(y_true, y_pred)
+# 1.45 ms ± 55.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.confusion_matrix(y_true, y_pred)
+# 11.8 ms ± 499 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+The confusion matrix computation by itself is about 8x faster with fastdist. But the larger speed improvement will come now that we don't need to
+recompute the confusion matrix to calculate balanced accuracy:
+
+```python
+from fastdist import fastdist
+import numpy as np
+from sklearn import metrics
+
+y_true = np.random.randint(2, size=10000)
+y_pred = np.random.randint(2, size=10000)
+
+%timeit fastdist.balanced_accuracy_score(y_true, y_pred, cm)
+# 11.7 µs ± 2.12 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
+
+%timeit metrics.balanced_accuracy_score(y_true, y_pred)
+# 9.81 ms ± 1.08 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
+```
+
+Saving the confusion matrix computation here makes fastdist's balanced accuracy score 838x faster than sklearn's.
+
+
+%prep
+%autosetup -n fastdist-1.1.5
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-fastdist -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Mon May 29 2023 Python_Bot <Python_Bot@openeuler.org> - 1.1.5-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..8a878c6
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+6e86ed5b62ec633e17f1f52bb58189a7 fastdist-1.1.5.tar.gz