diff options
| author | CoprDistGit <infra@openeuler.org> | 2023-06-08 14:30:10 +0000 |
|---|---|---|
| committer | CoprDistGit <infra@openeuler.org> | 2023-06-08 14:30:10 +0000 |
| commit | c981d2b41739959a981afdb5be83d287362035f2 (patch) | |
| tree | b8049d93687e5b60a96e3f252ed4a61ce0b09bec | |
| parent | 01dba0dde715e950b933eaf295cc196e0dfca77a (diff) | |
automatic import of python-deepsparseopeneuler20.03
| -rw-r--r-- | .gitignore | 1 | ||||
| -rw-r--r-- | python-deepsparse.spec | 522 | ||||
| -rw-r--r-- | sources | 2 |
3 files changed, 254 insertions, 271 deletions
@@ -1 +1,2 @@ /deepsparse-1.4.2.tar.gz +/deepsparse-1.5.0.tar.gz diff --git a/python-deepsparse.spec b/python-deepsparse.spec index 60eb037..df9b7a8 100644 --- a/python-deepsparse.spec +++ b/python-deepsparse.spec @@ -1,11 +1,11 @@ %global _empty_manifest_terminate_build 0 Name: python-deepsparse -Version: 1.4.2 +Version: 1.5.0 Release: 1 Summary: An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application License: Neural Magic DeepSparse Community License, Apache URL: https://github.com/neuralmagic/deepsparse -Source0: https://mirrors.nju.edu.cn/pypi/web/packages/35/fc/e2b35362e0d2077dadc7a2ca671ad016eb3e829e4c273c601b9097148e09/deepsparse-1.4.2.tar.gz +Source0: https://mirrors.aliyun.com/pypi/web/packages/af/9a/4e35b6aa1f14544536f9f14b6fa2888a450ee94fdb32d2e8eba0274733a8/deepsparse-1.5.0.tar.gz BuildArch: noarch @@ -32,77 +32,115 @@ limitations under the License. <img alt="tool icon" src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/icon-deepsparse.png" /> DeepSparse </h1> - <h3> An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application</h3> - <div style="display: flex; align-items: center; justify-content: center; flex-wrap: wrap"> + <h4> An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application</h4> + <div align="center"> <a href="https://docs.neuralmagic.com/deepsparse/"> - <img alt="Documentation" src="https://img.shields.io/badge/documentation-darkred?&style=for-the-badge&logo=read-the-docs" height="25" /> + <img alt="Documentation" src="https://img.shields.io/badge/documentation-darkred?&style=for-the-badge&logo=read-the-docs" height="20" /> </a> <a href="https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ/"> - <img alt="Slack" src="https://img.shields.io/badge/slack-purple?style=for-the-badge&logo=slack" height="25" /> + <img alt="Slack" src="https://img.shields.io/badge/slack-purple?style=for-the-badge&logo=slack" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/issues/"> - <img alt="Support" src="https://img.shields.io/badge/support%20forums-navy?style=for-the-badge&logo=github" height="25" /> + <img alt="Support" src="https://img.shields.io/badge/support%20forums-navy?style=for-the-badge&logo=github" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/actions/workflows/quality-check.yaml"> - <img alt="Main" src="https://img.shields.io/github/workflow/status/neuralmagic/deepsparse/Quality%20Checks/main?label=build&style=for-the-badge" height="25" /> + <img alt="Main" src="https://img.shields.io/github/workflow/status/neuralmagic/deepsparse/Quality%20Checks/main?label=build&style=for-the-badge" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/releases"> - <img alt="GitHub release" src="https://img.shields.io/github/release/neuralmagic/deepsparse.svg?style=for-the-badge" height="25" /> + <img alt="GitHub release" src="https://img.shields.io/github/release/neuralmagic/deepsparse.svg?style=for-the-badge" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/blob/main/CODE_OF_CONDUCT.md"> - <img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg?color=yellow&style=for-the-badge" height="25" /> + <img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg?color=yellow&style=for-the-badge" height="20" /> </a> <a href="https://www.youtube.com/channel/UCo8dO_WMGYbWCRnj_Dxr4EA"> - <img alt="YouTube" src="https://img.shields.io/badge/-YouTube-red?&style=for-the-badge&logo=youtube&logoColor=white" height="25" /> + <img alt="YouTube" src="https://img.shields.io/badge/-YouTube-red?&style=for-the-badge&logo=youtube&logoColor=white" height="20" /> </a> <a href="https://medium.com/limitlessai"> - <img alt="Medium" src="https://img.shields.io/badge/medium-%2312100E.svg?&style=for-the-badge&logo=medium&logoColor=white" height="25" /> + <img alt="Medium" src="https://img.shields.io/badge/medium-%2312100E.svg?&style=for-the-badge&logo=medium&logoColor=white" height="20" /> </a> <a href="https://twitter.com/neuralmagic"> - <img alt="Twitter" src="https://img.shields.io/twitter/follow/neuralmagic?color=darkgreen&label=Follow&style=social" height="25" /> + <img alt="Twitter" src="https://img.shields.io/twitter/follow/neuralmagic?color=darkgreen&label=Follow&style=social" height="20" /> </a> </div> </div> -A CPU runtime that takes advantage of sparsity within neural networks to reduce compute. Read [more about sparsification](https://docs.neuralmagic.com/user-guide/sparsification). +A CPU runtime that takes advantage of sparsity within neural networks to reduce compute. Read [more about sparsification](https://docs.neuralmagic.com/user-guides/sparsification). Neural Magic's DeepSparse is able to integrate into popular deep learning libraries (e.g., Hugging Face, Ultralytics) allowing you to leverage DeepSparse for loading and deploying sparse models with ONNX. ONNX gives the flexibility to serve your model in a framework-agnostic environment. Support includes [PyTorch,](https://pytorch.org/docs/stable/onnx.html) [TensorFlow,](https://github.com/onnx/tensorflow-onnx) [Keras,](https://github.com/onnx/keras-onnx) and [many other frameworks](https://github.com/onnx/onnxmltools). +## Installation + +Install DeepSparse Community as follows: + +```bash +pip install deepsparse +``` + DeepSparse is available in two editions: 1. [**DeepSparse Community**](#installation) is open-source and free for evaluation, research, and non-production use with our [DeepSparse Community License](https://neuralmagic.com/legal/engine-license-agreement/). 2. [**DeepSparse Enterprise**](https://docs.neuralmagic.com/products/deepsparse-ent) requires a Trial License or [can be fully licensed](https://neuralmagic.com/legal/master-software-license-and-service-agreement/) for production, commercial applications. +## 🧰 Hardware Support and System Requirements + +To ensure that your CPU is compatible with DeepSparse, it is recommended to review the [Supported Hardware for DeepSparse](https://docs.neuralmagic.com/user-guides/deepsparse-engine/hardware-support) documentation. + +To ensure that you get the best performance from DeepSparse, it has been thoroughly tested on Python versions 3.7-3.10, ONNX versions 1.5.0-1.12.0, ONNX opset version 11 or higher, and manylinux compliant systems. It is highly recommended to use a [virtual environment](https://docs.python.org/3/library/venv.html) when running DeepSparse. Please note that DeepSparse is only supported natively on Linux. For those using Mac or Windows, running Linux in a Docker or virtual machine is necessary to use DeepSparse. + ## Features +- 👩💻 Pipelines for [NLP](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/transformers), [CV Classification](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/image_classification), [CV Detection](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/yolo), [CV Segmentation](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/yolact) and more! - 🔌 [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) - 📜 [DeepSparse Benchmark](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) -- 👩💻 [NLP and Computer Vision Tasks Supported](https://github.com/neuralmagic/deepsparse/tree/main/examples) +- ☁️ [Cloud Deployments and Demos](https://github.com/neuralmagic/deepsparse/tree/main/examples) -## 🧰 Hardware Support and System Requirements +### 👩💻 Pipelines -Review [Supported Hardware for DeepSparse](https://docs.neuralmagic.com/user-guide/deepsparse-engine/hardware-support) to understand system requirements. -DeepSparse works natively on Linux; Mac and Windows require running Linux in a Docker or virtual machine; it will not run natively on those operating systems. +Pipelines are a high-level Python interface for running inference with DeepSparse across select tasks in NLP and CV: -DeepSparse is tested on Python 3.7-3.10, ONNX 1.5.0-1.12.0, ONNX opset version 11+, and manylinux compliant. -Using a [virtual environment](https://docs.python.org/3/library/venv.html) is highly recommended. +| NLP | CV | +|-----------------------|---------------------------| +| Text Classification `"text_classification"` | Image Classification `"image_classification"` | +| Token Classification `"token_classification"` | Object Detection `"yolo"` | +| Sentiment Analysis `"sentiment_analysis"` | Instance Segmentation `"yolact"` | +| Question Answering `"question_answering"` | Keypoint Detection `"open_pif_paf"` | +| MultiLabel Text Classification `"text_classification"` | | +| Document Classification `"text_classification"` | | +| Zero-Shot Text Classification `"zero_shot_text_classification"` | | -## Installation -Install DeepSparse Community as follows: +**NLP Example** | Question Answering +```python +from deepsparse import Pipeline -```bash -pip install deepsparse +qa_pipeline = Pipeline.create( + task="question-answering", + model_path="zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni", +) + +inference = qa_pipeline(question="What's my name?", context="My name is Snorlax") ``` +**CV Example** | Image Classification -To install the DeepSparse Enterprise, trial or inquire about licensing for DeepSparse Enterprise, see the [DeepSparse Enterprise documentation](https://docs.neuralmagic.com/products/deepsparse-ent). +```python +from deepsparse import Pipeline + +cv_pipeline = Pipeline.create( + task='image_classification', + model_path='zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95-none', +) + +input_image = "my_image.png" +inference = cv_pipeline(images=input_image) +``` -## Features ### 🔌 DeepSparse Server -DeepSparse Server allows you to serve models and pipelines from the terminal. The server runs on top of the popular FastAPI web framework and Uvicorn web server. Install the server using the following command: +DeepSparse Server is a tool that enables you to serve your models and pipelines directly from your terminal. + +The server is built on top of two powerful libraries: the FastAPI web framework and the Uvicorn web server. This combination ensures that DeepSparse Server delivers excellent performance and reliability. Install with this command: ```bash pip install deepsparse[server] @@ -121,10 +159,9 @@ deepsparse.server \ To look up arguments run: `deepsparse.server --help`. #### Multiple Models -To serve multiple models in your deployment you can easily build a `config.yaml`. In the example below, we define two BERT models in our configuration for the question answering task: +To deploy multiple models in your setup, a `config.yaml` file should be created. In the example provided, two BERT models are configured for the question-answering task: ```yaml -num_cores: 1 num_workers: 1 endpoints: - task: question_answering @@ -137,66 +174,36 @@ endpoints: batch_size: 1 ``` -Finally, after your `config.yaml` file is built, run the server with the config file path as an argument: +After the `config.yaml` file has been created, the server can be started by passing the file path as an argument: ```bash deepsparse.server config config.yaml ``` -[Getting Started with DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) for more info. +Read the [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) README for further details. ### 📜 DeepSparse Benchmark -The benchmark tool is available on your CLI to run expressive model benchmarks on DeepSparse with minimal parameters. +DeepSparse Benchmark, a command-line (CLI) tool, is used to evaluate the DeepSparse Engine's performance with ONNX models. This tool processes arguments, downloads and compiles the network into the engine, creates input tensors, and runs the model based on the selected scenario. Run `deepsparse.benchmark -h` to look up arguments: ```shell -deepsparse.benchmark [-h] [-b BATCH_SIZE] [-shapes INPUT_SHAPES] - [-ncores NUM_CORES] [-s {async,sync}] [-t TIME] - [-nstreams NUM_STREAMS] [-pin {none,core,numa}] - [-q] [-x EXPORT_PATH] - model_path +deepsparse.benchmark [-h] [-b BATCH_SIZE] [-i INPUT_SHAPES] [-ncores NUM_CORES] [-s {async,sync,elastic}] [-t TIME] + [-w WARMUP_TIME] [-nstreams NUM_STREAMS] [-pin {none,core,numa}] [-e ENGINE] [-q] [-x EXPORT_PATH] + model_path ``` -[Getting Started with CLI Benchmarking](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) includes examples of select inference scenarios: -- Synchronous (Single-stream) Scenario -- Asynchronous (Multi-stream) Scenario - -### 👩💻 NLP Inference Example - -```python -from deepsparse import Pipeline - -# SparseZoo model stub or path to ONNX file -model_path = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni" - -qa_pipeline = Pipeline.create( - task="question-answering", - model_path=model_path, -) - -my_name = qa_pipeline(question="What's my name?", context="My name is Snorlax") -``` +Refer to the [Benchmark](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) README for examples of specific inference scenarios. -NLP Tutorials: -- [Getting Started with Hugging Face Transformers 🤗](https://github.com/neuralmagic/deepsparse/tree/main/examples/huggingface-transformers) +### 🦉 Custom ONNX Model Support -Tasks Supported: -- [Token Classification: Named Entity Recognition](https://neuralmagic.com/use-cases/sparse-named-entity-recognition/) -- [Text Classification: Multi-Class](https://neuralmagic.com/use-cases/sparse-multi-class-text-classification/) -- [Text Classification: Binary](https://neuralmagic.com/use-cases/sparse-binary-text-classification/) -- [Text Classification: Sentiment Analysis](https://neuralmagic.com/use-cases/sparse-sentiment-analysis/) -- [Question Answering](https://neuralmagic.com/use-cases/sparse-question-answering/) +DeepSparse is capable of accepting ONNX models from two sources: -### 🦉 SparseZoo ONNX vs. Custom ONNX Models +**SparseZoo ONNX**: This is an open-source repository of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) offers inference-optimized models, which are trained using repeatable sparsification recipes and state-of-the-art techniques from [SparseML](https://github.com/neuralmagic/sparseml). -DeepSparse can accept ONNX models from two sources: - -- **SparseZoo ONNX**: our open-source collection of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) hosts inference-optimized models, trained on repeatable sparsification recipes using state-of-the-art techniques from [SparseML](https://github.com/neuralmagic/sparseml). - -- **Custom ONNX**: your own ONNX model, can be dense or sparse. Plug in your model to compare performance with other solutions. +**Custom ONNX**: Users can provide their own ONNX models, whether dense or sparse. By plugging in a custom model, users can compare its performance with other solutions. ```bash > wget https://github.com/onnx/models/raw/main/vision/classification/mobilenet/model/mobilenetv2-7.onnx @@ -218,53 +225,40 @@ engine = compile_model(onnx_filepath, batch_size) outputs = engine.run(inputs) ``` -The [GitHub repository](https://github.com/neuralmagic/deepsparse) includes package APIs along with examples to quickly get started benchmarking and inferencing sparse models. +The [GitHub repository](https://github.com/neuralmagic/deepsparse) repository contains package APIs and examples that help users swiftly begin benchmarking and performing inference on sparse models. ### Scheduling Single-Stream, Multi-Stream, and Elastic Inference -DeepSparse offers up to three types of inferences based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md). +DeepSparse offers different inference scenarios based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md). -1 ⚡ Single-stream scheduling: the latency/synchronous scenario, requests execute serially. [`default`] +⚡ **Single-stream** scheduling: the latency/synchronous scenario, requests execute serially. [`default`] <img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/single-stream.png" alt="single stream diagram" /> -Use Case: It's highly optimized for minimum per-request latency, using all of the system's resources provided to it on every request it gets. +It's highly optimized for minimum per-request latency, using all of the system's resources provided to it on every request it gets. -2 ⚡ Multi-stream scheduling: the throughput/asynchronous scenario, requests execute in parallel. +⚡ **Multi-stream** scheduling: the throughput/asynchronous scenario, requests execute in parallel. <img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/multi-stream.png" alt="multi stream diagram" /> -PRO TIP: The most common use cases for the multi-stream scheduler are where parallelism is low with respect to core count, and where requests need to be made asynchronously without time to batch them. - -3 ⚡ Elastic scheduling: requests execute in parallel, but not multiplexed on individual NUMA nodes. - -Use Case: A workload that might benefit from the elastic scheduler is one in which multiple requests need to be handled simultaneously, but where performance is hindered when those requests have to share an L3 cache. +The most common use cases for the multi-stream scheduler are where parallelism is low with respect to core count, and where requests need to be made asynchronously without time to batch them. ## Resources #### Libraries - [DeepSparse](https://docs.neuralmagic.com/deepsparse/) - - [SparseML](https://docs.neuralmagic.com/sparseml/) - - [SparseZoo](https://docs.neuralmagic.com/sparsezoo/) - - [Sparsify](https://docs.neuralmagic.com/sparsify/) - #### Versions - [DeepSparse](https://pypi.org/project/deepsparse) | stable - - [DeepSparse-Nightly](https://pypi.org/project/deepsparse-nightly/) | nightly (dev) - - [GitHub](https://github.com/neuralmagic/deepsparse/releases) | releases #### Info - - [Blog](https://www.neuralmagic.com/blog/) - - [Resources](https://www.neuralmagic.com/resources/) - ## Community ### Be Part of the Future... And the Future is Sparse! @@ -353,77 +347,115 @@ limitations under the License. <img alt="tool icon" src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/icon-deepsparse.png" /> DeepSparse </h1> - <h3> An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application</h3> - <div style="display: flex; align-items: center; justify-content: center; flex-wrap: wrap"> + <h4> An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application</h4> + <div align="center"> <a href="https://docs.neuralmagic.com/deepsparse/"> - <img alt="Documentation" src="https://img.shields.io/badge/documentation-darkred?&style=for-the-badge&logo=read-the-docs" height="25" /> + <img alt="Documentation" src="https://img.shields.io/badge/documentation-darkred?&style=for-the-badge&logo=read-the-docs" height="20" /> </a> <a href="https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ/"> - <img alt="Slack" src="https://img.shields.io/badge/slack-purple?style=for-the-badge&logo=slack" height="25" /> + <img alt="Slack" src="https://img.shields.io/badge/slack-purple?style=for-the-badge&logo=slack" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/issues/"> - <img alt="Support" src="https://img.shields.io/badge/support%20forums-navy?style=for-the-badge&logo=github" height="25" /> + <img alt="Support" src="https://img.shields.io/badge/support%20forums-navy?style=for-the-badge&logo=github" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/actions/workflows/quality-check.yaml"> - <img alt="Main" src="https://img.shields.io/github/workflow/status/neuralmagic/deepsparse/Quality%20Checks/main?label=build&style=for-the-badge" height="25" /> + <img alt="Main" src="https://img.shields.io/github/workflow/status/neuralmagic/deepsparse/Quality%20Checks/main?label=build&style=for-the-badge" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/releases"> - <img alt="GitHub release" src="https://img.shields.io/github/release/neuralmagic/deepsparse.svg?style=for-the-badge" height="25" /> + <img alt="GitHub release" src="https://img.shields.io/github/release/neuralmagic/deepsparse.svg?style=for-the-badge" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/blob/main/CODE_OF_CONDUCT.md"> - <img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg?color=yellow&style=for-the-badge" height="25" /> + <img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg?color=yellow&style=for-the-badge" height="20" /> </a> <a href="https://www.youtube.com/channel/UCo8dO_WMGYbWCRnj_Dxr4EA"> - <img alt="YouTube" src="https://img.shields.io/badge/-YouTube-red?&style=for-the-badge&logo=youtube&logoColor=white" height="25" /> + <img alt="YouTube" src="https://img.shields.io/badge/-YouTube-red?&style=for-the-badge&logo=youtube&logoColor=white" height="20" /> </a> <a href="https://medium.com/limitlessai"> - <img alt="Medium" src="https://img.shields.io/badge/medium-%2312100E.svg?&style=for-the-badge&logo=medium&logoColor=white" height="25" /> + <img alt="Medium" src="https://img.shields.io/badge/medium-%2312100E.svg?&style=for-the-badge&logo=medium&logoColor=white" height="20" /> </a> <a href="https://twitter.com/neuralmagic"> - <img alt="Twitter" src="https://img.shields.io/twitter/follow/neuralmagic?color=darkgreen&label=Follow&style=social" height="25" /> + <img alt="Twitter" src="https://img.shields.io/twitter/follow/neuralmagic?color=darkgreen&label=Follow&style=social" height="20" /> </a> </div> </div> -A CPU runtime that takes advantage of sparsity within neural networks to reduce compute. Read [more about sparsification](https://docs.neuralmagic.com/user-guide/sparsification). +A CPU runtime that takes advantage of sparsity within neural networks to reduce compute. Read [more about sparsification](https://docs.neuralmagic.com/user-guides/sparsification). Neural Magic's DeepSparse is able to integrate into popular deep learning libraries (e.g., Hugging Face, Ultralytics) allowing you to leverage DeepSparse for loading and deploying sparse models with ONNX. ONNX gives the flexibility to serve your model in a framework-agnostic environment. Support includes [PyTorch,](https://pytorch.org/docs/stable/onnx.html) [TensorFlow,](https://github.com/onnx/tensorflow-onnx) [Keras,](https://github.com/onnx/keras-onnx) and [many other frameworks](https://github.com/onnx/onnxmltools). +## Installation + +Install DeepSparse Community as follows: + +```bash +pip install deepsparse +``` + DeepSparse is available in two editions: 1. [**DeepSparse Community**](#installation) is open-source and free for evaluation, research, and non-production use with our [DeepSparse Community License](https://neuralmagic.com/legal/engine-license-agreement/). 2. [**DeepSparse Enterprise**](https://docs.neuralmagic.com/products/deepsparse-ent) requires a Trial License or [can be fully licensed](https://neuralmagic.com/legal/master-software-license-and-service-agreement/) for production, commercial applications. +## 🧰 Hardware Support and System Requirements + +To ensure that your CPU is compatible with DeepSparse, it is recommended to review the [Supported Hardware for DeepSparse](https://docs.neuralmagic.com/user-guides/deepsparse-engine/hardware-support) documentation. + +To ensure that you get the best performance from DeepSparse, it has been thoroughly tested on Python versions 3.7-3.10, ONNX versions 1.5.0-1.12.0, ONNX opset version 11 or higher, and manylinux compliant systems. It is highly recommended to use a [virtual environment](https://docs.python.org/3/library/venv.html) when running DeepSparse. Please note that DeepSparse is only supported natively on Linux. For those using Mac or Windows, running Linux in a Docker or virtual machine is necessary to use DeepSparse. + ## Features +- 👩💻 Pipelines for [NLP](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/transformers), [CV Classification](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/image_classification), [CV Detection](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/yolo), [CV Segmentation](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/yolact) and more! - 🔌 [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) - 📜 [DeepSparse Benchmark](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) -- 👩💻 [NLP and Computer Vision Tasks Supported](https://github.com/neuralmagic/deepsparse/tree/main/examples) +- ☁️ [Cloud Deployments and Demos](https://github.com/neuralmagic/deepsparse/tree/main/examples) -## 🧰 Hardware Support and System Requirements +### 👩💻 Pipelines -Review [Supported Hardware for DeepSparse](https://docs.neuralmagic.com/user-guide/deepsparse-engine/hardware-support) to understand system requirements. -DeepSparse works natively on Linux; Mac and Windows require running Linux in a Docker or virtual machine; it will not run natively on those operating systems. +Pipelines are a high-level Python interface for running inference with DeepSparse across select tasks in NLP and CV: -DeepSparse is tested on Python 3.7-3.10, ONNX 1.5.0-1.12.0, ONNX opset version 11+, and manylinux compliant. -Using a [virtual environment](https://docs.python.org/3/library/venv.html) is highly recommended. +| NLP | CV | +|-----------------------|---------------------------| +| Text Classification `"text_classification"` | Image Classification `"image_classification"` | +| Token Classification `"token_classification"` | Object Detection `"yolo"` | +| Sentiment Analysis `"sentiment_analysis"` | Instance Segmentation `"yolact"` | +| Question Answering `"question_answering"` | Keypoint Detection `"open_pif_paf"` | +| MultiLabel Text Classification `"text_classification"` | | +| Document Classification `"text_classification"` | | +| Zero-Shot Text Classification `"zero_shot_text_classification"` | | -## Installation -Install DeepSparse Community as follows: +**NLP Example** | Question Answering +```python +from deepsparse import Pipeline -```bash -pip install deepsparse +qa_pipeline = Pipeline.create( + task="question-answering", + model_path="zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni", +) + +inference = qa_pipeline(question="What's my name?", context="My name is Snorlax") ``` +**CV Example** | Image Classification -To install the DeepSparse Enterprise, trial or inquire about licensing for DeepSparse Enterprise, see the [DeepSparse Enterprise documentation](https://docs.neuralmagic.com/products/deepsparse-ent). +```python +from deepsparse import Pipeline + +cv_pipeline = Pipeline.create( + task='image_classification', + model_path='zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95-none', +) + +input_image = "my_image.png" +inference = cv_pipeline(images=input_image) +``` -## Features ### 🔌 DeepSparse Server -DeepSparse Server allows you to serve models and pipelines from the terminal. The server runs on top of the popular FastAPI web framework and Uvicorn web server. Install the server using the following command: +DeepSparse Server is a tool that enables you to serve your models and pipelines directly from your terminal. + +The server is built on top of two powerful libraries: the FastAPI web framework and the Uvicorn web server. This combination ensures that DeepSparse Server delivers excellent performance and reliability. Install with this command: ```bash pip install deepsparse[server] @@ -442,10 +474,9 @@ deepsparse.server \ To look up arguments run: `deepsparse.server --help`. #### Multiple Models -To serve multiple models in your deployment you can easily build a `config.yaml`. In the example below, we define two BERT models in our configuration for the question answering task: +To deploy multiple models in your setup, a `config.yaml` file should be created. In the example provided, two BERT models are configured for the question-answering task: ```yaml -num_cores: 1 num_workers: 1 endpoints: - task: question_answering @@ -458,66 +489,36 @@ endpoints: batch_size: 1 ``` -Finally, after your `config.yaml` file is built, run the server with the config file path as an argument: +After the `config.yaml` file has been created, the server can be started by passing the file path as an argument: ```bash deepsparse.server config config.yaml ``` -[Getting Started with DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) for more info. +Read the [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) README for further details. ### 📜 DeepSparse Benchmark -The benchmark tool is available on your CLI to run expressive model benchmarks on DeepSparse with minimal parameters. +DeepSparse Benchmark, a command-line (CLI) tool, is used to evaluate the DeepSparse Engine's performance with ONNX models. This tool processes arguments, downloads and compiles the network into the engine, creates input tensors, and runs the model based on the selected scenario. Run `deepsparse.benchmark -h` to look up arguments: ```shell -deepsparse.benchmark [-h] [-b BATCH_SIZE] [-shapes INPUT_SHAPES] - [-ncores NUM_CORES] [-s {async,sync}] [-t TIME] - [-nstreams NUM_STREAMS] [-pin {none,core,numa}] - [-q] [-x EXPORT_PATH] - model_path +deepsparse.benchmark [-h] [-b BATCH_SIZE] [-i INPUT_SHAPES] [-ncores NUM_CORES] [-s {async,sync,elastic}] [-t TIME] + [-w WARMUP_TIME] [-nstreams NUM_STREAMS] [-pin {none,core,numa}] [-e ENGINE] [-q] [-x EXPORT_PATH] + model_path ``` -[Getting Started with CLI Benchmarking](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) includes examples of select inference scenarios: -- Synchronous (Single-stream) Scenario -- Asynchronous (Multi-stream) Scenario +Refer to the [Benchmark](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) README for examples of specific inference scenarios. -### 👩💻 NLP Inference Example - -```python -from deepsparse import Pipeline +### 🦉 Custom ONNX Model Support -# SparseZoo model stub or path to ONNX file -model_path = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni" +DeepSparse is capable of accepting ONNX models from two sources: -qa_pipeline = Pipeline.create( - task="question-answering", - model_path=model_path, -) - -my_name = qa_pipeline(question="What's my name?", context="My name is Snorlax") -``` +**SparseZoo ONNX**: This is an open-source repository of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) offers inference-optimized models, which are trained using repeatable sparsification recipes and state-of-the-art techniques from [SparseML](https://github.com/neuralmagic/sparseml). -NLP Tutorials: -- [Getting Started with Hugging Face Transformers 🤗](https://github.com/neuralmagic/deepsparse/tree/main/examples/huggingface-transformers) - -Tasks Supported: -- [Token Classification: Named Entity Recognition](https://neuralmagic.com/use-cases/sparse-named-entity-recognition/) -- [Text Classification: Multi-Class](https://neuralmagic.com/use-cases/sparse-multi-class-text-classification/) -- [Text Classification: Binary](https://neuralmagic.com/use-cases/sparse-binary-text-classification/) -- [Text Classification: Sentiment Analysis](https://neuralmagic.com/use-cases/sparse-sentiment-analysis/) -- [Question Answering](https://neuralmagic.com/use-cases/sparse-question-answering/) - -### 🦉 SparseZoo ONNX vs. Custom ONNX Models - -DeepSparse can accept ONNX models from two sources: - -- **SparseZoo ONNX**: our open-source collection of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) hosts inference-optimized models, trained on repeatable sparsification recipes using state-of-the-art techniques from [SparseML](https://github.com/neuralmagic/sparseml). - -- **Custom ONNX**: your own ONNX model, can be dense or sparse. Plug in your model to compare performance with other solutions. +**Custom ONNX**: Users can provide their own ONNX models, whether dense or sparse. By plugging in a custom model, users can compare its performance with other solutions. ```bash > wget https://github.com/onnx/models/raw/main/vision/classification/mobilenet/model/mobilenetv2-7.onnx @@ -539,53 +540,40 @@ engine = compile_model(onnx_filepath, batch_size) outputs = engine.run(inputs) ``` -The [GitHub repository](https://github.com/neuralmagic/deepsparse) includes package APIs along with examples to quickly get started benchmarking and inferencing sparse models. +The [GitHub repository](https://github.com/neuralmagic/deepsparse) repository contains package APIs and examples that help users swiftly begin benchmarking and performing inference on sparse models. ### Scheduling Single-Stream, Multi-Stream, and Elastic Inference -DeepSparse offers up to three types of inferences based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md). +DeepSparse offers different inference scenarios based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md). -1 ⚡ Single-stream scheduling: the latency/synchronous scenario, requests execute serially. [`default`] +⚡ **Single-stream** scheduling: the latency/synchronous scenario, requests execute serially. [`default`] <img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/single-stream.png" alt="single stream diagram" /> -Use Case: It's highly optimized for minimum per-request latency, using all of the system's resources provided to it on every request it gets. +It's highly optimized for minimum per-request latency, using all of the system's resources provided to it on every request it gets. -2 ⚡ Multi-stream scheduling: the throughput/asynchronous scenario, requests execute in parallel. +⚡ **Multi-stream** scheduling: the throughput/asynchronous scenario, requests execute in parallel. <img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/multi-stream.png" alt="multi stream diagram" /> -PRO TIP: The most common use cases for the multi-stream scheduler are where parallelism is low with respect to core count, and where requests need to be made asynchronously without time to batch them. - -3 ⚡ Elastic scheduling: requests execute in parallel, but not multiplexed on individual NUMA nodes. - -Use Case: A workload that might benefit from the elastic scheduler is one in which multiple requests need to be handled simultaneously, but where performance is hindered when those requests have to share an L3 cache. +The most common use cases for the multi-stream scheduler are where parallelism is low with respect to core count, and where requests need to be made asynchronously without time to batch them. ## Resources #### Libraries - [DeepSparse](https://docs.neuralmagic.com/deepsparse/) - - [SparseML](https://docs.neuralmagic.com/sparseml/) - - [SparseZoo](https://docs.neuralmagic.com/sparsezoo/) - - [Sparsify](https://docs.neuralmagic.com/sparsify/) - #### Versions - [DeepSparse](https://pypi.org/project/deepsparse) | stable - - [DeepSparse-Nightly](https://pypi.org/project/deepsparse-nightly/) | nightly (dev) - - [GitHub](https://github.com/neuralmagic/deepsparse/releases) | releases #### Info - - [Blog](https://www.neuralmagic.com/blog/) - - [Resources](https://www.neuralmagic.com/resources/) - ## Community ### Be Part of the Future... And the Future is Sparse! @@ -671,77 +659,115 @@ limitations under the License. <img alt="tool icon" src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/icon-deepsparse.png" /> DeepSparse </h1> - <h3> An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application</h3> - <div style="display: flex; align-items: center; justify-content: center; flex-wrap: wrap"> + <h4> An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application</h4> + <div align="center"> <a href="https://docs.neuralmagic.com/deepsparse/"> - <img alt="Documentation" src="https://img.shields.io/badge/documentation-darkred?&style=for-the-badge&logo=read-the-docs" height="25" /> + <img alt="Documentation" src="https://img.shields.io/badge/documentation-darkred?&style=for-the-badge&logo=read-the-docs" height="20" /> </a> <a href="https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ/"> - <img alt="Slack" src="https://img.shields.io/badge/slack-purple?style=for-the-badge&logo=slack" height="25" /> + <img alt="Slack" src="https://img.shields.io/badge/slack-purple?style=for-the-badge&logo=slack" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/issues/"> - <img alt="Support" src="https://img.shields.io/badge/support%20forums-navy?style=for-the-badge&logo=github" height="25" /> + <img alt="Support" src="https://img.shields.io/badge/support%20forums-navy?style=for-the-badge&logo=github" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/actions/workflows/quality-check.yaml"> - <img alt="Main" src="https://img.shields.io/github/workflow/status/neuralmagic/deepsparse/Quality%20Checks/main?label=build&style=for-the-badge" height="25" /> + <img alt="Main" src="https://img.shields.io/github/workflow/status/neuralmagic/deepsparse/Quality%20Checks/main?label=build&style=for-the-badge" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/releases"> - <img alt="GitHub release" src="https://img.shields.io/github/release/neuralmagic/deepsparse.svg?style=for-the-badge" height="25" /> + <img alt="GitHub release" src="https://img.shields.io/github/release/neuralmagic/deepsparse.svg?style=for-the-badge" height="20" /> </a> <a href="https://github.com/neuralmagic/deepsparse/blob/main/CODE_OF_CONDUCT.md"> - <img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg?color=yellow&style=for-the-badge" height="25" /> + <img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg?color=yellow&style=for-the-badge" height="20" /> </a> <a href="https://www.youtube.com/channel/UCo8dO_WMGYbWCRnj_Dxr4EA"> - <img alt="YouTube" src="https://img.shields.io/badge/-YouTube-red?&style=for-the-badge&logo=youtube&logoColor=white" height="25" /> + <img alt="YouTube" src="https://img.shields.io/badge/-YouTube-red?&style=for-the-badge&logo=youtube&logoColor=white" height="20" /> </a> <a href="https://medium.com/limitlessai"> - <img alt="Medium" src="https://img.shields.io/badge/medium-%2312100E.svg?&style=for-the-badge&logo=medium&logoColor=white" height="25" /> + <img alt="Medium" src="https://img.shields.io/badge/medium-%2312100E.svg?&style=for-the-badge&logo=medium&logoColor=white" height="20" /> </a> <a href="https://twitter.com/neuralmagic"> - <img alt="Twitter" src="https://img.shields.io/twitter/follow/neuralmagic?color=darkgreen&label=Follow&style=social" height="25" /> + <img alt="Twitter" src="https://img.shields.io/twitter/follow/neuralmagic?color=darkgreen&label=Follow&style=social" height="20" /> </a> </div> </div> -A CPU runtime that takes advantage of sparsity within neural networks to reduce compute. Read [more about sparsification](https://docs.neuralmagic.com/user-guide/sparsification). +A CPU runtime that takes advantage of sparsity within neural networks to reduce compute. Read [more about sparsification](https://docs.neuralmagic.com/user-guides/sparsification). Neural Magic's DeepSparse is able to integrate into popular deep learning libraries (e.g., Hugging Face, Ultralytics) allowing you to leverage DeepSparse for loading and deploying sparse models with ONNX. ONNX gives the flexibility to serve your model in a framework-agnostic environment. Support includes [PyTorch,](https://pytorch.org/docs/stable/onnx.html) [TensorFlow,](https://github.com/onnx/tensorflow-onnx) [Keras,](https://github.com/onnx/keras-onnx) and [many other frameworks](https://github.com/onnx/onnxmltools). +## Installation + +Install DeepSparse Community as follows: + +```bash +pip install deepsparse +``` + DeepSparse is available in two editions: 1. [**DeepSparse Community**](#installation) is open-source and free for evaluation, research, and non-production use with our [DeepSparse Community License](https://neuralmagic.com/legal/engine-license-agreement/). 2. [**DeepSparse Enterprise**](https://docs.neuralmagic.com/products/deepsparse-ent) requires a Trial License or [can be fully licensed](https://neuralmagic.com/legal/master-software-license-and-service-agreement/) for production, commercial applications. +## 🧰 Hardware Support and System Requirements + +To ensure that your CPU is compatible with DeepSparse, it is recommended to review the [Supported Hardware for DeepSparse](https://docs.neuralmagic.com/user-guides/deepsparse-engine/hardware-support) documentation. + +To ensure that you get the best performance from DeepSparse, it has been thoroughly tested on Python versions 3.7-3.10, ONNX versions 1.5.0-1.12.0, ONNX opset version 11 or higher, and manylinux compliant systems. It is highly recommended to use a [virtual environment](https://docs.python.org/3/library/venv.html) when running DeepSparse. Please note that DeepSparse is only supported natively on Linux. For those using Mac or Windows, running Linux in a Docker or virtual machine is necessary to use DeepSparse. + ## Features +- 👩💻 Pipelines for [NLP](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/transformers), [CV Classification](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/image_classification), [CV Detection](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/yolo), [CV Segmentation](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/yolact) and more! - 🔌 [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) - 📜 [DeepSparse Benchmark](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) -- 👩💻 [NLP and Computer Vision Tasks Supported](https://github.com/neuralmagic/deepsparse/tree/main/examples) +- ☁️ [Cloud Deployments and Demos](https://github.com/neuralmagic/deepsparse/tree/main/examples) -## 🧰 Hardware Support and System Requirements +### 👩💻 Pipelines -Review [Supported Hardware for DeepSparse](https://docs.neuralmagic.com/user-guide/deepsparse-engine/hardware-support) to understand system requirements. -DeepSparse works natively on Linux; Mac and Windows require running Linux in a Docker or virtual machine; it will not run natively on those operating systems. +Pipelines are a high-level Python interface for running inference with DeepSparse across select tasks in NLP and CV: -DeepSparse is tested on Python 3.7-3.10, ONNX 1.5.0-1.12.0, ONNX opset version 11+, and manylinux compliant. -Using a [virtual environment](https://docs.python.org/3/library/venv.html) is highly recommended. +| NLP | CV | +|-----------------------|---------------------------| +| Text Classification `"text_classification"` | Image Classification `"image_classification"` | +| Token Classification `"token_classification"` | Object Detection `"yolo"` | +| Sentiment Analysis `"sentiment_analysis"` | Instance Segmentation `"yolact"` | +| Question Answering `"question_answering"` | Keypoint Detection `"open_pif_paf"` | +| MultiLabel Text Classification `"text_classification"` | | +| Document Classification `"text_classification"` | | +| Zero-Shot Text Classification `"zero_shot_text_classification"` | | -## Installation -Install DeepSparse Community as follows: +**NLP Example** | Question Answering +```python +from deepsparse import Pipeline -```bash -pip install deepsparse +qa_pipeline = Pipeline.create( + task="question-answering", + model_path="zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni", +) + +inference = qa_pipeline(question="What's my name?", context="My name is Snorlax") ``` +**CV Example** | Image Classification -To install the DeepSparse Enterprise, trial or inquire about licensing for DeepSparse Enterprise, see the [DeepSparse Enterprise documentation](https://docs.neuralmagic.com/products/deepsparse-ent). +```python +from deepsparse import Pipeline + +cv_pipeline = Pipeline.create( + task='image_classification', + model_path='zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95-none', +) + +input_image = "my_image.png" +inference = cv_pipeline(images=input_image) +``` -## Features ### 🔌 DeepSparse Server -DeepSparse Server allows you to serve models and pipelines from the terminal. The server runs on top of the popular FastAPI web framework and Uvicorn web server. Install the server using the following command: +DeepSparse Server is a tool that enables you to serve your models and pipelines directly from your terminal. + +The server is built on top of two powerful libraries: the FastAPI web framework and the Uvicorn web server. This combination ensures that DeepSparse Server delivers excellent performance and reliability. Install with this command: ```bash pip install deepsparse[server] @@ -760,10 +786,9 @@ deepsparse.server \ To look up arguments run: `deepsparse.server --help`. #### Multiple Models -To serve multiple models in your deployment you can easily build a `config.yaml`. In the example below, we define two BERT models in our configuration for the question answering task: +To deploy multiple models in your setup, a `config.yaml` file should be created. In the example provided, two BERT models are configured for the question-answering task: ```yaml -num_cores: 1 num_workers: 1 endpoints: - task: question_answering @@ -776,66 +801,36 @@ endpoints: batch_size: 1 ``` -Finally, after your `config.yaml` file is built, run the server with the config file path as an argument: +After the `config.yaml` file has been created, the server can be started by passing the file path as an argument: ```bash deepsparse.server config config.yaml ``` -[Getting Started with DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) for more info. +Read the [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) README for further details. ### 📜 DeepSparse Benchmark -The benchmark tool is available on your CLI to run expressive model benchmarks on DeepSparse with minimal parameters. +DeepSparse Benchmark, a command-line (CLI) tool, is used to evaluate the DeepSparse Engine's performance with ONNX models. This tool processes arguments, downloads and compiles the network into the engine, creates input tensors, and runs the model based on the selected scenario. Run `deepsparse.benchmark -h` to look up arguments: ```shell -deepsparse.benchmark [-h] [-b BATCH_SIZE] [-shapes INPUT_SHAPES] - [-ncores NUM_CORES] [-s {async,sync}] [-t TIME] - [-nstreams NUM_STREAMS] [-pin {none,core,numa}] - [-q] [-x EXPORT_PATH] - model_path +deepsparse.benchmark [-h] [-b BATCH_SIZE] [-i INPUT_SHAPES] [-ncores NUM_CORES] [-s {async,sync,elastic}] [-t TIME] + [-w WARMUP_TIME] [-nstreams NUM_STREAMS] [-pin {none,core,numa}] [-e ENGINE] [-q] [-x EXPORT_PATH] + model_path ``` -[Getting Started with CLI Benchmarking](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) includes examples of select inference scenarios: -- Synchronous (Single-stream) Scenario -- Asynchronous (Multi-stream) Scenario +Refer to the [Benchmark](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark) README for examples of specific inference scenarios. -### 👩💻 NLP Inference Example +### 🦉 Custom ONNX Model Support -```python -from deepsparse import Pipeline +DeepSparse is capable of accepting ONNX models from two sources: -# SparseZoo model stub or path to ONNX file -model_path = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni" +**SparseZoo ONNX**: This is an open-source repository of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) offers inference-optimized models, which are trained using repeatable sparsification recipes and state-of-the-art techniques from [SparseML](https://github.com/neuralmagic/sparseml). -qa_pipeline = Pipeline.create( - task="question-answering", - model_path=model_path, -) - -my_name = qa_pipeline(question="What's my name?", context="My name is Snorlax") -``` - -NLP Tutorials: -- [Getting Started with Hugging Face Transformers 🤗](https://github.com/neuralmagic/deepsparse/tree/main/examples/huggingface-transformers) - -Tasks Supported: -- [Token Classification: Named Entity Recognition](https://neuralmagic.com/use-cases/sparse-named-entity-recognition/) -- [Text Classification: Multi-Class](https://neuralmagic.com/use-cases/sparse-multi-class-text-classification/) -- [Text Classification: Binary](https://neuralmagic.com/use-cases/sparse-binary-text-classification/) -- [Text Classification: Sentiment Analysis](https://neuralmagic.com/use-cases/sparse-sentiment-analysis/) -- [Question Answering](https://neuralmagic.com/use-cases/sparse-question-answering/) - -### 🦉 SparseZoo ONNX vs. Custom ONNX Models - -DeepSparse can accept ONNX models from two sources: - -- **SparseZoo ONNX**: our open-source collection of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) hosts inference-optimized models, trained on repeatable sparsification recipes using state-of-the-art techniques from [SparseML](https://github.com/neuralmagic/sparseml). - -- **Custom ONNX**: your own ONNX model, can be dense or sparse. Plug in your model to compare performance with other solutions. +**Custom ONNX**: Users can provide their own ONNX models, whether dense or sparse. By plugging in a custom model, users can compare its performance with other solutions. ```bash > wget https://github.com/onnx/models/raw/main/vision/classification/mobilenet/model/mobilenetv2-7.onnx @@ -857,53 +852,40 @@ engine = compile_model(onnx_filepath, batch_size) outputs = engine.run(inputs) ``` -The [GitHub repository](https://github.com/neuralmagic/deepsparse) includes package APIs along with examples to quickly get started benchmarking and inferencing sparse models. +The [GitHub repository](https://github.com/neuralmagic/deepsparse) repository contains package APIs and examples that help users swiftly begin benchmarking and performing inference on sparse models. ### Scheduling Single-Stream, Multi-Stream, and Elastic Inference -DeepSparse offers up to three types of inferences based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md). +DeepSparse offers different inference scenarios based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md). -1 ⚡ Single-stream scheduling: the latency/synchronous scenario, requests execute serially. [`default`] +⚡ **Single-stream** scheduling: the latency/synchronous scenario, requests execute serially. [`default`] <img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/single-stream.png" alt="single stream diagram" /> -Use Case: It's highly optimized for minimum per-request latency, using all of the system's resources provided to it on every request it gets. +It's highly optimized for minimum per-request latency, using all of the system's resources provided to it on every request it gets. -2 ⚡ Multi-stream scheduling: the throughput/asynchronous scenario, requests execute in parallel. +⚡ **Multi-stream** scheduling: the throughput/asynchronous scenario, requests execute in parallel. <img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/multi-stream.png" alt="multi stream diagram" /> -PRO TIP: The most common use cases for the multi-stream scheduler are where parallelism is low with respect to core count, and where requests need to be made asynchronously without time to batch them. - -3 ⚡ Elastic scheduling: requests execute in parallel, but not multiplexed on individual NUMA nodes. - -Use Case: A workload that might benefit from the elastic scheduler is one in which multiple requests need to be handled simultaneously, but where performance is hindered when those requests have to share an L3 cache. +The most common use cases for the multi-stream scheduler are where parallelism is low with respect to core count, and where requests need to be made asynchronously without time to batch them. ## Resources #### Libraries - [DeepSparse](https://docs.neuralmagic.com/deepsparse/) - - [SparseML](https://docs.neuralmagic.com/sparseml/) - - [SparseZoo](https://docs.neuralmagic.com/sparsezoo/) - - [Sparsify](https://docs.neuralmagic.com/sparsify/) - #### Versions - [DeepSparse](https://pypi.org/project/deepsparse) | stable - - [DeepSparse-Nightly](https://pypi.org/project/deepsparse-nightly/) | nightly (dev) - - [GitHub](https://github.com/neuralmagic/deepsparse/releases) | releases #### Info - - [Blog](https://www.neuralmagic.com/blog/) - - [Resources](https://www.neuralmagic.com/resources/) - ## Community ### Be Part of the Future... And the Future is Sparse! @@ -964,7 +946,7 @@ Find this project useful in your research or other communications? Please consid %prep -%autosetup -n deepsparse-1.4.2 +%autosetup -n deepsparse-1.5.0 %build %py3_build @@ -978,20 +960,20 @@ if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then - find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst + find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/lib64 ]; then - find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst + find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/bin ]; then - find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst + find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/sbin ]; then - find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst + find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then - find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst + find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . @@ -1004,5 +986,5 @@ mv %{buildroot}/doclist.lst . %{_docdir}/* %changelog -* Tue May 30 2023 Python_Bot <Python_Bot@openeuler.org> - 1.4.2-1 +* Thu Jun 08 2023 Python_Bot <Python_Bot@openeuler.org> - 1.5.0-1 - Package Spec generated @@ -1 +1 @@ -ac5922c572828c283ca9cc7b612a9279 deepsparse-1.4.2.tar.gz +f536e5e06b2374c5b69a636afcf2b7a7 deepsparse-1.5.0.tar.gz |
