%global _empty_manifest_terminate_build 0 Name: python-paddlenlp Version: 2.5.2 Release: 1 Summary: Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including Neural Search, Question Answering, Information Extraction and Sentiment Analysis end-to-end system. License: Apache 2.0 URL: https://github.com/PaddlePaddle/PaddleNLP Source0: https://mirrors.nju.edu.cn/pypi/web/packages/ac/30/204ab2e0e01222060db5684041d5c4a73dec0adb622e3397f88d15b38f94/paddlenlp-2.5.2.tar.gz BuildArch: noarch Requires: python3-jieba Requires: python3-colorlog Requires: python3-colorama Requires: python3-seqeval Requires: python3-dill Requires: python3-multiprocess Requires: python3-datasets Requires: python3-tqdm Requires: python3-paddlefsl Requires: python3-sentencepiece Requires: python3-huggingface-hub Requires: python3-paddle2onnx Requires: python3-Flask-Babel Requires: python3-visualdl Requires: python3-fastapi Requires: python3-uvicorn Requires: python3-typer Requires: python3-rich Requires: python3-ray[tune] Requires: python3-hyperopt Requires: python3-parameterized Requires: python3-sentencepiece Requires: python3-regex Requires: python3-torch Requires: python3-transformers Requires: python3-fast-tokenizer-python Requires: python3-jinja2 Requires: python3-sphinx Requires: python3-sphinx-rtd-theme Requires: python3-readthedocs-sphinx-search Requires: python3-Markdown Requires: python3-sphinx-copybutton Requires: python3-sphinx-markdown-tables Requires: python3-paddlepaddle Requires: python3-ray[tune] Requires: python3-hyperopt Requires: python3-jinja2 Requires: python3-sphinx Requires: python3-sphinx-rtd-theme Requires: python3-readthedocs-sphinx-search Requires: python3-Markdown Requires: python3-sphinx-copybutton Requires: python3-sphinx-markdown-tables Requires: python3-paddlepaddle Requires: python3-parameterized Requires: python3-sentencepiece Requires: python3-regex Requires: python3-torch Requires: python3-transformers Requires: python3-fast-tokenizer-python %description

Features | Installation | Quick Start | API Reference | Community

**PaddleNLP** is an *easy-to-use* and *powerful* NLP library with **Awesome** pre-trained model zoo, supporting wide-range of NLP tasks from research to industrial applications. ## News ๐Ÿ“ข * ๐Ÿ”ฅ **Latest Features** * ๐Ÿ“ƒ Release **[UIE-X](./applications/information_extraction)**, an universal information extraction model that supports both document and text inputs. * โฃ๏ธRelease **[Opinion Mining and Sentiment Analysis Models](./applications/sentiment_analysis/unified_sentiment_extraction)** based on UIE, including abilities of sentence-level and aspect-based sentiment classification, attribute extraction, opinion extraction, attribute aggregation and implicit opinion extraction. * **2022.9.6 [PaddleNLPv2.4](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.4.0) Released!** * ๐Ÿ’Ž NLP Tools: Released **[Pipelines](./pipelines)** which supports turn-key construction of search engine and question answering systems. It features a flexible design that is applicable for all kinds of NLP systems so you can build end-to-end NLP pipelines like Legos! * ๐Ÿ”จ Industrial application: Release **[Complete Solution of Text Classification](./applications/text_classification)** covering various scenarios of text classification: multi-class, multi-label and hierarchical, it also supports **few-shot learning** and the training and optimization of **TrustAI**. Upgrade for [**UIE**](./model_zoo/uie) and release **UIE-M**, support both Chinese and English information extraction in a single model; release the data distillation solution for UIE to break the bottleneck of time-consuming of inference. * ๐Ÿญ AIGC: Release code generation SOTA model [**CodeGen**](./examples/code_generation/codegen) that supports multiple programming languages code generation. Integrate [**Text to Image Model**](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/docs/model_zoo/taskflow.md#%E6%96%87%E5%9B%BE%E7%94%9F%E6%88%90) DALLยทE Mini, Disco Diffusion, Stable Diffusion, let's play and have some fun! * ๐Ÿ’ช Framework upgrade: Release [**Auto Model Compression API**](./docs/compression.md), supports for pruning and quantization automatically, lower the barriers of model compression; Release [**Few-shot Prompt**](./applications/text_classification/multi_class/few-shot), includes the algorithms such as PET, P-Tuning and RGL. ## Features #### ๐Ÿ“ฆ Out-of-Box NLP Toolset #### ๐Ÿค— Awesome Chinese Model Zoo #### ๐ŸŽ›๏ธ Industrial End-to-end System #### ๐Ÿš€ High Performance Distributed Training and Inference ### Out-of-Box NLP Toolset Taskflow aims to provide off-the-shelf NLP pre-built task covering NLU and NLG technique, in the meanwhile with extreamly fast infernece satisfying industrial scenario. ![taskflow1](https://user-images.githubusercontent.com/11793384/159693816-fda35221-9751-43bb-b05c-7fc77571dd76.gif) For more usage please refer to [Taskflow Docs](./docs/model_zoo/taskflow.md). ### Awesome Chinese Model Zoo #### ๐Ÿ€„ Comprehensive Chinese Transformer Models We provide **45+** network architectures and over **500+** pretrained models. Not only includes all the SOTA model like ERNIE, PLATO and SKEP released by Baidu, but also integrates most of the high-quality Chinese pretrained model developed by other organizations. Use `AutoModel` API to **โšกSUPER FASTโšก** download pretrained models of different architecture. We welcome all developers to contribute your Transformer models to PaddleNLP! ```python from paddlenlp.transformers import * ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') bert = AutoModel.from_pretrained('bert-wwm-chinese') albert = AutoModel.from_pretrained('albert-chinese-tiny') roberta = AutoModel.from_pretrained('roberta-wwm-ext') electra = AutoModel.from_pretrained('chinese-electra-small') gpt = AutoModelForPretraining.from_pretrained('gpt-cpm-large-cn') ``` Due to the computation limitation, you can use the ERNIE-Tiny light models to accelerate the deployment of pretrained models. ```python # 6L768H ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') # 6L384H ernie = AutoModel.from_pretrained('ernie-3.0-mini-zh') # 4L384H ernie = AutoModel.from_pretrained('ernie-3.0-micro-zh') # 4L312H ernie = AutoModel.from_pretrained('ernie-3.0-nano-zh') ``` Unified API experience for NLP task like semantic representation, text classification, sentence matching, sequence labeling, question answering, etc. ```python import paddle from paddlenlp.transformers import * tokenizer = AutoTokenizer.from_pretrained('ernie-3.0-medium-zh') text = tokenizer('natural language processing') # Semantic Representation model = AutoModel.from_pretrained('ernie-3.0-medium-zh') sequence_output, pooled_output = model(input_ids=paddle.to_tensor([text['input_ids']])) # Text Classificaiton and Matching model = AutoModelForSequenceClassification.from_pretrained('ernie-3.0-medium-zh') # Sequence Labeling model = AutoModelForTokenClassification.from_pretrained('ernie-3.0-medium-zh') # Question Answering model = AutoModelForQuestionAnswering.from_pretrained('ernie-3.0-medium-zh') ``` #### Wide-range NLP Task Support PaddleNLP provides rich examples covering mainstream NLP task to help developers accelerate problem solving. You can find our powerful transformer [Model Zoo](./model_zoo), and wide-range NLP application [exmaples](./examples) with detailed instructions. Also you can run our interactive [Notebook tutorial](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995) on AI Studio, a powerful platform with **FREE** computing resource.
PaddleNLP Transformer model summary (click to show details)
| Model | Sequence Classification | Token Classification | Question Answering | Text Generation | Multiple Choice | | :----------------- | ----------------------- | -------------------- | ------------------ | --------------- | --------------- | | ALBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | BART | โœ… | โœ… | โœ… | โœ… | โŒ | | BERT | โœ… | โœ… | โœ… | โŒ | โœ… | | BigBird | โœ… | โœ… | โœ… | โŒ | โœ… | | BlenderBot | โŒ | โŒ | โŒ | โœ… | โŒ | | ChineseBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | ConvBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | CTRL | โœ… | โŒ | โŒ | โŒ | โŒ | | DistilBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | ELECTRA | โœ… | โœ… | โœ… | โŒ | โœ… | | ERNIE | โœ… | โœ… | โœ… | โŒ | โœ… | | ERNIE-CTM | โŒ | โœ… | โŒ | โŒ | โŒ | | ERNIE-Doc | โœ… | โœ… | โœ… | โŒ | โŒ | | ERNIE-GEN | โŒ | โŒ | โŒ | โœ… | โŒ | | ERNIE-Gram | โœ… | โœ… | โœ… | โŒ | โŒ | | ERNIE-M | โœ… | โœ… | โœ… | โŒ | โŒ | | FNet | โœ… | โœ… | โœ… | โŒ | โœ… | | Funnel-Transformer | โœ… | โœ… | โœ… | โŒ | โŒ | | GPT | โœ… | โœ… | โŒ | โœ… | โŒ | | LayoutLM | โœ… | โœ… | โŒ | โŒ | โŒ | | LayoutLMv2 | โŒ | โœ… | โŒ | โŒ | โŒ | | LayoutXLM | โŒ | โœ… | โŒ | โŒ | โŒ | | LUKE | โŒ | โœ… | โœ… | โŒ | โŒ | | mBART | โœ… | โŒ | โœ… | โŒ | โœ… | | MegatronBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | MobileBERT | โœ… | โŒ | โœ… | โŒ | โŒ | | MPNet | โœ… | โœ… | โœ… | โŒ | โœ… | | NEZHA | โœ… | โœ… | โœ… | โŒ | โœ… | | PP-MiniLM | โœ… | โŒ | โŒ | โŒ | โŒ | | ProphetNet | โŒ | โŒ | โŒ | โœ… | โŒ | | Reformer | โœ… | โŒ | โœ… | โŒ | โŒ | | RemBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | RoBERTa | โœ… | โœ… | โœ… | โŒ | โœ… | | RoFormer | โœ… | โœ… | โœ… | โŒ | โŒ | | SKEP | โœ… | โœ… | โŒ | โŒ | โŒ | | SqueezeBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | T5 | โŒ | โŒ | โŒ | โœ… | โŒ | | TinyBERT | โœ… | โŒ | โŒ | โŒ | โŒ | | UnifiedTransformer | โŒ | โŒ | โŒ | โœ… | โŒ | | XLNet | โœ… | โœ… | โœ… | โŒ | โœ… |
For more pretrained model usage, please refer to [Transformer API Docs](./docs/model_zoo/index.rst). ### Industrial End-to-end System We provide high value scenarios including information extraction, semantic retrieval, questionn answering high-value. For more details industial cases please refer to [Applications](./applications). #### ๐Ÿ” Neural Search System
For more details please refer to [Neural Search](./applications/neural_search). #### โ“ Question Answering System We provide question answering pipeline which can support FAQ system, Document-level Visual Question answering system based on [๐Ÿš€RocketQA](https://github.com/PaddlePaddle/RocketQA).
For more details please refer to [Question Answering](./applications/question_answering) and [Document VQA](./applications/document_intelligence/doc_vqa). #### ๐Ÿ’Œ Opinion Extraction and Sentiment Analysis We build an opinion extraction system for product review and fine-grained sentiment analysis based on [SKEP](https://arxiv.org/abs/2005.05635) Model.
For more details please refer to [Sentiment Analysis](./applications/sentiment_analysis). #### ๐ŸŽ™๏ธ Speech Command Analysis Integrated ASR Model, Information Extraction, we provide a speech command analysis pipeline that show how to use PaddleNLP and [PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech) to solve Speech + NLP real scenarios.
For more details please refer to [Speech Command Analysis](./applications/speech_cmd_analysis). ### High Performance Distributed Training and Inference #### โšก FastTokenizer: High Performance Text Preprocessing Library
```python AutoTokenizer.from_pretrained("ernie-3.0-medium-zh", use_fast=True) ``` Set `use_fast=True` to use C++ Tokenizer kernel to achieve 100x faster on text pre-processing. For more usage please refer to [FastTokenizer](./fast_tokenizer). #### โšก FastGeneration: High Perforance Generation Library
```python model = GPTLMHeadModel.from_pretrained('gpt-cpm-large-cn') outputs, _ = model.generate( input_ids=inputs_ids, max_length=10, decode_strategy='greedy_search', use_fast=True) ``` Set `use_fast=True` to achieve 5x speedup for Transformer, GPT, BART, PLATO, UniLM text generation. For more usage please refer to [FastGeneration](./fast_generation). #### ๐Ÿš€ Fleet: 4D Hybrid Distributed Training
For more super large-scale model pre-training details please refer to [GPT-3](./examples/language_model/gpt-3). ## Installation ### Prerequisites * python >= 3.7 * paddlepaddle >= 2.3 More information about PaddlePaddle installation please refer to [PaddlePaddle's Website](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/conda/linux-conda.html). ### Python pip Installation ``` pip install --upgrade paddlenlp ``` or you can install the latest develop branch code with the following command: ```shell pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html ``` ## Quick Start **Taskflow** aims to provide off-the-shelf NLP pre-built task covering NLU and NLG scenario, in the meanwhile with extreamly fast infernece satisfying industrial applications. ```python from paddlenlp import Taskflow # Chinese Word Segmentation seg = Taskflow("word_segmentation") seg("็ฌฌๅๅ››ๅฑŠๅ…จ่ฟไผšๅœจ่ฅฟๅฎ‰ไธพๅŠž") >>> ['็ฌฌๅๅ››ๅฑŠ', 'ๅ…จ่ฟไผš', 'ๅœจ', '่ฅฟๅฎ‰', 'ไธพๅŠž'] # POS Tagging tag = Taskflow("pos_tagging") tag("็ฌฌๅๅ››ๅฑŠๅ…จ่ฟไผšๅœจ่ฅฟๅฎ‰ไธพๅŠž") >>> [('็ฌฌๅๅ››ๅฑŠ', 'm'), ('ๅ…จ่ฟไผš', 'nz'), ('ๅœจ', 'p'), ('่ฅฟๅฎ‰', 'LOC'), ('ไธพๅŠž', 'v')] # Named Entity Recognition ner = Taskflow("ner") ner("ใ€Šๅญคๅฅณใ€‹ๆ˜ฏ2010ๅนดไนๅทžๅ‡บ็‰ˆ็คพๅ‡บ็‰ˆ็š„ๅฐ่ฏด๏ผŒไฝœ่€…ๆ˜ฏไฝ™ๅ…ผ็พฝ") >>> [('ใ€Š', 'w'), ('ๅญคๅฅณ', 'ไฝœๅ“็ฑป_ๅฎžไฝ“'), ('ใ€‹', 'w'), ('ๆ˜ฏ', '่‚ฏๅฎš่ฏ'), ('2010ๅนด', 'ๆ—ถ้—ด็ฑป'), ('ไนๅทžๅ‡บ็‰ˆ็คพ', '็ป„็ป‡ๆœบๆž„็ฑป'), ('ๅ‡บ็‰ˆ', 'ๅœบๆ™ฏไบ‹ไปถ'), ('็š„', 'ๅŠฉ่ฏ'), ('ๅฐ่ฏด', 'ไฝœๅ“็ฑป_ๆฆ‚ๅฟต'), ('๏ผŒ', 'w'), ('ไฝœ่€…', 'ไบบ็‰ฉ็ฑป_ๆฆ‚ๅฟต'), ('ๆ˜ฏ', '่‚ฏๅฎš่ฏ'), ('ไฝ™ๅ…ผ็พฝ', 'ไบบ็‰ฉ็ฑป_ๅฎžไฝ“')] # Dependency Parsing ddp = Taskflow("dependency_parsing") ddp("9ๆœˆ9ๆ—ฅไธŠๅˆ็บณ่พพๅฐ”ๅœจไบš็‘Ÿยท้˜ฟไป€็ƒๅœบๅ‡ป่ดฅไฟ„็ฝ—ๆ–ฏ็ƒๅ‘˜ๆข…ๅพท้Ÿฆๆฐๅคซ") >>> [{'word': ['9ๆœˆ9ๆ—ฅ', 'ไธŠๅˆ', '็บณ่พพๅฐ”', 'ๅœจ', 'ไบš็‘Ÿยท้˜ฟไป€็ƒๅœบ', 'ๅ‡ป่ดฅ', 'ไฟ„็ฝ—ๆ–ฏ', '็ƒๅ‘˜', 'ๆข…ๅพท้Ÿฆๆฐๅคซ'], 'head': [2, 6, 6, 5, 6, 0, 8, 9, 6], 'deprel': ['ATT', 'ADV', 'SBV', 'MT', 'ADV', 'HED', 'ATT', 'ATT', 'VOB']}] # Sentiment Analysis senta = Taskflow("sentiment_analysis") senta("่ฟ™ไธชไบงๅ“็”จ่ตทๆฅ็œŸ็š„ๅพˆๆต็•…๏ผŒๆˆ‘้žๅธธๅ–œๆฌข") >>> [{'text': '่ฟ™ไธชไบงๅ“็”จ่ตทๆฅ็œŸ็š„ๅพˆๆต็•…๏ผŒๆˆ‘้žๅธธๅ–œๆฌข', 'label': 'positive', 'score': 0.9938690066337585}] ``` ## API Reference - Support [LUGE](https://www.luge.ai/) dataset loading and compatible with Hugging Face [Datasets](https://huggingface.co/datasets). For more details please refer to [Dataset API](https://paddlenlp.readthedocs.io/zh/latest/data_prepare/dataset_list.html). - Using Hugging Face style API to load 500+ selected transformer models and download with fast speed. For more information please refer to [Transformers API](https://paddlenlp.readthedocs.io/zh/latest/model_zoo/index.html). - One-line of code to load pre-trained word embedding. For more usage please refer to [Embedding API](https://paddlenlp.readthedocs.io/zh/latest/model_zoo/embeddings.html). Please find all PaddleNLP API Reference from our [readthedocs](https://paddlenlp.readthedocs.io/). ## Community ### Slack To connect with other users and contributors, welcome to join our [Slack channel](https://paddlenlp.slack.com/). ### WeChat Scan the QR code below with your Wechatโฌ‡๏ธ. You can access to official technical exchange group. Look forward to your participation.
## Citation If you find PaddleNLP useful in your research, please consider cite ``` @misc{=paddlenlp, title={PaddleNLP: An Easy-to-use and High Performance NLP Library}, author={PaddleNLP Contributors}, howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}}, year={2021} } ``` ## Acknowledge We have borrowed from Hugging Face's [Transformers](https://github.com/huggingface/transformers)๐Ÿค— excellent design on pretrained models usage, and we would like to express our gratitude to the authors of Hugging Face and its open source community. ## License PaddleNLP is provided under the [Apache-2.0 License](./LICENSE). %package -n python3-paddlenlp Summary: Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including Neural Search, Question Answering, Information Extraction and Sentiment Analysis end-to-end system. Provides: python-paddlenlp BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-paddlenlp

Features | Installation | Quick Start | API Reference | Community

**PaddleNLP** is an *easy-to-use* and *powerful* NLP library with **Awesome** pre-trained model zoo, supporting wide-range of NLP tasks from research to industrial applications. ## News ๐Ÿ“ข * ๐Ÿ”ฅ **Latest Features** * ๐Ÿ“ƒ Release **[UIE-X](./applications/information_extraction)**, an universal information extraction model that supports both document and text inputs. * โฃ๏ธRelease **[Opinion Mining and Sentiment Analysis Models](./applications/sentiment_analysis/unified_sentiment_extraction)** based on UIE, including abilities of sentence-level and aspect-based sentiment classification, attribute extraction, opinion extraction, attribute aggregation and implicit opinion extraction. * **2022.9.6 [PaddleNLPv2.4](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.4.0) Released!** * ๐Ÿ’Ž NLP Tools: Released **[Pipelines](./pipelines)** which supports turn-key construction of search engine and question answering systems. It features a flexible design that is applicable for all kinds of NLP systems so you can build end-to-end NLP pipelines like Legos! * ๐Ÿ”จ Industrial application: Release **[Complete Solution of Text Classification](./applications/text_classification)** covering various scenarios of text classification: multi-class, multi-label and hierarchical, it also supports **few-shot learning** and the training and optimization of **TrustAI**. Upgrade for [**UIE**](./model_zoo/uie) and release **UIE-M**, support both Chinese and English information extraction in a single model; release the data distillation solution for UIE to break the bottleneck of time-consuming of inference. * ๐Ÿญ AIGC: Release code generation SOTA model [**CodeGen**](./examples/code_generation/codegen) that supports multiple programming languages code generation. Integrate [**Text to Image Model**](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/docs/model_zoo/taskflow.md#%E6%96%87%E5%9B%BE%E7%94%9F%E6%88%90) DALLยทE Mini, Disco Diffusion, Stable Diffusion, let's play and have some fun! * ๐Ÿ’ช Framework upgrade: Release [**Auto Model Compression API**](./docs/compression.md), supports for pruning and quantization automatically, lower the barriers of model compression; Release [**Few-shot Prompt**](./applications/text_classification/multi_class/few-shot), includes the algorithms such as PET, P-Tuning and RGL. ## Features #### ๐Ÿ“ฆ Out-of-Box NLP Toolset #### ๐Ÿค— Awesome Chinese Model Zoo #### ๐ŸŽ›๏ธ Industrial End-to-end System #### ๐Ÿš€ High Performance Distributed Training and Inference ### Out-of-Box NLP Toolset Taskflow aims to provide off-the-shelf NLP pre-built task covering NLU and NLG technique, in the meanwhile with extreamly fast infernece satisfying industrial scenario. ![taskflow1](https://user-images.githubusercontent.com/11793384/159693816-fda35221-9751-43bb-b05c-7fc77571dd76.gif) For more usage please refer to [Taskflow Docs](./docs/model_zoo/taskflow.md). ### Awesome Chinese Model Zoo #### ๐Ÿ€„ Comprehensive Chinese Transformer Models We provide **45+** network architectures and over **500+** pretrained models. Not only includes all the SOTA model like ERNIE, PLATO and SKEP released by Baidu, but also integrates most of the high-quality Chinese pretrained model developed by other organizations. Use `AutoModel` API to **โšกSUPER FASTโšก** download pretrained models of different architecture. We welcome all developers to contribute your Transformer models to PaddleNLP! ```python from paddlenlp.transformers import * ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') bert = AutoModel.from_pretrained('bert-wwm-chinese') albert = AutoModel.from_pretrained('albert-chinese-tiny') roberta = AutoModel.from_pretrained('roberta-wwm-ext') electra = AutoModel.from_pretrained('chinese-electra-small') gpt = AutoModelForPretraining.from_pretrained('gpt-cpm-large-cn') ``` Due to the computation limitation, you can use the ERNIE-Tiny light models to accelerate the deployment of pretrained models. ```python # 6L768H ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') # 6L384H ernie = AutoModel.from_pretrained('ernie-3.0-mini-zh') # 4L384H ernie = AutoModel.from_pretrained('ernie-3.0-micro-zh') # 4L312H ernie = AutoModel.from_pretrained('ernie-3.0-nano-zh') ``` Unified API experience for NLP task like semantic representation, text classification, sentence matching, sequence labeling, question answering, etc. ```python import paddle from paddlenlp.transformers import * tokenizer = AutoTokenizer.from_pretrained('ernie-3.0-medium-zh') text = tokenizer('natural language processing') # Semantic Representation model = AutoModel.from_pretrained('ernie-3.0-medium-zh') sequence_output, pooled_output = model(input_ids=paddle.to_tensor([text['input_ids']])) # Text Classificaiton and Matching model = AutoModelForSequenceClassification.from_pretrained('ernie-3.0-medium-zh') # Sequence Labeling model = AutoModelForTokenClassification.from_pretrained('ernie-3.0-medium-zh') # Question Answering model = AutoModelForQuestionAnswering.from_pretrained('ernie-3.0-medium-zh') ``` #### Wide-range NLP Task Support PaddleNLP provides rich examples covering mainstream NLP task to help developers accelerate problem solving. You can find our powerful transformer [Model Zoo](./model_zoo), and wide-range NLP application [exmaples](./examples) with detailed instructions. Also you can run our interactive [Notebook tutorial](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995) on AI Studio, a powerful platform with **FREE** computing resource.
PaddleNLP Transformer model summary (click to show details)
| Model | Sequence Classification | Token Classification | Question Answering | Text Generation | Multiple Choice | | :----------------- | ----------------------- | -------------------- | ------------------ | --------------- | --------------- | | ALBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | BART | โœ… | โœ… | โœ… | โœ… | โŒ | | BERT | โœ… | โœ… | โœ… | โŒ | โœ… | | BigBird | โœ… | โœ… | โœ… | โŒ | โœ… | | BlenderBot | โŒ | โŒ | โŒ | โœ… | โŒ | | ChineseBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | ConvBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | CTRL | โœ… | โŒ | โŒ | โŒ | โŒ | | DistilBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | ELECTRA | โœ… | โœ… | โœ… | โŒ | โœ… | | ERNIE | โœ… | โœ… | โœ… | โŒ | โœ… | | ERNIE-CTM | โŒ | โœ… | โŒ | โŒ | โŒ | | ERNIE-Doc | โœ… | โœ… | โœ… | โŒ | โŒ | | ERNIE-GEN | โŒ | โŒ | โŒ | โœ… | โŒ | | ERNIE-Gram | โœ… | โœ… | โœ… | โŒ | โŒ | | ERNIE-M | โœ… | โœ… | โœ… | โŒ | โŒ | | FNet | โœ… | โœ… | โœ… | โŒ | โœ… | | Funnel-Transformer | โœ… | โœ… | โœ… | โŒ | โŒ | | GPT | โœ… | โœ… | โŒ | โœ… | โŒ | | LayoutLM | โœ… | โœ… | โŒ | โŒ | โŒ | | LayoutLMv2 | โŒ | โœ… | โŒ | โŒ | โŒ | | LayoutXLM | โŒ | โœ… | โŒ | โŒ | โŒ | | LUKE | โŒ | โœ… | โœ… | โŒ | โŒ | | mBART | โœ… | โŒ | โœ… | โŒ | โœ… | | MegatronBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | MobileBERT | โœ… | โŒ | โœ… | โŒ | โŒ | | MPNet | โœ… | โœ… | โœ… | โŒ | โœ… | | NEZHA | โœ… | โœ… | โœ… | โŒ | โœ… | | PP-MiniLM | โœ… | โŒ | โŒ | โŒ | โŒ | | ProphetNet | โŒ | โŒ | โŒ | โœ… | โŒ | | Reformer | โœ… | โŒ | โœ… | โŒ | โŒ | | RemBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | RoBERTa | โœ… | โœ… | โœ… | โŒ | โœ… | | RoFormer | โœ… | โœ… | โœ… | โŒ | โŒ | | SKEP | โœ… | โœ… | โŒ | โŒ | โŒ | | SqueezeBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | T5 | โŒ | โŒ | โŒ | โœ… | โŒ | | TinyBERT | โœ… | โŒ | โŒ | โŒ | โŒ | | UnifiedTransformer | โŒ | โŒ | โŒ | โœ… | โŒ | | XLNet | โœ… | โœ… | โœ… | โŒ | โœ… |
For more pretrained model usage, please refer to [Transformer API Docs](./docs/model_zoo/index.rst). ### Industrial End-to-end System We provide high value scenarios including information extraction, semantic retrieval, questionn answering high-value. For more details industial cases please refer to [Applications](./applications). #### ๐Ÿ” Neural Search System
For more details please refer to [Neural Search](./applications/neural_search). #### โ“ Question Answering System We provide question answering pipeline which can support FAQ system, Document-level Visual Question answering system based on [๐Ÿš€RocketQA](https://github.com/PaddlePaddle/RocketQA).
For more details please refer to [Question Answering](./applications/question_answering) and [Document VQA](./applications/document_intelligence/doc_vqa). #### ๐Ÿ’Œ Opinion Extraction and Sentiment Analysis We build an opinion extraction system for product review and fine-grained sentiment analysis based on [SKEP](https://arxiv.org/abs/2005.05635) Model.
For more details please refer to [Sentiment Analysis](./applications/sentiment_analysis). #### ๐ŸŽ™๏ธ Speech Command Analysis Integrated ASR Model, Information Extraction, we provide a speech command analysis pipeline that show how to use PaddleNLP and [PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech) to solve Speech + NLP real scenarios.
For more details please refer to [Speech Command Analysis](./applications/speech_cmd_analysis). ### High Performance Distributed Training and Inference #### โšก FastTokenizer: High Performance Text Preprocessing Library
```python AutoTokenizer.from_pretrained("ernie-3.0-medium-zh", use_fast=True) ``` Set `use_fast=True` to use C++ Tokenizer kernel to achieve 100x faster on text pre-processing. For more usage please refer to [FastTokenizer](./fast_tokenizer). #### โšก FastGeneration: High Perforance Generation Library
```python model = GPTLMHeadModel.from_pretrained('gpt-cpm-large-cn') outputs, _ = model.generate( input_ids=inputs_ids, max_length=10, decode_strategy='greedy_search', use_fast=True) ``` Set `use_fast=True` to achieve 5x speedup for Transformer, GPT, BART, PLATO, UniLM text generation. For more usage please refer to [FastGeneration](./fast_generation). #### ๐Ÿš€ Fleet: 4D Hybrid Distributed Training
For more super large-scale model pre-training details please refer to [GPT-3](./examples/language_model/gpt-3). ## Installation ### Prerequisites * python >= 3.7 * paddlepaddle >= 2.3 More information about PaddlePaddle installation please refer to [PaddlePaddle's Website](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/conda/linux-conda.html). ### Python pip Installation ``` pip install --upgrade paddlenlp ``` or you can install the latest develop branch code with the following command: ```shell pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html ``` ## Quick Start **Taskflow** aims to provide off-the-shelf NLP pre-built task covering NLU and NLG scenario, in the meanwhile with extreamly fast infernece satisfying industrial applications. ```python from paddlenlp import Taskflow # Chinese Word Segmentation seg = Taskflow("word_segmentation") seg("็ฌฌๅๅ››ๅฑŠๅ…จ่ฟไผšๅœจ่ฅฟๅฎ‰ไธพๅŠž") >>> ['็ฌฌๅๅ››ๅฑŠ', 'ๅ…จ่ฟไผš', 'ๅœจ', '่ฅฟๅฎ‰', 'ไธพๅŠž'] # POS Tagging tag = Taskflow("pos_tagging") tag("็ฌฌๅๅ››ๅฑŠๅ…จ่ฟไผšๅœจ่ฅฟๅฎ‰ไธพๅŠž") >>> [('็ฌฌๅๅ››ๅฑŠ', 'm'), ('ๅ…จ่ฟไผš', 'nz'), ('ๅœจ', 'p'), ('่ฅฟๅฎ‰', 'LOC'), ('ไธพๅŠž', 'v')] # Named Entity Recognition ner = Taskflow("ner") ner("ใ€Šๅญคๅฅณใ€‹ๆ˜ฏ2010ๅนดไนๅทžๅ‡บ็‰ˆ็คพๅ‡บ็‰ˆ็š„ๅฐ่ฏด๏ผŒไฝœ่€…ๆ˜ฏไฝ™ๅ…ผ็พฝ") >>> [('ใ€Š', 'w'), ('ๅญคๅฅณ', 'ไฝœๅ“็ฑป_ๅฎžไฝ“'), ('ใ€‹', 'w'), ('ๆ˜ฏ', '่‚ฏๅฎš่ฏ'), ('2010ๅนด', 'ๆ—ถ้—ด็ฑป'), ('ไนๅทžๅ‡บ็‰ˆ็คพ', '็ป„็ป‡ๆœบๆž„็ฑป'), ('ๅ‡บ็‰ˆ', 'ๅœบๆ™ฏไบ‹ไปถ'), ('็š„', 'ๅŠฉ่ฏ'), ('ๅฐ่ฏด', 'ไฝœๅ“็ฑป_ๆฆ‚ๅฟต'), ('๏ผŒ', 'w'), ('ไฝœ่€…', 'ไบบ็‰ฉ็ฑป_ๆฆ‚ๅฟต'), ('ๆ˜ฏ', '่‚ฏๅฎš่ฏ'), ('ไฝ™ๅ…ผ็พฝ', 'ไบบ็‰ฉ็ฑป_ๅฎžไฝ“')] # Dependency Parsing ddp = Taskflow("dependency_parsing") ddp("9ๆœˆ9ๆ—ฅไธŠๅˆ็บณ่พพๅฐ”ๅœจไบš็‘Ÿยท้˜ฟไป€็ƒๅœบๅ‡ป่ดฅไฟ„็ฝ—ๆ–ฏ็ƒๅ‘˜ๆข…ๅพท้Ÿฆๆฐๅคซ") >>> [{'word': ['9ๆœˆ9ๆ—ฅ', 'ไธŠๅˆ', '็บณ่พพๅฐ”', 'ๅœจ', 'ไบš็‘Ÿยท้˜ฟไป€็ƒๅœบ', 'ๅ‡ป่ดฅ', 'ไฟ„็ฝ—ๆ–ฏ', '็ƒๅ‘˜', 'ๆข…ๅพท้Ÿฆๆฐๅคซ'], 'head': [2, 6, 6, 5, 6, 0, 8, 9, 6], 'deprel': ['ATT', 'ADV', 'SBV', 'MT', 'ADV', 'HED', 'ATT', 'ATT', 'VOB']}] # Sentiment Analysis senta = Taskflow("sentiment_analysis") senta("่ฟ™ไธชไบงๅ“็”จ่ตทๆฅ็œŸ็š„ๅพˆๆต็•…๏ผŒๆˆ‘้žๅธธๅ–œๆฌข") >>> [{'text': '่ฟ™ไธชไบงๅ“็”จ่ตทๆฅ็œŸ็š„ๅพˆๆต็•…๏ผŒๆˆ‘้žๅธธๅ–œๆฌข', 'label': 'positive', 'score': 0.9938690066337585}] ``` ## API Reference - Support [LUGE](https://www.luge.ai/) dataset loading and compatible with Hugging Face [Datasets](https://huggingface.co/datasets). For more details please refer to [Dataset API](https://paddlenlp.readthedocs.io/zh/latest/data_prepare/dataset_list.html). - Using Hugging Face style API to load 500+ selected transformer models and download with fast speed. For more information please refer to [Transformers API](https://paddlenlp.readthedocs.io/zh/latest/model_zoo/index.html). - One-line of code to load pre-trained word embedding. For more usage please refer to [Embedding API](https://paddlenlp.readthedocs.io/zh/latest/model_zoo/embeddings.html). Please find all PaddleNLP API Reference from our [readthedocs](https://paddlenlp.readthedocs.io/). ## Community ### Slack To connect with other users and contributors, welcome to join our [Slack channel](https://paddlenlp.slack.com/). ### WeChat Scan the QR code below with your Wechatโฌ‡๏ธ. You can access to official technical exchange group. Look forward to your participation.
## Citation If you find PaddleNLP useful in your research, please consider cite ``` @misc{=paddlenlp, title={PaddleNLP: An Easy-to-use and High Performance NLP Library}, author={PaddleNLP Contributors}, howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}}, year={2021} } ``` ## Acknowledge We have borrowed from Hugging Face's [Transformers](https://github.com/huggingface/transformers)๐Ÿค— excellent design on pretrained models usage, and we would like to express our gratitude to the authors of Hugging Face and its open source community. ## License PaddleNLP is provided under the [Apache-2.0 License](./LICENSE). %package help Summary: Development documents and examples for paddlenlp Provides: python3-paddlenlp-doc %description help

Features | Installation | Quick Start | API Reference | Community

**PaddleNLP** is an *easy-to-use* and *powerful* NLP library with **Awesome** pre-trained model zoo, supporting wide-range of NLP tasks from research to industrial applications. ## News ๐Ÿ“ข * ๐Ÿ”ฅ **Latest Features** * ๐Ÿ“ƒ Release **[UIE-X](./applications/information_extraction)**, an universal information extraction model that supports both document and text inputs. * โฃ๏ธRelease **[Opinion Mining and Sentiment Analysis Models](./applications/sentiment_analysis/unified_sentiment_extraction)** based on UIE, including abilities of sentence-level and aspect-based sentiment classification, attribute extraction, opinion extraction, attribute aggregation and implicit opinion extraction. * **2022.9.6 [PaddleNLPv2.4](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.4.0) Released!** * ๐Ÿ’Ž NLP Tools: Released **[Pipelines](./pipelines)** which supports turn-key construction of search engine and question answering systems. It features a flexible design that is applicable for all kinds of NLP systems so you can build end-to-end NLP pipelines like Legos! * ๐Ÿ”จ Industrial application: Release **[Complete Solution of Text Classification](./applications/text_classification)** covering various scenarios of text classification: multi-class, multi-label and hierarchical, it also supports **few-shot learning** and the training and optimization of **TrustAI**. Upgrade for [**UIE**](./model_zoo/uie) and release **UIE-M**, support both Chinese and English information extraction in a single model; release the data distillation solution for UIE to break the bottleneck of time-consuming of inference. * ๐Ÿญ AIGC: Release code generation SOTA model [**CodeGen**](./examples/code_generation/codegen) that supports multiple programming languages code generation. Integrate [**Text to Image Model**](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/docs/model_zoo/taskflow.md#%E6%96%87%E5%9B%BE%E7%94%9F%E6%88%90) DALLยทE Mini, Disco Diffusion, Stable Diffusion, let's play and have some fun! * ๐Ÿ’ช Framework upgrade: Release [**Auto Model Compression API**](./docs/compression.md), supports for pruning and quantization automatically, lower the barriers of model compression; Release [**Few-shot Prompt**](./applications/text_classification/multi_class/few-shot), includes the algorithms such as PET, P-Tuning and RGL. ## Features #### ๐Ÿ“ฆ Out-of-Box NLP Toolset #### ๐Ÿค— Awesome Chinese Model Zoo #### ๐ŸŽ›๏ธ Industrial End-to-end System #### ๐Ÿš€ High Performance Distributed Training and Inference ### Out-of-Box NLP Toolset Taskflow aims to provide off-the-shelf NLP pre-built task covering NLU and NLG technique, in the meanwhile with extreamly fast infernece satisfying industrial scenario. ![taskflow1](https://user-images.githubusercontent.com/11793384/159693816-fda35221-9751-43bb-b05c-7fc77571dd76.gif) For more usage please refer to [Taskflow Docs](./docs/model_zoo/taskflow.md). ### Awesome Chinese Model Zoo #### ๐Ÿ€„ Comprehensive Chinese Transformer Models We provide **45+** network architectures and over **500+** pretrained models. Not only includes all the SOTA model like ERNIE, PLATO and SKEP released by Baidu, but also integrates most of the high-quality Chinese pretrained model developed by other organizations. Use `AutoModel` API to **โšกSUPER FASTโšก** download pretrained models of different architecture. We welcome all developers to contribute your Transformer models to PaddleNLP! ```python from paddlenlp.transformers import * ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') bert = AutoModel.from_pretrained('bert-wwm-chinese') albert = AutoModel.from_pretrained('albert-chinese-tiny') roberta = AutoModel.from_pretrained('roberta-wwm-ext') electra = AutoModel.from_pretrained('chinese-electra-small') gpt = AutoModelForPretraining.from_pretrained('gpt-cpm-large-cn') ``` Due to the computation limitation, you can use the ERNIE-Tiny light models to accelerate the deployment of pretrained models. ```python # 6L768H ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') # 6L384H ernie = AutoModel.from_pretrained('ernie-3.0-mini-zh') # 4L384H ernie = AutoModel.from_pretrained('ernie-3.0-micro-zh') # 4L312H ernie = AutoModel.from_pretrained('ernie-3.0-nano-zh') ``` Unified API experience for NLP task like semantic representation, text classification, sentence matching, sequence labeling, question answering, etc. ```python import paddle from paddlenlp.transformers import * tokenizer = AutoTokenizer.from_pretrained('ernie-3.0-medium-zh') text = tokenizer('natural language processing') # Semantic Representation model = AutoModel.from_pretrained('ernie-3.0-medium-zh') sequence_output, pooled_output = model(input_ids=paddle.to_tensor([text['input_ids']])) # Text Classificaiton and Matching model = AutoModelForSequenceClassification.from_pretrained('ernie-3.0-medium-zh') # Sequence Labeling model = AutoModelForTokenClassification.from_pretrained('ernie-3.0-medium-zh') # Question Answering model = AutoModelForQuestionAnswering.from_pretrained('ernie-3.0-medium-zh') ``` #### Wide-range NLP Task Support PaddleNLP provides rich examples covering mainstream NLP task to help developers accelerate problem solving. You can find our powerful transformer [Model Zoo](./model_zoo), and wide-range NLP application [exmaples](./examples) with detailed instructions. Also you can run our interactive [Notebook tutorial](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995) on AI Studio, a powerful platform with **FREE** computing resource.
PaddleNLP Transformer model summary (click to show details)
| Model | Sequence Classification | Token Classification | Question Answering | Text Generation | Multiple Choice | | :----------------- | ----------------------- | -------------------- | ------------------ | --------------- | --------------- | | ALBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | BART | โœ… | โœ… | โœ… | โœ… | โŒ | | BERT | โœ… | โœ… | โœ… | โŒ | โœ… | | BigBird | โœ… | โœ… | โœ… | โŒ | โœ… | | BlenderBot | โŒ | โŒ | โŒ | โœ… | โŒ | | ChineseBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | ConvBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | CTRL | โœ… | โŒ | โŒ | โŒ | โŒ | | DistilBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | ELECTRA | โœ… | โœ… | โœ… | โŒ | โœ… | | ERNIE | โœ… | โœ… | โœ… | โŒ | โœ… | | ERNIE-CTM | โŒ | โœ… | โŒ | โŒ | โŒ | | ERNIE-Doc | โœ… | โœ… | โœ… | โŒ | โŒ | | ERNIE-GEN | โŒ | โŒ | โŒ | โœ… | โŒ | | ERNIE-Gram | โœ… | โœ… | โœ… | โŒ | โŒ | | ERNIE-M | โœ… | โœ… | โœ… | โŒ | โŒ | | FNet | โœ… | โœ… | โœ… | โŒ | โœ… | | Funnel-Transformer | โœ… | โœ… | โœ… | โŒ | โŒ | | GPT | โœ… | โœ… | โŒ | โœ… | โŒ | | LayoutLM | โœ… | โœ… | โŒ | โŒ | โŒ | | LayoutLMv2 | โŒ | โœ… | โŒ | โŒ | โŒ | | LayoutXLM | โŒ | โœ… | โŒ | โŒ | โŒ | | LUKE | โŒ | โœ… | โœ… | โŒ | โŒ | | mBART | โœ… | โŒ | โœ… | โŒ | โœ… | | MegatronBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | MobileBERT | โœ… | โŒ | โœ… | โŒ | โŒ | | MPNet | โœ… | โœ… | โœ… | โŒ | โœ… | | NEZHA | โœ… | โœ… | โœ… | โŒ | โœ… | | PP-MiniLM | โœ… | โŒ | โŒ | โŒ | โŒ | | ProphetNet | โŒ | โŒ | โŒ | โœ… | โŒ | | Reformer | โœ… | โŒ | โœ… | โŒ | โŒ | | RemBERT | โœ… | โœ… | โœ… | โŒ | โœ… | | RoBERTa | โœ… | โœ… | โœ… | โŒ | โœ… | | RoFormer | โœ… | โœ… | โœ… | โŒ | โŒ | | SKEP | โœ… | โœ… | โŒ | โŒ | โŒ | | SqueezeBERT | โœ… | โœ… | โœ… | โŒ | โŒ | | T5 | โŒ | โŒ | โŒ | โœ… | โŒ | | TinyBERT | โœ… | โŒ | โŒ | โŒ | โŒ | | UnifiedTransformer | โŒ | โŒ | โŒ | โœ… | โŒ | | XLNet | โœ… | โœ… | โœ… | โŒ | โœ… |
For more pretrained model usage, please refer to [Transformer API Docs](./docs/model_zoo/index.rst). ### Industrial End-to-end System We provide high value scenarios including information extraction, semantic retrieval, questionn answering high-value. For more details industial cases please refer to [Applications](./applications). #### ๐Ÿ” Neural Search System
For more details please refer to [Neural Search](./applications/neural_search). #### โ“ Question Answering System We provide question answering pipeline which can support FAQ system, Document-level Visual Question answering system based on [๐Ÿš€RocketQA](https://github.com/PaddlePaddle/RocketQA).
For more details please refer to [Question Answering](./applications/question_answering) and [Document VQA](./applications/document_intelligence/doc_vqa). #### ๐Ÿ’Œ Opinion Extraction and Sentiment Analysis We build an opinion extraction system for product review and fine-grained sentiment analysis based on [SKEP](https://arxiv.org/abs/2005.05635) Model.
For more details please refer to [Sentiment Analysis](./applications/sentiment_analysis). #### ๐ŸŽ™๏ธ Speech Command Analysis Integrated ASR Model, Information Extraction, we provide a speech command analysis pipeline that show how to use PaddleNLP and [PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech) to solve Speech + NLP real scenarios.
For more details please refer to [Speech Command Analysis](./applications/speech_cmd_analysis). ### High Performance Distributed Training and Inference #### โšก FastTokenizer: High Performance Text Preprocessing Library
```python AutoTokenizer.from_pretrained("ernie-3.0-medium-zh", use_fast=True) ``` Set `use_fast=True` to use C++ Tokenizer kernel to achieve 100x faster on text pre-processing. For more usage please refer to [FastTokenizer](./fast_tokenizer). #### โšก FastGeneration: High Perforance Generation Library
```python model = GPTLMHeadModel.from_pretrained('gpt-cpm-large-cn') outputs, _ = model.generate( input_ids=inputs_ids, max_length=10, decode_strategy='greedy_search', use_fast=True) ``` Set `use_fast=True` to achieve 5x speedup for Transformer, GPT, BART, PLATO, UniLM text generation. For more usage please refer to [FastGeneration](./fast_generation). #### ๐Ÿš€ Fleet: 4D Hybrid Distributed Training
For more super large-scale model pre-training details please refer to [GPT-3](./examples/language_model/gpt-3). ## Installation ### Prerequisites * python >= 3.7 * paddlepaddle >= 2.3 More information about PaddlePaddle installation please refer to [PaddlePaddle's Website](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/conda/linux-conda.html). ### Python pip Installation ``` pip install --upgrade paddlenlp ``` or you can install the latest develop branch code with the following command: ```shell pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html ``` ## Quick Start **Taskflow** aims to provide off-the-shelf NLP pre-built task covering NLU and NLG scenario, in the meanwhile with extreamly fast infernece satisfying industrial applications. ```python from paddlenlp import Taskflow # Chinese Word Segmentation seg = Taskflow("word_segmentation") seg("็ฌฌๅๅ››ๅฑŠๅ…จ่ฟไผšๅœจ่ฅฟๅฎ‰ไธพๅŠž") >>> ['็ฌฌๅๅ››ๅฑŠ', 'ๅ…จ่ฟไผš', 'ๅœจ', '่ฅฟๅฎ‰', 'ไธพๅŠž'] # POS Tagging tag = Taskflow("pos_tagging") tag("็ฌฌๅๅ››ๅฑŠๅ…จ่ฟไผšๅœจ่ฅฟๅฎ‰ไธพๅŠž") >>> [('็ฌฌๅๅ››ๅฑŠ', 'm'), ('ๅ…จ่ฟไผš', 'nz'), ('ๅœจ', 'p'), ('่ฅฟๅฎ‰', 'LOC'), ('ไธพๅŠž', 'v')] # Named Entity Recognition ner = Taskflow("ner") ner("ใ€Šๅญคๅฅณใ€‹ๆ˜ฏ2010ๅนดไนๅทžๅ‡บ็‰ˆ็คพๅ‡บ็‰ˆ็š„ๅฐ่ฏด๏ผŒไฝœ่€…ๆ˜ฏไฝ™ๅ…ผ็พฝ") >>> [('ใ€Š', 'w'), ('ๅญคๅฅณ', 'ไฝœๅ“็ฑป_ๅฎžไฝ“'), ('ใ€‹', 'w'), ('ๆ˜ฏ', '่‚ฏๅฎš่ฏ'), ('2010ๅนด', 'ๆ—ถ้—ด็ฑป'), ('ไนๅทžๅ‡บ็‰ˆ็คพ', '็ป„็ป‡ๆœบๆž„็ฑป'), ('ๅ‡บ็‰ˆ', 'ๅœบๆ™ฏไบ‹ไปถ'), ('็š„', 'ๅŠฉ่ฏ'), ('ๅฐ่ฏด', 'ไฝœๅ“็ฑป_ๆฆ‚ๅฟต'), ('๏ผŒ', 'w'), ('ไฝœ่€…', 'ไบบ็‰ฉ็ฑป_ๆฆ‚ๅฟต'), ('ๆ˜ฏ', '่‚ฏๅฎš่ฏ'), ('ไฝ™ๅ…ผ็พฝ', 'ไบบ็‰ฉ็ฑป_ๅฎžไฝ“')] # Dependency Parsing ddp = Taskflow("dependency_parsing") ddp("9ๆœˆ9ๆ—ฅไธŠๅˆ็บณ่พพๅฐ”ๅœจไบš็‘Ÿยท้˜ฟไป€็ƒๅœบๅ‡ป่ดฅไฟ„็ฝ—ๆ–ฏ็ƒๅ‘˜ๆข…ๅพท้Ÿฆๆฐๅคซ") >>> [{'word': ['9ๆœˆ9ๆ—ฅ', 'ไธŠๅˆ', '็บณ่พพๅฐ”', 'ๅœจ', 'ไบš็‘Ÿยท้˜ฟไป€็ƒๅœบ', 'ๅ‡ป่ดฅ', 'ไฟ„็ฝ—ๆ–ฏ', '็ƒๅ‘˜', 'ๆข…ๅพท้Ÿฆๆฐๅคซ'], 'head': [2, 6, 6, 5, 6, 0, 8, 9, 6], 'deprel': ['ATT', 'ADV', 'SBV', 'MT', 'ADV', 'HED', 'ATT', 'ATT', 'VOB']}] # Sentiment Analysis senta = Taskflow("sentiment_analysis") senta("่ฟ™ไธชไบงๅ“็”จ่ตทๆฅ็œŸ็š„ๅพˆๆต็•…๏ผŒๆˆ‘้žๅธธๅ–œๆฌข") >>> [{'text': '่ฟ™ไธชไบงๅ“็”จ่ตทๆฅ็œŸ็š„ๅพˆๆต็•…๏ผŒๆˆ‘้žๅธธๅ–œๆฌข', 'label': 'positive', 'score': 0.9938690066337585}] ``` ## API Reference - Support [LUGE](https://www.luge.ai/) dataset loading and compatible with Hugging Face [Datasets](https://huggingface.co/datasets). For more details please refer to [Dataset API](https://paddlenlp.readthedocs.io/zh/latest/data_prepare/dataset_list.html). - Using Hugging Face style API to load 500+ selected transformer models and download with fast speed. For more information please refer to [Transformers API](https://paddlenlp.readthedocs.io/zh/latest/model_zoo/index.html). - One-line of code to load pre-trained word embedding. For more usage please refer to [Embedding API](https://paddlenlp.readthedocs.io/zh/latest/model_zoo/embeddings.html). Please find all PaddleNLP API Reference from our [readthedocs](https://paddlenlp.readthedocs.io/). ## Community ### Slack To connect with other users and contributors, welcome to join our [Slack channel](https://paddlenlp.slack.com/). ### WeChat Scan the QR code below with your Wechatโฌ‡๏ธ. You can access to official technical exchange group. Look forward to your participation.
## Citation If you find PaddleNLP useful in your research, please consider cite ``` @misc{=paddlenlp, title={PaddleNLP: An Easy-to-use and High Performance NLP Library}, author={PaddleNLP Contributors}, howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}}, year={2021} } ``` ## Acknowledge We have borrowed from Hugging Face's [Transformers](https://github.com/huggingface/transformers)๐Ÿค— excellent design on pretrained models usage, and we would like to express our gratitude to the authors of Hugging Face and its open source community. ## License PaddleNLP is provided under the [Apache-2.0 License](./LICENSE). %prep %autosetup -n paddlenlp-2.5.2 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-paddlenlp -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Sun Apr 23 2023 Python_Bot - 2.5.2-1 - Package Spec generated