diff options
Diffstat (limited to 'python-nlu.spec')
-rw-r--r-- | python-nlu.spec | 1055 |
1 files changed, 1055 insertions, 0 deletions
diff --git a/python-nlu.spec b/python-nlu.spec new file mode 100644 index 0000000..fb7a722 --- /dev/null +++ b/python-nlu.spec @@ -0,0 +1,1055 @@ +%global _empty_manifest_terminate_build 0 +Name: python-nlu +Version: 4.2.0 +Release: 1 +Summary: John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 10000+ of pretrained models in 200+ languages. It enables swift and simple development and research with its powerful Pythonic and Keras inspired API. It is powerd by John Snow Labs powerful Spark NLP library. +License: Apache Software License +URL: http://nlu.johnsnowlabs.com +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/99/2e/59c673ff47d321ef315ec97b695da4aa70880dd8897feb46130fd2f84eab/nlu-4.2.0.tar.gz +BuildArch: noarch + +Requires: python3-spark-nlp +Requires: python3-numpy +Requires: python3-pyarrow +Requires: python3-pandas +Requires: python3-dataclasses + +%description + +# NLU: The Power of Spark NLP, the Simplicity of Python +John Snow Labs' NLU is a Python library for applying state-of-the-art text mining, directly on any dataframe, with a single line of code. +As a facade of the award-winning Spark NLP library, it comes with **1000+** of pretrained models in **100+**, all production-grade, scalable, and trainable, with **everything in 1 line of code.** + + + +## NLU in Action +See how easy it is to use any of the **thousands** of models in 1 line of code, there are hundreds of [tutorials](https://nlu.johnsnowlabs.com/docs/en/notebooks) and [simple examples](https://github.com/JohnSnowLabs/nlu/tree/master/examples) you can copy and paste into your projects to achieve State Of The Art easily. +<img src="http://ckl-it.de/wp-content/uploads/2020/08/My-Video6.gif" width="1800" height="500"/> + +## NLU & Streamlit in Action +This 1 line let's you visualize and play with **1000+ SOTA NLU & NLP models** in **200** languages + +```shell +streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/01_dashboard.py +``` +<img src="https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/docs/assets/streamlit_docs_assets/gif/start.gif"> + +NLU provides tight and simple integration into Streamlit, which enables building powerful webapps in just 1 line of code which showcase the. +View the [NLU&Streamlit documentation](https://nlu.johnsnowlabs.com/docs/en/streamlit_viz_examples) or [NLU & Streamlit examples section](https://github.com/JohnSnowLabs/nlu/tree/master/examples/streamlit). +The entire GIF demo and + + +## All NLU resources overview +Take a look at our official NLU page: [https://nlu.johnsnowlabs.com/](https://nlu.johnsnowlabs.com/) for user documentation and examples + +| Ressource | Description| +|-----------------------------------------------------------------------|-------------------------------------------| +| [Install NLU](https://nlu.johnsnowlabs.com/docs/en/install) | Just run `pip install nlu pyspark==3.0.2` +| [The NLU Namespace](https://nlu.johnsnowlabs.com/docs/en/namespace) | Find all the names of models you can load with `nlu.load()` +| [The `nlu.load(<Model>)` function](https://nlu.johnsnowlabs.com/docs/en/load_api) | Load any of the **1000+ models in 1 line** +| [The `nlu.load(<Model>).predict(data)` function](https://nlu.johnsnowlabs.com/docs/en/predict_api) | Predict on `Strings`, `List of Strings`, `Numpy Arrays`, `Pandas`, `Modin` and `Spark Dataframes` +| [The `nlu.load(<train.Model>).fit(data)` function](https://nlu.johnsnowlabs.com/docs/en/training) | Train a text classifier for `2-Class`, `N-Classes` `Multi-N-Classes`, `Named-Entitiy-Recognition` or `Parts of Speech Tagging` +| [The `nlu.load(<Model>).viz(data)` function](https://nlu.johnsnowlabs.com/docs/en/viz_examples) | Visualize the results of `Word Embedding Similarity Matrix`, `Named Entity Recognizers`, `Dependency Trees & Parts of Speech`, `Entity Resolution`,`Entity Linking` or `Entity Status Assertion` +| [The `nlu.load(<Model>).viz_streamlit(data)` function](https://nlu.johnsnowlabs.com/docs/en/streamlit_viz_examples) | Display an interactive GUI which lets you explore and test every model and feature in NLU in 1 click. +| [General Concepts](https://nlu.johnsnowlabs.com/docs/en/concepts) | General concepts in NLU +| [The latest release notes](https://nlu.johnsnowlabs.com/docs/en/release_notes) | Newest features added to NLU +| [Overview NLU 1-liners examples](https://nlu.johnsnowlabs.com/docs/en/examples) | Most common used models and their results +| [Overview NLU 1-liners examples for healthcare models](https://nlu.johnsnowlabs.com/docs/en/examples_hc) | Most common used healthcare models and their results +| [Overview of all NLU tutorials and Examples](https://nlu.johnsnowlabs.com/docs/en/notebooks) | 100+ tutorials on how to use NLU on text datasets for various problems and from various sources like Twitter, Chinese News, Crypto News Headlines, Airline Traffic communication, Product review classifier training, +| [Connect with us on Slack](https://join.slack.com/t/spark-nlp/shared_invite/zt-lutct9gm-kuUazcyFKhuGY3_0AMkxqA) | Problems, questions or suggestions? We have a very active and helpful community of over 2000+ AI enthusiasts putting NLU, Spark NLP & Spark OCR to good use +| [Discussion Forum](https://github.com/JohnSnowLabs/spark-nlp/discussions) | More indepth discussion with the community? Post a thread in our discussion Forum +| [John Snow Labs Medium](https://medium.com/spark-nlp) | Articles and Tutorials on the NLU, Spark NLP and Spark OCR +| [John Snow Labs Youtube](https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos) | Videos and Tutorials on the NLU, Spark NLP and Spark OCR +| [NLU Website](https://nlu.johnsnowlabs.com/) | The official NLU website +|[Github Issues](https://github.com/JohnSnowLabs/nlu/issues) | Report a bug + + + + + + +## Getting Started with NLU +To get your hands on the power of NLU, you just need to install it via pip and ensure Java 8 is installed and properly configured. Checkout [Quickstart for more infos](https://nlu.johnsnowlabs.com/docs/en/install) +```bash +pip install nlu pyspark==3.0.2 +``` + +## Loading and predicting with any model in 1 line python +```python +import nlu +nlu.load('sentiment').predict('I love NLU! <3') +``` + +## Loading and predicting with multiple models in 1 line + +Get 6 different embeddings in 1 line and use them for downstream data science tasks! + +```python +nlu.load('bert elmo albert xlnet glove use').predict('I love NLU! <3') +``` + +## What kind of models does NLU provide? +NLU provides everything a data scientist might want to wish for in one line of code! + - NLU provides everything a data scientist might want to wish for in one line of code! + - 1000 + pre-trained models + - 100+ of the latest NLP word embeddings ( BERT, ELMO, ALBERT, XLNET, GLOVE, BIOBERT, ELECTRA, COVIDBERT) and different variations of them + - 50+ of the latest NLP sentence embeddings ( BERT, ELECTRA, USE) and different variations of them + - 100+ Classifiers (NER, POS, Emotion, Sarcasm, Questions, Spam) + - 300+ Supported Languages +- Summarize Text and Answer Questions with T5 +- Labeled and Unlabeled Dependency parsing + - Various Text Cleaning and Pre-Processing methods like Stemming, Lemmatizing, Normalizing, Filtering, Cleaning pipelines and more + + +## Classifiers trained on many different datasets +Choose the right tool for the right task! Whether you analyze movies or twitter, NLU has the right model for you! + +- trec6 classifier +- trec10 classifier +- spam classifier +- fake news classifier +- emotion classifier +- cyberbullying classifier +- sarcasm classifier +- sentiment classifier for movies +- IMDB Movie Sentiment classifier +- Twitter sentiment classifier +- NER pretrained on ONTO notes +- NER trainer on CONLL +- Language classifier for 20 languages on the wiki 20 lang dataset. + +## Utilities for the Data Science NLU applications +Working with text data can sometimes be quite a dirty job. NLU helps you keep your hands clean by providing components that take away from data engineering intensive tasks. + +- Datetime Matcher +- Pattern Matcher +- Chunk Matcher +- Phrases Matcher +- Stopword Cleaners +- Pattern Cleaners +- Slang Cleaner + +## Where can I see all models available in NLU? +For NLU models to load, see [the NLU Namespace](https://nlu.johnsnowlabs.com/docs/en/namespace) or the [John Snow Labs Modelshub](https://modelshub.johnsnowlabs.com/models) or go [straight to the source](https://github.com/JohnSnowLabs/nlu/blob/master/nlu/namespace.py). + +## Supported Data Types +- Pandas DataFrame and Series +- Spark DataFrames +- Modin with Ray backend +- Modin with Dask backend +- Numpy arrays +- Strings and lists of strings + +## Overview of all tutorials using the NLU-Library + +In the following tabular, all available tutorials using NLU are listed. These tutorials will help you learn the +usage of the NLU library and on how to use it for your own tasks. Some of the tasks NLU does are +translating from any language to the english language, lemmatizing, tokenizing, cleaning text from +Symbol or unwanted syntax, spellchecking, detecting entities, analyzing sentiments and many more! + +{:.table2} + +| Tutorial Description | NLU Spells Used |Open In Colab | Dataset and Paper References | +|-----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Albert Word Embeddings with NLU | `albert`, `sentiment pos albert emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ALBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Albert-Paper](https://arxiv.org/pdf/1909.11942.pdf), [Albert on Github](https://github.com/google-research/ALBERT), [Albert on TensorFlow](https://tfhub.dev/s?q=albert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Albert](https://medium.com/spark-nlp/1-line-to-albert-word-embeddings-with-nlu-in-python-1691bc048ed1), [Albert_Embedding](https://nlp.johnsnowlabs.com/2021/06/23/albert_base_uncased_en.html) | +| Bert Word Embeddings with NLU | `bert`, `pos sentiment emotion bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_BERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Bert](https://medium.com/spark-nlp/1-line-to-bert-word-embeddings-with-nlu-f50d2b08cddc), [Bert_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| BIOBERT Word Embeddings with NLU | `biobert` , `sentiment pos biobert emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_BIOBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [BioBert-Paper](https://arxiv.org/abs/1901.08746), [Bert Github](https://github.com/google-research/bert) , [BERT: Deep Bidirectional Transformers](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Biobert](https://medium.com/spark-nlp/1-line-to-biobert-word-embeddings-with-nlu-in-python-7224ab52e131), [Biobert_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/biobert_pubmed_base_cased.html) | +| COVIDBERT Word Embeddings with NLU | `covidbert`, `sentiment covidbert pos` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_COVIDBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [CovidBert-Paper](https://journals.flvc.org/FLAIRS/article/view/128488), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-CovidBert](https://medium.com/spark-nlp/1-line-to-covidbert-word-embeddings-with-nlu-in-python-e67396da2f78), [Covidbert_Embedding](https://nlp.johnsnowlabs.com/2020/08/27/covidbert_large_uncased.html) | +| ELECTRA Word Embeddings with NLU | `electra`, `sentiment pos en.embed.electra emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ELECTRA_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Electra-Paper](https://arxiv.org/abs/2003.10555), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Electra](https://medium.com/spark-nlp/1-line-to-electra-word-embeddings-with-nlu-in-python-25f749bf3e92), [Electra_Embedding](https://nlp.johnsnowlabs.com/2020/08/27/electra_small_uncased.html) | +| ELMO Word Embeddings with NLU | `elmo`, `sentiment pos elmo emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ELMo_word_embeddings_and_t-SNE_visualization_example.ipynb) | [ELMO-Paper](https://arxiv.org/abs/1802.05365), [Elmo-TensorFlow](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Elmo](https://medium.com/spark-nlp/1-python-line-for-elmo-word-embeddings-with-john-snow-labs-nlu-628e9b924a3), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html) | +| GLOVE Word Embeddings with NLU | `glove`, `sentiment pos glove emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_GLOVE_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Glove-Paper](https://nlp.stanford.edu/pubs/glove.pdf), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Glove](https://medium.com/spark-nlp/1-line-to-glove-word-embeddings-with-nlu-in-python-baed152fff4d) , [Glove_Embedding](https://nlp.johnsnowlabs.com/2020/01/22/glove_100d.html) | +| XLNET Word Embeddings with NLU | `xlnet`, `sentiment pos xlnet emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_XLNET_word_embeddings_and_t-SNE_visualization_example.ipynb) | [XLNet-Paper](https://arxiv.org/abs/1906.08237), [Bert Github](https://github.com/zihangdai/xlnet), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-XLNet](https://medium.com/spark-nlp/1-line-to-xlnet-word-embeddings-with-nlu-in-python-5efc57d7ac79), [Xlnet_Embedding](https://nlp.johnsnowlabs.com/2021/07/07/xlnet_base_cased_en.html) | +| Multiple Word-Embeddings and Part of Speech in 1 Line of code | `bert electra elmo glove xlnet albert pos` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_multiple_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Bert-Paper](https://arxiv.org/pdf/1810.04805.pdf), [Albert-Paper](https://openreview.net/forum?id=H1eA7AEtvS), [ELMO-Paper](https://arxiv.org/abs/1802.05365), [Electra-Paper](https://arxiv.org/abs/2003.10555), [XLNet-Paper](https://arxiv.org/pdf/1906.08237.pdf), [Glove-Paper](https://nlp.stanford.edu/pubs/glove.pdf) | +| Normalzing with NLU | `norm` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_normalizer_example.ipynb) | - | +| Detect sentences with NLU | `sentence_detector.deep`, `sentence_detector.pragmatic`, `xx.sentence_detector` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_sentence_detection_example.ipynb) | [Sentence Detector](https://nlp.johnsnowlabs.com/2020/09/13/sentence_detector_dl_en.html) | +| Spellchecking with NLU | n.a. | n.a. | - | +| Stemming with NLU | `en.stem`, `de.stem` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_stemmer_example.ipynb) | - | +| Stopwords removal with NLU | `stopwords` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_stopwords_removal_example.ipynb) | [Stopwords](https://nlp.johnsnowlabs.com/2020/07/14/stopwords_en.html) | +| Tokenization with NLU | `tokenize` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_tokenization_example.ipynb) | - | +| Normalization of Documents | `norm_document` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/document_normalizer_demo.ipynb) | - | +| Open and Closed book question answering with Google's T5 | `en.t5` , `answer_question` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_question_answering.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf), [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Overview of every task available with T5 | `en.t5.base` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_tasks_summarize_question_answering_and_more.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf), [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Translate between more than 200 Languages in 1 line of code with Marian Models | `tr.translate_to.fr`, `en.translate_to.fr` ,`fr.translate_to.he` , `en.translate_to.de` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/translation_demo.ipynb) | [Marian-Papers](https://marian-nmt.github.io/publications/), [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation-Pipeline (En to Ger)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_de_xx.html) | +| BERT Sentence Embeddings with NLU | `embed_sentence.bert`, `pos sentiment embed_sentence.bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_BERT_sentence_embeddings_and_t-SNE_visualization_Example.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| ELECTRA Sentence Embeddings with NLU | `embed_sentence.electra`, `pos sentiment embed_sentence.electra` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_ELECTRA_sentence_embeddings_and_t-SNE_visualization_example.ipynb) | [Electra Paper](https://arxiv.org/abs/2003.10555), [Sentence-Electra-Embedding](https://nlp.johnsnowlabs.com/2020/08/27/sent_electra_small_uncased.html) | +| USE Sentence Embeddings with NLU | `use`, `pos sentiment use emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_USE_sentence_embeddings_and_t-SNE_visualization_example.ipynb) | [Universal Sentence Encoder](https://arxiv.org/abs/1803.11175), [USE-TensorFlow](https://tfhub.dev/google/universal-sentence-encoder/2), [Sentence-USE-Embedding](https://nlp.johnsnowlabs.com/2020/04/17/tfhub_use_lg.html) | +| Sentence similarity with NLU using BERT embeddings | `embed_sentence.bert`, `use en.embed_sentence.electra embed_sentence.bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/sentence_similarirty_stack_overflow_questions.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Part of Speech tagging with NLU | `pos` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/part_of_speechPOS/NLU_part_of_speech_ANC_example.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html) | +| NER Aspect Airline ATIS | `en.ner.aspect.airline` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NER_aspect_airline_ATIS.ipynb) | [NER Airline Model](https://nlp.johnsnowlabs.com/2021/01/25/nerdl_atis_840b_300d_en.html), [Atis intent Dataset](https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem) | +| NLU-NER_CONLL_2003_5class_example | `ner` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NLU_ner_CONLL_2003_5class_example.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html) | +| Named-entity recognition with Deep Learning ONTO NOTES | `ner.onto` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NLU_ner_ONTO_18class_example.ipynb) | [NER_Onto](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html) | +| Aspect based NER-Sentiment-Restaurants | `en.ner.aspect_sentiment` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/aspect_based_ner_sentiment_restaurants.ipynb) | - | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Chinese | `zh.segment_words`, `zh.pos`, `zh.ner`, `zh.translate_to.en` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/multilingual/chinese_ner_pos_and_tokenization.ipynb) | [Translation-Pipeline (Zh to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_zh_en_xx.html) | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Japanese | `ja.segment_words`, `ja.pos`, `ja.ner`, `ja.translate_to.en` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/multilingual/japanese_ner_pos_and_tokenization.ipynb) | [Translation-Pipeline (Ja to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_ja_en_xx.html) | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Korean | `ko.segment_words`, `ko.pos`, `ko.ner.kmou.glove_840B_300d`, `ko.translate_to.en` | [](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/master/examples/colab/component_examples/multilingual/korean_ner_pos_and_tokenization.ipynb) | - | +| Date Matching | `match.datetime` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/matchers/NLU_date_matching.ipynb) | - | +| Typed Dependency Parsing with NLU | `dep` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/dependency_parsing/NLU_typed_dependency_parsing_example.ipynb) | [Dependency Parsing ](https://nlp.johnsnowlabs.com/2021/03/27/Typed_Dependency_Parsing_en.html) | +| Untyped Dependency Parsing with NLU | `dep.untyped` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/dependency_parsing/NLU_untyped_dependency_parsing_example.ipynb) | - | +| E2E Classification with NLU | `e2e` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/E2E_classification.ipynb) | [e2e-Model](https://nlp.johnsnowlabs.com/2021/01/21/multiclassifierdl_use_e2e_en.html) | +| Language Classification with NLU | `lang` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/NLU_language_classification.ipynb) | - | +| Cyberbullying Classification with NLU | `classify.cyberbullying` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/cyberbullying_cassification_for_racism_and_sexism.ipynb) | [Cyberbullying-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_cyberbullying_en.html) | +| Sentiment Classification with NLU for Twitter | `emotion` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/emotion_classification.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Fake News Classification with NLU | `en.classify.fakenews` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/fake_news_classification.ipynb) | [Fakenews-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_fakenews_en.html) | +| Intent Classification with NLU | `en.classify.intent.airline` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/intent_classification_airlines_ATIS.ipynb) | [Airline-Intention classifier](https://nlp.johnsnowlabs.com/2021/01/25/classifierdl_use_atis_en.html), [Atis-Dataset](https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem?select=atis_intents.csv) | +| Question classification based on the TREC dataset | `en.classify.questions` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/question_classification.ipynb) | [Question-Classifier](https://nlp.johnsnowlabs.com/2021/01/08/classifierdl_use_trec50_en.html) | +| Sarcasm Classification with NLU | `en.classify.sarcasm` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sarcasm_classification.ipynb) | [Sarcasm-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_sarcasm_en.html) | +| Sentiment Classification with NLU for Twitter | `en.sentiment.twitter` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sentiment_classification.ipynb) | [Sentiment_Twitter-Classifier](https://nlp.johnsnowlabs.com/2021/01/18/sentimentdl_use_twitter_en.html) | +| Sentiment Classification with NLU for Movies | `en.sentiment.imdb` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sentiment_classification_movies.ipynb) | [Sentiment_imdb-Classifier](https://nlp.johnsnowlabs.com/2021/01/15/analyze_sentimentdl_use_imdb_en.html) | +| Spam Classification with NLU | `en.classify.spam` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/spam_classification.ipynb) | [Spam-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_spam_en.html) | +| Toxic text classification with NLU | `en.classify.toxic` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/toxic_classification.ipynb) | [Toxic-Classifier](https://nlp.johnsnowlabs.com/2021/01/21/multiclassifierdl_use_toxic_en.html) | +| Unsupervised keyword extraction with NLU using the YAKE algorithm | `yake` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/unsupervised_keyword_extraction_with_YAKE.ipynb) | - | +| Grammatical Chunk Matching with NLU | `match.chunks` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/chunkers/NLU_chunking_example.ipynb) | - | +| Getting n-Grams with NLU | `ngram` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/chunkers/NLU_n-gram.ipynb) | - | +| Assertion | `en.med_ner.clinical en.assert`, `en.med_ner.clinical.biobert en.assert.biobert`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/assertion/assertion_overview.ipynb) | [Healthcare-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_clinical_en.html), [NER_Clinical-Classifier]( https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_biobert_en.html), [Toxic-Classifier](https://nlp.johnsnowlabs.com/2021/01/26/assertion_dl_biobert_en.html) | +| De-Identification Model overview | `med_ner.jsl.wip.clinical en.de_identify`, `med_ner.jsl.wip.clinical en.de_identify.clinical`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/de_identification/DeIdentification_model_overview.ipynb) | [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html) | +| Drug Normalization | `norm_drugs` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare//drug_normalization/drug_norm.ipynb) | - | +| Entity Resolution | `med_ner.jsl.wip.clinical en.resolve_chunk.cpt_clinical`, `med_ner.jsl.wip.clinical en.resolve.icd10cm`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/entity_resolution/entity_resolvers_overview.ipynb) | [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html), [Entity-Resolver clinical](https://nlp.johnsnowlabs.com/2021/11/01/sbiobertresolve_icd10cm_augmented_billable_hcc_en.html) | +| Medical Named Entity Recognition | `en.med_ner.ade.clinical`, `en.med_ner.ade.clinical_bert`, `en.med_ner.anatomy`,`en.med_ner.anatomy.biobert`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/medical_named_entity_recognition/overview_medical_entity_recognizers.ipynb) | - | +| Relation Extraction | `en.med_ner.jsl.wip.clinical.greedy en.relation`, `en.med_ner.jsl.wip.clinical.greedy en.relation.bodypart.problem`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/relation_extraction/overview_relation.ipynb) | - | +| Visualization of NLP-Models with Spark-NLP and NLU | `ner`, `dep.typed`, `med_ner.jsl.wip.clinical resolve_chunk.rxnorm.in`, `med_ner.jsl.wip.clinical resolve.icd10cm` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/visualization/NLU_visualizations_tutorial.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Dependency Parsing](https://nlp.johnsnowlabs.com/2021/03/27/Typed_Dependency_Parsing_en.html), [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html), [Entity-Resolver (Chunks) clinical](https://nlp.johnsnowlabs.com/2021/04/16/chunkresolve_rxnorm_in_clinical_en.html) | +| NLU Covid-19 Emotion Showcase | `emotion` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_covid_emotion_showcase.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| NLU Covid-19 Sentiment Showcase | `sentiment` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_covid_sentiment_showcase.ipynb) | [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| NLU Airline Emotion Demo | `emotion` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_emotion_airline_demo.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| NLU Airline Sentiment Demo | `sentiment` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_sentiment_airline_demo.ipynb) | [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Bengali NER Hindi Embeddings for 30 Models | `bn.ner`, `bn.lemma`, `ja.lemma`, `am.lemma`, `bh.lemma`,` en.ner.onto.bert.small_l2_128`,.. | [](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/master/examples/release_notebooks/NLU1.1.2_Bengali_ner_Hindi_Embeddings_30_new_models.ipynb) | [Bengali-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_jifs_glove_840B_300d_bn.html), [Bengali-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_bn.html), [Japanese-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/15/lemma_ja.html), [Amharic-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_am.html) | +| Entity Resolution | `med_ner.jsl.wip.clinical en.resolve.umls`, `med_ner.jsl.wip.clinical en.resolve.loinc`, `med_ner.jsl.wip.clinical en.resolve.loinc.biobert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/release_notebooks/NLU_3_0_2_release_notebook.ipynb) | - | +| NLU 20 Minutes Crashcourse - the fast Data Science route | `spell`, `sentiment`, `pos`, `ner`, `yake`, `en.t5`, `emotion`, `answer_question`, `en.t5.base` ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/AI4_2021/NLU_crash_course_AI4.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html), [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) , [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Chapter 0: Intro: 1-liners | `sentiment`, `pos`, `ner`, `bert`, `elmo`, `embed_sentence.bert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/NYC_DC_NLP_MEETUP/0_liners_intro.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Chapter 1: NLU base-features with some classifiers on testdata | `emotion`, `yake`, `stem` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/NYC_DC_NLP_MEETUP/1_NLU_base_features_on_dataset_with_YAKE_Lemma_Stemm_classifiers_NER_.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Chapter 2: Translation between 300+ languages with Marian | `tr.translate_to.en`, `en.translate_to.fr`, `en.translate_to.he` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/translation_demo.ipynb) | [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation (En to He)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_he_xx.html) | +| Chapter 3: Answer questions and summarize Texts with T5 | `answer_question`, `en.t5`, `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_question_answering.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Chapter 4: Overview of T5-Tasks | `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_tasks_summarize_question_answering_and_more.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Graph NLU 20 Minutes Crashcourse - State of the Art Text Mining for Graphs | `spell`, `sentiment`, `pos`, `ner`, `yake`, `emotion`, `med_ner.jsl.wip.clinical`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/graph_ai_summit/Healthcare_Graph_NLU_COVID_Tigergraph.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html), [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Healthcare with NLU | `med_ner.human_phenotype.gene_biobert`, `med_ner.ade_biobert`, `med_ner.anatomy`, `med_ner.bacterial_species`,... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/healthcare_webinar/NLU_healthcare_webinar.ipynb) | - | +| Part 0: Intro: 1-liners | `spell`, `sentiment`, `pos`, `ner`, `bert`, `elmo`, `embed_sentence.bert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/0_liners_intro.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Bert](https://medium.com/spark-nlp/1-line-to-bert-word-embeddings-with-nlu-f50d2b08cddc) , [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html) , [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Part 1: NLU base-features with some classifiers on Testdata | `yake`, `stem`, `ner`, `emotion` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/1_NLU_base_features_on_dataset_with_YAKE_Lemma_Stemm_classifiers_NER_.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Part 2: Translate between 200+ Languages in 1 line of code with Marian-Models | `en.translate_to.de`, `en.translate_to.fr`, `en.translate_to.he` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/2_multilingual_translation_with_marian_intro.ipynb) | [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation-Pipeline (En to Ger)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_de_xx.html), [Translation (En to He)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_he_xx.html) | +| Part 3: More Multilingual NLP-translations for Asian Languages with Marian | `en.translate_to.hi`, `en.translate_to.ru`, `en.translate_to.zh` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/3_more_multi_lingual_NLP_translation_Asian_languages_with_Marian.ipynb) | [Translation (En to Hi)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_hil_xx.html), [Translation (En to Ru)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_run_xx.html), [Translation (En to Zh)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_zh_xx.html) | +| Part 4: Unsupervise Chinese Keyword Extraction, NER and Translation from chinese news | `zh.translate_to.en`, `zh.segment_words`, `yake`, `zh.lemma`, `zh.ner` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/4_Unsupervise_Chinese_Keyword_Extraction_NER_and_Translation_from_Chinese_News.ipynb) | [Translation-Pipeline (Zh to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_zh_en_xx.html), [Zh-Lemmatizer](https://nlp.johnsnowlabs.com/2020/03/19/explain_document_dl.html) | +| Part 5: Multilingual sentiment classifier training for 100+ languages | `train.sentiment`, `xx.embed_sentence.labse train.sentiment` | n.a. | [Sentence_Embedding.Labse](https://nlp.johnsnowlabs.com/2020/09/23/labse.html) | +| Part 6: Question-answering and Text-summarization with T5-Modell | `answer_question`, `en.t5`, `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/6_T5_question_answering_and_Text_summarization.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf) | +| Part 7: Overview of all tasks available with T5 | `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/7_T5_SQUAD_GLUE_SUPER_GLUE_TASKS.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf) | +| Part 8: Overview of some of the Multilingual modes with State Of the Art accuracy (1-liner) | `bn.lemma`, `ja.lemma`, `am.lemma`, `bh.lemma`, `zh.segment_words`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/8_Multi_lingual_ner_pos_stop_words_senti) | [Bengali-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_bn.html), [Japanese-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/15/lemma_ja.html) , [Amharic-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_am.html) | +| Overview of some Multilingual modes avaiable with State Of the Art accuracy (1-liner) | `bn.ner.cc_300d`, `ja.ner`, `zh.ner`, `th.ner.lst20.glove_840B_300D`, `ar.ner` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/python_web_conf/Multi_Linigual_examples.ipynb) | [Bengali-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_jifs_glove_840B_300d_bn.html) +| NLU 20 Minutes Crashcourse - the fast Data Science route | - | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/python_web_conf/NLU_crashcourse_py_web.ipynb) | - | + + +# Need help? +- [Ping us on Slack](https://spark-nlp.slack.com/archives/C0196BQCDPY) +- [Post an issue on Github](https://github.com/JohnSnowLabs/nlu/issues) + +# Simple NLU Demos +- [NLU different output levels Demo](https://colab.research.google.com/drive/1C4N3wpC17YzZf9fXHDNAJ5JvSmfbq7zT?usp=sharing) + + + + + + + + + + + + + + + + + + + + + + + + + + + + +# Features in NLU Overview +* Tokenization +* Trainable Word Segmentation +* Stop Words Removal +* Token Normalizer +* Document Normalizer +* Stemmer +* Lemmatizer +* NGrams +* Regex Matching +* Text Matching, +* Chunking +* Date Matcher +* Sentence Detector +* Deep Sentence Detector (Deep learning) +* Dependency parsing (Labeled/unlabeled) +* Part-of-speech tagging +* Sentiment Detection (ML models) +* Spell Checker (ML and DL models) +* Word Embeddings (GloVe and Word2Vec) +* BERT Embeddings (TF Hub models) +* ELMO Embeddings (TF Hub models) +* ALBERT Embeddings (TF Hub models) +* XLNet Embeddings +* Universal Sentence Encoder (TF Hub models) +* BERT Sentence Embeddings (42 TF Hub models) +* Sentence Embeddings +* Chunk Embeddings +* Unsupervised keywords extraction +* Language Detection & Identification (up to 375 languages) +* Multi-class Sentiment analysis (Deep learning) +* Multi-label Sentiment analysis (Deep learning) +* Multi-class Text Classification (Deep learning) +* Neural Machine Translation +* Text-To-Text Transfer Transformer (Google T5) +* Named entity recognition (Deep learning) +* Easy TensorFlow integration +* GPU Support +* Full integration with Spark ML functions +* 1000 pre-trained models in +200 languages! +* Multi-lingual NER models: Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Hewbrew, Italian, Japanese, Korean, Norwegian, Persian, Polish, Portuguese, Russian, Spanish, Swedish, Urdu and more +* Natural Language inference +* Coreference resolution +* Sentence Completion +* Word sense disambiguation +* Clinical entity recognition +* Clinical Entity Linking +* Entity normalization +* Assertion Status Detection +* De-identification +* Relation Extraction +* Clinical Entity Resolution + + +## Citation + +We have published a [paper](https://www.sciencedirect.com/science/article/pii/S2665963821000063) that you can cite for the NLU library: + +```bibtex +@article{KOCAMAN2021100058, + title = {Spark NLP: Natural language understanding at scale}, + journal = {Software Impacts}, + pages = {100058}, + year = {2021}, + issn = {2665-9638}, + doi = {https://doi.org/10.1016/j.simpa.2021.100058}, + url = {https://www.sciencedirect.com/science/article/pii/S2665963821000063}, + author = {Veysel Kocaman and David Talby}, + keywords = {Spark, Natural language processing, Deep learning, Tensorflow, Cluster}, + abstract = {Spark NLP is a Natural Language Processing (NLP) library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that can scale easily in a distributed environment. Spark NLP comes with 1100+ pretrained pipelines and models in more than 192+ languages. It supports nearly all the NLP tasks and modules that can be used seamlessly in a cluster. Downloaded more than 2.7 million times and experiencing 9x growth since January 2020, Spark NLP is used by 54% of healthcare organizations as the world’s most widely used NLP library in the enterprise.} + } +} +``` + + +%package -n python3-nlu +Summary: John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 10000+ of pretrained models in 200+ languages. It enables swift and simple development and research with its powerful Pythonic and Keras inspired API. It is powerd by John Snow Labs powerful Spark NLP library. +Provides: python-nlu +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-nlu + +# NLU: The Power of Spark NLP, the Simplicity of Python +John Snow Labs' NLU is a Python library for applying state-of-the-art text mining, directly on any dataframe, with a single line of code. +As a facade of the award-winning Spark NLP library, it comes with **1000+** of pretrained models in **100+**, all production-grade, scalable, and trainable, with **everything in 1 line of code.** + + + +## NLU in Action +See how easy it is to use any of the **thousands** of models in 1 line of code, there are hundreds of [tutorials](https://nlu.johnsnowlabs.com/docs/en/notebooks) and [simple examples](https://github.com/JohnSnowLabs/nlu/tree/master/examples) you can copy and paste into your projects to achieve State Of The Art easily. +<img src="http://ckl-it.de/wp-content/uploads/2020/08/My-Video6.gif" width="1800" height="500"/> + +## NLU & Streamlit in Action +This 1 line let's you visualize and play with **1000+ SOTA NLU & NLP models** in **200** languages + +```shell +streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/01_dashboard.py +``` +<img src="https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/docs/assets/streamlit_docs_assets/gif/start.gif"> + +NLU provides tight and simple integration into Streamlit, which enables building powerful webapps in just 1 line of code which showcase the. +View the [NLU&Streamlit documentation](https://nlu.johnsnowlabs.com/docs/en/streamlit_viz_examples) or [NLU & Streamlit examples section](https://github.com/JohnSnowLabs/nlu/tree/master/examples/streamlit). +The entire GIF demo and + + +## All NLU resources overview +Take a look at our official NLU page: [https://nlu.johnsnowlabs.com/](https://nlu.johnsnowlabs.com/) for user documentation and examples + +| Ressource | Description| +|-----------------------------------------------------------------------|-------------------------------------------| +| [Install NLU](https://nlu.johnsnowlabs.com/docs/en/install) | Just run `pip install nlu pyspark==3.0.2` +| [The NLU Namespace](https://nlu.johnsnowlabs.com/docs/en/namespace) | Find all the names of models you can load with `nlu.load()` +| [The `nlu.load(<Model>)` function](https://nlu.johnsnowlabs.com/docs/en/load_api) | Load any of the **1000+ models in 1 line** +| [The `nlu.load(<Model>).predict(data)` function](https://nlu.johnsnowlabs.com/docs/en/predict_api) | Predict on `Strings`, `List of Strings`, `Numpy Arrays`, `Pandas`, `Modin` and `Spark Dataframes` +| [The `nlu.load(<train.Model>).fit(data)` function](https://nlu.johnsnowlabs.com/docs/en/training) | Train a text classifier for `2-Class`, `N-Classes` `Multi-N-Classes`, `Named-Entitiy-Recognition` or `Parts of Speech Tagging` +| [The `nlu.load(<Model>).viz(data)` function](https://nlu.johnsnowlabs.com/docs/en/viz_examples) | Visualize the results of `Word Embedding Similarity Matrix`, `Named Entity Recognizers`, `Dependency Trees & Parts of Speech`, `Entity Resolution`,`Entity Linking` or `Entity Status Assertion` +| [The `nlu.load(<Model>).viz_streamlit(data)` function](https://nlu.johnsnowlabs.com/docs/en/streamlit_viz_examples) | Display an interactive GUI which lets you explore and test every model and feature in NLU in 1 click. +| [General Concepts](https://nlu.johnsnowlabs.com/docs/en/concepts) | General concepts in NLU +| [The latest release notes](https://nlu.johnsnowlabs.com/docs/en/release_notes) | Newest features added to NLU +| [Overview NLU 1-liners examples](https://nlu.johnsnowlabs.com/docs/en/examples) | Most common used models and their results +| [Overview NLU 1-liners examples for healthcare models](https://nlu.johnsnowlabs.com/docs/en/examples_hc) | Most common used healthcare models and their results +| [Overview of all NLU tutorials and Examples](https://nlu.johnsnowlabs.com/docs/en/notebooks) | 100+ tutorials on how to use NLU on text datasets for various problems and from various sources like Twitter, Chinese News, Crypto News Headlines, Airline Traffic communication, Product review classifier training, +| [Connect with us on Slack](https://join.slack.com/t/spark-nlp/shared_invite/zt-lutct9gm-kuUazcyFKhuGY3_0AMkxqA) | Problems, questions or suggestions? We have a very active and helpful community of over 2000+ AI enthusiasts putting NLU, Spark NLP & Spark OCR to good use +| [Discussion Forum](https://github.com/JohnSnowLabs/spark-nlp/discussions) | More indepth discussion with the community? Post a thread in our discussion Forum +| [John Snow Labs Medium](https://medium.com/spark-nlp) | Articles and Tutorials on the NLU, Spark NLP and Spark OCR +| [John Snow Labs Youtube](https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos) | Videos and Tutorials on the NLU, Spark NLP and Spark OCR +| [NLU Website](https://nlu.johnsnowlabs.com/) | The official NLU website +|[Github Issues](https://github.com/JohnSnowLabs/nlu/issues) | Report a bug + + + + + + +## Getting Started with NLU +To get your hands on the power of NLU, you just need to install it via pip and ensure Java 8 is installed and properly configured. Checkout [Quickstart for more infos](https://nlu.johnsnowlabs.com/docs/en/install) +```bash +pip install nlu pyspark==3.0.2 +``` + +## Loading and predicting with any model in 1 line python +```python +import nlu +nlu.load('sentiment').predict('I love NLU! <3') +``` + +## Loading and predicting with multiple models in 1 line + +Get 6 different embeddings in 1 line and use them for downstream data science tasks! + +```python +nlu.load('bert elmo albert xlnet glove use').predict('I love NLU! <3') +``` + +## What kind of models does NLU provide? +NLU provides everything a data scientist might want to wish for in one line of code! + - NLU provides everything a data scientist might want to wish for in one line of code! + - 1000 + pre-trained models + - 100+ of the latest NLP word embeddings ( BERT, ELMO, ALBERT, XLNET, GLOVE, BIOBERT, ELECTRA, COVIDBERT) and different variations of them + - 50+ of the latest NLP sentence embeddings ( BERT, ELECTRA, USE) and different variations of them + - 100+ Classifiers (NER, POS, Emotion, Sarcasm, Questions, Spam) + - 300+ Supported Languages +- Summarize Text and Answer Questions with T5 +- Labeled and Unlabeled Dependency parsing + - Various Text Cleaning and Pre-Processing methods like Stemming, Lemmatizing, Normalizing, Filtering, Cleaning pipelines and more + + +## Classifiers trained on many different datasets +Choose the right tool for the right task! Whether you analyze movies or twitter, NLU has the right model for you! + +- trec6 classifier +- trec10 classifier +- spam classifier +- fake news classifier +- emotion classifier +- cyberbullying classifier +- sarcasm classifier +- sentiment classifier for movies +- IMDB Movie Sentiment classifier +- Twitter sentiment classifier +- NER pretrained on ONTO notes +- NER trainer on CONLL +- Language classifier for 20 languages on the wiki 20 lang dataset. + +## Utilities for the Data Science NLU applications +Working with text data can sometimes be quite a dirty job. NLU helps you keep your hands clean by providing components that take away from data engineering intensive tasks. + +- Datetime Matcher +- Pattern Matcher +- Chunk Matcher +- Phrases Matcher +- Stopword Cleaners +- Pattern Cleaners +- Slang Cleaner + +## Where can I see all models available in NLU? +For NLU models to load, see [the NLU Namespace](https://nlu.johnsnowlabs.com/docs/en/namespace) or the [John Snow Labs Modelshub](https://modelshub.johnsnowlabs.com/models) or go [straight to the source](https://github.com/JohnSnowLabs/nlu/blob/master/nlu/namespace.py). + +## Supported Data Types +- Pandas DataFrame and Series +- Spark DataFrames +- Modin with Ray backend +- Modin with Dask backend +- Numpy arrays +- Strings and lists of strings + +## Overview of all tutorials using the NLU-Library + +In the following tabular, all available tutorials using NLU are listed. These tutorials will help you learn the +usage of the NLU library and on how to use it for your own tasks. Some of the tasks NLU does are +translating from any language to the english language, lemmatizing, tokenizing, cleaning text from +Symbol or unwanted syntax, spellchecking, detecting entities, analyzing sentiments and many more! + +{:.table2} + +| Tutorial Description | NLU Spells Used |Open In Colab | Dataset and Paper References | +|-----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Albert Word Embeddings with NLU | `albert`, `sentiment pos albert emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ALBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Albert-Paper](https://arxiv.org/pdf/1909.11942.pdf), [Albert on Github](https://github.com/google-research/ALBERT), [Albert on TensorFlow](https://tfhub.dev/s?q=albert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Albert](https://medium.com/spark-nlp/1-line-to-albert-word-embeddings-with-nlu-in-python-1691bc048ed1), [Albert_Embedding](https://nlp.johnsnowlabs.com/2021/06/23/albert_base_uncased_en.html) | +| Bert Word Embeddings with NLU | `bert`, `pos sentiment emotion bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_BERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Bert](https://medium.com/spark-nlp/1-line-to-bert-word-embeddings-with-nlu-f50d2b08cddc), [Bert_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| BIOBERT Word Embeddings with NLU | `biobert` , `sentiment pos biobert emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_BIOBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [BioBert-Paper](https://arxiv.org/abs/1901.08746), [Bert Github](https://github.com/google-research/bert) , [BERT: Deep Bidirectional Transformers](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Biobert](https://medium.com/spark-nlp/1-line-to-biobert-word-embeddings-with-nlu-in-python-7224ab52e131), [Biobert_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/biobert_pubmed_base_cased.html) | +| COVIDBERT Word Embeddings with NLU | `covidbert`, `sentiment covidbert pos` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_COVIDBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [CovidBert-Paper](https://journals.flvc.org/FLAIRS/article/view/128488), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-CovidBert](https://medium.com/spark-nlp/1-line-to-covidbert-word-embeddings-with-nlu-in-python-e67396da2f78), [Covidbert_Embedding](https://nlp.johnsnowlabs.com/2020/08/27/covidbert_large_uncased.html) | +| ELECTRA Word Embeddings with NLU | `electra`, `sentiment pos en.embed.electra emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ELECTRA_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Electra-Paper](https://arxiv.org/abs/2003.10555), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Electra](https://medium.com/spark-nlp/1-line-to-electra-word-embeddings-with-nlu-in-python-25f749bf3e92), [Electra_Embedding](https://nlp.johnsnowlabs.com/2020/08/27/electra_small_uncased.html) | +| ELMO Word Embeddings with NLU | `elmo`, `sentiment pos elmo emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ELMo_word_embeddings_and_t-SNE_visualization_example.ipynb) | [ELMO-Paper](https://arxiv.org/abs/1802.05365), [Elmo-TensorFlow](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Elmo](https://medium.com/spark-nlp/1-python-line-for-elmo-word-embeddings-with-john-snow-labs-nlu-628e9b924a3), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html) | +| GLOVE Word Embeddings with NLU | `glove`, `sentiment pos glove emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_GLOVE_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Glove-Paper](https://nlp.stanford.edu/pubs/glove.pdf), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Glove](https://medium.com/spark-nlp/1-line-to-glove-word-embeddings-with-nlu-in-python-baed152fff4d) , [Glove_Embedding](https://nlp.johnsnowlabs.com/2020/01/22/glove_100d.html) | +| XLNET Word Embeddings with NLU | `xlnet`, `sentiment pos xlnet emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_XLNET_word_embeddings_and_t-SNE_visualization_example.ipynb) | [XLNet-Paper](https://arxiv.org/abs/1906.08237), [Bert Github](https://github.com/zihangdai/xlnet), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-XLNet](https://medium.com/spark-nlp/1-line-to-xlnet-word-embeddings-with-nlu-in-python-5efc57d7ac79), [Xlnet_Embedding](https://nlp.johnsnowlabs.com/2021/07/07/xlnet_base_cased_en.html) | +| Multiple Word-Embeddings and Part of Speech in 1 Line of code | `bert electra elmo glove xlnet albert pos` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_multiple_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Bert-Paper](https://arxiv.org/pdf/1810.04805.pdf), [Albert-Paper](https://openreview.net/forum?id=H1eA7AEtvS), [ELMO-Paper](https://arxiv.org/abs/1802.05365), [Electra-Paper](https://arxiv.org/abs/2003.10555), [XLNet-Paper](https://arxiv.org/pdf/1906.08237.pdf), [Glove-Paper](https://nlp.stanford.edu/pubs/glove.pdf) | +| Normalzing with NLU | `norm` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_normalizer_example.ipynb) | - | +| Detect sentences with NLU | `sentence_detector.deep`, `sentence_detector.pragmatic`, `xx.sentence_detector` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_sentence_detection_example.ipynb) | [Sentence Detector](https://nlp.johnsnowlabs.com/2020/09/13/sentence_detector_dl_en.html) | +| Spellchecking with NLU | n.a. | n.a. | - | +| Stemming with NLU | `en.stem`, `de.stem` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_stemmer_example.ipynb) | - | +| Stopwords removal with NLU | `stopwords` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_stopwords_removal_example.ipynb) | [Stopwords](https://nlp.johnsnowlabs.com/2020/07/14/stopwords_en.html) | +| Tokenization with NLU | `tokenize` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_tokenization_example.ipynb) | - | +| Normalization of Documents | `norm_document` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/document_normalizer_demo.ipynb) | - | +| Open and Closed book question answering with Google's T5 | `en.t5` , `answer_question` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_question_answering.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf), [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Overview of every task available with T5 | `en.t5.base` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_tasks_summarize_question_answering_and_more.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf), [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Translate between more than 200 Languages in 1 line of code with Marian Models | `tr.translate_to.fr`, `en.translate_to.fr` ,`fr.translate_to.he` , `en.translate_to.de` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/translation_demo.ipynb) | [Marian-Papers](https://marian-nmt.github.io/publications/), [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation-Pipeline (En to Ger)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_de_xx.html) | +| BERT Sentence Embeddings with NLU | `embed_sentence.bert`, `pos sentiment embed_sentence.bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_BERT_sentence_embeddings_and_t-SNE_visualization_Example.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| ELECTRA Sentence Embeddings with NLU | `embed_sentence.electra`, `pos sentiment embed_sentence.electra` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_ELECTRA_sentence_embeddings_and_t-SNE_visualization_example.ipynb) | [Electra Paper](https://arxiv.org/abs/2003.10555), [Sentence-Electra-Embedding](https://nlp.johnsnowlabs.com/2020/08/27/sent_electra_small_uncased.html) | +| USE Sentence Embeddings with NLU | `use`, `pos sentiment use emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_USE_sentence_embeddings_and_t-SNE_visualization_example.ipynb) | [Universal Sentence Encoder](https://arxiv.org/abs/1803.11175), [USE-TensorFlow](https://tfhub.dev/google/universal-sentence-encoder/2), [Sentence-USE-Embedding](https://nlp.johnsnowlabs.com/2020/04/17/tfhub_use_lg.html) | +| Sentence similarity with NLU using BERT embeddings | `embed_sentence.bert`, `use en.embed_sentence.electra embed_sentence.bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/sentence_similarirty_stack_overflow_questions.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Part of Speech tagging with NLU | `pos` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/part_of_speechPOS/NLU_part_of_speech_ANC_example.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html) | +| NER Aspect Airline ATIS | `en.ner.aspect.airline` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NER_aspect_airline_ATIS.ipynb) | [NER Airline Model](https://nlp.johnsnowlabs.com/2021/01/25/nerdl_atis_840b_300d_en.html), [Atis intent Dataset](https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem) | +| NLU-NER_CONLL_2003_5class_example | `ner` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NLU_ner_CONLL_2003_5class_example.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html) | +| Named-entity recognition with Deep Learning ONTO NOTES | `ner.onto` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NLU_ner_ONTO_18class_example.ipynb) | [NER_Onto](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html) | +| Aspect based NER-Sentiment-Restaurants | `en.ner.aspect_sentiment` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/aspect_based_ner_sentiment_restaurants.ipynb) | - | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Chinese | `zh.segment_words`, `zh.pos`, `zh.ner`, `zh.translate_to.en` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/multilingual/chinese_ner_pos_and_tokenization.ipynb) | [Translation-Pipeline (Zh to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_zh_en_xx.html) | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Japanese | `ja.segment_words`, `ja.pos`, `ja.ner`, `ja.translate_to.en` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/multilingual/japanese_ner_pos_and_tokenization.ipynb) | [Translation-Pipeline (Ja to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_ja_en_xx.html) | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Korean | `ko.segment_words`, `ko.pos`, `ko.ner.kmou.glove_840B_300d`, `ko.translate_to.en` | [](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/master/examples/colab/component_examples/multilingual/korean_ner_pos_and_tokenization.ipynb) | - | +| Date Matching | `match.datetime` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/matchers/NLU_date_matching.ipynb) | - | +| Typed Dependency Parsing with NLU | `dep` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/dependency_parsing/NLU_typed_dependency_parsing_example.ipynb) | [Dependency Parsing ](https://nlp.johnsnowlabs.com/2021/03/27/Typed_Dependency_Parsing_en.html) | +| Untyped Dependency Parsing with NLU | `dep.untyped` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/dependency_parsing/NLU_untyped_dependency_parsing_example.ipynb) | - | +| E2E Classification with NLU | `e2e` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/E2E_classification.ipynb) | [e2e-Model](https://nlp.johnsnowlabs.com/2021/01/21/multiclassifierdl_use_e2e_en.html) | +| Language Classification with NLU | `lang` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/NLU_language_classification.ipynb) | - | +| Cyberbullying Classification with NLU | `classify.cyberbullying` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/cyberbullying_cassification_for_racism_and_sexism.ipynb) | [Cyberbullying-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_cyberbullying_en.html) | +| Sentiment Classification with NLU for Twitter | `emotion` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/emotion_classification.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Fake News Classification with NLU | `en.classify.fakenews` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/fake_news_classification.ipynb) | [Fakenews-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_fakenews_en.html) | +| Intent Classification with NLU | `en.classify.intent.airline` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/intent_classification_airlines_ATIS.ipynb) | [Airline-Intention classifier](https://nlp.johnsnowlabs.com/2021/01/25/classifierdl_use_atis_en.html), [Atis-Dataset](https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem?select=atis_intents.csv) | +| Question classification based on the TREC dataset | `en.classify.questions` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/question_classification.ipynb) | [Question-Classifier](https://nlp.johnsnowlabs.com/2021/01/08/classifierdl_use_trec50_en.html) | +| Sarcasm Classification with NLU | `en.classify.sarcasm` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sarcasm_classification.ipynb) | [Sarcasm-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_sarcasm_en.html) | +| Sentiment Classification with NLU for Twitter | `en.sentiment.twitter` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sentiment_classification.ipynb) | [Sentiment_Twitter-Classifier](https://nlp.johnsnowlabs.com/2021/01/18/sentimentdl_use_twitter_en.html) | +| Sentiment Classification with NLU for Movies | `en.sentiment.imdb` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sentiment_classification_movies.ipynb) | [Sentiment_imdb-Classifier](https://nlp.johnsnowlabs.com/2021/01/15/analyze_sentimentdl_use_imdb_en.html) | +| Spam Classification with NLU | `en.classify.spam` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/spam_classification.ipynb) | [Spam-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_spam_en.html) | +| Toxic text classification with NLU | `en.classify.toxic` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/toxic_classification.ipynb) | [Toxic-Classifier](https://nlp.johnsnowlabs.com/2021/01/21/multiclassifierdl_use_toxic_en.html) | +| Unsupervised keyword extraction with NLU using the YAKE algorithm | `yake` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/unsupervised_keyword_extraction_with_YAKE.ipynb) | - | +| Grammatical Chunk Matching with NLU | `match.chunks` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/chunkers/NLU_chunking_example.ipynb) | - | +| Getting n-Grams with NLU | `ngram` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/chunkers/NLU_n-gram.ipynb) | - | +| Assertion | `en.med_ner.clinical en.assert`, `en.med_ner.clinical.biobert en.assert.biobert`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/assertion/assertion_overview.ipynb) | [Healthcare-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_clinical_en.html), [NER_Clinical-Classifier]( https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_biobert_en.html), [Toxic-Classifier](https://nlp.johnsnowlabs.com/2021/01/26/assertion_dl_biobert_en.html) | +| De-Identification Model overview | `med_ner.jsl.wip.clinical en.de_identify`, `med_ner.jsl.wip.clinical en.de_identify.clinical`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/de_identification/DeIdentification_model_overview.ipynb) | [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html) | +| Drug Normalization | `norm_drugs` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare//drug_normalization/drug_norm.ipynb) | - | +| Entity Resolution | `med_ner.jsl.wip.clinical en.resolve_chunk.cpt_clinical`, `med_ner.jsl.wip.clinical en.resolve.icd10cm`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/entity_resolution/entity_resolvers_overview.ipynb) | [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html), [Entity-Resolver clinical](https://nlp.johnsnowlabs.com/2021/11/01/sbiobertresolve_icd10cm_augmented_billable_hcc_en.html) | +| Medical Named Entity Recognition | `en.med_ner.ade.clinical`, `en.med_ner.ade.clinical_bert`, `en.med_ner.anatomy`,`en.med_ner.anatomy.biobert`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/medical_named_entity_recognition/overview_medical_entity_recognizers.ipynb) | - | +| Relation Extraction | `en.med_ner.jsl.wip.clinical.greedy en.relation`, `en.med_ner.jsl.wip.clinical.greedy en.relation.bodypart.problem`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/relation_extraction/overview_relation.ipynb) | - | +| Visualization of NLP-Models with Spark-NLP and NLU | `ner`, `dep.typed`, `med_ner.jsl.wip.clinical resolve_chunk.rxnorm.in`, `med_ner.jsl.wip.clinical resolve.icd10cm` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/visualization/NLU_visualizations_tutorial.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Dependency Parsing](https://nlp.johnsnowlabs.com/2021/03/27/Typed_Dependency_Parsing_en.html), [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html), [Entity-Resolver (Chunks) clinical](https://nlp.johnsnowlabs.com/2021/04/16/chunkresolve_rxnorm_in_clinical_en.html) | +| NLU Covid-19 Emotion Showcase | `emotion` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_covid_emotion_showcase.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| NLU Covid-19 Sentiment Showcase | `sentiment` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_covid_sentiment_showcase.ipynb) | [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| NLU Airline Emotion Demo | `emotion` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_emotion_airline_demo.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| NLU Airline Sentiment Demo | `sentiment` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_sentiment_airline_demo.ipynb) | [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Bengali NER Hindi Embeddings for 30 Models | `bn.ner`, `bn.lemma`, `ja.lemma`, `am.lemma`, `bh.lemma`,` en.ner.onto.bert.small_l2_128`,.. | [](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/master/examples/release_notebooks/NLU1.1.2_Bengali_ner_Hindi_Embeddings_30_new_models.ipynb) | [Bengali-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_jifs_glove_840B_300d_bn.html), [Bengali-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_bn.html), [Japanese-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/15/lemma_ja.html), [Amharic-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_am.html) | +| Entity Resolution | `med_ner.jsl.wip.clinical en.resolve.umls`, `med_ner.jsl.wip.clinical en.resolve.loinc`, `med_ner.jsl.wip.clinical en.resolve.loinc.biobert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/release_notebooks/NLU_3_0_2_release_notebook.ipynb) | - | +| NLU 20 Minutes Crashcourse - the fast Data Science route | `spell`, `sentiment`, `pos`, `ner`, `yake`, `en.t5`, `emotion`, `answer_question`, `en.t5.base` ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/AI4_2021/NLU_crash_course_AI4.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html), [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) , [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Chapter 0: Intro: 1-liners | `sentiment`, `pos`, `ner`, `bert`, `elmo`, `embed_sentence.bert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/NYC_DC_NLP_MEETUP/0_liners_intro.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Chapter 1: NLU base-features with some classifiers on testdata | `emotion`, `yake`, `stem` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/NYC_DC_NLP_MEETUP/1_NLU_base_features_on_dataset_with_YAKE_Lemma_Stemm_classifiers_NER_.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Chapter 2: Translation between 300+ languages with Marian | `tr.translate_to.en`, `en.translate_to.fr`, `en.translate_to.he` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/translation_demo.ipynb) | [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation (En to He)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_he_xx.html) | +| Chapter 3: Answer questions and summarize Texts with T5 | `answer_question`, `en.t5`, `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_question_answering.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Chapter 4: Overview of T5-Tasks | `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_tasks_summarize_question_answering_and_more.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Graph NLU 20 Minutes Crashcourse - State of the Art Text Mining for Graphs | `spell`, `sentiment`, `pos`, `ner`, `yake`, `emotion`, `med_ner.jsl.wip.clinical`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/graph_ai_summit/Healthcare_Graph_NLU_COVID_Tigergraph.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html), [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Healthcare with NLU | `med_ner.human_phenotype.gene_biobert`, `med_ner.ade_biobert`, `med_ner.anatomy`, `med_ner.bacterial_species`,... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/healthcare_webinar/NLU_healthcare_webinar.ipynb) | - | +| Part 0: Intro: 1-liners | `spell`, `sentiment`, `pos`, `ner`, `bert`, `elmo`, `embed_sentence.bert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/0_liners_intro.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Bert](https://medium.com/spark-nlp/1-line-to-bert-word-embeddings-with-nlu-f50d2b08cddc) , [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html) , [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Part 1: NLU base-features with some classifiers on Testdata | `yake`, `stem`, `ner`, `emotion` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/1_NLU_base_features_on_dataset_with_YAKE_Lemma_Stemm_classifiers_NER_.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Part 2: Translate between 200+ Languages in 1 line of code with Marian-Models | `en.translate_to.de`, `en.translate_to.fr`, `en.translate_to.he` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/2_multilingual_translation_with_marian_intro.ipynb) | [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation-Pipeline (En to Ger)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_de_xx.html), [Translation (En to He)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_he_xx.html) | +| Part 3: More Multilingual NLP-translations for Asian Languages with Marian | `en.translate_to.hi`, `en.translate_to.ru`, `en.translate_to.zh` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/3_more_multi_lingual_NLP_translation_Asian_languages_with_Marian.ipynb) | [Translation (En to Hi)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_hil_xx.html), [Translation (En to Ru)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_run_xx.html), [Translation (En to Zh)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_zh_xx.html) | +| Part 4: Unsupervise Chinese Keyword Extraction, NER and Translation from chinese news | `zh.translate_to.en`, `zh.segment_words`, `yake`, `zh.lemma`, `zh.ner` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/4_Unsupervise_Chinese_Keyword_Extraction_NER_and_Translation_from_Chinese_News.ipynb) | [Translation-Pipeline (Zh to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_zh_en_xx.html), [Zh-Lemmatizer](https://nlp.johnsnowlabs.com/2020/03/19/explain_document_dl.html) | +| Part 5: Multilingual sentiment classifier training for 100+ languages | `train.sentiment`, `xx.embed_sentence.labse train.sentiment` | n.a. | [Sentence_Embedding.Labse](https://nlp.johnsnowlabs.com/2020/09/23/labse.html) | +| Part 6: Question-answering and Text-summarization with T5-Modell | `answer_question`, `en.t5`, `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/6_T5_question_answering_and_Text_summarization.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf) | +| Part 7: Overview of all tasks available with T5 | `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/7_T5_SQUAD_GLUE_SUPER_GLUE_TASKS.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf) | +| Part 8: Overview of some of the Multilingual modes with State Of the Art accuracy (1-liner) | `bn.lemma`, `ja.lemma`, `am.lemma`, `bh.lemma`, `zh.segment_words`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/8_Multi_lingual_ner_pos_stop_words_senti) | [Bengali-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_bn.html), [Japanese-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/15/lemma_ja.html) , [Amharic-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_am.html) | +| Overview of some Multilingual modes avaiable with State Of the Art accuracy (1-liner) | `bn.ner.cc_300d`, `ja.ner`, `zh.ner`, `th.ner.lst20.glove_840B_300D`, `ar.ner` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/python_web_conf/Multi_Linigual_examples.ipynb) | [Bengali-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_jifs_glove_840B_300d_bn.html) +| NLU 20 Minutes Crashcourse - the fast Data Science route | - | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/python_web_conf/NLU_crashcourse_py_web.ipynb) | - | + + +# Need help? +- [Ping us on Slack](https://spark-nlp.slack.com/archives/C0196BQCDPY) +- [Post an issue on Github](https://github.com/JohnSnowLabs/nlu/issues) + +# Simple NLU Demos +- [NLU different output levels Demo](https://colab.research.google.com/drive/1C4N3wpC17YzZf9fXHDNAJ5JvSmfbq7zT?usp=sharing) + + + + + + + + + + + + + + + + + + + + + + + + + + + + +# Features in NLU Overview +* Tokenization +* Trainable Word Segmentation +* Stop Words Removal +* Token Normalizer +* Document Normalizer +* Stemmer +* Lemmatizer +* NGrams +* Regex Matching +* Text Matching, +* Chunking +* Date Matcher +* Sentence Detector +* Deep Sentence Detector (Deep learning) +* Dependency parsing (Labeled/unlabeled) +* Part-of-speech tagging +* Sentiment Detection (ML models) +* Spell Checker (ML and DL models) +* Word Embeddings (GloVe and Word2Vec) +* BERT Embeddings (TF Hub models) +* ELMO Embeddings (TF Hub models) +* ALBERT Embeddings (TF Hub models) +* XLNet Embeddings +* Universal Sentence Encoder (TF Hub models) +* BERT Sentence Embeddings (42 TF Hub models) +* Sentence Embeddings +* Chunk Embeddings +* Unsupervised keywords extraction +* Language Detection & Identification (up to 375 languages) +* Multi-class Sentiment analysis (Deep learning) +* Multi-label Sentiment analysis (Deep learning) +* Multi-class Text Classification (Deep learning) +* Neural Machine Translation +* Text-To-Text Transfer Transformer (Google T5) +* Named entity recognition (Deep learning) +* Easy TensorFlow integration +* GPU Support +* Full integration with Spark ML functions +* 1000 pre-trained models in +200 languages! +* Multi-lingual NER models: Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Hewbrew, Italian, Japanese, Korean, Norwegian, Persian, Polish, Portuguese, Russian, Spanish, Swedish, Urdu and more +* Natural Language inference +* Coreference resolution +* Sentence Completion +* Word sense disambiguation +* Clinical entity recognition +* Clinical Entity Linking +* Entity normalization +* Assertion Status Detection +* De-identification +* Relation Extraction +* Clinical Entity Resolution + + +## Citation + +We have published a [paper](https://www.sciencedirect.com/science/article/pii/S2665963821000063) that you can cite for the NLU library: + +```bibtex +@article{KOCAMAN2021100058, + title = {Spark NLP: Natural language understanding at scale}, + journal = {Software Impacts}, + pages = {100058}, + year = {2021}, + issn = {2665-9638}, + doi = {https://doi.org/10.1016/j.simpa.2021.100058}, + url = {https://www.sciencedirect.com/science/article/pii/S2665963821000063}, + author = {Veysel Kocaman and David Talby}, + keywords = {Spark, Natural language processing, Deep learning, Tensorflow, Cluster}, + abstract = {Spark NLP is a Natural Language Processing (NLP) library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that can scale easily in a distributed environment. Spark NLP comes with 1100+ pretrained pipelines and models in more than 192+ languages. It supports nearly all the NLP tasks and modules that can be used seamlessly in a cluster. Downloaded more than 2.7 million times and experiencing 9x growth since January 2020, Spark NLP is used by 54% of healthcare organizations as the world’s most widely used NLP library in the enterprise.} + } +} +``` + + +%package help +Summary: Development documents and examples for nlu +Provides: python3-nlu-doc +%description help + +# NLU: The Power of Spark NLP, the Simplicity of Python +John Snow Labs' NLU is a Python library for applying state-of-the-art text mining, directly on any dataframe, with a single line of code. +As a facade of the award-winning Spark NLP library, it comes with **1000+** of pretrained models in **100+**, all production-grade, scalable, and trainable, with **everything in 1 line of code.** + + + +## NLU in Action +See how easy it is to use any of the **thousands** of models in 1 line of code, there are hundreds of [tutorials](https://nlu.johnsnowlabs.com/docs/en/notebooks) and [simple examples](https://github.com/JohnSnowLabs/nlu/tree/master/examples) you can copy and paste into your projects to achieve State Of The Art easily. +<img src="http://ckl-it.de/wp-content/uploads/2020/08/My-Video6.gif" width="1800" height="500"/> + +## NLU & Streamlit in Action +This 1 line let's you visualize and play with **1000+ SOTA NLU & NLP models** in **200** languages + +```shell +streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/01_dashboard.py +``` +<img src="https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/docs/assets/streamlit_docs_assets/gif/start.gif"> + +NLU provides tight and simple integration into Streamlit, which enables building powerful webapps in just 1 line of code which showcase the. +View the [NLU&Streamlit documentation](https://nlu.johnsnowlabs.com/docs/en/streamlit_viz_examples) or [NLU & Streamlit examples section](https://github.com/JohnSnowLabs/nlu/tree/master/examples/streamlit). +The entire GIF demo and + + +## All NLU resources overview +Take a look at our official NLU page: [https://nlu.johnsnowlabs.com/](https://nlu.johnsnowlabs.com/) for user documentation and examples + +| Ressource | Description| +|-----------------------------------------------------------------------|-------------------------------------------| +| [Install NLU](https://nlu.johnsnowlabs.com/docs/en/install) | Just run `pip install nlu pyspark==3.0.2` +| [The NLU Namespace](https://nlu.johnsnowlabs.com/docs/en/namespace) | Find all the names of models you can load with `nlu.load()` +| [The `nlu.load(<Model>)` function](https://nlu.johnsnowlabs.com/docs/en/load_api) | Load any of the **1000+ models in 1 line** +| [The `nlu.load(<Model>).predict(data)` function](https://nlu.johnsnowlabs.com/docs/en/predict_api) | Predict on `Strings`, `List of Strings`, `Numpy Arrays`, `Pandas`, `Modin` and `Spark Dataframes` +| [The `nlu.load(<train.Model>).fit(data)` function](https://nlu.johnsnowlabs.com/docs/en/training) | Train a text classifier for `2-Class`, `N-Classes` `Multi-N-Classes`, `Named-Entitiy-Recognition` or `Parts of Speech Tagging` +| [The `nlu.load(<Model>).viz(data)` function](https://nlu.johnsnowlabs.com/docs/en/viz_examples) | Visualize the results of `Word Embedding Similarity Matrix`, `Named Entity Recognizers`, `Dependency Trees & Parts of Speech`, `Entity Resolution`,`Entity Linking` or `Entity Status Assertion` +| [The `nlu.load(<Model>).viz_streamlit(data)` function](https://nlu.johnsnowlabs.com/docs/en/streamlit_viz_examples) | Display an interactive GUI which lets you explore and test every model and feature in NLU in 1 click. +| [General Concepts](https://nlu.johnsnowlabs.com/docs/en/concepts) | General concepts in NLU +| [The latest release notes](https://nlu.johnsnowlabs.com/docs/en/release_notes) | Newest features added to NLU +| [Overview NLU 1-liners examples](https://nlu.johnsnowlabs.com/docs/en/examples) | Most common used models and their results +| [Overview NLU 1-liners examples for healthcare models](https://nlu.johnsnowlabs.com/docs/en/examples_hc) | Most common used healthcare models and their results +| [Overview of all NLU tutorials and Examples](https://nlu.johnsnowlabs.com/docs/en/notebooks) | 100+ tutorials on how to use NLU on text datasets for various problems and from various sources like Twitter, Chinese News, Crypto News Headlines, Airline Traffic communication, Product review classifier training, +| [Connect with us on Slack](https://join.slack.com/t/spark-nlp/shared_invite/zt-lutct9gm-kuUazcyFKhuGY3_0AMkxqA) | Problems, questions or suggestions? We have a very active and helpful community of over 2000+ AI enthusiasts putting NLU, Spark NLP & Spark OCR to good use +| [Discussion Forum](https://github.com/JohnSnowLabs/spark-nlp/discussions) | More indepth discussion with the community? Post a thread in our discussion Forum +| [John Snow Labs Medium](https://medium.com/spark-nlp) | Articles and Tutorials on the NLU, Spark NLP and Spark OCR +| [John Snow Labs Youtube](https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos) | Videos and Tutorials on the NLU, Spark NLP and Spark OCR +| [NLU Website](https://nlu.johnsnowlabs.com/) | The official NLU website +|[Github Issues](https://github.com/JohnSnowLabs/nlu/issues) | Report a bug + + + + + + +## Getting Started with NLU +To get your hands on the power of NLU, you just need to install it via pip and ensure Java 8 is installed and properly configured. Checkout [Quickstart for more infos](https://nlu.johnsnowlabs.com/docs/en/install) +```bash +pip install nlu pyspark==3.0.2 +``` + +## Loading and predicting with any model in 1 line python +```python +import nlu +nlu.load('sentiment').predict('I love NLU! <3') +``` + +## Loading and predicting with multiple models in 1 line + +Get 6 different embeddings in 1 line and use them for downstream data science tasks! + +```python +nlu.load('bert elmo albert xlnet glove use').predict('I love NLU! <3') +``` + +## What kind of models does NLU provide? +NLU provides everything a data scientist might want to wish for in one line of code! + - NLU provides everything a data scientist might want to wish for in one line of code! + - 1000 + pre-trained models + - 100+ of the latest NLP word embeddings ( BERT, ELMO, ALBERT, XLNET, GLOVE, BIOBERT, ELECTRA, COVIDBERT) and different variations of them + - 50+ of the latest NLP sentence embeddings ( BERT, ELECTRA, USE) and different variations of them + - 100+ Classifiers (NER, POS, Emotion, Sarcasm, Questions, Spam) + - 300+ Supported Languages +- Summarize Text and Answer Questions with T5 +- Labeled and Unlabeled Dependency parsing + - Various Text Cleaning and Pre-Processing methods like Stemming, Lemmatizing, Normalizing, Filtering, Cleaning pipelines and more + + +## Classifiers trained on many different datasets +Choose the right tool for the right task! Whether you analyze movies or twitter, NLU has the right model for you! + +- trec6 classifier +- trec10 classifier +- spam classifier +- fake news classifier +- emotion classifier +- cyberbullying classifier +- sarcasm classifier +- sentiment classifier for movies +- IMDB Movie Sentiment classifier +- Twitter sentiment classifier +- NER pretrained on ONTO notes +- NER trainer on CONLL +- Language classifier for 20 languages on the wiki 20 lang dataset. + +## Utilities for the Data Science NLU applications +Working with text data can sometimes be quite a dirty job. NLU helps you keep your hands clean by providing components that take away from data engineering intensive tasks. + +- Datetime Matcher +- Pattern Matcher +- Chunk Matcher +- Phrases Matcher +- Stopword Cleaners +- Pattern Cleaners +- Slang Cleaner + +## Where can I see all models available in NLU? +For NLU models to load, see [the NLU Namespace](https://nlu.johnsnowlabs.com/docs/en/namespace) or the [John Snow Labs Modelshub](https://modelshub.johnsnowlabs.com/models) or go [straight to the source](https://github.com/JohnSnowLabs/nlu/blob/master/nlu/namespace.py). + +## Supported Data Types +- Pandas DataFrame and Series +- Spark DataFrames +- Modin with Ray backend +- Modin with Dask backend +- Numpy arrays +- Strings and lists of strings + +## Overview of all tutorials using the NLU-Library + +In the following tabular, all available tutorials using NLU are listed. These tutorials will help you learn the +usage of the NLU library and on how to use it for your own tasks. Some of the tasks NLU does are +translating from any language to the english language, lemmatizing, tokenizing, cleaning text from +Symbol or unwanted syntax, spellchecking, detecting entities, analyzing sentiments and many more! + +{:.table2} + +| Tutorial Description | NLU Spells Used |Open In Colab | Dataset and Paper References | +|-----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Albert Word Embeddings with NLU | `albert`, `sentiment pos albert emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ALBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Albert-Paper](https://arxiv.org/pdf/1909.11942.pdf), [Albert on Github](https://github.com/google-research/ALBERT), [Albert on TensorFlow](https://tfhub.dev/s?q=albert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Albert](https://medium.com/spark-nlp/1-line-to-albert-word-embeddings-with-nlu-in-python-1691bc048ed1), [Albert_Embedding](https://nlp.johnsnowlabs.com/2021/06/23/albert_base_uncased_en.html) | +| Bert Word Embeddings with NLU | `bert`, `pos sentiment emotion bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_BERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Bert](https://medium.com/spark-nlp/1-line-to-bert-word-embeddings-with-nlu-f50d2b08cddc), [Bert_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| BIOBERT Word Embeddings with NLU | `biobert` , `sentiment pos biobert emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_BIOBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [BioBert-Paper](https://arxiv.org/abs/1901.08746), [Bert Github](https://github.com/google-research/bert) , [BERT: Deep Bidirectional Transformers](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Biobert](https://medium.com/spark-nlp/1-line-to-biobert-word-embeddings-with-nlu-in-python-7224ab52e131), [Biobert_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/biobert_pubmed_base_cased.html) | +| COVIDBERT Word Embeddings with NLU | `covidbert`, `sentiment covidbert pos` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_COVIDBERT_word_embeddings_and_t-SNE_visualization_example.ipynb) | [CovidBert-Paper](https://journals.flvc.org/FLAIRS/article/view/128488), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-CovidBert](https://medium.com/spark-nlp/1-line-to-covidbert-word-embeddings-with-nlu-in-python-e67396da2f78), [Covidbert_Embedding](https://nlp.johnsnowlabs.com/2020/08/27/covidbert_large_uncased.html) | +| ELECTRA Word Embeddings with NLU | `electra`, `sentiment pos en.embed.electra emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ELECTRA_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Electra-Paper](https://arxiv.org/abs/2003.10555), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Electra](https://medium.com/spark-nlp/1-line-to-electra-word-embeddings-with-nlu-in-python-25f749bf3e92), [Electra_Embedding](https://nlp.johnsnowlabs.com/2020/08/27/electra_small_uncased.html) | +| ELMO Word Embeddings with NLU | `elmo`, `sentiment pos elmo emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_ELMo_word_embeddings_and_t-SNE_visualization_example.ipynb) | [ELMO-Paper](https://arxiv.org/abs/1802.05365), [Elmo-TensorFlow](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Elmo](https://medium.com/spark-nlp/1-python-line-for-elmo-word-embeddings-with-john-snow-labs-nlu-628e9b924a3), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html) | +| GLOVE Word Embeddings with NLU | `glove`, `sentiment pos glove emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_GLOVE_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Glove-Paper](https://nlp.stanford.edu/pubs/glove.pdf), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Glove](https://medium.com/spark-nlp/1-line-to-glove-word-embeddings-with-nlu-in-python-baed152fff4d) , [Glove_Embedding](https://nlp.johnsnowlabs.com/2020/01/22/glove_100d.html) | +| XLNET Word Embeddings with NLU | `xlnet`, `sentiment pos xlnet emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_XLNET_word_embeddings_and_t-SNE_visualization_example.ipynb) | [XLNet-Paper](https://arxiv.org/abs/1906.08237), [Bert Github](https://github.com/zihangdai/xlnet), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-XLNet](https://medium.com/spark-nlp/1-line-to-xlnet-word-embeddings-with-nlu-in-python-5efc57d7ac79), [Xlnet_Embedding](https://nlp.johnsnowlabs.com/2021/07/07/xlnet_base_cased_en.html) | +| Multiple Word-Embeddings and Part of Speech in 1 Line of code | `bert electra elmo glove xlnet albert pos` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/word_embeddings/NLU_multiple_word_embeddings_and_t-SNE_visualization_example.ipynb) | [Bert-Paper](https://arxiv.org/pdf/1810.04805.pdf), [Albert-Paper](https://openreview.net/forum?id=H1eA7AEtvS), [ELMO-Paper](https://arxiv.org/abs/1802.05365), [Electra-Paper](https://arxiv.org/abs/2003.10555), [XLNet-Paper](https://arxiv.org/pdf/1906.08237.pdf), [Glove-Paper](https://nlp.stanford.edu/pubs/glove.pdf) | +| Normalzing with NLU | `norm` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_normalizer_example.ipynb) | - | +| Detect sentences with NLU | `sentence_detector.deep`, `sentence_detector.pragmatic`, `xx.sentence_detector` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_sentence_detection_example.ipynb) | [Sentence Detector](https://nlp.johnsnowlabs.com/2020/09/13/sentence_detector_dl_en.html) | +| Spellchecking with NLU | n.a. | n.a. | - | +| Stemming with NLU | `en.stem`, `de.stem` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_stemmer_example.ipynb) | - | +| Stopwords removal with NLU | `stopwords` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_stopwords_removal_example.ipynb) | [Stopwords](https://nlp.johnsnowlabs.com/2020/07/14/stopwords_en.html) | +| Tokenization with NLU | `tokenize` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/NLU_tokenization_example.ipynb) | - | +| Normalization of Documents | `norm_document` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/text_pre_processing_and_cleaning/document_normalizer_demo.ipynb) | - | +| Open and Closed book question answering with Google's T5 | `en.t5` , `answer_question` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_question_answering.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf), [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Overview of every task available with T5 | `en.t5.base` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_tasks_summarize_question_answering_and_more.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf), [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Translate between more than 200 Languages in 1 line of code with Marian Models | `tr.translate_to.fr`, `en.translate_to.fr` ,`fr.translate_to.he` , `en.translate_to.de` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/translation_demo.ipynb) | [Marian-Papers](https://marian-nmt.github.io/publications/), [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation-Pipeline (En to Ger)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_de_xx.html) | +| BERT Sentence Embeddings with NLU | `embed_sentence.bert`, `pos sentiment embed_sentence.bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_BERT_sentence_embeddings_and_t-SNE_visualization_Example.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| ELECTRA Sentence Embeddings with NLU | `embed_sentence.electra`, `pos sentiment embed_sentence.electra` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_ELECTRA_sentence_embeddings_and_t-SNE_visualization_example.ipynb) | [Electra Paper](https://arxiv.org/abs/2003.10555), [Sentence-Electra-Embedding](https://nlp.johnsnowlabs.com/2020/08/27/sent_electra_small_uncased.html) | +| USE Sentence Embeddings with NLU | `use`, `pos sentiment use emotion` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/NLU_USE_sentence_embeddings_and_t-SNE_visualization_example.ipynb) | [Universal Sentence Encoder](https://arxiv.org/abs/1803.11175), [USE-TensorFlow](https://tfhub.dev/google/universal-sentence-encoder/2), [Sentence-USE-Embedding](https://nlp.johnsnowlabs.com/2020/04/17/tfhub_use_lg.html) | +| Sentence similarity with NLU using BERT embeddings | `embed_sentence.bert`, `use en.embed_sentence.electra embed_sentence.bert` |[](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sentence_embeddings/sentence_similarirty_stack_overflow_questions.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Part of Speech tagging with NLU | `pos` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/part_of_speechPOS/NLU_part_of_speech_ANC_example.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html) | +| NER Aspect Airline ATIS | `en.ner.aspect.airline` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NER_aspect_airline_ATIS.ipynb) | [NER Airline Model](https://nlp.johnsnowlabs.com/2021/01/25/nerdl_atis_840b_300d_en.html), [Atis intent Dataset](https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem) | +| NLU-NER_CONLL_2003_5class_example | `ner` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NLU_ner_CONLL_2003_5class_example.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html) | +| Named-entity recognition with Deep Learning ONTO NOTES | `ner.onto` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/NLU_ner_ONTO_18class_example.ipynb) | [NER_Onto](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html) | +| Aspect based NER-Sentiment-Restaurants | `en.ner.aspect_sentiment` |[](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/tutorial_docs/examples/colab/component_examples/named_entity_recognition_NER/aspect_based_ner_sentiment_restaurants.ipynb) | - | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Chinese | `zh.segment_words`, `zh.pos`, `zh.ner`, `zh.translate_to.en` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/multilingual/chinese_ner_pos_and_tokenization.ipynb) | [Translation-Pipeline (Zh to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_zh_en_xx.html) | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Japanese | `ja.segment_words`, `ja.pos`, `ja.ner`, `ja.translate_to.en` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/multilingual/japanese_ner_pos_and_tokenization.ipynb) | [Translation-Pipeline (Ja to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_ja_en_xx.html) | +| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Korean | `ko.segment_words`, `ko.pos`, `ko.ner.kmou.glove_840B_300d`, `ko.translate_to.en` | [](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/master/examples/colab/component_examples/multilingual/korean_ner_pos_and_tokenization.ipynb) | - | +| Date Matching | `match.datetime` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/matchers/NLU_date_matching.ipynb) | - | +| Typed Dependency Parsing with NLU | `dep` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/dependency_parsing/NLU_typed_dependency_parsing_example.ipynb) | [Dependency Parsing ](https://nlp.johnsnowlabs.com/2021/03/27/Typed_Dependency_Parsing_en.html) | +| Untyped Dependency Parsing with NLU | `dep.untyped` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/dependency_parsing/NLU_untyped_dependency_parsing_example.ipynb) | - | +| E2E Classification with NLU | `e2e` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/E2E_classification.ipynb) | [e2e-Model](https://nlp.johnsnowlabs.com/2021/01/21/multiclassifierdl_use_e2e_en.html) | +| Language Classification with NLU | `lang` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/NLU_language_classification.ipynb) | - | +| Cyberbullying Classification with NLU | `classify.cyberbullying` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/cyberbullying_cassification_for_racism_and_sexism.ipynb) | [Cyberbullying-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_cyberbullying_en.html) | +| Sentiment Classification with NLU for Twitter | `emotion` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/emotion_classification.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Fake News Classification with NLU | `en.classify.fakenews` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/fake_news_classification.ipynb) | [Fakenews-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_fakenews_en.html) | +| Intent Classification with NLU | `en.classify.intent.airline` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/intent_classification_airlines_ATIS.ipynb) | [Airline-Intention classifier](https://nlp.johnsnowlabs.com/2021/01/25/classifierdl_use_atis_en.html), [Atis-Dataset](https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem?select=atis_intents.csv) | +| Question classification based on the TREC dataset | `en.classify.questions` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/question_classification.ipynb) | [Question-Classifier](https://nlp.johnsnowlabs.com/2021/01/08/classifierdl_use_trec50_en.html) | +| Sarcasm Classification with NLU | `en.classify.sarcasm` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sarcasm_classification.ipynb) | [Sarcasm-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_sarcasm_en.html) | +| Sentiment Classification with NLU for Twitter | `en.sentiment.twitter` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sentiment_classification.ipynb) | [Sentiment_Twitter-Classifier](https://nlp.johnsnowlabs.com/2021/01/18/sentimentdl_use_twitter_en.html) | +| Sentiment Classification with NLU for Movies | `en.sentiment.imdb` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/sentiment_classification_movies.ipynb) | [Sentiment_imdb-Classifier](https://nlp.johnsnowlabs.com/2021/01/15/analyze_sentimentdl_use_imdb_en.html) | +| Spam Classification with NLU | `en.classify.spam` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/spam_classification.ipynb) | [Spam-Classifier](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_spam_en.html) | +| Toxic text classification with NLU | `en.classify.toxic` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/toxic_classification.ipynb) | [Toxic-Classifier](https://nlp.johnsnowlabs.com/2021/01/21/multiclassifierdl_use_toxic_en.html) | +| Unsupervised keyword extraction with NLU using the YAKE algorithm | `yake` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/unsupervised_keyword_extraction_with_YAKE.ipynb) | - | +| Grammatical Chunk Matching with NLU | `match.chunks` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/chunkers/NLU_chunking_example.ipynb) | - | +| Getting n-Grams with NLU | `ngram` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/chunkers/NLU_n-gram.ipynb) | - | +| Assertion | `en.med_ner.clinical en.assert`, `en.med_ner.clinical.biobert en.assert.biobert`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/assertion/assertion_overview.ipynb) | [Healthcare-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_clinical_en.html), [NER_Clinical-Classifier]( https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_biobert_en.html), [Toxic-Classifier](https://nlp.johnsnowlabs.com/2021/01/26/assertion_dl_biobert_en.html) | +| De-Identification Model overview | `med_ner.jsl.wip.clinical en.de_identify`, `med_ner.jsl.wip.clinical en.de_identify.clinical`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/de_identification/DeIdentification_model_overview.ipynb) | [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html) | +| Drug Normalization | `norm_drugs` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare//drug_normalization/drug_norm.ipynb) | - | +| Entity Resolution | `med_ner.jsl.wip.clinical en.resolve_chunk.cpt_clinical`, `med_ner.jsl.wip.clinical en.resolve.icd10cm`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/entity_resolution/entity_resolvers_overview.ipynb) | [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html), [Entity-Resolver clinical](https://nlp.johnsnowlabs.com/2021/11/01/sbiobertresolve_icd10cm_augmented_billable_hcc_en.html) | +| Medical Named Entity Recognition | `en.med_ner.ade.clinical`, `en.med_ner.ade.clinical_bert`, `en.med_ner.anatomy`,`en.med_ner.anatomy.biobert`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/medical_named_entity_recognition/overview_medical_entity_recognizers.ipynb) | - | +| Relation Extraction | `en.med_ner.jsl.wip.clinical.greedy en.relation`, `en.med_ner.jsl.wip.clinical.greedy en.relation.bodypart.problem`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/relation_extraction/overview_relation.ipynb) | - | +| Visualization of NLP-Models with Spark-NLP and NLU | `ner`, `dep.typed`, `med_ner.jsl.wip.clinical resolve_chunk.rxnorm.in`, `med_ner.jsl.wip.clinical resolve.icd10cm` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/visualization/NLU_visualizations_tutorial.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Dependency Parsing](https://nlp.johnsnowlabs.com/2021/03/27/Typed_Dependency_Parsing_en.html), [NER-Clinical](https://nlp.johnsnowlabs.com/2021/11/03/ner_profiling_clinical_en.html), [Entity-Resolver (Chunks) clinical](https://nlp.johnsnowlabs.com/2021/04/16/chunkresolve_rxnorm_in_clinical_en.html) | +| NLU Covid-19 Emotion Showcase | `emotion` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_covid_emotion_showcase.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| NLU Covid-19 Sentiment Showcase | `sentiment` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_covid_sentiment_showcase.ipynb) | [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| NLU Airline Emotion Demo | `emotion` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_emotion_airline_demo.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| NLU Airline Sentiment Demo | `sentiment` | [![Open In GitHub]()](https://github.com/JohnSnowLabs/nlu/blob/master/examples/kaggle/nlu_sentiment_airline_demo.ipynb) | [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Bengali NER Hindi Embeddings for 30 Models | `bn.ner`, `bn.lemma`, `ja.lemma`, `am.lemma`, `bh.lemma`,` en.ner.onto.bert.small_l2_128`,.. | [](https://colab.research.google.com/github/Murat-Karadag/nlu/blob/master/examples/release_notebooks/NLU1.1.2_Bengali_ner_Hindi_Embeddings_30_new_models.ipynb) | [Bengali-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_jifs_glove_840B_300d_bn.html), [Bengali-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_bn.html), [Japanese-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/15/lemma_ja.html), [Amharic-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_am.html) | +| Entity Resolution | `med_ner.jsl.wip.clinical en.resolve.umls`, `med_ner.jsl.wip.clinical en.resolve.loinc`, `med_ner.jsl.wip.clinical en.resolve.loinc.biobert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/release_notebooks/NLU_3_0_2_release_notebook.ipynb) | - | +| NLU 20 Minutes Crashcourse - the fast Data Science route | `spell`, `sentiment`, `pos`, `ner`, `yake`, `en.t5`, `emotion`, `answer_question`, `en.t5.base` ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/AI4_2021/NLU_crash_course_AI4.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html), [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) , [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Chapter 0: Intro: 1-liners | `sentiment`, `pos`, `ner`, `bert`, `elmo`, `embed_sentence.bert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/NYC_DC_NLP_MEETUP/0_liners_intro.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html), [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Chapter 1: NLU base-features with some classifiers on testdata | `emotion`, `yake`, `stem` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/NYC_DC_NLP_MEETUP/1_NLU_base_features_on_dataset_with_YAKE_Lemma_Stemm_classifiers_NER_.ipynb) | [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Chapter 2: Translation between 300+ languages with Marian | `tr.translate_to.en`, `en.translate_to.fr`, `en.translate_to.he` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/translation_demo.ipynb) | [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation (En to He)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_he_xx.html) | +| Chapter 3: Answer questions and summarize Texts with T5 | `answer_question`, `en.t5`, `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_question_answering.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Chapter 4: Overview of T5-Tasks | `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/sequence2sequence/T5_tasks_summarize_question_answering_and_more.ipynb) | [T5-Model](https://nlp.johnsnowlabs.com/2021/01/08/t5_base_en.html) | +| Graph NLU 20 Minutes Crashcourse - State of the Art Text Mining for Graphs | `spell`, `sentiment`, `pos`, `ner`, `yake`, `emotion`, `med_ner.jsl.wip.clinical`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/graph_ai_summit/Healthcare_Graph_NLU_COVID_Tigergraph.ipynb) | [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html), [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html) | +| Healthcare with NLU | `med_ner.human_phenotype.gene_biobert`, `med_ner.ade_biobert`, `med_ner.anatomy`, `med_ner.bacterial_species`,... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/healthcare_webinar/NLU_healthcare_webinar.ipynb) | - | +| Part 0: Intro: 1-liners | `spell`, `sentiment`, `pos`, `ner`, `bert`, `elmo`, `embed_sentence.bert` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/0_liners_intro.ipynb) | [Bert-Paper](https://arxiv.org/abs/1810.04805), [Bert Github](https://github.com/google-research/bert), [T-SNE](https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf?fbclid=IwA), [T-SNE-Bert](https://medium.com/spark-nlp/1-line-to-bert-word-embeddings-with-nlu-f50d2b08cddc) , [Part of Speech](https://nlp.johnsnowlabs.com/2021/03/05/pos_anc.html), [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Spellchecker](https://nlp.johnsnowlabs.com/2021/03/28/spellcheck_dl_en.html), [Sentiment classification](https://nlp.johnsnowlabs.com/2021/03/24/analyze_sentiment_en.html), [Elmo-Embedding](https://nlp.johnsnowlabs.com/2020/01/31/elmo.html) , [Bert-Sentence_Embedding](https://nlp.johnsnowlabs.com/2020/08/25/sent_small_bert_L2_128.html) | +| Part 1: NLU base-features with some classifiers on Testdata | `yake`, `stem`, `ner`, `emotion` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/1_NLU_base_features_on_dataset_with_YAKE_Lemma_Stemm_classifiers_NER_.ipynb) | [NER-Piple](https://nlp.johnsnowlabs.com/2021/03/22/onto_recognize_entities_sm_en.html), [Emotion detection](https://nlp.johnsnowlabs.com/2021/01/09/classifierdl_use_emotion_en.html) | +| Part 2: Translate between 200+ Languages in 1 line of code with Marian-Models | `en.translate_to.de`, `en.translate_to.fr`, `en.translate_to.he` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/2_multilingual_translation_with_marian_intro.ipynb) | [Translation-Pipeline (En to Fr)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_fr_xx.html), [Translation-Pipeline (En to Ger)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_de_xx.html), [Translation (En to He)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_he_xx.html) | +| Part 3: More Multilingual NLP-translations for Asian Languages with Marian | `en.translate_to.hi`, `en.translate_to.ru`, `en.translate_to.zh` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/3_more_multi_lingual_NLP_translation_Asian_languages_with_Marian.ipynb) | [Translation (En to Hi)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_hil_xx.html), [Translation (En to Ru)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_run_xx.html), [Translation (En to Zh)](https://nlp.johnsnowlabs.com/2021/06/04/translate_en_zh_xx.html) | +| Part 4: Unsupervise Chinese Keyword Extraction, NER and Translation from chinese news | `zh.translate_to.en`, `zh.segment_words`, `yake`, `zh.lemma`, `zh.ner` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/4_Unsupervise_Chinese_Keyword_Extraction_NER_and_Translation_from_Chinese_News.ipynb) | [Translation-Pipeline (Zh to En)](https://nlp.johnsnowlabs.com/2021/06/04/translate_zh_en_xx.html), [Zh-Lemmatizer](https://nlp.johnsnowlabs.com/2020/03/19/explain_document_dl.html) | +| Part 5: Multilingual sentiment classifier training for 100+ languages | `train.sentiment`, `xx.embed_sentence.labse train.sentiment` | n.a. | [Sentence_Embedding.Labse](https://nlp.johnsnowlabs.com/2020/09/23/labse.html) | +| Part 6: Question-answering and Text-summarization with T5-Modell | `answer_question`, `en.t5`, `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/6_T5_question_answering_and_Text_summarization.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf) | +| Part 7: Overview of all tasks available with T5 | `en.t5.base` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/7_T5_SQUAD_GLUE_SUPER_GLUE_TASKS.ipynb) | [T5-Paper](https://arxiv.org/pdf/1910.10683.pdf) | +| Part 8: Overview of some of the Multilingual modes with State Of the Art accuracy (1-liner) | `bn.lemma`, `ja.lemma`, `am.lemma`, `bh.lemma`, `zh.segment_words`, ... | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/multi_lingual_webinar/8_Multi_lingual_ner_pos_stop_words_senti) | [Bengali-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_bn.html), [Japanese-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/15/lemma_ja.html) , [Amharic-Lemmatizer](https://nlp.johnsnowlabs.com/2021/01/20/lemma_am.html) | +| Overview of some Multilingual modes avaiable with State Of the Art accuracy (1-liner) | `bn.ner.cc_300d`, `ja.ner`, `zh.ner`, `th.ner.lst20.glove_840B_300D`, `ar.ner` | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/python_web_conf/Multi_Linigual_examples.ipynb) | [Bengali-NER](https://nlp.johnsnowlabs.com/2021/01/27/ner_jifs_glove_840B_300d_bn.html) +| NLU 20 Minutes Crashcourse - the fast Data Science route | - | [](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/webinars_conferences_etc/python_web_conf/NLU_crashcourse_py_web.ipynb) | - | + + +# Need help? +- [Ping us on Slack](https://spark-nlp.slack.com/archives/C0196BQCDPY) +- [Post an issue on Github](https://github.com/JohnSnowLabs/nlu/issues) + +# Simple NLU Demos +- [NLU different output levels Demo](https://colab.research.google.com/drive/1C4N3wpC17YzZf9fXHDNAJ5JvSmfbq7zT?usp=sharing) + + + + + + + + + + + + + + + + + + + + + + + + + + + + +# Features in NLU Overview +* Tokenization +* Trainable Word Segmentation +* Stop Words Removal +* Token Normalizer +* Document Normalizer +* Stemmer +* Lemmatizer +* NGrams +* Regex Matching +* Text Matching, +* Chunking +* Date Matcher +* Sentence Detector +* Deep Sentence Detector (Deep learning) +* Dependency parsing (Labeled/unlabeled) +* Part-of-speech tagging +* Sentiment Detection (ML models) +* Spell Checker (ML and DL models) +* Word Embeddings (GloVe and Word2Vec) +* BERT Embeddings (TF Hub models) +* ELMO Embeddings (TF Hub models) +* ALBERT Embeddings (TF Hub models) +* XLNet Embeddings +* Universal Sentence Encoder (TF Hub models) +* BERT Sentence Embeddings (42 TF Hub models) +* Sentence Embeddings +* Chunk Embeddings +* Unsupervised keywords extraction +* Language Detection & Identification (up to 375 languages) +* Multi-class Sentiment analysis (Deep learning) +* Multi-label Sentiment analysis (Deep learning) +* Multi-class Text Classification (Deep learning) +* Neural Machine Translation +* Text-To-Text Transfer Transformer (Google T5) +* Named entity recognition (Deep learning) +* Easy TensorFlow integration +* GPU Support +* Full integration with Spark ML functions +* 1000 pre-trained models in +200 languages! +* Multi-lingual NER models: Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Hewbrew, Italian, Japanese, Korean, Norwegian, Persian, Polish, Portuguese, Russian, Spanish, Swedish, Urdu and more +* Natural Language inference +* Coreference resolution +* Sentence Completion +* Word sense disambiguation +* Clinical entity recognition +* Clinical Entity Linking +* Entity normalization +* Assertion Status Detection +* De-identification +* Relation Extraction +* Clinical Entity Resolution + + +## Citation + +We have published a [paper](https://www.sciencedirect.com/science/article/pii/S2665963821000063) that you can cite for the NLU library: + +```bibtex +@article{KOCAMAN2021100058, + title = {Spark NLP: Natural language understanding at scale}, + journal = {Software Impacts}, + pages = {100058}, + year = {2021}, + issn = {2665-9638}, + doi = {https://doi.org/10.1016/j.simpa.2021.100058}, + url = {https://www.sciencedirect.com/science/article/pii/S2665963821000063}, + author = {Veysel Kocaman and David Talby}, + keywords = {Spark, Natural language processing, Deep learning, Tensorflow, Cluster}, + abstract = {Spark NLP is a Natural Language Processing (NLP) library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that can scale easily in a distributed environment. Spark NLP comes with 1100+ pretrained pipelines and models in more than 192+ languages. It supports nearly all the NLP tasks and modules that can be used seamlessly in a cluster. Downloaded more than 2.7 million times and experiencing 9x growth since January 2020, Spark NLP is used by 54% of healthcare organizations as the world’s most widely used NLP library in the enterprise.} + } +} +``` + + +%prep +%autosetup -n nlu-4.2.0 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-nlu -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Tue Apr 11 2023 Python_Bot <Python_Bot@openeuler.org> - 4.2.0-1 +- Package Spec generated |