%global _empty_manifest_terminate_build 0
Name: python-detext
Version: 3.2.0
Release: 1
Summary: please add a summary manually as the author left a blank one
License: BSD-2-CLAUSE
URL: https://pypi.org/project/detext/
Source0: https://mirrors.nju.edu.cn/pypi/web/packages/be/b5/7830e3f839fe0de22356c5e59982dc03706588ccd63e32162a903252ff1b/detext-3.2.0.tar.gz
BuildArch: noarch
%description
**DeText** is a _De_ep **_Text_** understanding framework for NLP related ranking, classification,
and language generation tasks. It leverages semantic matching using deep neural networks to
understand member intents in search and recommender systems.
As a general NLP framework, DeText can be applied to many tasks, including search & recommendation ranking,
multi-class classification and query understanding tasks.
More details can be found in the [LinkedIn Engineering blog post](https://engineering.linkedin.com/blog/2020/open-sourcing-detext).
## Highlight
* Natural language understanding powered by state-of-the-art deep neural networks
* automatic feature extraction with deep models
* end-to-end training
* interaction modeling between ranking sources and targets
* A general framework with great flexibility
* customizable model architectures
* multiple text encoder support
* multiple data input types support
* various optimization choices
* standard training flow control
* Easy-to-use
* Configuration based modeling (e.g., all configurations through command line)
## General Model Architecture
DeText supports a general model architecture that contains following components:
* **Word embedding layer**. It converts the sequence of words into a d by n matrix.
* **CNN/BERT/LSTM for text encoding layer**. It takes into the word embedding matrix as input, and maps the text data into a fixed length embedding.
* **Interaction layer**. It generates deep features based on the text embeddings. Options include concatenation, cosine similarity, etc.
* **Wide & Deep Feature Processing**. We combine the traditional features with the interaction features (deep features) in a wide & deep fashion.
* **MLP layer**. The MLP layer is to combine wide features and deep features.
All parameters are jointly updated to optimize the training objective.
![](detext_model_architecture.png)
### Model Configurables
DeText offers great flexibility for clients to build customized networks for their own use cases:
* **LTR/classification layer**: in-house LTR loss implementation, or tf-ranking LTR loss, multi-class classification support.
* **MLP layer**: customizable number of layers and number of dimensions.
* **Interaction layer**: support Cosine Similarity, Hadamard Product, and Concatenation.
* **Text embedding layer**: support CNN, BERT, LSTM with customized parameters on filters, layers, dimensions, etc.
* **Continuous feature normalization**: element-wise rescaling, value normalization.
* **Categorical feature processing**: modeled as entity embedding.
All these can be customized via hyper-parameters in the DeText template. Note that tf-ranking is
supported in the DeText framework, i.e., users can choose the LTR loss and metrics defined in DeText.
## User Guide
### Dev environment set up
1. Create your virtualenv (Python version >= 3.7)
```shell script
VENV_DIR =
python3 -m venv $VENV_DIR # Make sure your python version >= 3.7
source $VENV_DIR/bin/activate # Enter the virtual environment
```
1. Upgrade pip and setuptools version
```shell script
pip3 install -U pip
pip3 install -U setuptools
```
1. Run setup for DeText:
```shell script
pip install . -e
```
1. Verify environment setup through pytest. If all tests pass, the environment is correctly set up
```shell script
pytest
```
1. Refer to the training manual ([TRAINING.md](user_guide/TRAINING.md)) to find information about customizing the model:
* Training data format and preparation
* Key parameters to customize and train DeText models
* Detailed information about all DeText training parameters for full customization
1. Train a model using DeText (e.g., [run_detext.sh](test/resources/run_detext.sh))
### Tutorial
If you would like a simple try out of the library, you can refer to the following notebooks for tutorial
* [text_classification_demo.ipynb](user_guide/notebooks/text_classification_demo.ipynb)
This notebook shows how to use DeText to train a multi-class text classification model on a public query intent
classification dataset. Detailed instructions on data preparation, model training, model inference are included.
* [autocompletion.ipynb](user_guide/notebooks/autocompletion.ipynb)
This notebook shows how to use DeText to train a text ranking model on a public query auto completion dataset.
Detailed steps on data preparation, model training, model inference examples are included.
## **Citation**
Please cite DeText in your publications if it helps your research:
```
@manual{guo-liu20,
author = {Weiwei Guo and
Xiaowei Liu and
Sida Wang and
Huiji Gao and
Bo Long},
title = {DeText: A Deep NLP Framework for Intelligent Text Understanding},
url = {https://engineering.linkedin.com/blog/2020/open-sourcing-detext},
year = {2020}
}
@inproceedings{guo-gao19,
author = {Weiwei Guo and
Huiji Gao and
Jun Shi and
Bo Long},
title = {Deep Natural Language Processing for Search Systems},
booktitle = {ACM SIGIR 2019},
year = {2019}
}
@inproceedings{guo-gao19,
author = {Weiwei Guo and
Huiji Gao and
Jun Shi and
Bo Long and
Liang Zhang and
Bee-Chung Chen and
Deepak Agarwal},
title = {Deep Natural Language Processing for Search and Recommender Systems},
booktitle = {ACM SIGKDD 2019},
year = {2019}
}
@inproceedings{guo-liu20,
author = {Weiwei Guo and
Xiaowei Liu and
Sida Wang and
Huiji Gao and
Ananth Sankar and
Zimeng Yang and
Qi Guo and
Liang Zhang and
Bo Long and
Bee-Chung Chen and
Deepak Agarwal},
title = {DeText: A Deep Text Ranking Framework with BERT},
booktitle = {ACM CIKM 2020},
year = {2020}
}
@inproceedings{jia-long20,
author = {Jun Jia and
Bo Long and
Huiji Gao and
Weiwei Guo and
Jun Shi and
Xiaowei Liu and
Mingzhou Zhou and
Zhoutong Fu and
Sida Wang and
Sandeep Kumar Jha},
title = {Deep Learning for Search and Recommender Systems in Practice},
booktitle = {ACM SIGKDD 2020},
year = {2020}
}
@inproceedings{wang-guo20,
author = {Sida Wang and
Weiwei Guo and
Huiji Gao and
Bo Long},
title = {Efficient Neural Query Auto Completion},
booktitle = {ACM CIKM 2020},
year = {2020}
}
@inproceedings{liu-guo20,
author = {Xiaowei Liu and
Weiwei Guo and
Huiji Gao and
Bo Long},
title = {Deep Search Query Intent Understanding},
booktitle = {arXiv:2008.06759},
year = {2020}
}
```
%package -n python3-detext
Summary: please add a summary manually as the author left a blank one
Provides: python-detext
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-detext
**DeText** is a _De_ep **_Text_** understanding framework for NLP related ranking, classification,
and language generation tasks. It leverages semantic matching using deep neural networks to
understand member intents in search and recommender systems.
As a general NLP framework, DeText can be applied to many tasks, including search & recommendation ranking,
multi-class classification and query understanding tasks.
More details can be found in the [LinkedIn Engineering blog post](https://engineering.linkedin.com/blog/2020/open-sourcing-detext).
## Highlight
* Natural language understanding powered by state-of-the-art deep neural networks
* automatic feature extraction with deep models
* end-to-end training
* interaction modeling between ranking sources and targets
* A general framework with great flexibility
* customizable model architectures
* multiple text encoder support
* multiple data input types support
* various optimization choices
* standard training flow control
* Easy-to-use
* Configuration based modeling (e.g., all configurations through command line)
## General Model Architecture
DeText supports a general model architecture that contains following components:
* **Word embedding layer**. It converts the sequence of words into a d by n matrix.
* **CNN/BERT/LSTM for text encoding layer**. It takes into the word embedding matrix as input, and maps the text data into a fixed length embedding.
* **Interaction layer**. It generates deep features based on the text embeddings. Options include concatenation, cosine similarity, etc.
* **Wide & Deep Feature Processing**. We combine the traditional features with the interaction features (deep features) in a wide & deep fashion.
* **MLP layer**. The MLP layer is to combine wide features and deep features.
All parameters are jointly updated to optimize the training objective.
![](detext_model_architecture.png)
### Model Configurables
DeText offers great flexibility for clients to build customized networks for their own use cases:
* **LTR/classification layer**: in-house LTR loss implementation, or tf-ranking LTR loss, multi-class classification support.
* **MLP layer**: customizable number of layers and number of dimensions.
* **Interaction layer**: support Cosine Similarity, Hadamard Product, and Concatenation.
* **Text embedding layer**: support CNN, BERT, LSTM with customized parameters on filters, layers, dimensions, etc.
* **Continuous feature normalization**: element-wise rescaling, value normalization.
* **Categorical feature processing**: modeled as entity embedding.
All these can be customized via hyper-parameters in the DeText template. Note that tf-ranking is
supported in the DeText framework, i.e., users can choose the LTR loss and metrics defined in DeText.
## User Guide
### Dev environment set up
1. Create your virtualenv (Python version >= 3.7)
```shell script
VENV_DIR =
python3 -m venv $VENV_DIR # Make sure your python version >= 3.7
source $VENV_DIR/bin/activate # Enter the virtual environment
```
1. Upgrade pip and setuptools version
```shell script
pip3 install -U pip
pip3 install -U setuptools
```
1. Run setup for DeText:
```shell script
pip install . -e
```
1. Verify environment setup through pytest. If all tests pass, the environment is correctly set up
```shell script
pytest
```
1. Refer to the training manual ([TRAINING.md](user_guide/TRAINING.md)) to find information about customizing the model:
* Training data format and preparation
* Key parameters to customize and train DeText models
* Detailed information about all DeText training parameters for full customization
1. Train a model using DeText (e.g., [run_detext.sh](test/resources/run_detext.sh))
### Tutorial
If you would like a simple try out of the library, you can refer to the following notebooks for tutorial
* [text_classification_demo.ipynb](user_guide/notebooks/text_classification_demo.ipynb)
This notebook shows how to use DeText to train a multi-class text classification model on a public query intent
classification dataset. Detailed instructions on data preparation, model training, model inference are included.
* [autocompletion.ipynb](user_guide/notebooks/autocompletion.ipynb)
This notebook shows how to use DeText to train a text ranking model on a public query auto completion dataset.
Detailed steps on data preparation, model training, model inference examples are included.
## **Citation**
Please cite DeText in your publications if it helps your research:
```
@manual{guo-liu20,
author = {Weiwei Guo and
Xiaowei Liu and
Sida Wang and
Huiji Gao and
Bo Long},
title = {DeText: A Deep NLP Framework for Intelligent Text Understanding},
url = {https://engineering.linkedin.com/blog/2020/open-sourcing-detext},
year = {2020}
}
@inproceedings{guo-gao19,
author = {Weiwei Guo and
Huiji Gao and
Jun Shi and
Bo Long},
title = {Deep Natural Language Processing for Search Systems},
booktitle = {ACM SIGIR 2019},
year = {2019}
}
@inproceedings{guo-gao19,
author = {Weiwei Guo and
Huiji Gao and
Jun Shi and
Bo Long and
Liang Zhang and
Bee-Chung Chen and
Deepak Agarwal},
title = {Deep Natural Language Processing for Search and Recommender Systems},
booktitle = {ACM SIGKDD 2019},
year = {2019}
}
@inproceedings{guo-liu20,
author = {Weiwei Guo and
Xiaowei Liu and
Sida Wang and
Huiji Gao and
Ananth Sankar and
Zimeng Yang and
Qi Guo and
Liang Zhang and
Bo Long and
Bee-Chung Chen and
Deepak Agarwal},
title = {DeText: A Deep Text Ranking Framework with BERT},
booktitle = {ACM CIKM 2020},
year = {2020}
}
@inproceedings{jia-long20,
author = {Jun Jia and
Bo Long and
Huiji Gao and
Weiwei Guo and
Jun Shi and
Xiaowei Liu and
Mingzhou Zhou and
Zhoutong Fu and
Sida Wang and
Sandeep Kumar Jha},
title = {Deep Learning for Search and Recommender Systems in Practice},
booktitle = {ACM SIGKDD 2020},
year = {2020}
}
@inproceedings{wang-guo20,
author = {Sida Wang and
Weiwei Guo and
Huiji Gao and
Bo Long},
title = {Efficient Neural Query Auto Completion},
booktitle = {ACM CIKM 2020},
year = {2020}
}
@inproceedings{liu-guo20,
author = {Xiaowei Liu and
Weiwei Guo and
Huiji Gao and
Bo Long},
title = {Deep Search Query Intent Understanding},
booktitle = {arXiv:2008.06759},
year = {2020}
}
```
%package help
Summary: Development documents and examples for detext
Provides: python3-detext-doc
%description help
**DeText** is a _De_ep **_Text_** understanding framework for NLP related ranking, classification,
and language generation tasks. It leverages semantic matching using deep neural networks to
understand member intents in search and recommender systems.
As a general NLP framework, DeText can be applied to many tasks, including search & recommendation ranking,
multi-class classification and query understanding tasks.
More details can be found in the [LinkedIn Engineering blog post](https://engineering.linkedin.com/blog/2020/open-sourcing-detext).
## Highlight
* Natural language understanding powered by state-of-the-art deep neural networks
* automatic feature extraction with deep models
* end-to-end training
* interaction modeling between ranking sources and targets
* A general framework with great flexibility
* customizable model architectures
* multiple text encoder support
* multiple data input types support
* various optimization choices
* standard training flow control
* Easy-to-use
* Configuration based modeling (e.g., all configurations through command line)
## General Model Architecture
DeText supports a general model architecture that contains following components:
* **Word embedding layer**. It converts the sequence of words into a d by n matrix.
* **CNN/BERT/LSTM for text encoding layer**. It takes into the word embedding matrix as input, and maps the text data into a fixed length embedding.
* **Interaction layer**. It generates deep features based on the text embeddings. Options include concatenation, cosine similarity, etc.
* **Wide & Deep Feature Processing**. We combine the traditional features with the interaction features (deep features) in a wide & deep fashion.
* **MLP layer**. The MLP layer is to combine wide features and deep features.
All parameters are jointly updated to optimize the training objective.
![](detext_model_architecture.png)
### Model Configurables
DeText offers great flexibility for clients to build customized networks for their own use cases:
* **LTR/classification layer**: in-house LTR loss implementation, or tf-ranking LTR loss, multi-class classification support.
* **MLP layer**: customizable number of layers and number of dimensions.
* **Interaction layer**: support Cosine Similarity, Hadamard Product, and Concatenation.
* **Text embedding layer**: support CNN, BERT, LSTM with customized parameters on filters, layers, dimensions, etc.
* **Continuous feature normalization**: element-wise rescaling, value normalization.
* **Categorical feature processing**: modeled as entity embedding.
All these can be customized via hyper-parameters in the DeText template. Note that tf-ranking is
supported in the DeText framework, i.e., users can choose the LTR loss and metrics defined in DeText.
## User Guide
### Dev environment set up
1. Create your virtualenv (Python version >= 3.7)
```shell script
VENV_DIR =
python3 -m venv $VENV_DIR # Make sure your python version >= 3.7
source $VENV_DIR/bin/activate # Enter the virtual environment
```
1. Upgrade pip and setuptools version
```shell script
pip3 install -U pip
pip3 install -U setuptools
```
1. Run setup for DeText:
```shell script
pip install . -e
```
1. Verify environment setup through pytest. If all tests pass, the environment is correctly set up
```shell script
pytest
```
1. Refer to the training manual ([TRAINING.md](user_guide/TRAINING.md)) to find information about customizing the model:
* Training data format and preparation
* Key parameters to customize and train DeText models
* Detailed information about all DeText training parameters for full customization
1. Train a model using DeText (e.g., [run_detext.sh](test/resources/run_detext.sh))
### Tutorial
If you would like a simple try out of the library, you can refer to the following notebooks for tutorial
* [text_classification_demo.ipynb](user_guide/notebooks/text_classification_demo.ipynb)
This notebook shows how to use DeText to train a multi-class text classification model on a public query intent
classification dataset. Detailed instructions on data preparation, model training, model inference are included.
* [autocompletion.ipynb](user_guide/notebooks/autocompletion.ipynb)
This notebook shows how to use DeText to train a text ranking model on a public query auto completion dataset.
Detailed steps on data preparation, model training, model inference examples are included.
## **Citation**
Please cite DeText in your publications if it helps your research:
```
@manual{guo-liu20,
author = {Weiwei Guo and
Xiaowei Liu and
Sida Wang and
Huiji Gao and
Bo Long},
title = {DeText: A Deep NLP Framework for Intelligent Text Understanding},
url = {https://engineering.linkedin.com/blog/2020/open-sourcing-detext},
year = {2020}
}
@inproceedings{guo-gao19,
author = {Weiwei Guo and
Huiji Gao and
Jun Shi and
Bo Long},
title = {Deep Natural Language Processing for Search Systems},
booktitle = {ACM SIGIR 2019},
year = {2019}
}
@inproceedings{guo-gao19,
author = {Weiwei Guo and
Huiji Gao and
Jun Shi and
Bo Long and
Liang Zhang and
Bee-Chung Chen and
Deepak Agarwal},
title = {Deep Natural Language Processing for Search and Recommender Systems},
booktitle = {ACM SIGKDD 2019},
year = {2019}
}
@inproceedings{guo-liu20,
author = {Weiwei Guo and
Xiaowei Liu and
Sida Wang and
Huiji Gao and
Ananth Sankar and
Zimeng Yang and
Qi Guo and
Liang Zhang and
Bo Long and
Bee-Chung Chen and
Deepak Agarwal},
title = {DeText: A Deep Text Ranking Framework with BERT},
booktitle = {ACM CIKM 2020},
year = {2020}
}
@inproceedings{jia-long20,
author = {Jun Jia and
Bo Long and
Huiji Gao and
Weiwei Guo and
Jun Shi and
Xiaowei Liu and
Mingzhou Zhou and
Zhoutong Fu and
Sida Wang and
Sandeep Kumar Jha},
title = {Deep Learning for Search and Recommender Systems in Practice},
booktitle = {ACM SIGKDD 2020},
year = {2020}
}
@inproceedings{wang-guo20,
author = {Sida Wang and
Weiwei Guo and
Huiji Gao and
Bo Long},
title = {Efficient Neural Query Auto Completion},
booktitle = {ACM CIKM 2020},
year = {2020}
}
@inproceedings{liu-guo20,
author = {Xiaowei Liu and
Weiwei Guo and
Huiji Gao and
Bo Long},
title = {Deep Search Query Intent Understanding},
booktitle = {arXiv:2008.06759},
year = {2020}
}
```
%prep
%autosetup -n detext-3.2.0
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-detext -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Fri May 05 2023 Python_Bot - 3.2.0-1
- Package Spec generated