%global _empty_manifest_terminate_build 0 Name: python-pytolemaic Version: 0.15.4 Release: 1 Summary: Package for ML model analysis License: Free To Use But Restricted URL: https://github.com/broundal/Pytolemaic Source0: https://mirrors.aliyun.com/pypi/web/packages/c1/7c/697a8443642d286a28b81d4b0790a22f249faac224927ae533c92caf9ae3/pytolemaic-0.15.4.tar.gz BuildArch: noarch Requires: python3-numpy Requires: python3-pandas Requires: python3-scipy Requires: python3-scikit-learn Requires: python3-lime Requires: python3-matplotlib Requires: python3-xgboost %description ![PyPI - Version](https://img.shields.io/pypi/v/pytolemaic?color=brightgreen) ![Unittests](https://github.com/Broundal/Pytolemaic/workflows/Unittests/badge.svg?branch=master) ![PyPI - License](https://img.shields.io/pypi/l/pytolemaic?color=orange) # Pytolemaic ## What is Pytolemaic Pytolemaic package analyzes your model and dataset and measure their quality. The package supports classification/regression models built for tabular datasets (e.g. sklearn's regressors/classifiers), but will also support custom made models as long as they implement sklearn's API. The package is aimed for personal use and comes with no guarantees. I hope you will find it useful. I will appreciate any feedback you have. ## Install ``` pip install pytolemaic ``` ## Basic usage ``` from pytolemaic import PyTrust pytrust = PyTrust(model=estimator, xtrain=xtrain, ytrain=ytrain, xtest=xtest, ytest=ytest) # run all analysis and print insights:, insights = pytrust.insights() print("\n".join(insights)) # run analysis and plot graphs pytrust.plot() ``` ## supported features The package contains the following functionalities: #### On model creation - **Dataset Analysis**: Analysis aimed to detect issues in the dataset. - **Sensitivity Analysis**: Calculation of feature importance for given model, either via sensitivity to feature value or sensitivity to missing values. - **Vulnerability report**: Based on the feature sensitivity we measure model's vulnerability in respect to imputation, leakage, and # of features. - **Scoring report**: Report model's score on test data with confidence interval. - **separation quality**: Measure whether train and test data comes from the same distribution. - **Overall quality**: Provides overall quality measures #### On prediction - **Prediction uncertainty**: Provides an uncertainty measure for given model's prediction. - **Lime explanation**: Provides Lime explanation for sample of interest. ## How to use: Get started by calling help() function (*Recommended!*): ``` from pytolemaic import help supported_keys = help() # or help(key='basic usage') ``` Example for performing all available analysis with PyTrust: ``` from pytolemaic import PyTrust pytrust = PyTrust( model=estimator, xtrain=xtrain, ytrain=ytrain, xtest=xtest, ytest=ytest) # run all analysis and get a list of distilled insights", insights = pytrust.insights() print("\n".join(insights)) # run all analysis and plot all graphs pytrust.plot() # print all data gathered import pprint pprint(report.to_dict(printable=True)) ``` In case of need to access only specific analysis (usually to save time) ``` # dataset analysis report dataset_analysis_report = pytrust.dataset_analysis_report # feature sensitivity report sensitivity_report = pytrust.sensitivity_report # model's performance report scoring_report = pytrust.scoring_report # overall model's quality report quality_report = pytrust.quality_report # with any of the above reports report = print("\n".join(report.insights())) report.plot() # plot graphs pprint(report.to_dict(printable=True)) # export report as a dictionary pprint(report.to_dict_meaning()) # print documentation for above dictionary ``` Analysis of predictions ``` # estimate uncertainty of a prediction uncertainty_model = pytrust.create_uncertainty_model() # explain a prediction with Lime create_lime_explainer = pytrust.create_lime_explainer() ``` Examples on toy dataset can be found in [/examples/toy_examples/](./examples/toy_examples/) Examples on 'real-life' datasets can be found in [/examples/interesting_examples/](./examples/interesting_examples/) ## Output examples: #### Sensitivity Analysis: - The sensitivity of each feature (\[0,1\], normalized to sum of 1): ``` 'sensitivity_report': { 'method': 'shuffled', 'sensitivities': { 'age': 0.12395, 'capital-gain': 0.06725, 'capital-loss': 0.02465, 'education': 0.05769, 'education-num': 0.13765, ... } } ``` - Simple statistics on the feature sensitivity: ``` 'shuffle_stats_report': { 'n_features': 14, 'n_low': 1, 'n_zero': 0 } ``` - Naive vulnerability scores (\[0,1\], lower is better): - **Imputation**: sensitivity of the model to missing values. - **Leakge**: chance of the model to have leaking features. - **Too many features**: Whether the model is based on too many features. ``` 'vulnerability_report': { 'imputation': 0.35, 'leakage': 0, 'too_many_features': 0.14 } ``` #### scoring report For given metric, the score and confidence intervals (CI) is calculated ``` 'recall': { 'ci_high': 0.763, 'ci_low': 0.758, 'ci_ratio': 0.023, 'metric': 'recall', 'value': 0.760, }, 'auc': { 'ci_high': 0.909, 'ci_low': 0.907, 'ci_ratio': 0.022, 'metric': 'auc', 'value': 0.907 } ``` Additionally, score quality measures the quality of the score based on the separability (auc score) between train and test sets. Value of 1 means test set has same distribution as train set. Value of 0 means test set has fundamentally different distribution. ``` 'separation_quality': 0.00611 ``` Combining the above measures into a single number we provide the overall quality of the model/dataset. Higher quality value (\[0,1\]) means better dataset/model. ``` quality_report : { 'model_quality_report': { 'model_loss': 0.24, 'model_quality': 0.41, 'vulnerability_report': {...}}, 'test_quality_report': { 'ci_ratio': 0.023, 'separation_quality': 0.006, 'test_set_quality': 0}, 'train_quality_report': { 'train_set_quality': 0.85, 'vulnerability_report': {...}} ``` #### prediction uncertainty The module can be used to yield uncertainty measure for predictions. ``` uncertainty_model = pytrust.create_uncertainty_model(method='confidence') predictions = uncertainty_model.predict(x_pred) # same as model.predict(x_pred) uncertainty = uncertainty_model.uncertainty(x_pred) ``` #### Lime explanation The module can be used to produce lime explanations for sample of interest. ``` explainer = pytrust.create_lime_explainer() explainer.explain(sample) # returns a dictionary explainer.plot(sample) # produce a graphical explanation ``` %package -n python3-pytolemaic Summary: Package for ML model analysis Provides: python-pytolemaic BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-pytolemaic ![PyPI - Version](https://img.shields.io/pypi/v/pytolemaic?color=brightgreen) ![Unittests](https://github.com/Broundal/Pytolemaic/workflows/Unittests/badge.svg?branch=master) ![PyPI - License](https://img.shields.io/pypi/l/pytolemaic?color=orange) # Pytolemaic ## What is Pytolemaic Pytolemaic package analyzes your model and dataset and measure their quality. The package supports classification/regression models built for tabular datasets (e.g. sklearn's regressors/classifiers), but will also support custom made models as long as they implement sklearn's API. The package is aimed for personal use and comes with no guarantees. I hope you will find it useful. I will appreciate any feedback you have. ## Install ``` pip install pytolemaic ``` ## Basic usage ``` from pytolemaic import PyTrust pytrust = PyTrust(model=estimator, xtrain=xtrain, ytrain=ytrain, xtest=xtest, ytest=ytest) # run all analysis and print insights:, insights = pytrust.insights() print("\n".join(insights)) # run analysis and plot graphs pytrust.plot() ``` ## supported features The package contains the following functionalities: #### On model creation - **Dataset Analysis**: Analysis aimed to detect issues in the dataset. - **Sensitivity Analysis**: Calculation of feature importance for given model, either via sensitivity to feature value or sensitivity to missing values. - **Vulnerability report**: Based on the feature sensitivity we measure model's vulnerability in respect to imputation, leakage, and # of features. - **Scoring report**: Report model's score on test data with confidence interval. - **separation quality**: Measure whether train and test data comes from the same distribution. - **Overall quality**: Provides overall quality measures #### On prediction - **Prediction uncertainty**: Provides an uncertainty measure for given model's prediction. - **Lime explanation**: Provides Lime explanation for sample of interest. ## How to use: Get started by calling help() function (*Recommended!*): ``` from pytolemaic import help supported_keys = help() # or help(key='basic usage') ``` Example for performing all available analysis with PyTrust: ``` from pytolemaic import PyTrust pytrust = PyTrust( model=estimator, xtrain=xtrain, ytrain=ytrain, xtest=xtest, ytest=ytest) # run all analysis and get a list of distilled insights", insights = pytrust.insights() print("\n".join(insights)) # run all analysis and plot all graphs pytrust.plot() # print all data gathered import pprint pprint(report.to_dict(printable=True)) ``` In case of need to access only specific analysis (usually to save time) ``` # dataset analysis report dataset_analysis_report = pytrust.dataset_analysis_report # feature sensitivity report sensitivity_report = pytrust.sensitivity_report # model's performance report scoring_report = pytrust.scoring_report # overall model's quality report quality_report = pytrust.quality_report # with any of the above reports report = print("\n".join(report.insights())) report.plot() # plot graphs pprint(report.to_dict(printable=True)) # export report as a dictionary pprint(report.to_dict_meaning()) # print documentation for above dictionary ``` Analysis of predictions ``` # estimate uncertainty of a prediction uncertainty_model = pytrust.create_uncertainty_model() # explain a prediction with Lime create_lime_explainer = pytrust.create_lime_explainer() ``` Examples on toy dataset can be found in [/examples/toy_examples/](./examples/toy_examples/) Examples on 'real-life' datasets can be found in [/examples/interesting_examples/](./examples/interesting_examples/) ## Output examples: #### Sensitivity Analysis: - The sensitivity of each feature (\[0,1\], normalized to sum of 1): ``` 'sensitivity_report': { 'method': 'shuffled', 'sensitivities': { 'age': 0.12395, 'capital-gain': 0.06725, 'capital-loss': 0.02465, 'education': 0.05769, 'education-num': 0.13765, ... } } ``` - Simple statistics on the feature sensitivity: ``` 'shuffle_stats_report': { 'n_features': 14, 'n_low': 1, 'n_zero': 0 } ``` - Naive vulnerability scores (\[0,1\], lower is better): - **Imputation**: sensitivity of the model to missing values. - **Leakge**: chance of the model to have leaking features. - **Too many features**: Whether the model is based on too many features. ``` 'vulnerability_report': { 'imputation': 0.35, 'leakage': 0, 'too_many_features': 0.14 } ``` #### scoring report For given metric, the score and confidence intervals (CI) is calculated ``` 'recall': { 'ci_high': 0.763, 'ci_low': 0.758, 'ci_ratio': 0.023, 'metric': 'recall', 'value': 0.760, }, 'auc': { 'ci_high': 0.909, 'ci_low': 0.907, 'ci_ratio': 0.022, 'metric': 'auc', 'value': 0.907 } ``` Additionally, score quality measures the quality of the score based on the separability (auc score) between train and test sets. Value of 1 means test set has same distribution as train set. Value of 0 means test set has fundamentally different distribution. ``` 'separation_quality': 0.00611 ``` Combining the above measures into a single number we provide the overall quality of the model/dataset. Higher quality value (\[0,1\]) means better dataset/model. ``` quality_report : { 'model_quality_report': { 'model_loss': 0.24, 'model_quality': 0.41, 'vulnerability_report': {...}}, 'test_quality_report': { 'ci_ratio': 0.023, 'separation_quality': 0.006, 'test_set_quality': 0}, 'train_quality_report': { 'train_set_quality': 0.85, 'vulnerability_report': {...}} ``` #### prediction uncertainty The module can be used to yield uncertainty measure for predictions. ``` uncertainty_model = pytrust.create_uncertainty_model(method='confidence') predictions = uncertainty_model.predict(x_pred) # same as model.predict(x_pred) uncertainty = uncertainty_model.uncertainty(x_pred) ``` #### Lime explanation The module can be used to produce lime explanations for sample of interest. ``` explainer = pytrust.create_lime_explainer() explainer.explain(sample) # returns a dictionary explainer.plot(sample) # produce a graphical explanation ``` %package help Summary: Development documents and examples for pytolemaic Provides: python3-pytolemaic-doc %description help ![PyPI - Version](https://img.shields.io/pypi/v/pytolemaic?color=brightgreen) ![Unittests](https://github.com/Broundal/Pytolemaic/workflows/Unittests/badge.svg?branch=master) ![PyPI - License](https://img.shields.io/pypi/l/pytolemaic?color=orange) # Pytolemaic ## What is Pytolemaic Pytolemaic package analyzes your model and dataset and measure their quality. The package supports classification/regression models built for tabular datasets (e.g. sklearn's regressors/classifiers), but will also support custom made models as long as they implement sklearn's API. The package is aimed for personal use and comes with no guarantees. I hope you will find it useful. I will appreciate any feedback you have. ## Install ``` pip install pytolemaic ``` ## Basic usage ``` from pytolemaic import PyTrust pytrust = PyTrust(model=estimator, xtrain=xtrain, ytrain=ytrain, xtest=xtest, ytest=ytest) # run all analysis and print insights:, insights = pytrust.insights() print("\n".join(insights)) # run analysis and plot graphs pytrust.plot() ``` ## supported features The package contains the following functionalities: #### On model creation - **Dataset Analysis**: Analysis aimed to detect issues in the dataset. - **Sensitivity Analysis**: Calculation of feature importance for given model, either via sensitivity to feature value or sensitivity to missing values. - **Vulnerability report**: Based on the feature sensitivity we measure model's vulnerability in respect to imputation, leakage, and # of features. - **Scoring report**: Report model's score on test data with confidence interval. - **separation quality**: Measure whether train and test data comes from the same distribution. - **Overall quality**: Provides overall quality measures #### On prediction - **Prediction uncertainty**: Provides an uncertainty measure for given model's prediction. - **Lime explanation**: Provides Lime explanation for sample of interest. ## How to use: Get started by calling help() function (*Recommended!*): ``` from pytolemaic import help supported_keys = help() # or help(key='basic usage') ``` Example for performing all available analysis with PyTrust: ``` from pytolemaic import PyTrust pytrust = PyTrust( model=estimator, xtrain=xtrain, ytrain=ytrain, xtest=xtest, ytest=ytest) # run all analysis and get a list of distilled insights", insights = pytrust.insights() print("\n".join(insights)) # run all analysis and plot all graphs pytrust.plot() # print all data gathered import pprint pprint(report.to_dict(printable=True)) ``` In case of need to access only specific analysis (usually to save time) ``` # dataset analysis report dataset_analysis_report = pytrust.dataset_analysis_report # feature sensitivity report sensitivity_report = pytrust.sensitivity_report # model's performance report scoring_report = pytrust.scoring_report # overall model's quality report quality_report = pytrust.quality_report # with any of the above reports report = print("\n".join(report.insights())) report.plot() # plot graphs pprint(report.to_dict(printable=True)) # export report as a dictionary pprint(report.to_dict_meaning()) # print documentation for above dictionary ``` Analysis of predictions ``` # estimate uncertainty of a prediction uncertainty_model = pytrust.create_uncertainty_model() # explain a prediction with Lime create_lime_explainer = pytrust.create_lime_explainer() ``` Examples on toy dataset can be found in [/examples/toy_examples/](./examples/toy_examples/) Examples on 'real-life' datasets can be found in [/examples/interesting_examples/](./examples/interesting_examples/) ## Output examples: #### Sensitivity Analysis: - The sensitivity of each feature (\[0,1\], normalized to sum of 1): ``` 'sensitivity_report': { 'method': 'shuffled', 'sensitivities': { 'age': 0.12395, 'capital-gain': 0.06725, 'capital-loss': 0.02465, 'education': 0.05769, 'education-num': 0.13765, ... } } ``` - Simple statistics on the feature sensitivity: ``` 'shuffle_stats_report': { 'n_features': 14, 'n_low': 1, 'n_zero': 0 } ``` - Naive vulnerability scores (\[0,1\], lower is better): - **Imputation**: sensitivity of the model to missing values. - **Leakge**: chance of the model to have leaking features. - **Too many features**: Whether the model is based on too many features. ``` 'vulnerability_report': { 'imputation': 0.35, 'leakage': 0, 'too_many_features': 0.14 } ``` #### scoring report For given metric, the score and confidence intervals (CI) is calculated ``` 'recall': { 'ci_high': 0.763, 'ci_low': 0.758, 'ci_ratio': 0.023, 'metric': 'recall', 'value': 0.760, }, 'auc': { 'ci_high': 0.909, 'ci_low': 0.907, 'ci_ratio': 0.022, 'metric': 'auc', 'value': 0.907 } ``` Additionally, score quality measures the quality of the score based on the separability (auc score) between train and test sets. Value of 1 means test set has same distribution as train set. Value of 0 means test set has fundamentally different distribution. ``` 'separation_quality': 0.00611 ``` Combining the above measures into a single number we provide the overall quality of the model/dataset. Higher quality value (\[0,1\]) means better dataset/model. ``` quality_report : { 'model_quality_report': { 'model_loss': 0.24, 'model_quality': 0.41, 'vulnerability_report': {...}}, 'test_quality_report': { 'ci_ratio': 0.023, 'separation_quality': 0.006, 'test_set_quality': 0}, 'train_quality_report': { 'train_set_quality': 0.85, 'vulnerability_report': {...}} ``` #### prediction uncertainty The module can be used to yield uncertainty measure for predictions. ``` uncertainty_model = pytrust.create_uncertainty_model(method='confidence') predictions = uncertainty_model.predict(x_pred) # same as model.predict(x_pred) uncertainty = uncertainty_model.uncertainty(x_pred) ``` #### Lime explanation The module can be used to produce lime explanations for sample of interest. ``` explainer = pytrust.create_lime_explainer() explainer.explain(sample) # returns a dictionary explainer.plot(sample) # produce a graphical explanation ``` %prep %autosetup -n pytolemaic-0.15.4 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-pytolemaic -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Fri Jun 09 2023 Python_Bot - 0.15.4-1 - Package Spec generated