%global _empty_manifest_terminate_build 0 Name: python-tensorflowonspark Version: 2.2.5 Release: 1 Summary: Deep learning with TensorFlow on Apache Spark clusters License: Apache 2.0 URL: https://github.com/yahoo/TensorFlowOnSpark Source0: https://mirrors.nju.edu.cn/pypi/web/packages/95/e3/e75b54b6e5d77b8a7dff55908655b5684d7b48cc04e7e66f359a37fb3202/tensorflowonspark-2.2.5.tar.gz BuildArch: noarch %description # TensorFlowOnSpark > _TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters._ [![Build Status](https://cd.screwdriver.cd/pipelines/6384/badge)](https://cd.screwdriver.cd/pipelines/6384) [![Package](https://img.shields.io/badge/package-pypi-blue.svg)](https://pypi.org/project/tensorflowonspark/) [![Downloads](https://img.shields.io/pypi/dm/tensorflowonspark.svg)](https://img.shields.io/pypi/dm/tensorflowonspark.svg) [![Documentation](https://img.shields.io/badge/Documentation-latest-blue.svg)](https://yahoo.github.io/TensorFlowOnSpark/) By combining salient features from the [TensorFlow](https://www.tensorflow.org) deep learning framework with [Apache Spark](http://spark.apache.org) and [Apache Hadoop](http://hadoop.apache.org), TensorFlowOnSpark enables distributed deep learning on a cluster of GPU and CPU servers. It enables both distributed TensorFlow training and inferencing on Spark clusters, with a goal to minimize the amount of code changes required to run existing TensorFlow programs on a shared grid. Its Spark-compatible API helps manage the TensorFlow cluster with the following steps: 1. **Startup** - launches the Tensorflow main function on the executors, along with listeners for data/control messages. 1. **Data ingestion** - **InputMode.TENSORFLOW** - leverages TensorFlow's built-in APIs to read data files directly from HDFS. - **InputMode.SPARK** - sends Spark RDD data to the TensorFlow nodes via a `TFNode.DataFeed` class. Note that we leverage the [Hadoop Input/Output Format](https://github.com/tensorflow/ecosystem/tree/master/hadoop) to access TFRecords on HDFS. 1. **Shutdown** - shuts down the Tensorflow workers and PS nodes on the executors. ## Table of Contents - [Background](#background) - [Install](#install) - [Usage](#usage) - [API](#api) - [Contribute](#contribute) - [License](#license) ## Background TensorFlowOnSpark was developed by Yahoo for large-scale distributed deep learning on our Hadoop clusters in Yahoo's private cloud. TensorFlowOnSpark provides some important benefits (see [our blog](https://developer.yahoo.com/blogs/157196317141/)) over alternative deep learning solutions. * Easily migrate existing TensorFlow programs with <10 lines of code change. * Support all TensorFlow functionalities: synchronous/asynchronous training, model/data parallelism, inferencing and TensorBoard. * Server-to-server direct communication achieves faster learning when available. * Allow datasets on HDFS and other sources pushed by Spark or pulled by TensorFlow. * Easily integrate with your existing Spark data processing pipelines. * Easily deployed on cloud or on-premise and on CPUs or GPUs. ## Install TensorFlowOnSpark is provided as a pip package, which can be installed on single machines via: ``` # for tensorflow>=2.0.0 pip install tensorflowonspark # for tensorflow<2.0.0 pip install tensorflowonspark==1.4.4 ``` For distributed clusters, please see our [wiki site](../../wiki) for detailed documentation for specific environments, such as our getting started guides for [single-node Spark Standalone](https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_Standalone), [YARN clusters](../../wiki/GetStarted_YARN) and [AWS EC2](../../wiki/GetStarted_EC2). Note: the Windows operating system is not currently supported due to [this issue](https://github.com/yahoo/TensorFlowOnSpark/issues/36). ## Usage To use TensorFlowOnSpark with an existing TensorFlow application, you can follow our [Conversion Guide](../../wiki/Conversion-Guide) to describe the required changes. Additionally, our [wiki site](../../wiki) has pointers to some presentations which provide an overview of the platform. **Note: since TensorFlow 2.x breaks API compatibility with TensorFlow 1.x, the examples have been updated accordingly. If you are using TensorFlow 1.x, you will need to checkout the `v1.4.4` tag for compatible examples and instructions.** ## API [API Documentation](https://yahoo.github.io/TensorFlowOnSpark/) is automatically generated from the code. ## Contribute Please join the [TensorFlowOnSpark user group](https://groups.google.com/forum/#!forum/TensorFlowOnSpark-users) for discussions and questions. If you have a question, please review our [FAQ](../../wiki/Frequently-Asked-Questions) before posting. Contributions are always welcome. For more information, please see our [guide for getting involved](Contributing.md). ## License The use and distribution terms for this software are covered by the Apache 2.0 license. See [LICENSE](LICENSE) file for terms. %package -n python3-tensorflowonspark Summary: Deep learning with TensorFlow on Apache Spark clusters Provides: python-tensorflowonspark BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-tensorflowonspark # TensorFlowOnSpark > _TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters._ [![Build Status](https://cd.screwdriver.cd/pipelines/6384/badge)](https://cd.screwdriver.cd/pipelines/6384) [![Package](https://img.shields.io/badge/package-pypi-blue.svg)](https://pypi.org/project/tensorflowonspark/) [![Downloads](https://img.shields.io/pypi/dm/tensorflowonspark.svg)](https://img.shields.io/pypi/dm/tensorflowonspark.svg) [![Documentation](https://img.shields.io/badge/Documentation-latest-blue.svg)](https://yahoo.github.io/TensorFlowOnSpark/) By combining salient features from the [TensorFlow](https://www.tensorflow.org) deep learning framework with [Apache Spark](http://spark.apache.org) and [Apache Hadoop](http://hadoop.apache.org), TensorFlowOnSpark enables distributed deep learning on a cluster of GPU and CPU servers. It enables both distributed TensorFlow training and inferencing on Spark clusters, with a goal to minimize the amount of code changes required to run existing TensorFlow programs on a shared grid. Its Spark-compatible API helps manage the TensorFlow cluster with the following steps: 1. **Startup** - launches the Tensorflow main function on the executors, along with listeners for data/control messages. 1. **Data ingestion** - **InputMode.TENSORFLOW** - leverages TensorFlow's built-in APIs to read data files directly from HDFS. - **InputMode.SPARK** - sends Spark RDD data to the TensorFlow nodes via a `TFNode.DataFeed` class. Note that we leverage the [Hadoop Input/Output Format](https://github.com/tensorflow/ecosystem/tree/master/hadoop) to access TFRecords on HDFS. 1. **Shutdown** - shuts down the Tensorflow workers and PS nodes on the executors. ## Table of Contents - [Background](#background) - [Install](#install) - [Usage](#usage) - [API](#api) - [Contribute](#contribute) - [License](#license) ## Background TensorFlowOnSpark was developed by Yahoo for large-scale distributed deep learning on our Hadoop clusters in Yahoo's private cloud. TensorFlowOnSpark provides some important benefits (see [our blog](https://developer.yahoo.com/blogs/157196317141/)) over alternative deep learning solutions. * Easily migrate existing TensorFlow programs with <10 lines of code change. * Support all TensorFlow functionalities: synchronous/asynchronous training, model/data parallelism, inferencing and TensorBoard. * Server-to-server direct communication achieves faster learning when available. * Allow datasets on HDFS and other sources pushed by Spark or pulled by TensorFlow. * Easily integrate with your existing Spark data processing pipelines. * Easily deployed on cloud or on-premise and on CPUs or GPUs. ## Install TensorFlowOnSpark is provided as a pip package, which can be installed on single machines via: ``` # for tensorflow>=2.0.0 pip install tensorflowonspark # for tensorflow<2.0.0 pip install tensorflowonspark==1.4.4 ``` For distributed clusters, please see our [wiki site](../../wiki) for detailed documentation for specific environments, such as our getting started guides for [single-node Spark Standalone](https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_Standalone), [YARN clusters](../../wiki/GetStarted_YARN) and [AWS EC2](../../wiki/GetStarted_EC2). Note: the Windows operating system is not currently supported due to [this issue](https://github.com/yahoo/TensorFlowOnSpark/issues/36). ## Usage To use TensorFlowOnSpark with an existing TensorFlow application, you can follow our [Conversion Guide](../../wiki/Conversion-Guide) to describe the required changes. Additionally, our [wiki site](../../wiki) has pointers to some presentations which provide an overview of the platform. **Note: since TensorFlow 2.x breaks API compatibility with TensorFlow 1.x, the examples have been updated accordingly. If you are using TensorFlow 1.x, you will need to checkout the `v1.4.4` tag for compatible examples and instructions.** ## API [API Documentation](https://yahoo.github.io/TensorFlowOnSpark/) is automatically generated from the code. ## Contribute Please join the [TensorFlowOnSpark user group](https://groups.google.com/forum/#!forum/TensorFlowOnSpark-users) for discussions and questions. If you have a question, please review our [FAQ](../../wiki/Frequently-Asked-Questions) before posting. Contributions are always welcome. For more information, please see our [guide for getting involved](Contributing.md). ## License The use and distribution terms for this software are covered by the Apache 2.0 license. See [LICENSE](LICENSE) file for terms. %package help Summary: Development documents and examples for tensorflowonspark Provides: python3-tensorflowonspark-doc %description help # TensorFlowOnSpark > _TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters._ [![Build Status](https://cd.screwdriver.cd/pipelines/6384/badge)](https://cd.screwdriver.cd/pipelines/6384) [![Package](https://img.shields.io/badge/package-pypi-blue.svg)](https://pypi.org/project/tensorflowonspark/) [![Downloads](https://img.shields.io/pypi/dm/tensorflowonspark.svg)](https://img.shields.io/pypi/dm/tensorflowonspark.svg) [![Documentation](https://img.shields.io/badge/Documentation-latest-blue.svg)](https://yahoo.github.io/TensorFlowOnSpark/) By combining salient features from the [TensorFlow](https://www.tensorflow.org) deep learning framework with [Apache Spark](http://spark.apache.org) and [Apache Hadoop](http://hadoop.apache.org), TensorFlowOnSpark enables distributed deep learning on a cluster of GPU and CPU servers. It enables both distributed TensorFlow training and inferencing on Spark clusters, with a goal to minimize the amount of code changes required to run existing TensorFlow programs on a shared grid. Its Spark-compatible API helps manage the TensorFlow cluster with the following steps: 1. **Startup** - launches the Tensorflow main function on the executors, along with listeners for data/control messages. 1. **Data ingestion** - **InputMode.TENSORFLOW** - leverages TensorFlow's built-in APIs to read data files directly from HDFS. - **InputMode.SPARK** - sends Spark RDD data to the TensorFlow nodes via a `TFNode.DataFeed` class. Note that we leverage the [Hadoop Input/Output Format](https://github.com/tensorflow/ecosystem/tree/master/hadoop) to access TFRecords on HDFS. 1. **Shutdown** - shuts down the Tensorflow workers and PS nodes on the executors. ## Table of Contents - [Background](#background) - [Install](#install) - [Usage](#usage) - [API](#api) - [Contribute](#contribute) - [License](#license) ## Background TensorFlowOnSpark was developed by Yahoo for large-scale distributed deep learning on our Hadoop clusters in Yahoo's private cloud. TensorFlowOnSpark provides some important benefits (see [our blog](https://developer.yahoo.com/blogs/157196317141/)) over alternative deep learning solutions. * Easily migrate existing TensorFlow programs with <10 lines of code change. * Support all TensorFlow functionalities: synchronous/asynchronous training, model/data parallelism, inferencing and TensorBoard. * Server-to-server direct communication achieves faster learning when available. * Allow datasets on HDFS and other sources pushed by Spark or pulled by TensorFlow. * Easily integrate with your existing Spark data processing pipelines. * Easily deployed on cloud or on-premise and on CPUs or GPUs. ## Install TensorFlowOnSpark is provided as a pip package, which can be installed on single machines via: ``` # for tensorflow>=2.0.0 pip install tensorflowonspark # for tensorflow<2.0.0 pip install tensorflowonspark==1.4.4 ``` For distributed clusters, please see our [wiki site](../../wiki) for detailed documentation for specific environments, such as our getting started guides for [single-node Spark Standalone](https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_Standalone), [YARN clusters](../../wiki/GetStarted_YARN) and [AWS EC2](../../wiki/GetStarted_EC2). Note: the Windows operating system is not currently supported due to [this issue](https://github.com/yahoo/TensorFlowOnSpark/issues/36). ## Usage To use TensorFlowOnSpark with an existing TensorFlow application, you can follow our [Conversion Guide](../../wiki/Conversion-Guide) to describe the required changes. Additionally, our [wiki site](../../wiki) has pointers to some presentations which provide an overview of the platform. **Note: since TensorFlow 2.x breaks API compatibility with TensorFlow 1.x, the examples have been updated accordingly. If you are using TensorFlow 1.x, you will need to checkout the `v1.4.4` tag for compatible examples and instructions.** ## API [API Documentation](https://yahoo.github.io/TensorFlowOnSpark/) is automatically generated from the code. ## Contribute Please join the [TensorFlowOnSpark user group](https://groups.google.com/forum/#!forum/TensorFlowOnSpark-users) for discussions and questions. If you have a question, please review our [FAQ](../../wiki/Frequently-Asked-Questions) before posting. Contributions are always welcome. For more information, please see our [guide for getting involved](Contributing.md). ## License The use and distribution terms for this software are covered by the Apache 2.0 license. See [LICENSE](LICENSE) file for terms. %prep %autosetup -n tensorflowonspark-2.2.5 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-tensorflowonspark -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Fri Apr 21 2023 Python_Bot - 2.2.5-1 - Package Spec generated