%global _empty_manifest_terminate_build 0
Name: python-cstreet
Version: 1.1.2
Release: 1
Summary: CStreet is a python script (python 3.6, 3.7 or 3.8) for cell state trajectory construction by using k-nearest neighbors graph algorithm for time-series single-cell RNA-seq data.
License: MIT License
URL: https://github.com/yw-Hua/CStreet
Source0: https://mirrors.aliyun.com/pypi/web/packages/f9/cd/4c74db35a79bf79c48aedcba8a3b2a38561c93be8e8c41a199d1349f604c/cstreet-1.1.2.tar.gz
BuildArch: noarch
Requires: python3-pandas
Requires: python3-numpy
Requires: python3-scanpy
Requires: python3-anndata
Requires: python3-networkx
Requires: python3-colorama
Requires: python3-fa2
Requires: python3-retrying
Requires: python3-igraph
Requires: python3-louvain
# CStreet: a computed Cell State trajectory inference method for time-series single-cell RNA-seq data
**| [Overview](#overview) | [Installation](#installation) | [Quick Start](#quick-start) | [Parameter Details](#parameter-details) | [Run CStreet in python interface](#run-cstreet-in-python-interface) | [Data&Code](#supplementary-datacode) | [Citation](#citation) |**

## Overview
CStreet is a cell state trajectory inference method for time-series single-cell RNA-seq data. It is written in Python (3.6, 3.7 or 3.8) and is available as a command line tool and a Python library to meet the needs of different users.
CStreet uses time-series information to construct the k-nearest neighbors connections within and between time points. Then, CStreet calculates the connection probabilities of cell states and visualizes the trajectory, which may include multiple starting points and paths, using a force-directed layout method.
## Installation
CStreet has been packaged and uploaded to [PyPI](https://pypi.org/project/cstreet/). Before your installation, ensure that you have pip available. pip3 is the package installer for Python. If you do not have pip3 on your machine, click [here](https://pip.pypa.io/en/stable/) to install it. Then, CStreet and its relevant packages can be installed using a single command.
$ pip3 install cstreet
You may experience errors when installing or updating packages. This is because pip3 will change the way that it resolves dependency conflicts. We recommend you use the command below.
$ pip3 install cstreet --use-feature=2020-resolver
You may need to use the command below to add the default installation path of pip3 to your system path,
$ export PATH=~/.local/bin:$PATH
Then, type the command below to check whether CStreet has been installed successfully.(For the first time running, it may takes about one minute for CStreet’s initial configuration.)
$ CStreet -h
## Quick Start
### Step 1. CStreet installation following the above tutorial.
### Step 2. Input preparation
CStreet utilizes time-series expression levels in tab-delimited format or AnnData format as input.
The cell state information can be generated using the built-in clustering function of CStreet or input by the user.
We provided a small [test dataset](https://github.com/yw-Hua/CStreet/tree/master/test/test_data) containing normalized expression levels and the state information at three time points.
Or type the command below to download.
$ wget https://github.com/yw-Hua/CStreet/raw/master/test/test_data.zip
$ unzip test_data.zip
### Step 3. Operation of CStreet
Type the command below to run CStreet.
$ CStreet -i test_data/ExpressionMatrix_t1.txt test_data/ExpressionMatrix_t2.txt test_data/ExpressionMatrix_t3.txt -s test_data/CellStates_t1.txt test_data/CellStates_t2.txt test_data/CellStates_t3.txt -n test
### Step 4. Output
The contents of the output directory in tree format will be displayed as described below, including the clustered cell state information if it is not provided by users, the connection probabilities of the cell states and a visualization of the inferred cell state trajectories.
├── ProjectName_CStreetTopology.pdf
├── ProjectName_CellStatesConnCytoscape.txt
├── SupplementaryFigures
│ ├── ProjectName_t1_ForceDirectedLayout.pdf
│ ├── ProjectName_t1_LouvainUMAPClustering.pdf
│ ├── ProjectName_t1_LouvainUMAPClusteringCoordinates.txt
│ ├── ProjectName_t2_ForceDirectedLayout.pdf
│ ├── ProjectName_t2_LouvainUMAPClustering.pdf
│ ├── ProjectName_t2_LouvainUMAPClusteringCoordinates.txt
│ └── ...
└── SupplementaryResults
├── ProjectName_BetweenTimePoints_CellStatesConnectionProbabilities.txt
├── ProjectName_t1_FilteredCellInfo.txt
├── ProjectName_t1_FilteredGeneInfo.txt
├── ProjectName_t1_CellStatesConnectionProbabilities.txt
├── ProjectName_t2_FilteredCellInfo.txt
├── ProjectName_t2_FilteredGeneInfo.txt
├── ProjectName_t2_CellStatesConnectionProbabilities.txt
└── ...

## Parameter Details
The parameter details of CStreet are as follows:
usage: CStreet [-h] <-i ExpMatrix1 ExpMatrix2 ExpMatrix3 ...> [-s CellStates1 CellStates2 CellStates3 ...] [-n ProjectName] [-o OutputDir] [options]
CStreet is a cell states trajectory inference method for time-series single-cell RNA-seq data.
optional arguments:
-h, --help show this help message and exit
-v, --version show version number of CStreet and exit
This indicates expression matrixes, which contain the
time-series expression level as read counts or
normalized values in tab-delimited format. For
example: '-i ExpressionMatrix_t1.txt
ExpressionMatrix_t2.txt ExpressionMatrix_t3.txt'
indicates the input of 3 timepoint expression
-T, --Input_CellinCol
This determines whether the expression level of one
cell is displayed as a column in the expression
matrixes. For example, '-T' indicates that the gene
expression levels of one cell are displayed in a
column and the expression levels of one gene across
all cells are displayed in a row.
The project directory, which is used to save all
output files. DEFAULT: "./".
The project name, which is used to generate output
file names as a prefix. DEFAULT: "CStreet"
An optional parameter that uses CStreet's built-in
dimensionality reduction and clustering methods to
perform clustering without knowing the cell states
(DEFAULT) or accepts the user’s input. The input files
should contain the cell state information sharing the
same cell ID in the expression matrixes in tab-
delimited format.
The number of principal components to use, which is
enabled only if the cell state information is not
provided. It can be set from 1 to the minimum
dimension size of the expression matrixes. DEFAULT:
The number of nearest neighbors to be searched, which
is enabled only if the cell state information is not
provided. It should be in the range of 2 to 100 in
general. DEFAULT: 15.
The resolution of the Louvain algorithm, which is
enabled only if the cell state information is not
provided. A higher resolution means that more and
smaller clusters are found. DEFAULT: 0.1.
--Switch_DeadCellFilter {ON,OFF}
The switch for the dead cell filter, which filters
cell outliers based on the count percent of
mitochondrial genes. DEFAULT: "ON".
The maximum count percent of mitochondrial genes
needed for a cell to pass filtering, which is enabled
only if '--Switch_DeadCellFilter' is "ON". DEFAULT:
--Switch_LowCellNumGeneFilter {ON,OFF}
The switch for the low cell number gene filter, which
retains genes that are expressed in at least a certain
number of cells. DEFAULT: "ON".
The minimum number of cells expressed that is required
for a gene to pass filtering, which is enabled only if
'--Switch_LowCellNumGeneFilter' is "ON". DEFAULT: 3.
--Switch_LowGeneCellsFilter {ON,OFF}
The switch for the low gene number cell filter, which
retains cells with at least a certain number of genes
expressed. DEFAULT: "ON".
The minimum number of genes expressed that is required
for a cell to pass filtering, which is enabled only if
'--Switch_LowGeneCellsFilter' is "ON". DEFAULT: 200.
--Switch_Normalize {ON,OFF}
The switch to enable total read count normalization
for each cell. DEFAULT: "ON".
The base of normalization, which is enabled only if '
--Switch_Normalize' is "ON". If the DEFAULT is chosen,
it is CPM normalization. DEFAULT: 1e6.
--Switch_LogTransform {ON,OFF}
The switch to logarithmize the expression matrix.
The distance metric to use for kNN. It can be set to
"euclidean" or "correlation". DEFAULT: "euclidean".
The number of principal components to use within a
timepoint. It can be set from 1 to the minimum
dimension size of the expression matrixes. DEFAULT:
The number of nearest neighbors to be searched within
a timepoint. It should be in the range of 2 to 100 in
general. DEFAULT: 15.
The number of principal components to use between
timepoints. It can be set from 1 to the minimum
dimension size of the expression matrixes. DEFAULT:
The number of nearest neighbors to be searched between
timepoints. It should be in the range of 2 to 100 in
general. DEFAULT: 15.
The number of repeated sampling trials used to
estimate the connection probability. DEFAULT: 5.
The seed of repeated sampling trials used to estimate
the connection probability. DEFAULT: 0.
Figure size of the result figure. For example: '--
FigureParam_FigureSize 6 7' indicates width is 6 and
height is 7. DEFAULT: 6 7.
The width of the label box in the result figure. For
example, '--FigureParam_LabelBoxWidth 10' means that
10 characters will be shown in the label box of the
resulting figure. DEFAULT: 10.
The minimum probability of edge for each cell state
that is displayed, which will only be used for
visualization. It can be a number between 0 and 1 or
"OTSU". When OTSU is selected, it will be
automatically estimated using OTSU's Method (OTSU,
1979). DEFAULT: "OTSU".
The maximum number of outdegrees for each cell state
that is displayed, which will only be used for
visualization. DEFAULT: 10.
The minimum cell number of each cell state that is
displayed, which will only be used for visualization.
## Run CStreet in Python interface
CStreet can also be used step by step in the Python interface and easily integrated into custom scripts. [Here](https://nbviewer.jupyter.org/github/yw-Hua/CStreet/blob/master/tutorial.ipynb) is a tutorial written using [Jupyter Notebook](https://jupyter.org/).
## Supplementary Data&Code
All supplementary code and data, including cell type annotations, used to run the comparisons are available on [here](https://1drv.ms/u/s!Atc1l0sjoiRyhaxOqZYIGkgOyI3Lhg?e=bfhR9u).
The [Anndata h5ad file format](https://anndata.readthedocs.io/en/stable/index.html) is used to stored single-cell expression data and additional annotations in our supplementary file.
## Citation
Zhao C, Xiu W, Hua Y, Zhang N, Zhang Y. CStreet: a computed Cell State trajectory inference method for time-series single-cell RNA sequencing data. Bioinformatics (Oxford, England). 2021 Jul. DOI: 10.1093/bioinformatics/btab488.
%package -n python3-cstreet
Summary: CStreet is a python script (python 3.6, 3.7 or 3.8) for cell state trajectory construction by using k-nearest neighbors graph algorithm for time-series single-cell RNA-seq data.
Provides: python-cstreet
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-cstreet
%package help
Summary: Development documents and examples for cstreet
Provides: python3-cstreet-doc
%description help
%autosetup -n cstreet-1.1.2
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
if [ -d usr/bin ]; then
find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-cstreet -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
* Tue Jun 20 2023 Python_Bot - 1.1.2-1
- Package Spec generated