1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
|
%global _empty_manifest_terminate_build 0
Name: python-abstar
Version: 0.6.1
Release: 1
Summary: VDJ assignment and antibody sequence annotation. Scalable from a single sequence to billions of sequences.
License: MIT License
URL: https://www.github.com/briney/abstar
Source0: https://mirrors.nju.edu.cn/pypi/web/packages/cc/f7/9513bb778b9745b6ae0c07434e87a3bcd3a6380dc5fedd7d96d83b05dcb5/abstar-0.6.1.tar.gz
BuildArch: noarch
Requires: python3-biopython
Requires: python3-celery
Requires: python3-dask[complete]
Requires: python3-matplotlib
Requires: python3-numpy
Requires: python3-pandas
Requires: python3-parasail
Requires: python3-pyarrow
Requires: python3-pymongo
Requires: python3-pytest
%description

[](https://app.travis-ci.com/github/briney/abstar)
[](https://abstar.readthedocs.io/en/latest/?badge=latest)


# abstar
VDJ assignment and antibody sequence annotation. Scalable from a single sequence to billions of sequences.
- Source code: [github.com/briney/abstar](https://github.com/briney/abstar)
- Documentation: [abstar.readthedocs.org](http://abstar.readthedocs.org)
- Download: [pypi.python.org/pypi/abstar](https://pypi.python.org/pypi/abstar)
- Docker: [hub.docker.com/r/briney/abstar/](https://hub.docker.com/r/briney/abstar/)
### install
`pip install abstar`
### use
To run abstar on a single FASTA or FASTQ file:
`abstar -i <input-file> -o <output-directory> -t <temp-directory>`
To iteratively run abstar on all files in an input directory:
`abstar -i <input-directory> -o <output-directory> -t <temp-directory>`
To run abstar using the included test data as input:
`abstar -o <output-directory> -t <temp-directory> --use-test-data`
When using the abstar test data, note that although the test data file contains 1,000 sequences, one of the test sequences is not a valid antibody recombination. Only 999 sequences should be processed successfully.
When using BaseSpace as the input data source, you can optionally provide all of the required directories:
`abstar -i <input-directory> -o <output-directory> -t <temp-directory> -b`
Or you can simply provide a single project directory, and all required directories will be created in the project directory:
`abstar -p <project_directory> -b`
### additional options
`-l LOG_LOCATION, --log LOG_LOCATION` Change the log directory location. Default is the parent directory of `<output_directory>`.
`-m, --merge` Input directory should contain paired FASTQ (or gzipped FASTQ) files. Paired files will be merged with PANDAseq prior to processing with abstar. Note that when using the BaseSpace option (`-b, --basespace`), this option is implied.
`-b, --basespace` Download a sequencing run from BaseSpace, which is Illumina's cloud storage environment. Since Illumina sequencers produce paired-end reads, `--merge` is implied.
`-u N, --uaid N` Sequences contain a unique antibody ID (UAID, or molecular barcode) of length N. The uaid will be parsed from the beginning of each input sequence and added to the JSON output. Negative values result in the UAID being parsed from the end of the sequence.
`-s SPECIES, --species SPECIES` Select the species from which the input sequences are derived. Supported options are 'human', 'mouse', and 'macaque'. Default is 'human'.
`-c, --cluster` Runs abstar in distributed mode on a Celery cluster.
`-h, --help` Prints detailed information about all runtime options.
`-D --debug` Much more verbose logging.
### api
Most core abstar functions are available through a public API, making it easier to run abstar as a component of integrated analysis pipelines. See the abstar [documentation](http://abstar.readthedocs.org) for more detail about the API.
### helper scripts
A few helper scripts are included with abstar:
`batch_mongoimport` automates the import of multiple JSON output files into a MongoDB database.
`build_abstar_germline_db` creates abstar germline databases from IMGT-gapped FASTA files of V, D and J gene segments.
`make_basespace_credfile` makes a credentials file for BaseSpace, which is required if downloading sequences from BaseSpace with abstar. Developer credentials are required, and the process for obtaining them is explained [here](https://support.basespace.illumina.com/knowledgebase/articles/403618-python-run-downloader)
### testing
To run the test suite, clone or download the repository and run `pytest ./` from the top-level directory.
### requirements
Python 3.8+
abutils
biopython
celery
nwalign3
pymongo
pytest
scikit-bio
All of the above dependencies can be installed with pip, and will be installed automatically when installing abstar with pip.
If you're new to Python, a great way to get started is to install the [Anaconda Python distribution](https://www.continuum.io/downloads), which includes pip as well as a ton of useful scientific Python packages.
sequence merging requires [PANDAseq](https://github.com/neufeld/pandaseq)
batch_mongoimport requires [MongoDB](http://www.mongodb.org/)
BaseSpace downloading requires the [BaseSpace Python SDK](https://github.com/basespace/basespace-python-sdk)
%package -n python3-abstar
Summary: VDJ assignment and antibody sequence annotation. Scalable from a single sequence to billions of sequences.
Provides: python-abstar
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-abstar

[](https://app.travis-ci.com/github/briney/abstar)
[](https://abstar.readthedocs.io/en/latest/?badge=latest)


# abstar
VDJ assignment and antibody sequence annotation. Scalable from a single sequence to billions of sequences.
- Source code: [github.com/briney/abstar](https://github.com/briney/abstar)
- Documentation: [abstar.readthedocs.org](http://abstar.readthedocs.org)
- Download: [pypi.python.org/pypi/abstar](https://pypi.python.org/pypi/abstar)
- Docker: [hub.docker.com/r/briney/abstar/](https://hub.docker.com/r/briney/abstar/)
### install
`pip install abstar`
### use
To run abstar on a single FASTA or FASTQ file:
`abstar -i <input-file> -o <output-directory> -t <temp-directory>`
To iteratively run abstar on all files in an input directory:
`abstar -i <input-directory> -o <output-directory> -t <temp-directory>`
To run abstar using the included test data as input:
`abstar -o <output-directory> -t <temp-directory> --use-test-data`
When using the abstar test data, note that although the test data file contains 1,000 sequences, one of the test sequences is not a valid antibody recombination. Only 999 sequences should be processed successfully.
When using BaseSpace as the input data source, you can optionally provide all of the required directories:
`abstar -i <input-directory> -o <output-directory> -t <temp-directory> -b`
Or you can simply provide a single project directory, and all required directories will be created in the project directory:
`abstar -p <project_directory> -b`
### additional options
`-l LOG_LOCATION, --log LOG_LOCATION` Change the log directory location. Default is the parent directory of `<output_directory>`.
`-m, --merge` Input directory should contain paired FASTQ (or gzipped FASTQ) files. Paired files will be merged with PANDAseq prior to processing with abstar. Note that when using the BaseSpace option (`-b, --basespace`), this option is implied.
`-b, --basespace` Download a sequencing run from BaseSpace, which is Illumina's cloud storage environment. Since Illumina sequencers produce paired-end reads, `--merge` is implied.
`-u N, --uaid N` Sequences contain a unique antibody ID (UAID, or molecular barcode) of length N. The uaid will be parsed from the beginning of each input sequence and added to the JSON output. Negative values result in the UAID being parsed from the end of the sequence.
`-s SPECIES, --species SPECIES` Select the species from which the input sequences are derived. Supported options are 'human', 'mouse', and 'macaque'. Default is 'human'.
`-c, --cluster` Runs abstar in distributed mode on a Celery cluster.
`-h, --help` Prints detailed information about all runtime options.
`-D --debug` Much more verbose logging.
### api
Most core abstar functions are available through a public API, making it easier to run abstar as a component of integrated analysis pipelines. See the abstar [documentation](http://abstar.readthedocs.org) for more detail about the API.
### helper scripts
A few helper scripts are included with abstar:
`batch_mongoimport` automates the import of multiple JSON output files into a MongoDB database.
`build_abstar_germline_db` creates abstar germline databases from IMGT-gapped FASTA files of V, D and J gene segments.
`make_basespace_credfile` makes a credentials file for BaseSpace, which is required if downloading sequences from BaseSpace with abstar. Developer credentials are required, and the process for obtaining them is explained [here](https://support.basespace.illumina.com/knowledgebase/articles/403618-python-run-downloader)
### testing
To run the test suite, clone or download the repository and run `pytest ./` from the top-level directory.
### requirements
Python 3.8+
abutils
biopython
celery
nwalign3
pymongo
pytest
scikit-bio
All of the above dependencies can be installed with pip, and will be installed automatically when installing abstar with pip.
If you're new to Python, a great way to get started is to install the [Anaconda Python distribution](https://www.continuum.io/downloads), which includes pip as well as a ton of useful scientific Python packages.
sequence merging requires [PANDAseq](https://github.com/neufeld/pandaseq)
batch_mongoimport requires [MongoDB](http://www.mongodb.org/)
BaseSpace downloading requires the [BaseSpace Python SDK](https://github.com/basespace/basespace-python-sdk)
%package help
Summary: Development documents and examples for abstar
Provides: python3-abstar-doc
%description help

[](https://app.travis-ci.com/github/briney/abstar)
[](https://abstar.readthedocs.io/en/latest/?badge=latest)


# abstar
VDJ assignment and antibody sequence annotation. Scalable from a single sequence to billions of sequences.
- Source code: [github.com/briney/abstar](https://github.com/briney/abstar)
- Documentation: [abstar.readthedocs.org](http://abstar.readthedocs.org)
- Download: [pypi.python.org/pypi/abstar](https://pypi.python.org/pypi/abstar)
- Docker: [hub.docker.com/r/briney/abstar/](https://hub.docker.com/r/briney/abstar/)
### install
`pip install abstar`
### use
To run abstar on a single FASTA or FASTQ file:
`abstar -i <input-file> -o <output-directory> -t <temp-directory>`
To iteratively run abstar on all files in an input directory:
`abstar -i <input-directory> -o <output-directory> -t <temp-directory>`
To run abstar using the included test data as input:
`abstar -o <output-directory> -t <temp-directory> --use-test-data`
When using the abstar test data, note that although the test data file contains 1,000 sequences, one of the test sequences is not a valid antibody recombination. Only 999 sequences should be processed successfully.
When using BaseSpace as the input data source, you can optionally provide all of the required directories:
`abstar -i <input-directory> -o <output-directory> -t <temp-directory> -b`
Or you can simply provide a single project directory, and all required directories will be created in the project directory:
`abstar -p <project_directory> -b`
### additional options
`-l LOG_LOCATION, --log LOG_LOCATION` Change the log directory location. Default is the parent directory of `<output_directory>`.
`-m, --merge` Input directory should contain paired FASTQ (or gzipped FASTQ) files. Paired files will be merged with PANDAseq prior to processing with abstar. Note that when using the BaseSpace option (`-b, --basespace`), this option is implied.
`-b, --basespace` Download a sequencing run from BaseSpace, which is Illumina's cloud storage environment. Since Illumina sequencers produce paired-end reads, `--merge` is implied.
`-u N, --uaid N` Sequences contain a unique antibody ID (UAID, or molecular barcode) of length N. The uaid will be parsed from the beginning of each input sequence and added to the JSON output. Negative values result in the UAID being parsed from the end of the sequence.
`-s SPECIES, --species SPECIES` Select the species from which the input sequences are derived. Supported options are 'human', 'mouse', and 'macaque'. Default is 'human'.
`-c, --cluster` Runs abstar in distributed mode on a Celery cluster.
`-h, --help` Prints detailed information about all runtime options.
`-D --debug` Much more verbose logging.
### api
Most core abstar functions are available through a public API, making it easier to run abstar as a component of integrated analysis pipelines. See the abstar [documentation](http://abstar.readthedocs.org) for more detail about the API.
### helper scripts
A few helper scripts are included with abstar:
`batch_mongoimport` automates the import of multiple JSON output files into a MongoDB database.
`build_abstar_germline_db` creates abstar germline databases from IMGT-gapped FASTA files of V, D and J gene segments.
`make_basespace_credfile` makes a credentials file for BaseSpace, which is required if downloading sequences from BaseSpace with abstar. Developer credentials are required, and the process for obtaining them is explained [here](https://support.basespace.illumina.com/knowledgebase/articles/403618-python-run-downloader)
### testing
To run the test suite, clone or download the repository and run `pytest ./` from the top-level directory.
### requirements
Python 3.8+
abutils
biopython
celery
nwalign3
pymongo
pytest
scikit-bio
All of the above dependencies can be installed with pip, and will be installed automatically when installing abstar with pip.
If you're new to Python, a great way to get started is to install the [Anaconda Python distribution](https://www.continuum.io/downloads), which includes pip as well as a ton of useful scientific Python packages.
sequence merging requires [PANDAseq](https://github.com/neufeld/pandaseq)
batch_mongoimport requires [MongoDB](http://www.mongodb.org/)
BaseSpace downloading requires the [BaseSpace Python SDK](https://github.com/basespace/basespace-python-sdk)
%prep
%autosetup -n abstar-0.6.1
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-abstar -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Tue May 30 2023 Python_Bot <Python_Bot@openeuler.org> - 0.6.1-1
- Package Spec generated
|