python-ldpred.spec


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456

%global _empty_manifest_terminate_build 0
Name:		python-LDpred
Version:	1.0.11
Release:	1
Summary:	A Python package that adjusts GWAS summary statistics for the effects of linkage disequilibrium (LD)
License:	MIT
URL:		https://github.com/ldpred
Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/87/49/3ca0efdb5913b672e1073e3146e933b0b31d9071b9cdb0df2321abf065b2/LDpred-1.0.11.tar.gz
BuildArch:	noarch

Requires:	python3-h5py
Requires:	python3-scipy
Requires:	python3-plinkio

%description

# LDpred #


LDpred is a Python based software package that adjusts GWAS summary statistics
for the effects of linkage disequilibrium (LD).  The details of the method is
described in Vilhjalmsson et al. (AJHG 2015) [http://www.cell.com/ajhg/abstract/S0002-9297(15)00365-1]

* The current version is 1.0.11

### News ###

Recent improvements have focused on making LDpred more robust, addressing issues highlighted by recent publications (Ge et al., Nat Comm 2019; Choi and O'Reilly, GigaScience 2019; Privé et al., AJHG 2019).

- Nov 20th, 2019, v. 1.0.11: Implemented LDpred-fast Joel Mefford's sparsified BLUP prediction (Mefford, thesis 2018). LDpred-fast is suitable for polygenic diseases/traits when LDpred-gibbs fails to converge or is too slow.

- Oct 21st, 2019, v. 1.0.10: LDpred-gibbs now reports LDpred-inf effects for SNPs in long-range LD regions (Price et al., AJHG 2008). This improves convergence of the algorithm substantially when applied to large datasets.

- Oct 17st, 2019, v. 1.0.8: Fixed a bug in LDpred that could improve convergence for gibbs.

- Oct 11th, 2019, v. 1.0.7: Improved accuracy and robustness.
  - Now able to handle variants with p-values rounded down to 0. 
  - Fixed a serious bug that caused sample sizes in summary stats file not always being used correctly when provided. 
  - LDpred gibbs can now handle differing sample sizes per variant effects, if they are parsed in summary stats. 
  - LDpred now estimates the heritabiliy it separately for each chromosome by default. 


## Getting Started ##
LDpred can be installed using pip on most systems by typing

`pip install ldpred`

### Requirements ###
LDpred currently requires three Python packages to be installed and in path.  These
are **h5py** [http://www.h5py.org/](http://www.h5py.org/), **scipy** [http://www.scipy.org/](http://www.scipy.org/)
and **libplinkio** [https://github.com/mfranberg/libplinkio](https://github.com/mfranberg/libplinkio).  Lastly, LDpred
has currently only been tested with **Python 3.6+**.

The first two packages **h5py** and **scipy** are commonly used Python packages, and pre-installed on many computer systems. The last **libplinkio** package can be installed using **pip** (https://pip.pypa.io/en/latest/quickstart.html), which is also pre-installed on many systems.

With **pip**, one can install **libplinkio** using the following command:

`pip install plinkio`

or if you need to install it locally you can try

`pip install --user plinkio`

With these three packages in place, you should be all set to install and use LDpred.

### Installing LDpred ###

As with most Python packages, configurating LDpred is simple.  You can use **pip** to install it by typing

`pip install ldpred`

This should automatically take care of dependencies.  The examples below assume ldpred has been installed using pip.

Alternatively you can use **git** (which is installed on most systems) and clone this repository using the following git command:

`git clone https://github.com/bvilhjal/ldpred.git`

Finally, you can also download the source files and place them somewhere.

With the Python source code in place and the three packages **h5py**, **scipy**, and **libplinkio** installed, then you should be ready to use LDpred.


### How to run tests ###
A couple of simulated data examples can be found in the **test_data** directory.  These datasets were simulated using two different values of *p* (fraction of causal markers) and with heritability set to 0.1. The sample size used when simulating the summary statistics is 10,000.


### Code Contributions ###
I encourage users to extend the code, and adapt it too their needs.  Currently there are no formal guidelines set for
contributions, and pull requests will be reviewed on a case by case basis.

### Who do I talk to? ###
If you have any questions or trouble getting the method to work, try first to look at issues, to see if it is reported there.  Also, you can check if some of the cloned LDpred repos have addressed your issue.

In emergencies, please contact Bjarni Vilhjalmsson (bjarni.vilhjalmsson@gmail.com), but expect slow replies.  

## Using LDpred ##
A typical LDpred workflow consists of 3 steps:

### Step 1: Coordinate data ###
The first step is a data synchronization step, where two or three data sets, genotypes and summary statistics are synchronized.  This generates a HDF5 file which contains the synchronized genotypes.  This step can be done by running 

`ldpred coord`

use --help for detailed options.  This step requires at least one genotype file (the LD reference genotypes), where we recommend at least 1000 unrelated individuals with the same ancestry make-up as the individuals for which summary statistics datasets are obtained from.  Another genotype file can also be given if the user intends to validate the predictions using a separate set of genotypes.

### Step 2: Generate LDpred SNP weights ###
After generating the coordinated data file then the one can apply LDpred and run it on the synchronized dataset.  This step can be done by running 

`ldpred gibbs`

use --help for detailed options.  This step generates two files, a LD file with LD information for the given LD radius, and the re-weighted effect estimates.  The LD file enables the user to not have to generate the LD file again when trying, e.g., different values of **p** (the fraction of causal variants). However, it is re-generated if a different LD radius is given.  The other file that LDpred generates contains the LDpred-adjusted effect estimates.

### Step 3: Generating individual risk scores ###
Individual risk scores can be generated using the following command

`ldpred score`

use --help for detailed options.  It calculates polygenic risk scores for the individuals in the validation data if given, otherwise it treats the LD reference genotypes as validation genotypes.  A phenotype file can be provided, covariate file, as well as plink-formatted principal components file.


### Additional methods: LD-pruning + Thresholding ###
In addition to the LDpred gibbs sampler and infinitesimal model methods, the package also implements LD-pruning + Thresholding as an alternative method. You can run this using the following command

`ldpred p+t`

This method often yields better predictions than LDpred when the LD reference panel is small, or when the training data is very large (due to problems with gibbs sampler convergence).

### Tests ###
You can run a test to see if LDpred work on your system by running the following tests

`ldpred-unittest`

Note that passing this test does not guarantee that LDpred work in all situations.

### Citation ###
Please cite [this paper](https://doi.org/10.1016/j.ajhg.2015.09.001)

### Acknowledges ###
Thanks to all who provided bug reports and contributed code.


%package -n python3-LDpred
Summary:	A Python package that adjusts GWAS summary statistics for the effects of linkage disequilibrium (LD)
Provides:	python-LDpred
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-LDpred

# LDpred #


LDpred is a Python based software package that adjusts GWAS summary statistics
for the effects of linkage disequilibrium (LD).  The details of the method is
described in Vilhjalmsson et al. (AJHG 2015) [http://www.cell.com/ajhg/abstract/S0002-9297(15)00365-1]

* The current version is 1.0.11

### News ###

Recent improvements have focused on making LDpred more robust, addressing issues highlighted by recent publications (Ge et al., Nat Comm 2019; Choi and O'Reilly, GigaScience 2019; Privé et al., AJHG 2019).

- Nov 20th, 2019, v. 1.0.11: Implemented LDpred-fast Joel Mefford's sparsified BLUP prediction (Mefford, thesis 2018). LDpred-fast is suitable for polygenic diseases/traits when LDpred-gibbs fails to converge or is too slow.

- Oct 21st, 2019, v. 1.0.10: LDpred-gibbs now reports LDpred-inf effects for SNPs in long-range LD regions (Price et al., AJHG 2008). This improves convergence of the algorithm substantially when applied to large datasets.

- Oct 17st, 2019, v. 1.0.8: Fixed a bug in LDpred that could improve convergence for gibbs.

- Oct 11th, 2019, v. 1.0.7: Improved accuracy and robustness.
  - Now able to handle variants with p-values rounded down to 0. 
  - Fixed a serious bug that caused sample sizes in summary stats file not always being used correctly when provided. 
  - LDpred gibbs can now handle differing sample sizes per variant effects, if they are parsed in summary stats. 
  - LDpred now estimates the heritabiliy it separately for each chromosome by default. 


## Getting Started ##
LDpred can be installed using pip on most systems by typing

`pip install ldpred`

### Requirements ###
LDpred currently requires three Python packages to be installed and in path.  These
are **h5py** [http://www.h5py.org/](http://www.h5py.org/), **scipy** [http://www.scipy.org/](http://www.scipy.org/)
and **libplinkio** [https://github.com/mfranberg/libplinkio](https://github.com/mfranberg/libplinkio).  Lastly, LDpred
has currently only been tested with **Python 3.6+**.

The first two packages **h5py** and **scipy** are commonly used Python packages, and pre-installed on many computer systems. The last **libplinkio** package can be installed using **pip** (https://pip.pypa.io/en/latest/quickstart.html), which is also pre-installed on many systems.

With **pip**, one can install **libplinkio** using the following command:

`pip install plinkio`

or if you need to install it locally you can try

`pip install --user plinkio`

With these three packages in place, you should be all set to install and use LDpred.

### Installing LDpred ###

As with most Python packages, configurating LDpred is simple.  You can use **pip** to install it by typing

`pip install ldpred`

This should automatically take care of dependencies.  The examples below assume ldpred has been installed using pip.

Alternatively you can use **git** (which is installed on most systems) and clone this repository using the following git command:

`git clone https://github.com/bvilhjal/ldpred.git`

Finally, you can also download the source files and place them somewhere.

With the Python source code in place and the three packages **h5py**, **scipy**, and **libplinkio** installed, then you should be ready to use LDpred.


### How to run tests ###
A couple of simulated data examples can be found in the **test_data** directory.  These datasets were simulated using two different values of *p* (fraction of causal markers) and with heritability set to 0.1. The sample size used when simulating the summary statistics is 10,000.


### Code Contributions ###
I encourage users to extend the code, and adapt it too their needs.  Currently there are no formal guidelines set for
contributions, and pull requests will be reviewed on a case by case basis.

### Who do I talk to? ###
If you have any questions or trouble getting the method to work, try first to look at issues, to see if it is reported there.  Also, you can check if some of the cloned LDpred repos have addressed your issue.

In emergencies, please contact Bjarni Vilhjalmsson (bjarni.vilhjalmsson@gmail.com), but expect slow replies.  

## Using LDpred ##
A typical LDpred workflow consists of 3 steps:

### Step 1: Coordinate data ###
The first step is a data synchronization step, where two or three data sets, genotypes and summary statistics are synchronized.  This generates a HDF5 file which contains the synchronized genotypes.  This step can be done by running 

`ldpred coord`

use --help for detailed options.  This step requires at least one genotype file (the LD reference genotypes), where we recommend at least 1000 unrelated individuals with the same ancestry make-up as the individuals for which summary statistics datasets are obtained from.  Another genotype file can also be given if the user intends to validate the predictions using a separate set of genotypes.

### Step 2: Generate LDpred SNP weights ###
After generating the coordinated data file then the one can apply LDpred and run it on the synchronized dataset.  This step can be done by running 

`ldpred gibbs`

use --help for detailed options.  This step generates two files, a LD file with LD information for the given LD radius, and the re-weighted effect estimates.  The LD file enables the user to not have to generate the LD file again when trying, e.g., different values of **p** (the fraction of causal variants). However, it is re-generated if a different LD radius is given.  The other file that LDpred generates contains the LDpred-adjusted effect estimates.

### Step 3: Generating individual risk scores ###
Individual risk scores can be generated using the following command

`ldpred score`

use --help for detailed options.  It calculates polygenic risk scores for the individuals in the validation data if given, otherwise it treats the LD reference genotypes as validation genotypes.  A phenotype file can be provided, covariate file, as well as plink-formatted principal components file.


### Additional methods: LD-pruning + Thresholding ###
In addition to the LDpred gibbs sampler and infinitesimal model methods, the package also implements LD-pruning + Thresholding as an alternative method. You can run this using the following command

`ldpred p+t`

This method often yields better predictions than LDpred when the LD reference panel is small, or when the training data is very large (due to problems with gibbs sampler convergence).

### Tests ###
You can run a test to see if LDpred work on your system by running the following tests

`ldpred-unittest`

Note that passing this test does not guarantee that LDpred work in all situations.

### Citation ###
Please cite [this paper](https://doi.org/10.1016/j.ajhg.2015.09.001)

### Acknowledges ###
Thanks to all who provided bug reports and contributed code.


%package help
Summary:	Development documents and examples for LDpred
Provides:	python3-LDpred-doc
%description help

# LDpred #


LDpred is a Python based software package that adjusts GWAS summary statistics
for the effects of linkage disequilibrium (LD).  The details of the method is
described in Vilhjalmsson et al. (AJHG 2015) [http://www.cell.com/ajhg/abstract/S0002-9297(15)00365-1]

* The current version is 1.0.11

### News ###

Recent improvements have focused on making LDpred more robust, addressing issues highlighted by recent publications (Ge et al., Nat Comm 2019; Choi and O'Reilly, GigaScience 2019; Privé et al., AJHG 2019).

- Nov 20th, 2019, v. 1.0.11: Implemented LDpred-fast Joel Mefford's sparsified BLUP prediction (Mefford, thesis 2018). LDpred-fast is suitable for polygenic diseases/traits when LDpred-gibbs fails to converge or is too slow.

- Oct 21st, 2019, v. 1.0.10: LDpred-gibbs now reports LDpred-inf effects for SNPs in long-range LD regions (Price et al., AJHG 2008). This improves convergence of the algorithm substantially when applied to large datasets.

- Oct 17st, 2019, v. 1.0.8: Fixed a bug in LDpred that could improve convergence for gibbs.

- Oct 11th, 2019, v. 1.0.7: Improved accuracy and robustness.
  - Now able to handle variants with p-values rounded down to 0. 
  - Fixed a serious bug that caused sample sizes in summary stats file not always being used correctly when provided. 
  - LDpred gibbs can now handle differing sample sizes per variant effects, if they are parsed in summary stats. 
  - LDpred now estimates the heritabiliy it separately for each chromosome by default. 


## Getting Started ##
LDpred can be installed using pip on most systems by typing

`pip install ldpred`

### Requirements ###
LDpred currently requires three Python packages to be installed and in path.  These
are **h5py** [http://www.h5py.org/](http://www.h5py.org/), **scipy** [http://www.scipy.org/](http://www.scipy.org/)
and **libplinkio** [https://github.com/mfranberg/libplinkio](https://github.com/mfranberg/libplinkio).  Lastly, LDpred
has currently only been tested with **Python 3.6+**.

The first two packages **h5py** and **scipy** are commonly used Python packages, and pre-installed on many computer systems. The last **libplinkio** package can be installed using **pip** (https://pip.pypa.io/en/latest/quickstart.html), which is also pre-installed on many systems.

With **pip**, one can install **libplinkio** using the following command:

`pip install plinkio`

or if you need to install it locally you can try

`pip install --user plinkio`

With these three packages in place, you should be all set to install and use LDpred.

### Installing LDpred ###

As with most Python packages, configurating LDpred is simple.  You can use **pip** to install it by typing

`pip install ldpred`

This should automatically take care of dependencies.  The examples below assume ldpred has been installed using pip.

Alternatively you can use **git** (which is installed on most systems) and clone this repository using the following git command:

`git clone https://github.com/bvilhjal/ldpred.git`

Finally, you can also download the source files and place them somewhere.

With the Python source code in place and the three packages **h5py**, **scipy**, and **libplinkio** installed, then you should be ready to use LDpred.


### How to run tests ###
A couple of simulated data examples can be found in the **test_data** directory.  These datasets were simulated using two different values of *p* (fraction of causal markers) and with heritability set to 0.1. The sample size used when simulating the summary statistics is 10,000.


### Code Contributions ###
I encourage users to extend the code, and adapt it too their needs.  Currently there are no formal guidelines set for
contributions, and pull requests will be reviewed on a case by case basis.

### Who do I talk to? ###
If you have any questions or trouble getting the method to work, try first to look at issues, to see if it is reported there.  Also, you can check if some of the cloned LDpred repos have addressed your issue.

In emergencies, please contact Bjarni Vilhjalmsson (bjarni.vilhjalmsson@gmail.com), but expect slow replies.  

## Using LDpred ##
A typical LDpred workflow consists of 3 steps:

### Step 1: Coordinate data ###
The first step is a data synchronization step, where two or three data sets, genotypes and summary statistics are synchronized.  This generates a HDF5 file which contains the synchronized genotypes.  This step can be done by running 

`ldpred coord`

use --help for detailed options.  This step requires at least one genotype file (the LD reference genotypes), where we recommend at least 1000 unrelated individuals with the same ancestry make-up as the individuals for which summary statistics datasets are obtained from.  Another genotype file can also be given if the user intends to validate the predictions using a separate set of genotypes.

### Step 2: Generate LDpred SNP weights ###
After generating the coordinated data file then the one can apply LDpred and run it on the synchronized dataset.  This step can be done by running 

`ldpred gibbs`

use --help for detailed options.  This step generates two files, a LD file with LD information for the given LD radius, and the re-weighted effect estimates.  The LD file enables the user to not have to generate the LD file again when trying, e.g., different values of **p** (the fraction of causal variants). However, it is re-generated if a different LD radius is given.  The other file that LDpred generates contains the LDpred-adjusted effect estimates.

### Step 3: Generating individual risk scores ###
Individual risk scores can be generated using the following command

`ldpred score`

use --help for detailed options.  It calculates polygenic risk scores for the individuals in the validation data if given, otherwise it treats the LD reference genotypes as validation genotypes.  A phenotype file can be provided, covariate file, as well as plink-formatted principal components file.


### Additional methods: LD-pruning + Thresholding ###
In addition to the LDpred gibbs sampler and infinitesimal model methods, the package also implements LD-pruning + Thresholding as an alternative method. You can run this using the following command

`ldpred p+t`

This method often yields better predictions than LDpred when the LD reference panel is small, or when the training data is very large (due to problems with gibbs sampler convergence).

### Tests ###
You can run a test to see if LDpred work on your system by running the following tests

`ldpred-unittest`

Note that passing this test does not guarantee that LDpred work in all situations.

### Citation ###
Please cite [this paper](https://doi.org/10.1016/j.ajhg.2015.09.001)

### Acknowledges ###
Thanks to all who provided bug reports and contributed code.


%prep
%autosetup -n LDpred-1.0.11

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-LDpred -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Tue May 30 2023 Python_Bot <Python_Bot@openeuler.org> - 1.0.11-1
- Package Spec generated