Fix missing output for nonsense and essential_splice impacts by FerriolCalvet · Pull Request #22 · bbglab/omega

FerriolCalvet · 2024-04-22T09:51:17Z

Problem

Missing outputs for nonsense and essential_splice impacts:

gene    sample  impact  dnds    pvalue  lower   upper
ARID1A  P19_0044_BDO_01 missense        1.0398697       0.8973031       0.49979153      1.8980241
ARID1A  P19_0044_BDO_01 nonsense                                
ARID1A  P19_0044_BDO_01 essential_splice                                
EP300   P19_0044_BDO_01 missense        1.798834        0.028565407     1.0808547       3.3448544
EP300   P19_0044_BDO_01 nonsense                                
EP300   P19_0044_BDO_01 essential_splice

Possible reasons

There are 0s in the lambdas vectors and the tensorflow probablity functions don't seem to like it too much.

For missense:

l equal to [0.28558427 0.32617283 0.05533479 0.13334824 0.15745339 0.09242073
 0.08529546 0.30062553 0.20218417 0.2424905  0.08098787 0.27460638
 0.13486527 0.32502422 0.00684538 0.14846061 0.01359925 0.0575599
 0.04244946 0.01481647 0.01848553 0.05727808 0.06567501 0.09264341
 0.09275755 0.06764409 0.01564826 0.02131398 0.05997038 0.06238931
 0.01891936 0.12842454 0.0938124  0.12310488 0.17774798 0.04028124
 0.14306849 0.46776822 0.30274367 0.25046015 0.08864892 0.09785648
 0.07335839 0.25764883 0.108295   0.12602973 0.11663748 0.02268062
 0.07946959 0.05692631 0.01860485 0.02654143 0.02170661 0.04168417
 0.13690487 0.03569017 0.01251115 0.05133861 0.01112057 0.07303728
 0.0157018  0.01001406 0.00796137 0.01362443 0.16251975 0.08476225
 0.3185019  0.06119071 0.03862971 0.07041591 0.07634285 0.04007868
 0.08172115 0.01557059 0.09000723 0.0718038  0.02061916 0.03482537
 0.05826616 0.03305257 0.17029199 0.01897544 0.01860485 0.00884715
 0.007079   0.01311499 0.07524361 0.01182637 0.11420705 0.01738286
 0.03394764 0.01460746 0.0127372  0.01234038 0.01358624 0.01617238]

For nonsense:

l equal to [1.2262212e-02 2.1095350e-02 4.3984400e-03 3.9292038e-03 9.4510019e-03
 3.0427321e-03 4.3038931e-03 1.4362105e-02 1.8824339e-03 5.5378429e-03
 2.5045760e-03 2.4753483e-03 7.2978713e-02 4.8018314e-02 8.4807668e-03
 3.2509621e-02 5.8391481e-04 3.7227089e-03 2.6390641e-03 4.3657818e-04
 5.9400496e-05 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 1.3675435e-02 0.0000000e+00
 0.0000000e+00 0.0000000e+00 3.1742804e-02 2.1667564e-03 2.5523536e-02
 0.0000000e+00 5.5269703e-02 7.7360771e-03 2.5631344e-02 0.0000000e+00
 5.3811915e-02 2.7419524e-03 0.0000000e+00 0.0000000e+00 3.2564253e-02
 0.0000000e+00 5.9980094e-03 0.0000000e+00 5.0469371e-03 3.4141592e-03
 2.1656503e-03 1.6842084e-03 0.0000000e+00 8.1689312e-04 5.1226473e-04
 2.5927636e-04 1.7851978e-04 1.2167569e-03 1.9530891e-04 0.0000000e+00
 7.4089942e-03 1.8911036e-03 4.8782704e-03 1.7760373e-03 0.0000000e+00
 0.0000000e+00 3.4833457e-03 7.0859439e-04 0.0000000e+00 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 1.3559515e-04 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00
 1.0814865e-02 1.1380531e-03 2.1656503e-03 5.6140282e-04 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 1.3559515e-04
 0.0000000e+00 0.0000000e+00 1.8345994e-03 0.0000000e+00 0.0000000e+00
 0.0000000e+00]

Explanation

There are only a limited number of changes that can lead to having a nonsense impact (stop codon creation, start codon truncation ... ) and same for the essential_splice variant types.

Then the 0s of the lambdas vector are real 0s, so filling them is not correct, and we would also never expect to see a mutation with that impact in any of the channels in which there is no probability (lambda = 0).

Then what we can do is remove the positions of the vector in which there is a lambda = 0 and also remove those positions from the mutations vector changing the shape from 96 channels to whatever number of channels are left after the filtering.

This is what I implemented. (see file changes in this PR)

After solving it

gene    sample  impact  dnds    pvalue  lower   upper
ARID1A  P19_0044_BDO_01 missense        1.0398697       0.8973031       0.49979153      1.8980241
ARID1A  P19_0044_BDO_01 nonsense        0.21724111      0.34792298      0.21724111      3.645586
ARID1A  P19_0044_BDO_01 essential_splice        0.21734934      0.5623948       0.21734934      9.230412
EP300   P19_0044_BDO_01 missense        1.798834        0.028565407     1.0808547       3.3448544
EP300   P19_0044_BDO_01 nonsense        3.609274        0.09682387      0.71474063      13.938253
EP300   P19_0044_BDO_01 essential_splice        0.21835195      0.5825151       0.21835195      10.202482

- happens for nonsense and essential_splice impacts

- fixed bug in subsetting the vectors not working

koszulordie

It is still a question why the previous implementation failed to provide an output when the lambda (expectation) vectors had zero expectation components. But the proposed solution makes sense in that it should give in theory the same behavior as intended by the original MLE implementation and it seems to be working in practice.

FerriolCalvet added 2 commits April 21, 2024 16:44

wip fix: no output when lambdas have 0 values

e94534c

- happens for nonsense and essential_splice impacts

fix: output for nonsense and essential_splice

276ef64

- fixed bug in subsetting the vectors not working

FerriolCalvet requested a review from koszulordie April 22, 2024 09:51

FerriolCalvet self-assigned this Apr 22, 2024

FerriolCalvet added the bug Something isn't working label Apr 22, 2024

FerriolCalvet linked an issue Apr 22, 2024 that may be closed by this pull request

No output for nonsense and essential_splice impacts #21

Closed

koszulordie approved these changes Apr 22, 2024

View reviewed changes

FerriolCalvet merged commit ac87eca into dev/package_singlesample Apr 22, 2024

FerriolCalvet deleted the dev/21-nonsense-n-essential_splice branch April 22, 2024 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing output for nonsense and essential_splice impacts#22

Fix missing output for nonsense and essential_splice impacts#22
FerriolCalvet merged 2 commits intodev/package_singlesamplefrom
dev/21-nonsense-n-essential_splice

FerriolCalvet commented Apr 22, 2024

Uh oh!

koszulordie left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FerriolCalvet commented Apr 22, 2024

Problem

Possible reasons

Explanation

After solving it

Uh oh!

koszulordie left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants