Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroDivisionError: float division by zero #449

Closed
f12345zxcvbnm2 opened this issue Feb 13, 2025 · 3 comments
Closed

ZeroDivisionError: float division by zero #449

f12345zxcvbnm2 opened this issue Feb 13, 2025 · 3 comments

Comments

@f12345zxcvbnm2
Copy link

Summary:
ZeroDivisionError: float division by zero

Description:
I am trying to use average_nucleotide_identity.py script for the dataset of bacteria genomes,
my command was:
average_nucleotide_identity.py -i ./0.6_test/ -o ./7_ANI -m ANIm -g -v

Traceback (most recent call last):
  File "/data/h01003/miniconda3/envs/pyani/bin/average_nucleotide_identity.py", line 10, in <module>
    sys.exit(run_main())
  File "/data/h01003/miniconda3/envs/pyani/lib/python3.8/site-packages/pyani/scripts/average_nucleotide_identity.py", line 998, in run_main
    results = method_function(args, infiles, org_lengths)
  File "/data/h01003/miniconda3/envs/pyani/lib/python3.8/site-packages/pyani/scripts/average_nucleotide_identity.py", line 559, in calculate_anim
    results = anim.process_deltadir(deltadir, org_lengths, logger=logger)
  File "/data/h01003/miniconda3/envs/pyani/lib/python3.8/site-packages/pyani/anim.py", line 480, in process_deltadir
    ) = parse_delta(deltafile)
  File "/data/h01003/miniconda3/envs/pyani/lib/python3.8/site-packages/pyani/anim.py", line 396, in parse_delta
    avrg_ID = sum(weighted_identical_bases) / sum(aligned_bases)
ZeroDivisionError: division by zero

pyani Version:
pyani 0.3.0

Python Version:
Python 3.8.19

Operating System:
Ubuntu 20.04.6 LTS

here are my files:
2_bacteria.zip

@peterjc
Copy link
Collaborator

peterjc commented Feb 13, 2025

Your two bacteria are so different they have no meaningful alignment in nucmer, so the ANIm value ought to be reported as null or NA. Or perhaps zero.

Practically speaking I guess you have probably tried this on a larger set and narrowed it down to a small test case? If so, you've already identified an outlier best left out of the analysis.

@f12345zxcvbnm2
Copy link
Author

Greetings, thank you very much for your response. These two genomes were specifically selected from 300 genomes because I reviewed the nucmer_output folder. Now, the situation is a bit tricky. I need to identify and filter out the genomes that cannot be calculated from the entire dataset. I was wondering if pyani has this filtering function? If not, it would be incredibly helpful if such a filtering feature could be included in future updates.

@peterjc
Copy link
Collaborator

peterjc commented Feb 13, 2025

The team is aware of this issue and thinking about how to best handle it as part of a larger engineering effort rewriting pyANI: https://github.com/pyani-plus/pyani-plus

Note pyANI-plus is still unfinished and has not had a public release, but for this specific example the failed comparisons are handled gracefully with a null recorded in the database. This does currently limit the downstream analysis options - see pyani-plus/pyani-plus#272 in particular. If you fancy trying this out, some early end-user feedback would be welcome... but there is no end-user facing documentation as yet beyond the command line help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants