ZeroDivisionError: float division by zero #449

f12345zxcvbnm2 · 2025-02-13T09:22:14Z

Summary:
ZeroDivisionError: float division by zero

Description:
I am trying to use average_nucleotide_identity.py script for the dataset of bacteria genomes,
my command was:
average_nucleotide_identity.py -i ./0.6_test/ -o ./7_ANI -m ANIm -g -v

Traceback (most recent call last):
  File "/data/h01003/miniconda3/envs/pyani/bin/average_nucleotide_identity.py", line 10, in <module>
    sys.exit(run_main())
  File "/data/h01003/miniconda3/envs/pyani/lib/python3.8/site-packages/pyani/scripts/average_nucleotide_identity.py", line 998, in run_main
    results = method_function(args, infiles, org_lengths)
  File "/data/h01003/miniconda3/envs/pyani/lib/python3.8/site-packages/pyani/scripts/average_nucleotide_identity.py", line 559, in calculate_anim
    results = anim.process_deltadir(deltadir, org_lengths, logger=logger)
  File "/data/h01003/miniconda3/envs/pyani/lib/python3.8/site-packages/pyani/anim.py", line 480, in process_deltadir
    ) = parse_delta(deltafile)
  File "/data/h01003/miniconda3/envs/pyani/lib/python3.8/site-packages/pyani/anim.py", line 396, in parse_delta
    avrg_ID = sum(weighted_identical_bases) / sum(aligned_bases)
ZeroDivisionError: division by zero

pyani Version:
pyani 0.3.0

Python Version:
Python 3.8.19

Operating System:
Ubuntu 20.04.6 LTS

here are my files:
2_bacteria.zip

The text was updated successfully, but these errors were encountered:

peterjc · 2025-02-13T10:59:40Z

Your two bacteria are so different they have no meaningful alignment in nucmer, so the ANIm value ought to be reported as null or NA. Or perhaps zero.

Practically speaking I guess you have probably tried this on a larger set and narrowed it down to a small test case? If so, you've already identified an outlier best left out of the analysis.

f12345zxcvbnm2 · 2025-02-13T11:15:23Z

Greetings, thank you very much for your response. These two genomes were specifically selected from 300 genomes because I reviewed the nucmer_output folder. Now, the situation is a bit tricky. I need to identify and filter out the genomes that cannot be calculated from the entire dataset. I was wondering if pyani has this filtering function? If not, it would be incredibly helpful if such a filtering feature could be included in future updates.

peterjc · 2025-02-13T11:58:07Z

The team is aware of this issue and thinking about how to best handle it as part of a larger engineering effort rewriting pyANI: https://github.com/pyani-plus/pyani-plus

Note pyANI-plus is still unfinished and has not had a public release, but for this specific example the failed comparisons are handled gracefully with a null recorded in the database. This does currently limit the downstream analysis options - see pyani-plus/pyani-plus#272 in particular. If you fancy trying this out, some early end-user feedback would be welcome... but there is no end-user facing documentation as yet beyond the command line help.

f12345zxcvbnm2 closed this as completed Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroDivisionError: float division by zero #449

ZeroDivisionError: float division by zero #449

f12345zxcvbnm2 commented Feb 13, 2025

peterjc commented Feb 13, 2025

f12345zxcvbnm2 commented Feb 13, 2025

peterjc commented Feb 13, 2025

ZeroDivisionError: float division by zero #449

ZeroDivisionError: float division by zero #449

Comments

f12345zxcvbnm2 commented Feb 13, 2025

peterjc commented Feb 13, 2025

f12345zxcvbnm2 commented Feb 13, 2025

peterjc commented Feb 13, 2025