Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_timeline_ancestor_detach_idempotent_success: increase timeout #10464

Closed
erikgrinaker opened this issue Jan 21, 2025 · 1 comment · Fixed by #10490
Closed

test_timeline_ancestor_detach_idempotent_success: increase timeout #10464

erikgrinaker opened this issue Jan 21, 2025 · 1 comment · Fixed by #10490
Assignees
Labels
a/test/flaky Area: related to flaky tests c/storage Component: storage

Comments

@erikgrinaker
Copy link
Contributor

Consider increasing the timeout here, unless there are signs of actual problems.

See https://neonprod.grafana.net/d/fddp4rvg7k2dcf/regression-test-failures?orgId=1&var-test_name=test_timeline_ancestor_detach_idempotent_success%5Bshards_initial_after2%5D&from=now-7d&to=now&timezone=utc&var-restrict=true&var-max_count=100&var-reference=$__all&var-ignore_reference=refs%2Fpull%2F0000%2Fmerge

test_runner/regress/test_timeline_detach_ancestor.py:639: in test_timeline_ancestor_detach_idempotent_success
    env.storage_controller.reconcile_until_idle()
test_runner/fixtures/neon_fixtures.py:2091: in reconcile_until_idle
    raise RuntimeError("Timeout in reconcile_until_idle")
E   RuntimeError: Timeout in reconcile_until_idle
@erikgrinaker erikgrinaker added a/test/flaky Area: related to flaky tests c/storage Component: storage labels Jan 21, 2025
@arpad-m
Copy link
Member

arpad-m commented Jan 23, 2025

Hmm yeah looking at the storcon logs of this run, filtering for "Applying optimization", it seems to migrate the shards one by one from one location to the other, so there is a constant progress, and not a hang. Same goes for two other flaky failures I looked at.

So increasing the timeout makes the most sense I think.

github-merge-queue bot pushed a commit that referenced this issue Jan 23, 2025
Sometimes, especially when the host running the tests is overloaded, we
can run into reconcile timeouts in
`test_timeline_ancestor_detach_idempotent_success`, making the test
flaky. By increasing the timeouts from 30 seconds to 120 seconds, we can
address the flakiness.

Fixes #10464
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/test/flaky Area: related to flaky tests c/storage Component: storage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants