This file can be used by CI instances to skip tests known to be flaky and
focus on tests with no false negatives.
Details
- Reviewers
- None
- Group Reviewers
hg-reviewers
Diff Detail
- Repository
- rHG Mercurial
- Lint
Lint Skipped - Unit
Unit Tests Skipped
Event Timeline
Another way to do this could be to introduce another special keyword, similar to "slow", so that tests that are known flaky can #require flaky or even mark a specific block with #if flaky. Obviously fixing test-remotefilelog-bgprefetch.t is a better idea altogether, but I'm not going to claim that it's easy.
For this specific test case, it's not flaky on our Buildbot instance (https://buildbot.mercurial-scm.org/builders/hg%20tests). It only happens on exotic filesystems (like inside a Docker container) and when running the tests in parallel.
Flaky tests on our buildbot instances are usually fixed quite quickly, this contrib file is more for tests that are flaky outside our buildbot setup.
The big difference between #require slow and #require flaky is that we don't want to run the slow tests by default but we want to run the flaky test by default if not specified otherwise. I don't know enough about the test runner to evaluate the amount of work need to support a --no-flaky-tests flag,
@av6 is a --no-flaky-tests flag what you had in mind?
Something like that, yes. I suggested that flag seeing how this change introduces a quite specific file with just one line. And I thought "if it's important, let's integrate it deeper into the test runner".
But a flag for run-tests.py is not the only alternative. Now that I think about #require, can test-remotefilelog-bgprefetch.t be marked with #require not-an-exotic-fs-inside-docker (or whatever the actual thing that causes flakiness if it's known)? hghave.py already checks various filesystem features, why not make it check for the cause of issue6083?
For this specific test, we are not sure that it's the exact issue, it's our best lead. I'm afraid that identifying the underlying cause will be 90% of the work fixing it.
My goal is to allow people running tests outside our buildbot and gcc build farm to being able to quickly identify which test error are expected or not. And for CI setups, we want the flaky test to not impact the return code as it would makes the build fails in most of CI setups.
Identifying the root cause of the issue is a problem that takes time and expertise, the docker exotic fs lead is actually from @durin42, I would have not guessed it. I would like to find a way for people to report that a test is flaky on their setup without requiring them to dig too deep in the issue.
I would be ok with the --no-flaky-tests flag, even if I found the #require flaky a bit confusing.