This is an archive of the discontinued Mercurial Phabricator instance.

tests: add a file listing all the flaky tests
AbandonedPublic

Authored by lothiraldan on Mar 12 2019, 5:58 AM.

Details

Reviewers
None
Group Reviewers
hg-reviewers
Summary

This file can be used by CI instances to skip tests known to be flaky and
focus on tests with no false negatives.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

lothiraldan created this revision.Mar 12 2019, 5:58 AM
av6 added a subscriber: av6.Mar 12 2019, 8:04 AM

Another way to do this could be to introduce another special keyword, similar to "slow", so that tests that are known flaky can #require flaky or even mark a specific block with #if flaky. Obviously fixing ​test-remotefilelog-bgprefetch.t is a better idea altogether, but I'm not going to claim that it's easy.

In D6122#89169, @av6 wrote:

Another way to do this could be to introduce another special keyword, similar to "slow", so that tests that are known flaky can #require flaky or even mark a specific block with #if flaky. Obviously fixing ​test-remotefilelog-bgprefetch.t is a better idea altogether, but I'm not going to claim that it's easy.

For this specific test case, it's not flaky on our Buildbot instance (https://buildbot.mercurial-scm.org/builders/hg%20tests). It only happens on exotic filesystems (like inside a Docker container) and when running the tests in parallel.

Flaky tests on our buildbot instances are usually fixed quite quickly, this contrib file is more for tests that are flaky outside our buildbot setup.

The big difference between #require slow and #require flaky is that we don't want to run the slow tests by default but we want to run the flaky test by default if not specified otherwise. I don't know enough about the test runner to evaluate the amount of work need to support a --no-flaky-tests flag,

@av6 is a --no-flaky-tests flag what you had in mind?

av6 added a comment.Mar 21 2019, 8:31 AM

@av6 is a --no-flaky-tests flag what you had in mind?

Something like that, yes. I suggested that flag seeing how this change introduces a quite specific file with just one line. And I thought "if it's important, let's integrate it deeper into the test runner".

But a flag for run-tests.py is not the only alternative. Now that I think about #require, can ​test-remotefilelog-bgprefetch.t be marked with #require not-an-exotic-fs-inside-docker (or whatever the actual thing that causes flakiness if it's known)? hghave.py already checks various filesystem features, why not make it check for the cause of issue6083?

In D6122#89819, @av6 wrote:

@av6 is a --no-flaky-tests flag what you had in mind?

Something like that, yes. I suggested that flag seeing how this change introduces a quite specific file with just one line. And I thought "if it's important, let's integrate it deeper into the test runner".
But a flag for run-tests.py is not the only alternative. Now that I think about #require, can ​test-remotefilelog-bgprefetch.t be marked with #require not-an-exotic-fs-inside-docker (or whatever the actual thing that causes flakiness if it's known)? hghave.py already checks various filesystem features, why not make it check for the cause of issue6083?

For this specific test, we are not sure that it's the exact issue, it's our best lead. I'm afraid that identifying the underlying cause will be 90% of the work fixing it.

My goal is to allow people running tests outside our buildbot and gcc build farm to being able to quickly identify which test error are expected or not. And for CI setups, we want the flaky test to not impact the return code as it would makes the build fails in most of CI setups.

Identifying the root cause of the issue is a problem that takes time and expertise, the docker exotic fs lead is actually from @durin42, I would have not guessed it. I would like to find a way for people to report that a test is flaky on their setup without requiring them to dig too deep in the issue.

I would be ok with the --no-flaky-tests flag, even if I found the #require flaky a bit confusing.

In D6122#89819, @av6 wrote:

@av6 is a --no-flaky-tests flag what you had in mind?

Something like that, yes. I suggested that flag seeing how this change introduces a quite specific file with just one line. And I thought "if it's important, let's integrate it deeper into the test runner".
But a flag for run-tests.py is not the only alternative. Now that I think about #require, can ​test-remotefilelog-bgprefetch.t be marked with #require not-an-exotic-fs-inside-docker (or whatever the actual thing that causes flakiness if it's known)? hghave.py already checks various filesystem features, why not make it check for the cause of issue6083?

For this specific test, we are not sure that it's the exact issue, it's our best lead. I'm afraid that identifying the underlying cause will be 90% of the work fixing it.
My goal is to allow people running tests outside our buildbot and gcc build farm to being able to quickly identify which test error are expected or not. And for CI setups, we want the flaky test to not impact the return code as it would makes the build fails in most of CI setups.
Identifying the root cause of the issue is a problem that takes time and expertise, the docker exotic fs lead is actually from @durin42, I would have not guessed it. I would like to find a way for people to report that a test is flaky on their setup without requiring them to dig too deep in the issue.
I would be ok with the --no-flaky-tests flag, even if I found the #require flaky a bit confusing.

@av6 What is your final thoughts on this patch?

lothiraldan abandoned this revision.Oct 5 2019, 11:14 AM