This is an archive of the discontinued Mercurial Phabricator instance.

dirs: reject consecutive slashes in paths
ClosedPublic

Authored by durin42 on Nov 6 2019, 12:16 AM.

Details

Summary

We shouldn't ever see those, and the fuzzer go really excited that if
it gives us a 65k string with 55k slashes in it we use a lot of RAM.

This is a better fix than what I tried in D7105. It was suggested by
Yuya, and I verified it does in fact cause the fuzzer to not OOM.

This is a revision of D7234, but with the missing set of an error
added. I added a unit test of the dirs behavior because I needed to
reason more carefully about the failure modes around consecutive
slashes.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

durin42 created this revision.Nov 6 2019, 12:16 AM
durin42 edited the summary of this revision. (Show Details)Nov 6 2019, 12:18 AM
durin42 updated this revision to Diff 17619.
durin42 edited the summary of this revision. (Show Details)Nov 6 2019, 1:10 AM
durin42 updated this revision to Diff 17620.
indygreg accepted this revision.Nov 7 2019, 3:34 AM
This revision is now accepted and ready to land.Nov 7 2019, 3:34 AM
This revision was automatically updated to reflect the committed changes.

Sorry to necropost, but since this broke the Rust implementation, I was wondering what the best approach would be to replicate this behavior, and I am starting to think that this should be reverted.

IIUC, currently any new path passes through the pathauditor first for validation, so any further check would be completely redundant.
My intuition is that adding this check is purely here to satisfy the fuzzer, but would never happen in real life. Adding checks to this (very) internal data structure comes at a cost, both in performance and in code ergonomics.

Sorry to necropost, but since this broke the Rust implementation, I was wondering what the best approach would be to replicate this behavior, and I am starting to think that this should be reverted.

Oh, you mean the Rust version doesn't do the same rejection?

Given that you're about to do a hash lookup, I'm a little skeptical that a endswith('/') check would show up meaningfully in a profiler, but I'm willing to be proven wrong?

IIUC, currently any new path passes through the pathauditor first for validation, so any further check would be completely redundant.

Plausible, but I'd like some sort of test coverage demonstrating that.

My intuition is that adding this check is purely here to satisfy the fuzzer, but would never happen in real life. Adding checks to this (very) internal data structure comes at a cost, both in performance and in code ergonomics.

Yes, this was largely added to make the fuzzer not get stuck on OOM conditions. That said, if the pathauditor can't catch this, we need to defend against this DoS vector at this layer, and it's such a small check at this layer I'm inclined to keep it unless it is measurably slowing down real uses...

Sorry to necropost, but since this broke the Rust implementation, I was wondering what the best approach would be to replicate this behavior, and I am starting to think that this should be reverted.

Oh, you mean the Rust version doesn't do the same rejection?
Given that you're about to do a hash lookup, I'm a little skeptical that a endswith('/') check would show up meaningfully in a profiler, but I'm willing to be proven wrong?

IIUC, currently any new path passes through the pathauditor first for validation, so any further check would be completely redundant.

Plausible, but I'd like some sort of test coverage demonstrating that.

My intuition is that adding this check is purely here to satisfy the fuzzer, but would never happen in real life. Adding checks to this (very) internal data structure comes at a cost, both in performance and in code ergonomics.

Yes, this was largely added to make the fuzzer not get stuck on OOM conditions. That said, if the pathauditor can't catch this, we need to defend against this DoS vector at this layer, and it's such a small check at this layer I'm inclined to keep it unless it is measurably slowing down real uses...

I should add: if we can substantiate that such a path can't make it through the pathauditor (and we have tests for that in the pathauditor layer so we don't break that in the future!) we can push this consecutive-slashes check into dirs_fuzzer.cc and remove it from dirs.c.

Oh, you mean the Rust version doesn't do the same rejection?

It does not, currently.

Given that you're about to do a hash lookup, I'm a little skeptical that a endswith('/') check would show up meaningfully in a profiler, but I'm willing to be proven wrong?

I have the same intuition you do, it's mostly about not repeating the same operations that have any impact at all on performance/code.

I should add: if we can substantiate that such a path can't make it through the pathauditor (and we have tests for that in the pathauditor layer so we don't break that in the future!) we can push this consecutive-slashes check into dirs_fuzzer.cc and remove it from dirs.c.

It was my candid intuition that the pathauditor was already "ensured" as the barrier for path-based vulnerabilities.
Since my plan for the Rust implementation was to enforce that very fact, I guess I'll be the one to write said tests when I write a Rust pathauditor. It should come up not too long after my current work (in-progress) about matchers, since handling unknown files in any capacity will require the pathauditor.

I'm not 100% sure of the implications of writing the aforementioned tests, if it's too much work, I can replicate this check in Rust for the time being.

In practice, this means the tests consistently fails when testing with Rust. Can we either have a quick fix of the Rust code or a temporary backout of this (until we Rust code is fixed)?