util: use ~ as a suffix for a temp file in the same directory as a source file
ClosedPublic

Authored by mbolin on Aug 21 2017, 8:39 PM.

Details

Summary

Tools like Buck have patterns to ignore the creation of files (in the working
copy) that match certain patterns:

https://github.com/facebook/buck/blob/39278a4f0701c5239eae148968dc1ed4cc8661f7/src/com/facebook/buck/cli/Main.java#L259-L299

When Buck sees a new source file (as reported by Watchman), it has to invalidate
a number of caches associated with the directory that contains the file.
Using a standard suffix, such as ~, would make it easier for Buck and others
to filter out these types of file creation events.

The other uses of tempfile.mkstemp() in Hg do not appear to be problematic
because they (generally speaking) do not specify the dir parameter, so the
new file is created in the system-appropriate temp directory, which is outside
the working copy.

Test Plan

make tests

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
mbolin created this revision.Aug 21 2017, 8:39 PM
quark retitled this revision from Use ~ as a suffix when creating a temp file in the same directory as a source file. to util: use ~ as a suffix when creating a temp file in the same directory as a source file.Aug 21 2017, 9:17 PM
mbolin retitled this revision from util: use ~ as a suffix when creating a temp file in the same directory as a source file to util: use ~ as a suffix for temp file in the same directory as a source file.Aug 23 2017, 12:48 AM
mbolin retitled this revision from util: use ~ as a suffix for temp file in the same directory as a source file to util: use ~ as a suffix for a temp file in the same directory as a source file.
durin42 accepted this revision as: durin42.Aug 23 2017, 11:48 AM

I don't object to this, but maybe others do. Reviewers not on vacation, please feel encouraged to push this.

mbolin edited the test plan for this revision. (Show Details)Aug 23 2017, 1:36 PM
smf added a subscriber: smf.Aug 23 2017, 2:33 PM
In D468#7748, @durin42 wrote:

I don't object to this, but maybe others do. Reviewers not on vacation, please feel encouraged to push this.

Well, I had some objections on the mailing list. It's sad that those don't show up in phabricator. Basically, I'd like a more unified approach for all types of temp files (commit message, histedit, conflicts, etc).

quark added a comment.Aug 23 2017, 3:04 PM
In D468#7833, @smf wrote:

Basically, I'd like a more unified approach for all types of temp files (commit message, histedit, conflicts, etc).

I think this patch is about low-level util function that shouldn't be coupled with ui or config. It has value on its own and I don't think such "unified approach" should block this patch - we can always add configs and make callers of util.mktempcopy pass suffix down here AFTER this patch.

mercurial/util.py
1517

Could you make suffix a keyword argument here, default to ~?

Therefore we can give callsite control about what extension it wants.

1528

And use suffix=suffix here.

quark requested changes to this revision.Aug 23 2017, 3:09 PM

Back to you to add the keyword argument.

This revision now requires changes to proceed.Aug 23 2017, 3:09 PM
mbolin requested review of this revision.Aug 23 2017, 3:26 PM

@quark Rather than exposing the additional keyword argument now, why don't we wait until there is a compelling use-case to use something other than ~? As it stands, creating an API for this would just open things up to misuse.

quark added a comment.Aug 23 2017, 3:30 PM
In D468#7846, @mbolin wrote:

@quark Rather than exposing the additional keyword argument now, why don't we wait until there is a compelling use-case to use something other than ~? As it stands, creating an API for this would just open things up to misuse.

@smf: I think this is a question for you.

I'm personally fine with either way.

smf added a comment.Aug 23 2017, 4:37 PM
In D468#7836, @quark wrote:
In D468#7833, @smf wrote:

Basically, I'd like a more unified approach for all types of temp files (commit message, histedit, conflicts, etc).

I think this patch is about low-level util function that shouldn't be coupled with ui or config. It has value on its own and I don't think such "unified approach" should block this patch - we can always add configs and make callers of util.mktempcopy pass suffix down here AFTER this patch.

Um, what? That's the whole point of review. Yes, I know you think you can do this AFTER (why are we yelling?). I don't like this patch as-is and am frankly tired of trying to jam stuff through review and promise to clean up later. I am against this patch.

In D468#7908, @smf wrote:
In D468#7836, @quark wrote:
In D468#7833, @smf wrote:

Basically, I'd like a more unified approach for all types of temp files (commit message, histedit, conflicts, etc).

I think this patch is about low-level util function that shouldn't be coupled with ui or config. It has value on its own and I don't think such "unified approach" should block this patch - we can always add configs and make callers of util.mktempcopy pass suffix down here AFTER this patch.

Um, what? That's the whole point of review. Yes, I know you think you can do this AFTER (why are we yelling?). I don't like this patch as-is and am frankly tired of trying to jam stuff through review and promise to clean up later. I am against this patch.

As a bystander here, I'm not even sure what the proposals are. Can one of you summarize what the competing ideas are?

(In general I agree with smf that a more unified error-resistant API should be a goal, and I'm -0 on adding complexity to hg in the name of buck performance if there's a better proposal on the table (but if there is I've missed it.))

smf added a comment.Aug 23 2017, 4:45 PM
In D468#7909, @durin42 wrote:
In D468#7908, @smf wrote:
In D468#7836, @quark wrote:
In D468#7833, @smf wrote:

Basically, I'd like a more unified approach for all types of temp files (commit message, histedit, conflicts, etc).

I think this patch is about low-level util function that shouldn't be coupled with ui or config. It has value on its own and I don't think such "unified approach" should block this patch - we can always add configs and make callers of util.mktempcopy pass suffix down here AFTER this patch.

Um, what? That's the whole point of review. Yes, I know you think you can do this AFTER (why are we yelling?). I don't like this patch as-is and am frankly tired of trying to jam stuff through review and promise to clean up later. I am against this patch.

As a bystander here, I'm not even sure what the proposals are. Can one of you summarize what the competing ideas are?

(In general I agree with smf that a more unified error-resistant API should be a goal, and I'm -0 on adding complexity to hg in the name of buck performance if there's a better proposal on the table (but if there is I've missed it.))

From the mailing list (just posting this here for now so I can reply to it in the next message):

Jun Wu <quark@fb.com> writes:

I think this patch is about changing the suffix. I guess some less powerful
editors may only support matching file types by suffixes.

Yeah, it seems that way.

Please correct me if I'm wrong, but I failed to see how the
"experimental.editortmpinhg" setting could be used as-is to change the
suffix. Therefore I think the patch is still needed.

Maybe we can change it to a config option? Like, instead of saying:

extra_defaults = {
    'prefix': 'editor',
    'suffix': '.txt',
}

Changing ".txt" there to be a config option (maybe
experimental.editorsuffix).

That seems a bit over complicated to me. Why not just just use the
random tmp as a directory instead of a file while renaming the file at
the same time:

/tmp/XYZ123/hg-editor (or HG_EDITOR or whatever)

This would allow 'editortmpinhg'[1] to just be .hg/hg-editor (or
HG_EDITOR etc). The '~' seems a bit like a quick hack and I think I'd
prefer to do this cleanly. My logic here is:

  1. if we can append a character to the suffix, then we should be able to

change the directory

  1. we might want different 'file types' for different tmp files and '~' seems that it might not get us far enough. For instance: a) editor, b) conflicts c) histedit etc

For (a) and (c), I hacked something together for my fork of magit
(called mahgic) so that I can have different modes for commit (to show
the diff) and for histedit (so that 'tab' and 'enter' will show the
commit at the point).

[1] I planned on making editortmpinhg a more thorough thing so that
everything would be in the .hg directory so that programs (like emacs)
would be able to find the correct repo given only the tmp file.

@smf As I put in the summary, I think this use of tempfile.mkstemp() is different than the others in the codebase because it uses the dir= argument to create a file in the working copy. As such, I'd argue that it's reasonable to consider it separately from the others.

In particular, I think it's independent of how temp files are created for things like commit messages as those are paths that are intended to be exposed to the user insofar as they are opened in the user's editor. As it stands, I have D464 out for review as a first step to impose some order on path names for those types of files.

As far as this change is concerned, the focus is to preserve the existing behavior of mktempcopy() while minimizing its impact on other developer tools.

smf added a comment.Aug 23 2017, 4:49 PM
In D468#7910, @smf wrote:

That seems a bit over complicated to me. Why not just just use the
random tmp as a directory instead of a file while renaming the file at
the same time:

/tmp/XYZ123/hg-editor (or HG_EDITOR or whatever)

This would allow 'editortmpinhg'[1] to just be .hg/hg-editor (or
HG_EDITOR etc). The '~' seems a bit like a quick hack and I think I'd
prefer to do this cleanly. My logic here is:

  1. if we can append a character to the suffix, then we should be able to change the directory
  2. we might want different 'file types' for different tmp files and '~' seems that it might not get us far enough. For instance: a) editor, b) conflicts c) histedit etc

    For (a) and (c), I hacked something together for my fork of magit (called mahgic) so that I can have different modes for commit (to show the diff) and for histedit (so that 'tab' and 'enter' will show the commit at the point).

    [1] I planned on making editortmpinhg a more thorough thing so that everything would be in the .hg directory so that programs (like emacs) would be able to find the correct repo given only the tmp file.

Basically, I was pondering aloud if '~' would be enough to future-proof us and if we shouldn't just rename all temp files to something unique (HG_EDITOR for commits, HG_HISTEDIT for histedit, etc). What I was hoping for was a discussion on that.

@smf Personally, I think that D464 is a better place to have that discussion.

In D468#7914, @smf wrote:

Basically, I was pondering aloud if '~' would be enough to future-proof us and if we shouldn't just rename all temp files to something unique (HG_EDITOR for commits, HG_HISTEDIT for histedit, etc). What I was hoping for was a discussion on that.

This sounds like a good idea. However, if I'm understanding @mbolin correctly, we /also/ want a ~ suffix (or similar) so that devtools don't need to grow hg-specific logic for tempfiles (since, in the case of a ~ suffix, we'll be enough like vim that they will already correctly ignore things). Do I have a reasonable handle on this, or is there more nuance that I'm missing?

smf added a comment.Aug 23 2017, 4:52 PM
In D468#7912, @mbolin wrote:

@smf As I put in the summary, I think this use of tempfile.mkstemp() is different than the others in the codebase because it uses the dir= argument to create a file in the working copy. As such, I'd argue that it's reasonable to consider it separately from the others.

In particular, I think it's independent of how temp files are created for things like commit messages as those are paths that are intended to be exposed to the user insofar as they are opened in the user's editor. As it stands, I have D464 out for review as a first step to impose some order on path names for those types of files.

As far as this change is concerned, the focus is to preserve the existing behavior of mktempcopy() while minimizing its impact on other developer tools.

Jesus, I confused myself on this patch. Yes, you're right. I completely mistook this for other patches I worked on mucking around with the tmp directories.

That's definitely on me, sorry!

smf added a comment.Aug 23 2017, 4:54 PM
In D468#7916, @durin42 wrote:
In D468#7914, @smf wrote:

Basically, I was pondering aloud if '~' would be enough to future-proof us and if we shouldn't just rename all temp files to something unique (HG_EDITOR for commits, HG_HISTEDIT for histedit, etc). What I was hoping for was a discussion on that.

This sounds like a good idea. However, if I'm understanding @mbolin correctly, we /also/ want a ~ suffix (or similar) so that devtools don't need to grow hg-specific logic for tempfiles (since, in the case of a ~ suffix, we'll be enough like vim that they will already correctly ignore things). Do I have a reasonable handle on this, or is there more nuance that I'm missing?

Yeah, I don't know what happened. I guess I misread this and thought it was similar to work I did before. I withdraw my concerns and apologize for derailing this.

smf added a comment.Aug 23 2017, 4:54 PM
In D468#7915, @mbolin wrote:

@smf Personally, I think that D464 is a better place to have that discussion.

Sigh. Yes, this is what I was confused about.

@smf so are you OK with this patch as-is?

smf added a comment.Aug 23 2017, 4:59 PM
In D468#7920, @mbolin wrote:

@smf so are you OK with this patch as-is?

Yeah, should be fine.

quark resigned from this revision.Aug 24 2017, 12:53 AM

@durin42 I believe we've come to a resolution on this thread that this is fine as-is, so are you OK to take this now?

durin42 accepted this revision.Aug 29 2017, 11:20 AM
This revision is now accepted and ready to land.Aug 29 2017, 11:20 AM
This revision was automatically updated to reflect the committed changes.