graph: improve graph output by using Unicode characters
ClosedPublic

Authored by johnstiles on May 26 2018, 2:25 AM.

Details

Summary

This extension beautifies log -G output by using Unicode characters.

A terminal with UTF-8 support and a monospace Unicode font are required.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
johnstiles created this revision.May 26 2018, 2:25 AM
yuja added a subscriber: yuja.May 28 2018, 9:48 AM
This extension beautifies log -G output by using Unicode characters.

A terminal with UTF-8 support and a monospace Unicode font are required.

First, I'm not sure if we'll want to maintain this extension in core since
unicode generally sucks.

That said, can you fix style nits detected by test-check-code.t and add some
tests so that anyone loving unicode outputs can start reviewing and queue this?

+def extsetup(ui):
+ def convertedges(line):
+ def prettyedge(before, edge, after):
+ if edge == '~': return '╧'
+ if edge == 'X': return '╳'
+ if edge == '/': return '╱'
+ if edge == '-': return '─'
+ if edge == '|': return '│'
+ if edge == ':': return '┆'
+ if edge == '\\': return '╲'

+ if edge == '+':
+ if before == ' ' and not after == ' ': return '├'
+ if after == ' ' and not before == ' ': return '┤'
+ return '┼'

+ if node == 'o': return '◯'
+ if node == '@': return '⌾'
+ if node == '*': return '∗'
+ if node == 'x': return '◌'
+ if node == '_': return '╤'

These non-ASCII characters will have to be expressed in `\\xNN' form for
Python 3 compatibility. Any strings in Mercurial are bytes.

+ def outputprettygraph(orig, ui, graph, *args, **kwargs):
+ if not ui.plain('graph') and sys.stdout.encoding == "UTF-8":

s/sys.stdout.encoding/encoding.encoding/

The canonical encoding source is encoding.encoding. And we'll have to check
encoding._wide because most of the glyphs used here are "East Asian Ambiguous"
characters.

https://www.unicode.org/reports/tr11/tr11-33.html

johnstiles updated this revision to Diff 8918.May 28 2018, 6:05 PM

Converted non-ASCII characters to `\xNN' form for Python 3 compatibility.
Converted sys.stdout.encoding to encoding.encoding.

johnstiles updated this revision to Diff 8919.May 28 2018, 7:25 PM

Addressed warnings found by test-check-code.t.

johnstiles updated this revision to Diff 8920.May 28 2018, 7:40 PM

Corrected indenting issue. (In converting from 2-space to 4-space indents, I misaligned this block.)

johnstiles updated this revision to Diff 8921.May 29 2018, 3:03 AM

getprettygraphnode now honors HGPLAIN and UI encoding type.
Added tests.

yuya, I believe these patches should address all of your concerns except for encoding._wide. I am not sure what your expectation is for that. I don't think it would make sense to silently disable the extension if encoding._wide is set; IMO that would cause user confusion. If a user sees that the log lines are not lining up properly, that is easy to understand, diagnose and fix (change your Terminal font). If the extension just silently doesn't work, that is much harder to fix.

Let me know if you see anything else that you would like addressed! Thanks :)

yuja added a comment.May 29 2018, 9:22 AM

except for encoding._wide. I am not sure what your expectation is for that.
I don't think it would make sense to silently disable the extension if
encoding._wide is set; IMO that would cause user confusion.

I think that's also true for encoding != 'UTF-8'. The simplest solution
would be to show a warning when loading the extension.

def extsetup(ui):
    if encoding.encoding != 'UTF-8' or encoding._wide:
        ui.warn(_('unsupported encoding or terminal setting\n'))
        return
smf added a subscriber: smf.May 29 2018, 10:17 PM

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

yuya, I believe these patches should address all of your concerns except for `encoding._wide`. I am not sure what your expectation is for that. I don't think it would make sense to silently disable the extension if `encoding._wide` is set; IMO that would cause user confusion. If a user sees that the log lines are not lining up properly, that is easy to understand, diagnose and fix (change your Terminal font). If the extension just silently doesn't work, that is much harder to fix.

Let me know if you see anything else that you would like addressed! Thanks :)

Oh, crazy. I just started hacking something like this up last week. I'll
have to check this out.

In D3665#57649, @smf wrote:

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

yuya, I believe these patches should address all of your concerns except for `encoding._wide`. I am not sure what your expectation is for that. I don't think it would make sense to silently disable the extension if `encoding._wide` is set; IMO that would cause user confusion. If a user sees that the log lines are not lining up properly, that is easy to understand, diagnose and fix (change your Terminal font). If the extension just silently doesn't work, that is much harder to fix.

Let me know if you see anything else that you would like addressed! Thanks :)

Oh, crazy. I just started hacking something like this up last week. I'll
have to check this out.

Awesome! Great minds think alike ;)

If you figure out a way to improve the look of diagonal lines transitioning into a straight vertical line, let me know... I'm still not in love with those. We need more box characters!

smf added a comment.May 29 2018, 10:47 PM

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

In https://phab.mercurial-scm.org/D3665#57649, @smf wrote:

> johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:
>
> > johnstiles added a comment.
> > 
> >   yuya, I believe these patches should address all of your concerns except for `encoding._wide`. I am not sure what your expectation is for that. I don't think it would make sense to silently disable the extension if `encoding._wide` is set; IMO that would cause user confusion. If a user sees that the log lines are not lining up properly, that is easy to understand, diagnose and fix (change your Terminal font). If the extension just silently doesn't work, that is much harder to fix.
> >   
> >   Let me know if you see anything else that you would like addressed! Thanks :)
>
> Oh, crazy. I just started hacking something like this up last week. I'll
>  have to check this out.
>
> - F93615: signature.asc <https://phab.mercurial-scm.org/F93615>


Awesome! Great minds think alike ;)

:-)

If you figure out a way to improve the look of diagonal lines transitioning into a straight vertical line, let me know... I'm still not in love with those. We need more box characters!

Yeah, that's going to be difficult. I started down this rabbit hole when
I was looking for a way to display diagonal dotted lines. I'm not sure
if there is an answer.

johnstiles updated this revision to Diff 8926.May 30 2018, 3:29 AM
johnstiles updated this revision to Diff 8927.May 30 2018, 3:35 AM

A proper warning is now issued when the encoding is not UTF-8, or when East-Asian ambiguous characters will be rendered as wide characters.
The tests have been updated to check these warnings.
Yuya, I believe this should address everything we discussed. Let me know if you see a need for any further changes.

yuja added a comment.May 30 2018, 8:31 AM
A proper warning is now issued when the encoding is not UTF-8, or when East-Asian ambiguous characters will be rendered as wide characters.
The tests have been updated to check these warnings.
Yuya, I believe this should address everything we discussed. Let me know if you see a need for any further changes.

Yeah, the changes look good to me, thanks. Sean and others, can you review
and queue this if it seems good to go into the core extension? I'm 0 on this
feature.

A couple more nits:

--- tests/test-check-commit.t
+++ tests/test-check-commit.t.err
@@ -25,3 +25,14 @@
   >     fi
   >   done
   > fi
+  Revision 2c4617bda693 does not comply with rules
+  ------------------------------------------------------
+  167: adds double empty line
+   +
+  235: adds double empty line
+   +
+  1209: adds double empty line
+   +
+  1403: adds double empty line
+   +
+  

ERROR: test-check-commit.t output changed
!...
--- tests/test-check-pylint.t
+++ tests/test-check-pylint.t.err
@@ -19,3 +19,22 @@
   ------------------------------------ (?)
   Your code has been rated at 10.00/10 (?)
    (?)
+  Using config file $TESTTMP/fakerc
+  ************* Module hgext.beautifygraph
+  C: 38,29: More than one statement on a single line (multiple-statements)
+  C: 39,29: More than one statement on a single line (multiple-statements)
+  C: 40,29: More than one statement on a single line (multiple-statements)
+  C: 41,29: More than one statement on a single line (multiple-statements)
+  C: 42,29: More than one statement on a single line (multiple-statements)
+  C: 43,29: More than one statement on a single line (multiple-statements)
+  C: 44,29: More than one statement on a single line (multiple-statements)
+  C: 60,24: More than one statement on a single line (multiple-statements)
+  C: 61,24: More than one statement on a single line (multiple-statements)
+  C: 62,24: More than one statement on a single line (multiple-statements)
+  C: 63,24: More than one statement on a single line (multiple-statements)
+  C: 64,24: More than one statement on a single line (multiple-statements)
  1. beautifygraph.py - improve graph output by using Unicode characters

Can you add a copyright line and GPL placeholder?

A terminal with UTF-8 support and a monospace Unicode font are required.

Strictly speaking, the font doesn't matter. It's up to terminal emulator
how to lay out each character.

from mercurial import encoding
from mercurial import extensions
from mercurial import templatekw
from mercurial import graphmod

We prefer `from mercurial import (

foo,
bar,
baz,

)`.

FWIW, this isn't rendering nicely with PuTTY on Windows 10. The U+25EF ◯ glyph is being truncated on the right side. I'm also seeing empty squares for U+233E ⌾ and other code points. Using Courier New as the terminal font. LANG=en_US.UTF-8 and TERM=xterm-256color, so Mercurial detects the encoding as UTF-8.

Not a show stopper. But a bit disappointing. I can probably safely blame PuTTY/Windows here.

FWIW, this isn't rendering nicely with PuTTY on Windows 10. The U+25EF ◯ glyph is being truncated on the right side. I'm also seeing empty squares for U+233E ⌾ and other code points. Using Courier New as the terminal font. LANG=en_US.UTF-8 and TERM=xterm-256color, so Mercurial detects the encoding as UTF-8.

Not a show stopper. But a bit disappointing. I can probably safely blame PuTTY/Windows here.

I haven't tested in Windows, unfortunately! I've run the extension in OS X Terminal and on Linux in xfce4-terminal and it's been fine, but Windows text rendering is obviously going to be different :(

Have you experimented with different fonts?

I installed Dejavu Sans Mono from https://dejavu-fonts.github.io/ and it works great!

I installed Dejavu Sans Mono from https://dejavu-fonts.github.io/ and it works great!

That's great to hear!

How do you like it? It's a pretty simple extension but it definitely scratches an itch for me.

I think this is a cool idea! I could nitpick some of the glyph choices (e.g. U+233E ⌾ is really small and harder to read than @ and U+25CC ◌ looks like a circle and therefore the standard node type). But overall I like it!

We could roll this functionality into core pretty easily and put it behind an e.g. ui.graph-unicode config setting. Or we could find some way to templatize the graph characters so people could modify them more easily. Having all the edge and node characters hard-coded feels a bit awkward. But I think this is scope bloat and shouldn't block this feature from landing. Others may have different opinions...

hgext/beautifygraph.py
37

These don't need to be nested functions / closures. Please move to the module level.

johnstiles added a comment.EditedMay 31 2018, 12:59 AM

I think this is a cool idea! I could nitpick some of the glyph choices (e.g. U+233E ⌾ is really small and harder to read than @ and U+25CC ◌ looks like a circle and therefore the standard node type). But overall I like it!

We could roll this functionality into core pretty easily and put it behind an e.g. ui.graph-unicode config setting. Or we could find some way to templatize the graph characters so people could modify them more easily. Having all the edge and node characters hard-coded feels a bit awkward. But I think this is scope bloat and shouldn't block this feature from landing. Others may have different opinions...

The glyphs can vary pretty significantly depending on your OS and font of choice. Here's an album of how things render in OS X Terminal: https://imgur.com/a/xFkj4zv

I generally use Menlo so it looked great for me during testing, but some of the results are underwhelming for sure. Also, I'm not actually convinced my terminal is rendering Roboto Mono properly as the box characters don't even come close to full height.

That being said, having done this test, I would totally be open to replacing ⌾ and ◯ . I don't like how large the hollow circle can get relative to the double-circle. There are a lot of potential circle candidates:

U+25EF ◯ <- today's o
U+233E ⌾ <- today's @
U+25CC ◌ <- today's x
U+235F ⍟
U+25CD ◍
U+26AC ⚬
U+25CB ○
U+274D ❍
U+2689 ⚉
U+23E3 ⏣
U+25CF ●
U+29BB ⦻
U+2A37 ⨷
U+FFEE ○

I'm finding that on OS X, the following two glyphs get good results in almost every font, and great results in Menlo and DejaVu Sans Mono. I wish they popped a little more in Ubuntu Mono but they are still better than the current choices. I'll switch over to Linux now to verify there.
U+235F ⍟
U+25CB ○

I experimented some more in Linux and got some surprisingly different outcomes. Fallback rules definitely vary! Anyway, I think these glyphs are going to work the best across the widest variety of fonts. They are in the same character group so they should all be roughly the same size and shape, and in most monospace fonts they are large enough to be visually distinct.

U+25CB ○ <- changeset
U+25CC ◌ <- obsolete
U+25CD ◍ <- active

I hope these look good on Windows! :)

johnstiles updated this revision to Diff 8939.May 31 2018, 1:59 AM
johnstiles marked an inline comment as done.

OK, I've fixed the latest nits in this diff, as well as changing the circle characters as discussed to:

U+25CB ○ <- changeset
U+25CC ◌ <- obsolete (as before)
U+25CD ◍ <- active

How do the new glyphs look in PuTTY with Deja Vu?
Is there anything else that you need me to look at in this patch?

Just let me know if there's anything I need to do to keep things progressing. :)

Hi there @indygreg and @yuja -- are you happy with the diff as it currently stands? Do you think this needs any additional work?

spectral accepted this revision.Jun 8 2018, 3:53 PM
spectral added a subscriber: spectral.

lgtm, but yuja had asked for sean and others to look, and I don't have push access :)

smf added a comment.Jun 14 2018, 7:02 PM

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

Hi there @indygreg and @yuja  -- are you happy with the diff as it currently stands? Do you think this needs any additional work?

Sorry for the late reply; was having email filtering issues (hopefully
fixed now).

Overall, I'm +0 on it because I think the "real" solution to this is to
templatize the graph glpyhs so any user could use whatever unicode they
desire. However, that's a tall order I wouldn't put on you so I'm fine
with putting this in and adding a note along the lines of, "the path to
get this in core is to get the graph log templated."

The only hiccup I keep having is the misalignment of graphs like these:

Do you think there's any way to fix those? I suspect the answer is "no"
unless someone wants to write their own font?

In D3665#58616, @smf wrote:

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

Hi there @indygreg and @yuja  -- are you happy with the diff as it currently stands? Do you think this needs any additional work?

Sorry for the late reply; was having email filtering issues (hopefully
fixed now).

Overall, I'm +0 on it because I think the "real" solution to this is to
templatize the graph glpyhs so any user could use whatever unicode they
desire. However, that's a tall order I wouldn't put on you so I'm fine
with putting this in and adding a note along the lines of, "the path to
get this in core is to get the graph log templated."

I had originally pushed for this to be templating or at least additional configuration for the graphlog, we currently have ways of changing the | and : markers via config (not template), but these don't support unicode because of https://www.mercurial-scm.org/repo/hg/file/tip/mercurial/graphmod.py#l476, which is for this: https://www.mercurial-scm.org/repo/hg/file/tip/tests/test-glog.t#l3072

smf added a comment.Jun 14 2018, 7:23 PM

spectral (Kyle Lippincott) <phabricator@mercurial-scm.org> writes:

spectral added a comment.

In https://phab.mercurial-scm.org/D3665#58616, @smf wrote:

> johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:
>
> > johnstiles added a comment.
> > 
> >   Hi there @indygreg and @yuja  -- are you happy with the diff as it currently stands? Do you think this needs any additional work?
>
> Sorry for the late reply; was having email filtering issues (hopefully
>  fixed now).
>
> Overall, I'm +0 on it because I think the "real" solution to this is to
>  templatize the graph glpyhs so any user could use whatever unicode they
>  desire. However, that's a tall order I wouldn't put on you so I'm fine
>  with putting this in and adding a note along the lines of, "the path to
>  get this in core is to get the graph log templated."


I had originally pushed for this to be templating or at least additional configuration for the graphlog, we currently have ways of changing the | and : markers via config (not template), but these don't support unicode because of https://www.mercurial-scm.org/repo/hg/file/tip/mercurial/graphmod.py#l476, which is for this: https://www.mercurial-scm.org/repo/hg/file/tip/tests/test-glog.t#l3072

So, you'd add more config knobs and fix that issue?

It all depends on your terminal font. It looks like the font you're using now has issues with U+2506 ┆ and is otherwise working as expected.

Most likely it is rendering ┆ as full-width instead of narrow-width. I'm almost positive this issue is fixable if you are willing to experiment with a different font or Terminal settings, because I've never seen that particular artifact. Usually it's been the circle glyphs that want to render over-wide. :)

If you are willing to experiment with it, there are a bunch of very similar glyphs that we could swap out and use instead if you think it would help the general case. (https://unicode-search.net/unicode-namesearch.pl?term=DASH) I'm certainly willing to explore it, but it would help to know a little more detail about your current setup (Terminal? Font? OS?) to see if I can repro the issue locally and see whether this would actually be widespread.

Broader picture: on the whole I think we can pick a really good starting point that works for the vast majority of users' setups, but we are unlikely to land on a 100% solution for all fonts and terminals. Some fonts just look bad or won't align perfectly. Some layouts will still have little gaps in them where Unicode doesn't provide the box characters we need for a perfect layout. On the other hand, the existing ASCII mode looks unaligned and goofy on all machines all of the time, so having an optional mode that looks better for most users is still a step in the right direction :)

Additional thoughts: I would love to add features to this patch, such as glyph customizability or even color coding. I can certainly get it done in a followup. On the other hand, I don't want to sink tons of free time into the process it if I can't get this simple version through review. This is a spare-time project for me and I don't need to add flexibility options if it's only ever going to exist on my PC. I'd like to at least get a foothold in, then work on improvements once this has landed, instead of working for weeks just to risk having the feature rejected at the end of the process and having nothing to show for it.

In D3665#58616, @smf wrote:

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

Hi there @indygreg and @yuja  -- are you happy with the diff as it currently stands? Do you think this needs any additional work?

Sorry for the late reply; was having email filtering issues (hopefully
fixed now).

Overall, I'm +0 on it because I think the "real" solution to this is to
templatize the graph glpyhs so any user could use whatever unicode they
desire. However, that's a tall order I wouldn't put on you so I'm fine
with putting this in and adding a note along the lines of, "the path to
get this in core is to get the graph log templated."

The only hiccup I keep having is the misalignment of graphs like these:

Do you think there's any way to fix those? I suspect the answer is "no"
unless someone wants to write their own font?











Side note: I'm unfamiliar with the lingo here, what's +0 on a feature? Indifference?

smf added a comment.Jun 14 2018, 8:48 PM

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

It all depends on your terminal font. It looks like the font you're using now has issues with U+2506 ┆ and is otherwise working as expected.

Yeah, that was my fault. I forgot I had a different font for unicode
symbols to fix a bug in powerline. Unfortunately, it still looks similar
when I put everything to one font setting: they all look a tiny bit
broken with the fonts I've tried. That list isn't large: Fira, DejuVu
Sans (from earlier in this thread), and a random assortment of system
fonts.

Do you have any other fonts I could try?

Most likely it is rendering ┆ as full-width instead of narrow-width. I'm almost positive this issue is fixable if you are willing to experiment with a different font or Terminal settings, because I've never seen that particular artifact. Usually it's been the circle glyphs that want to render over-wide. :)

If you are willing to experiment with it, there are a bunch of very similar glyphs that we could swap out and use instead if you think it would help the general case. (https://unicode-search.net/unicode-namesearch.pl?term=DASH) I'm certainly willing to explore it, but it would help to know a little more detail about your current setup (Terminal? Font? OS?) to see if I can repro the issue locally and see whether this would actually be widespread.

Broader picture: on the whole I think we can pick a really good starting point that works for the vast majority of users' setups, but we are unlikely to land on a 100% solution for all fonts and terminals. Some fonts just look bad or won't align perfectly. Some layouts will still have little gaps in them where Unicode doesn't provide the box characters we need for a perfect layout. On the other hand, the existing ASCII mode looks unaligned and goofy on all machines all of the time, so having an optional mode that looks better for most users is still a step in the right direction :)

Agreed. Maybe would could put some font suggestions in the help? (I
don't mean amending this patch, by the way, just asking in general).

Additional thoughts: I would love to add features to this patch, such as glyph customizability or even color coding. I can certainly get it done in a followup. On the other hand, I don't want to sink tons of free time into the process it if I can't get this simple version through review. This is a spare-time project for me and I don't need to add flexibility options if it's only ever going to exist on my PC. I'd like to at least get a foothold in, then work on improvements once this has landed, instead of working for weeks just to risk having the feature rejected at the end of the process and having nothing to show for it.

Oh, yeah, I totally agree. Apologies for the back-and-forth on this.
Lemme play around with some fonts and whatnot after you get back on some
suggestions, and then let's try to get this landed :-)

smf added a comment.Jun 14 2018, 8:49 PM

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

Side note: I'm unfamiliar with the lingo here, what's +0 on a feature? Indifference?

Ah, yeah, sorry. To me, at least, it's a "sure but I don't want to be
blamed if it fails" ;-P (basically, "I won't block this"). Maybe that
number is wrong though. Let's play with some fonts and see what we can
get out of it.

I have put together an album of various fonts from my discussions with @yuja earlier in the thread: https://imgur.com/a/xFkj4zv

I'm a Menlo or DejaVu guy myself.

smf added a comment.Jun 14 2018, 8:58 PM

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

I have put together an album of various fonts from my discussions with @yuja earlier in the thread: https://imgur.com/a/xFkj4zv

I'm a Menlo or DejaVu guy myself.

Ah, I only just now realized each image had the font name in it >_< OK,
cool! Lemme play with these tonight and get back to you tomorrow (feel
free to ping me if I forget or am late :-))

smf added a comment.Jun 15 2018, 5:50 PM

I've looked this over today and have queued this up :-) Unfortunately, though, the metadata doesn't seem right? I'm not getting your name or email (nor timestamp) for the patch. Do you want me to use the same name from 24e517600b29 (John Stiles <johnstiles@gmail.com>)?

In D3665#58828, @smf wrote:

I've looked this over today and have queued this up :-) Unfortunately, though, the metadata doesn't seem right? I'm not getting your name or email (nor timestamp) for the patch. Do you want me to use the same name from 24e517600b29 (John Stiles <johnstiles@gmail.com>)?

That's perfect, thanks!
The diff came from hg diff --git so maybe hg's git-diff emulation mode is lacking those fields.

smf added a comment.Jun 15 2018, 6:12 PM

johnstiles (John Stiles) <phabricator@mercurial-scm.org> writes:

johnstiles added a comment.

In https://phab.mercurial-scm.org/D3665#58828, @smf wrote:

> I've looked this over today and have queued this up :-) Unfortunately, though, the metadata doesn't seem right? I'm not getting your name or email (nor timestamp) for the patch. Do you want me to use the same name from https://phab.mercurial-scm.org/rHG24e517600b2986a3d5855456b3cdf0830ea0a79e (John Stiles <johnstiles@gmail.com>)?


That's perfect, thanks!
The diff came from `hg diff --git` so maybe hg's git-diff emulation mode is lacking those fields.

Ah, that would be why. 'hg diff' is for viewing the diff of files. 'hg
export' is the command that will also include author, date, branch name,
etc.

smf added a comment.Jun 15 2018, 6:33 PM

Could you send me the output of hg export REV | head or do you just want me to add your name and not worry about the date/time, etc.?

I'm not worried about the date/time info. You can just add in my name.
Thanks :)

quark added a subscriber: quark.EditedJun 16 2018, 2:21 AM

Maybe I should change cmd.exe font. But here's what I got pasting the text into the console:

EDIT: Depends on fonts. Some non-English fonts can render them while the most common ones like Consolas, Courier New will have difficulty.

The cmd.exe runs in utf-8 codepage (chcp 65001).

What are you trying to demonstrate here? I'm lost.

In D3665#58858, @quark wrote:

Maybe I should change cmd.exe font. But here's what I got pasting the text into the console:

quark added a comment.Jun 16 2018, 3:00 AM

What are you trying to demonstrate here? I'm lost.

I'm sorry you feel lost. Since Windows was mentioned in the thread and I happen to have a Windows system, I thought it's somehow useful to post a real cmd.exe screenshot with these characters - they do NOT render well. If that's not clear or is unrelated to the change, please ignore.

Without more context I have no idea what you are trying to show. Windows is
certainly capable of rendering Unicode characters in the console. It is
also very possible to get ? characters if you're running a
non-Unicode-aware tool or if there are encoding mix-up issues. "type con"
and pasting in text from a text editor doesn't really prove anything one
way or the other. (What editor? What encoding did it think the data was?
How does "type con" handle Unicode text? etc.)

Run the actual hgext and take a screenshot of what it generates if you want
to give a more useful data point for us.

quark added a comment.Jun 16 2018, 3:38 AM

Without more context I have no idea what you are trying to show. Windows is
certainly capable of rendering Unicode characters in the console. It is
also very possible to get ? characters if you're running a
non-Unicode-aware tool or if there are encoding mix-up issues. "type con"
and pasting in text from a text editor doesn't really prove anything one
way or the other. (What editor? What encoding did it think the data was?
How does "type con" handle Unicode text? etc.)

Run the actual hgext and take a screenshot of what it generates if you want
to give a more useful data point for us.

Copy-pasting would render utf-8 characters correctly if the font supports them. It does not seem to be related to the current codepage.

I edited the screenshot to show an explicit type a-utf-8-file with "utf-8" (65001) codepage, with different fonts.

Note: Mercurial will raise "unknown encoding: cp65001" in this case so it's probably safe if you have the encoding.encoding != 'utf8' check. But anything rendered using the utf-8 characters (be it from "git bash hg -G ... > a.txt" or whatever POSIX-like shells) will render sub-optimally in cmd.exe.

Are you capable of running the extension?

The output of type is irrelevant to me. The behavior of python.exe when outputting to the Windows shell is all that really matters here. If encoding.encoding reports UTF8, it should work or there's an Hg issue. If it reports something else, the extension will disable itself anyway.

In D3665#58863, @quark wrote:

Without more context I have no idea what you are trying to show. Windows is
certainly capable of rendering Unicode characters in the console. It is
also very possible to get ? characters if you're running a
non-Unicode-aware tool or if there are encoding mix-up issues. "type con"
and pasting in text from a text editor doesn't really prove anything one
way or the other. (What editor? What encoding did it think the data was?
How does "type con" handle Unicode text? etc.)

Run the actual hgext and take a screenshot of what it generates if you want
to give a more useful data point for us.

Copy-pasting would render utf-8 characters correctly if the font supports them. It does not seem to be related to the current codepage.

I edited the screenshot to show an explicit type a-utf-8-file with "utf-8" (65001) codepage, with different fonts.

Note: Mercurial will raise "unknown encoding: cp65001" in this case so it's probably safe if you have the encoding.encoding != 'utf8' check. But anything rendered using the utf-8 characters (be it from "git bash hg -G ... > a.txt" or whatever POSIX-like shells) will render sub-optimally in cmd.exe.

quark added a comment.EditedJun 16 2018, 2:10 PM

Are you capable of running the extension?

The output of type is irrelevant to me. The behavior of python.exe when outputting to the Windows shell is all that really matters here. If encoding.encoding reports UTF8, it should work or there's an Hg issue. If it reports something else, the extension will disable itself anyway.

Running hg installed in WSL inside cmd.exe, encoding.encoding is utf-8. Since it's still using cmd.exe font, it cannot render those fancy characters correctly.

The previous type outputs demonstrate clearly that the cmd.exe fonts are incapable of rendering those UTF-8 characters. So I don't think they are irrelevant, and I'm not going to post another screenshot.

Can you provide a screenshot of the actual Windows behavior?

quark added a comment.Jun 16 2018, 3:08 PM

To be clear, I have no interest in +1 or -1 this feature, and I'm not interested in spending more time testing it. I think I have made it very clear that Windows (at least WSL) is going to be a headache. Not to say, Linux (as my primary OS) font rendering is another story that might surprise you.

At a low-level, the Unicode specification does not define some characters to be "wide" or "narrow" explicitly. Using them is like "undefined behavior" depending on the actual font. encoding._wide cannot be accurate for two reasons - 1. no way to get the font details; 2. width is per character, not a global thing.

I have seen people testing software on macOS only, then think it improves everyone's life and proudly announce the feature. As a Linux/Windows user, I feel sad.

For what it's worth, it works great on Linux, so no need to feel sad. That's my primary dev environment. If you aren't interested in actually testing the extension itself I'm not sure why you are posting here, but thank you for the feedback about type con issues with UTF8 on Windows. It was very elucidating.

quark added a comment.Jun 16 2018, 3:42 PM

Since you mentioned Linux... Here is what your extension renders on my Linux terminal:

The characters do not render as question marks, because I have fonts covering them and fontconfig smartly choose fallback fonts (Windows does similar things).

I'm not saying others render the same as mine, but I'd repeat - "width" is per character, and is generally "undefined" - some of them are wide, some are narrow.

You can say "it works great on *my* Linux", but not other's Linux. Depending on what fonts are installed, what the rendering engine is, what fontconfig says about font substitution rules, etc. There are just too many ways to get ugly results. And it still complies the Unicode standard - since it's undefined.

Looks like your font is missing the dashed vertical line, and has an oddly small regular-circle glyph. I don't recognize the font at all so I can't really speak much more to that. Fortunately though...

(a) it's an extension which isn't on by default
(b) your font choices are your own. That doesn't look like any default font I've ever seen, so this is unlikely to affect many others

I cannot possibly promise that this extension will look good in every OS, font and terminal. That's not a reasonable goal. If I thought looking perfect everywhere was an option, I'd be patching the mainline graph view instead of making an extension that opts-out by default.

Do you have any ideas to improve the situation? There has been plenty of font discussion in the thread already; this is not really covering any new ground.

quark added a comment.Jun 16 2018, 4:17 PM

Looks like your font is missing the dashed vertical line, and has an oddly small regular-circle glyph. I don't recognize the font at all so I can't really speak much more to that. Fortunately though...

(a) it's an extension which isn't on by default
(b) your font choices are your own. That doesn't look like any default font I've ever seen, so this is unlikely to affect many others

Off topic. But there are many CJK fonts that you probably haven't seen. People using them are not a minority. So it's unfair to say "this is unlikely to affect many others".

I cannot possibly promise that this extension will look good in every OS, font and terminal. That's not a reasonable goal. If I thought looking perfect everywhere was an option, I'd be patching the mainline graph view instead of making an extension that opts-out by default.

Do you have any ideas to improve the situation? There has been plenty of font discussion in the thread already; this is not really covering any new ground.

I believe the most "correct" solution is to revise the Unicode standard and upgrade fonts. But that cannot happen anytime soon. I do feel sad about the current situation.

I don’t disagree with your premise, but the comments section of this patch are not really the right venue for this discussion.

indygreg requested changes to this revision.Jun 16 2018, 4:42 PM

I think having better display of graph symbols is a compelling end-user feature and should be part of Mercurial.

It is apparent that no set of advanced glyphs works in all terminals and fonts. But that shouldn't mean people can't opt in to using more visually appealing glyphs in graph output.

I think this extension is a step in the right direction. But I also don't feel comfortable exposing this extension to all users just yet because of the many environments it doesn't work in.

If we add (EXPERIMENTAL) to the first line of the module docstring (so it doesn't show up in hg help output), I'm comfortable accepting this extension as is. We can remove the experimental state of the feature once support for customizing glyphs is added. That can be done in the extension or (preferably) in core, alleviating the need for an extension.

While I'm here, I think a reasonable follow-up feature would is to have built-in "profiles" for graph glyphs so users could e.g. start with a set of glyphs that are known to work reasonably well given a certain terminal emulator, font, etc. We could also provide a hg debuggraphlog or similar that printed examples of the various built-in profiles so users could pick one that renders the best. This could be exposed via a ui.graph-symbols config option or similar.

This revision now requires changes to proceed.Jun 16 2018, 4:42 PM
johnstiles updated this revision to Diff 9116.Jun 16 2018, 5:09 PM

Thanks for the concrete feedback. I've uploaded a new diff. The (EXPERIMENTAL) tag has been added and I am now using hg export to generate the diff.

indygreg accepted this revision.Jun 16 2018, 5:17 PM
This revision is now accepted and ready to land.Jun 16 2018, 5:17 PM

🤩👍

Next patch, emoji!

indygreg requested changes to this revision.Jun 16 2018, 5:25 PM

I'm getting a few test errors applying this against the latest revision of hg repo:

--- /Users/gps/src/hg-committed/tests/test-check-commit.t
+++ /Users/gps/src/hg-committed/tests/test-check-commit.t.err
@@ -25,3 +25,10 @@
   >     fi
   >   done
   > fi
+  Revision 82a9045ac1a2 does not comply with rules
+  ------------------------------------------------------
+  1225: adds double empty line
+   +
+  1419: adds double empty line
+   +
+
--- /Users/gps/src/hg-committed/tests/test-check-module-imports.t
+++ /Users/gps/src/hg-committed/tests/test-check-module-imports.t.err
@@ -43,3 +43,5 @@
   > -X tests/test-lock.py \
   > -X tests/test-verify-repo-operations.py \
   > | sed 's-\\-/-g' | $PYTHON "$import_checker" -
+  hgext/beautifygraph.py:17: imports from mercurial not lexically sorted: graphmod < templatekw
+  [1]
--- /Users/gps/src/hg-committed/tests/test-check-code.t
+++ /Users/gps/src/hg-committed/tests/test-check-code.t.err
@@ -12,9 +12,13 @@
   > -X hgext/fsmonitor/pywatchman \
   > -X mercurial/thirdparty \
   > | sed 's-\\-/-g' | "$check_code" --warnings --per-file=0 - || false
+  hgext/beautifygraph.py:9:
+   > '''beautify log -G output by using Unicode characters (EXPERIMENTAL)
+   trailing whitespace
   Skipping i18n/polib.py it has no-che?k-code (glob)
   Skipping mercurial/statprof.py it has no-che?k-code (glob)
   Skipping tests/badserverext.py it has no-che?k-code (glob)
+  [1]
  File "$TESTTMP/printrevset.py", line 31, in printrevset
    ui.write(smartset.prettyformat(revs) + b'\n')
AttributeError: 'module' object has no attribute 'prettyformat'

Please rebase this against the latest Mercurial revision, fix the static analysis failures, and resubmit.

You should be able to test via: cd tests; ./run-tests.py -l -j4 test-check-commit.t test-check-module-imports.t test-check-code.t test-glog-beautifygraph.t.

This revision now requires changes to proceed.Jun 16 2018, 5:25 PM
smf added a comment.Jun 16 2018, 7:31 PM

Greg and Jun,

Before anyone else requests more changes or anything, this is a first time contributor and I was trying to be a bit more flexible as a reviewer. I actually queued the patch yesterday but due to the test failures, I haven't pushed. I'll personally fix up the tests and add the experimental flag.

John,

Don't worry about updating this patch or rebasing. I've already (locally) added "EXPERIMENTAL" and am fixing the other test failures, too. I just haven't finished yet because I'm trying to also finish my household chores :-P I'll post a diff of what I tweaked when I finish this weekend. Thanks for bearing with me on this discussion!

Thanks for the assist, @smf ! I appreciate it.

This revision was automatically updated to reflect the committed changes.
smf added a comment.Jun 16 2018, 9:07 PM

Thanks for the assist, @smf ! I appreciate it.

Sure, no problem :-)

By the way, here's the diff of what I changed:

diff --git a/hgext/beautifygraph.py b/hgext/beautifygraph.py
index 7ff3c08..254d2cc 100644
--- a/hgext/beautifygraph.py
+++ b/hgext/beautifygraph.py
@@ -4,23 +4,23 @@
 # Copyright 2018 John Stiles <johnstiles@gmail.com>
 #
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
-'''This extension beautifies log -G output by using Unicode characters.
+'''beautify log -G output by using Unicode characters (EXPERIMENTAL)
 
    A terminal with UTF-8 support and monospace narrow text are required.
 '''
 
 from __future__ import absolute_import
 
 from mercurial.i18n import _
 from mercurial import (
     encoding,
     extensions,
+    graphmod,
     templatekw,
-    graphmod
 )
 
 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
 # be specifying the version(s) of Mercurial they are tested with, or
diff --git a/tests/test-duplicateoptions.py b/tests/test-duplicateoptions.py
index 4511a89..397eca4 100644
--- a/tests/test-duplicateoptions.py
+++ b/tests/test-duplicateoptions.py
@@ -4,11 +4,11 @@ from mercurial import (
     commands,
     extensions,
     ui as uimod,
 )
 
-ignore = {b'highlight', b'win32text', b'factotum'}
+ignore = {b'highlight', b'win32text', b'factotum', b'beautifygraph'}
 
 if os.name != 'nt':
     ignore.add(b'win32mbcs')
 
 disabled = [ext for ext in extensions.disabled().keys() if ext not in ignore]
diff --git a/tests/test-glog-beautifygraph.t b/tests/test-glog-beautifygraph.t
index c3d1fb7..e62334f 100644
--- a/tests/test-glog-beautifygraph.t
+++ b/tests/test-glog-beautifygraph.t
@@ -89,10 +89,14 @@ o  (0) root
   >   logcmdutil,
   >   revsetlang,
   >   smartset,
   > )
   > 
+  > from mercurial.utils import (
+  >   stringutil,
+  > )
+  > 
   > def logrevset(repo, pats, opts):
   >     revs = logcmdutil._initialrevs(repo, opts)
   >     if not revs:
   >         return None
   >     match, pats, slowpath = logcmdutil._makematcher(repo, revs, pats, opts)
@@ -109,11 +113,11 @@ o  (0) root
   >             else:
   >                 tree = []
   >             ui = repo.ui
   >             ui.write(b'%r\n' % (opts.get(b'rev', []),))
   >             ui.write(revsetlang.prettyformat(tree) + b'\n')
-  >             ui.write(smartset.prettyformat(revs) + b'\n')
+  >             ui.write(stringutil.prettyrepr(revs) + b'\n')
   >             revs = smartset.baseset()  # display no revisions
   >         return revs, filematcher
   >     extensions.wrapfunction(logcmdutil, 'getrevs', printrevset)
   >     aliases, entry = cmdutil.findcmd(b'log', commands.table)
   >     entry[1].append((b'', b'print-revset', False,
@@ -324,1003 +328,1001 @@ The extension should not turn on if we'r
   
 
 The rest of our tests will use the default narrow text UTF-8.
 
   $ hg log -G -q
-  \xe2\x8c\xbe  34:fea3ac5810e0 (esc)
+  \xe2\x97\x8d  34:fea3ac5810e0 (esc)
   \xe2\x94\x82 (esc)
-  \xe2\x94\x82 \xe2\x97\xaf  33:68608f5145f9 (esc)
+  \xe2\x94\x82 \xe2\x97\x8b  33:68608f5145f9 (esc)

[... elided the rest]
smf added a comment.Jun 16 2018, 9:18 PM
In D3665#58976, @smf wrote:

Thanks for the assist, @smf ! I appreciate it.

Sure, no problem :-)

By the way, here's the diff of what I changed:

diff --git a/hgext/beautifygraph.py b/hgext/beautifygraph.py
index 7ff3c08..254d2cc 100644
--- a/hgext/beautifygraph.py
+++ b/hgext/beautifygraph.py
@@ -4,23 +4,23 @@
 # Copyright 2018 John Stiles <johnstiles@gmail.com>
 #
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
-'''This extension beautifies log -G output by using Unicode characters.
+'''beautify log -G output by using Unicode characters (EXPERIMENTAL)

In the Mercurial project, we use "EXPERIMENTAL" as a flag to mean "we have the option to remove this later." In general, we have a very strong breaking change policy (as in, we try to never have one). The only exceptions to that rule that I'm aware of are if it's a security problem (e.g. default clone of subrepositories changed due to major security flaws) or was broken from day 1.

   A terminal with UTF-8 support and monospace narrow text are required.
'''

from __future__ import absolute_import

from mercurial.i18n import _
from mercurial import (
    encoding,
    extensions,

+ graphmod,

templatekw,
  • graphmod )

I'll go through and briefly explain each hunk (just for reference and if any other newcomers find this patch). The graphmod import needed to be earlier due to our code style checker.

  1. Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
  2. extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
  3. be specifying the version(s) of Mercurial they are tested with, or diff --git a/tests/test-duplicateoptions.py b/tests/test-duplicateoptions.py index 4511a89..397eca4 100644
    • a/tests/test-duplicateoptions.py +++ b/tests/test-duplicateoptions.py @@ -4,11 +4,11 @@ from mercurial import ( commands, extensions, ui as uimod, )

      -ignore = {b'highlight', b'win32text', b'factotum'} +ignore = {b'highlight', b'win32text', b'factotum', b'beautifygraph'}

I only found this test breaking on the gcc compile farm with the output changing to "beautifygraph: unsupported encoding, UTF-8 required" so I choose to skip importing that extension in this test.

if os.name != 'nt':
    ignore.add(b'win32mbcs')

disabled = [ext for ext in extensions.disabled().keys() if ext not in ignore]

diff --git a/tests/test-glog-beautifygraph.t b/tests/test-glog-beautifygraph.t
index c3d1fb7..e62334f 100644

  • a/tests/test-glog-beautifygraph.t +++ b/tests/test-glog-beautifygraph.t @@ -89,10 +89,14 @@ o (0) root > logcmdutil, > revsetlang, > smartset, > ) > + > from mercurial.utils import ( + > stringutil, + > )

Ah, yes, a classic race condition with another dev changing something that you're working on.

> def logrevset(repo, pats, opts):
>     revs = logcmdutil._initialrevs(repo, opts)
>     if not revs:
>         return None
>     match, pats, slowpath = logcmdutil._makematcher(repo, revs, pats, opts)

@@ -109,11 +113,11 @@ o (0) root

>             else:
>                 tree = []
>             ui = repo.ui
>             ui.write(b'%r\n' % (opts.get(b'rev', []),))
>             ui.write(revsetlang.prettyformat(tree) + b'\n')
  • > ui.write(smartset.prettyformat(revs) + b'\n') + > ui.write(stringutil.prettyrepr(revs) + b'\n')

As I mentioned above, it seems Yuya changed this and it landed before your patch.

>             revs = smartset.baseset()  # display no revisions
>         return revs, filematcher
>     extensions.wrapfunction(logcmdutil, 'getrevs', printrevset)
>     aliases, entry = cmdutil.findcmd(b'log', commands.table)
>     entry[1].append((b'', b'print-revset', False,

@@ -324,1003 +328,1001 @@ The extension should not turn on if we'r

  

The rest of our tests will use the default narrow text UTF-8.

  $ hg log -G -q
  • \xe2\x8c\xbe 34:fea3ac5810e0 (esc) + \xe2\x97\x8d 34:fea3ac5810e0 (esc)

Ah, I think you forgot to update the test output when changing some of the characters you used (at least I hope that was the problem here). All I did to update the whole test at large was run with: run-tests.py -l tests/test-glog-beautifygraph.t -i. The -i meaning "interactively ask the user to accept this change" which will automatically change the test file for you.

\xe2\x94\x82 (esc)
  • \xe2\x94\x82 \xe2\x97\xaf 33:68608f5145f9 (esc) + \xe2\x94\x82 \xe2\x97\x8b 33:68608f5145f9 (esc)

    [... elided the rest] `

The rest of the diff was just more of the same as the previous error.

Hope that helps!

yuja added a comment.Jun 16 2018, 11:37 PM
> (a) it's an extension which isn't on by default
>  (b) your font choices are your own. That doesn't look like any default font I've ever seen, so this is unlikely to affect many others

Off topic. But there are many CJK fonts that you probably haven't seen. People using them are not a minority. So it's unfair to say "this is unlikely to affect many others".

For the record, as a person living in emoji country, I understand the
narrow/wide character madness as well as fonts issues. And we're trained to
avoid any 8-bit characters unless necessary because only ASCII (except 0x5c)
is trustworthy.

That's the reason I thought I wasn't the right person to judge the value
of this extension. If some people love this feature, we shouldn't stop it
just because it wouldn't work in many CJK environments.