Python3.7+ will "coerce" the locale (specifically LC_CTYPE) in many situations,
and this can cause chg to not start. My previous fix in D7550 did not cover all
situations correctly, but hopefully this fix does.
The C side of chg will set CHGORIG_LC_CTYPE in its environment before starting
the command server and before calling setenv on the command server.
When calculating the environment hash, we use the value from CHGORIG_LC_CTYPE to
calculate the hash - intentionally ignoring the modifications that Python may
have done during command server startup.
When chg calls setenv on the command server, the command server will see
CHGORIG_LC_CTYPE in the environment-to-set, and NOT modify LC_CTYPE to be the
same as in the environment-to-set. This preserves the modifications that Python
has done during startup. We'll still calculate the hash using the
CHGORIG_LC_CTYPE variables appropriately, so we'll detect environment changes
(even if they don't cause a change in the actual value). Example:
- LC_CTYPE=invalid_1 chg cmd
- Py3.7 sets LC_CTYPE=C.UTF-8 on Linux
- CHGORIG_LC_CTYPE=1invalid_1
- Environment hash is as-if 'LC_CTYPE=invalid_1', even though it really is LC_CTYPE=C.UTF-8
- LC_CTYPE=invalid_2 chg cmd
- Connect to the existing server, call setenv
- Calculate hash as-if 'LC_CTYPE=invalid_2', even though it is identical to the other command server (C.UTF-8)
This isn't a huge issue in practice. It can cause two separate command servers
that are functionally identical to be executed. This should not be considered an
observable/intentional effect, and is something that may change in the future.
This is hopefully a more future-proof fix than the original one in D7550: we
won't have to worry about behavior changes (or incorrect readings of the current
behavior) in future versions of Python. If more environment variables end up
being modified, it's a simple one line fix in chg.c to also preserve those.
Important Caveat: if something causes one of these variables to change *inside*
the hg serve process, we're going to end up persisting that value. Example:
- Command server starts up, Python sets LC_CTYPE=C.UTF-8
- Some extension sets LC_CTYPE=en_US.UTF-8 in the environment
- The next invocation of chg will call setenv, saying via CHGORIG_LC_CTYPE that the variable should not be in the environment
- chgserver.py will preserve LC_CTYPE=en_US.UTF-8
This is quite unlikely and would previously have caused a different problem:
- Command server starts up, let's assume py2 and so LC_CTYPE is unmodified
- Some extension sets LC_CTYPE=en_US.UTF-8 in the environment
- The next invocation of chg will call setenv, saying LC_CTYPE shouldn't be in the environment
- chgserver.py will say that the environment hash doesn't match and redirect chg to a new server
- chg will create that server and use that, but it'll have an identical hash to the previous one (since at startup LC_CTYPE isn't modified by the extension yet). This should be fine, it'll then run the command like normal.
- Every time chg is run, it restarts the command server due to this issue, slowing everything down :)