This is an archive of the discontinued Mercurial Phabricator instance.

hgweb: log error before attempting I/O
ClosedPublic

Authored by indygreg on Jan 30 2019, 3:56 PM.

Details

Summary

Previously, an uncaught exception during HTTP request serving would
attempt to send an error response then log the exception.

If an exception occurred during I/O, this exception would be
raised and the original exception wouldn't be logged.

This commit changes behavior so the original exception is logged
first, before we attempt to do anything else. This ensures the
exception is logged.

This change resulted in new tracebacks appearing in various tests.

Because tracebacks can vary between Python versions, we added a
simple script to filter the stack part of traceback lines. This
makes testing much simpler, as we don't need to glob over lines
and make lines conditional.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

indygreg created this revision.Jan 30 2019, 3:56 PM
This revision was automatically updated to reflect the committed changes.

While this is better, it triggers a few failures on Windows. I've seen these connection resets with hg serve outside of the test harness. Any ideas? It almost seems like the client is closing the socket before all of the data is sent. (This failure occurs on py2)

--- c:/Users/Matt/projects/hg/tests/test-hgweb-filelog.t
+++ c:/Users/Matt/projects/hg/tests/test-hgweb-filelog.t.err
@@ -1843,5 +1843,69 @@
 errors

   $ cat errors.log
+  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/log/3/a?linerange=1':
+  Traceback (most recent call last):
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 102, in do_POST
+      self.do_write()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 95, in do_write
+      self.do_hgweb()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 205, in do_hgweb
+      self._done()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 258, in _done
+      self.wfile.write('0\r\n\r\n')
+    File "C:\Python27\Lib\socket.py", line 328, in write
+      self.flush()
+    File "C:\Python27\Lib\socket.py", line 307, in flush
+      self._sock.sendall(view[write_offset:write_offset+buffer_size])
+  error: [Errno 10053] An established connection was aborted by the software in your host machine
+
+  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/log/3/a?linerange=1:a':
+  Traceback (most recent call last):
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 102, in do_POST
+      self.do_write()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 95, in do_write
+      self.do_hgweb()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 202, in do_hgweb
+      self._write(chunk)
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 253, in _write
+      self.wfile.write(data)
+    File "C:\Python27\Lib\socket.py", line 328, in write
+      self.flush()
+    File "C:\Python27\Lib\socket.py", line 307, in flush
+      self._sock.sendall(view[write_offset:write_offset+buffer_size])
+  error: [Errno 10054] $ECONNRESET$
+
+  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/log/3/a?linerange=1:2&linerange=3:4':
+  Traceback (most recent call last):
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 102, in do_POST
+      self.do_write()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 95, in do_write
+      self.do_hgweb()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 202, in do_hgweb
+      self._write(chunk)
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 253, in _write
+      self.wfile.write(data)
+    File "C:\Python27\Lib\socket.py", line 328, in write
+      self.flush()
+    File "C:\Python27\Lib\socket.py", line 307, in flush
+      self._sock.sendall(view[write_offset:write_offset+buffer_size])
+  error: [Errno 10054] $ECONNRESET$
+
+  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/log/3/a?linerange=3:2':
+  Traceback (most recent call last):
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 102, in do_POST
+      self.do_write()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 95, in do_write
+      self.do_hgweb()
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 202, in do_hgweb
+      self._write(chunk)
+    File "c:\Users\Matt\projects\hg\mercurial\hgweb\server.py", line 253, in _write
+      self.wfile.write(data)
+    File "C:\Python27\Lib\socket.py", line 328, in write
+      self.flush()
+    File "C:\Python27\Lib\socket.py", line 307, in flush
+      self._sock.sendall(view[write_offset:write_offset+buffer_size])
+  error: [Errno 10054] $ECONNRESET$
+

   $ cd ..

ERROR: test-hgweb-filelog.t output changed
!.
--- c:/Users/Matt/projects/hg/tests/test-hgweb.t
+++ c:/Users/Matt/projects/hg/tests/test-hgweb.t.err
@@ -913,16 +913,12 @@
   $ cat errors.log | "$PYTHON" $TESTDIR/filtertraceback.py
   $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/?cmd=spam': (glob)
   Traceback (most recent call last):
-  error: [Errno 104] $ECONNRESET$
-
-  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/spam': (glob)
-  Traceback (most recent call last):
-  error: [Errno 104] $ECONNRESET$
-
-  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/spam/tip/foo': (glob)
-  Traceback (most recent call last):
-  error: [Errno 104] $ECONNRESET$
-
+  error: [Errno 10054] $ECONNRESET$\r (esc)
+
+  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/spam/tip/foo':\r (esc)
+  Traceback (most recent call last):\r (esc)
+  error: [Errno 10054] $ECONNRESET$\r (esc)
+  \r (esc)
   $ rm -f errors.log

 Uncaught exceptions result in a logged error and canned HTTP response

ERROR: test-hgweb.t output changed
!
Failed test-hgweb-filelog.t: output changed
Failed test-hgweb.t: output changed

@durin42 and I were both able to produce intermittent failures with this change. Why, I'm not sure.

Does the Windows regression occur with this patch or with D5751? I ask because the behavior of socket.shutdown() is documented to vary by platform. I wouldn't at all be surprised if D5751 caused the regression seen here.

@durin42 and I were both able to produce intermittent failures with this change. Why, I'm not sure.
Does the Windows regression occur with this patch or with D5751? I ask because the behavior of socket.shutdown() is documented to vary by platform. I wouldn't at all be surprised if D5751 caused the regression seen here.

It was this patch.

Like I said, I’ve seen connection resets in the non-daemonized serve output (don’t recall if it was push, pull, or both). I thought it was weird, but there were no other signs of problems, so I wrote it off as some odd Windows thing. (Not sure that I’ve seen it on Linux or OS X too)

@durin42 and I were both able to produce intermittent failures with this change. Why, I'm not sure.
Does the Windows regression occur with this patch or with D5751? I ask because the behavior of socket.shutdown() is documented to vary by platform. I wouldn't at all be surprised if D5751 caused the regression seen here.

It was this patch.
Like I said, I’ve seen connection resets in the non-daemonized serve output (don’t recall if it was push, pull, or both). I thought it was weird, but there were no other signs of problems, so I wrote it off as some odd Windows thing. (Not sure that I’ve seen it on Linux or OS X too)

Does D5764 help the situation? It made a big difference for me.

@durin42 and I were both able to produce intermittent failures with this change. Why, I'm not sure.
Does the Windows regression occur with this patch or with D5751? I ask because the behavior of socket.shutdown() is documented to vary by platform. I wouldn't at all be surprised if D5751 caused the regression seen here.

It was this patch.
Like I said, I’ve seen connection resets in the non-daemonized serve output (don’t recall if it was push, pull, or both). I thought it was weird, but there were no other signs of problems, so I wrote it off as some odd Windows thing. (Not sure that I’ve seen it on Linux or OS X too)

Does D5764 help the situation? It made a big difference for me.

That helps a lot. I was able to run this ~200 times in --loop without issue (with py3). Then I got a few flyers:

--- d:/mercurial/tests/test-hgweb-filelog.t
+++ d:/mercurial/tests/test-hgweb-filelog.t.err
@@ -1843,5 +1843,19 @@
 errors

   $ cat errors.log
+  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/log/3/a?linerange=1:a':
+  Traceback (most recent call last):
+    File "d:\mercurial\mercurial\hgweb\server.py", line 102, in do_POST
+      self.do_write()
+    File "d:\mercurial\mercurial\hgweb\server.py", line 95, in do_write
+      self.do_hgweb()
+    File "d:\mercurial\mercurial\hgweb\server.py", line 204, in do_hgweb
+      self._write(chunk)
+    File "d:\mercurial\mercurial\hgweb\server.py", line 255, in _write
+      self.wfile.write(data)
+    File "C:\Program Files\Python37\Lib\socketserver.py", line 796, in write
+      self._sock.sendall(b)
+  ConnectionAbortedError: [WinError 10053] An established connection was aborted by the software in your host machine
+

   $ cd ..

ERROR: test-hgweb-filelog.t output changed
!..................
--- d:/mercurial/tests/test-hgweb.t
+++ d:/mercurial/tests/test-hgweb.t.err
@@ -911,6 +911,10 @@
 errors

   $ cat errors.log | "$PYTHON" $TESTDIR/filtertraceback.py
+  $LOCALIP - - [$ERRDATE$] Exception happened during processing request '/spam':\r (esc)
+  Traceback (most recent call last):\r (esc)
+  ConnectionAbortedError: [WinError 10053] An established connection was aborted by the software in your host machine\r (esc)
+  \r (esc)
   $ rm -f errors.log

 Uncaught exceptions result in a logged error and canned HTTP response

ERROR: test-hgweb.t output changed

So there might be a small race here, but I'm not concerned, given how rare it is.