#26594 closed defect (fixed)

Tor becomes unresponsive after configuration handling on Windows

Reported by: ahf Owned by:
Priority: High Milestone: Tor: 0.3.5.x-final
Component: Core Tor/Tor Version: Tor: unspecified
Severity: Normal Keywords: regression
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

When trying to start Tor from master (as of commit adcd1d8b9ac09f3abc11e2e3187fe363ad3df2fd), Tor will never progress from the initial configuration handling that is happening before we bootstrap.

When starting Tor the following output will appear:

PS C:\Users\ahf\AppData\Local\Packages\TheDebianProject.DebianGNULinux_76v4gfsz19hv4\LocalState\rootfs\home\ahf\src\github.com\ahf\tor-win32\src\tor> .\src\or\tor.exe
Jul 01 20:58:33.065 [notice] Tor 0.3.5.0-alpha-dev (git-adcd1d8b9ac09f3a) running on Windows 8 with Libevent 2.1.8-stable, OpenSSL 1.0.2n, Zlib 1.2.11, Liblzma N/A, and Libzstd N/A.
Jul 01 20:58:33.068 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Jul 01 20:58:33.069 [notice] This version is not a stable Tor release. Expect more bugs than usual.
Jul 01 20:58:33.084 [notice] Configuration file "C:\Users\ahf\AppData\Roaming\tor\torrc" not present, using reasonable defaults.
Jul 01 20:58:33.086 [warn] Path for GeoIPFile (<default>) is relative and will resolve to C:\Users\ahf\AppData\Local\Packages\TheDebianProject.DebianGNULinux_76v4gfsz19hv4\LocalState\rootfs\home\ahf\src\github.com\ahf\tor-win32\src\tor\<default>. Is this what you wanted?
Jul 01 20:58:33.086 [warn] Path for GeoIPv6File (<default>) is relative and will resolve to C:\Users\ahf\AppData\Local\Packages\TheDebianProject.DebianGNULinux_76v4gfsz19hv4\LocalState\rootfs\home\ahf\src\github.com\ahf\tor-win32\src\tor\<default>. Is this what you wanted?

After this the process becomes unresponsive. The "Scheduler type KISTLite has been enabled." NOTICE log message is expected here, but is never shown.

Bisecting the issue

I tried to bisect the issue and it looks like the issue began to show up after commit 696f6f15697260255146d634e1529202cc4c2b77 (the commit itself doesn't compile on Windows, but the following couple of commits are related fixes to that patch). The first commit that I can compile where this issue appears is 0b7452eeb2f2dee7acefee2d3ca2cb402a877ea1.

Analysis

I managed to track the issue down to a call to strcasecmp() in config_lines_eq() in src/lib/encoding/confline.c. I added the following debug output to get a slightly better understanding of what was going on (since gdb wasn't of much help here):

diff --git a/src/lib/encoding/confline.c b/src/lib/encoding/confline.c
index 7f535b321..4544465d3 100644
--- a/src/lib/encoding/confline.c
+++ b/src/lib/encoding/confline.c
@@ -14,6 +14,9 @@
 #include "lib/string/util_string.h"

 #include <string.h>
+#include <stdio.h>
+
+#define AHF_DEBUG(...) do { printf(__VA_ARGS__); fflush(stdout); } while (0)

 /** Helper: allocate a new configuration option mapping 'key' to 'val',
  * append it to *<b>lst</b>. */
@@ -232,8 +235,26 @@ int
 config_lines_eq(config_line_t *a, config_line_t *b)
 {
   while (a && b) {
+    AHF_DEBUG("a: %p\n", a);
+    AHF_DEBUG("b: %p\n", b);
+
+    AHF_DEBUG("a->key:   %p\n", a->key);
+    AHF_DEBUG("a->value: %p\n", a->value);
+    AHF_DEBUG("a->next:  %p\n\n", a->next);
+
+    AHF_DEBUG("b->key:   %p\n", b->key);
+    AHF_DEBUG("b->value: %p\n", b->value);
+    AHF_DEBUG("b->next:  %p\n\n", b->next);
+
+    AHF_DEBUG("a->key: '%s'\n", a->key);
+    AHF_DEBUG("a->value: '%s'\n", a->value);
+
+    AHF_DEBUG("b->key: '%s'\n", b->key);
+    AHF_DEBUG("b->value: '%s'\n", b->value);
+
     if (strcasecmp(a->key, b->key) || strcmp(a->value, b->value))
       return 0;
     a = a->next;
     b = b->next;
   }

Tor would now output the following when started:

$ ./src/or/tor.exe
Jul 01 21:24:47.029 [notice] Tor 0.3.5.0-alpha-dev (git-0b7452eeb2f2dee7) running on Very recent version of Windows [major=10,minor=0] with Libevent 2.1.8-stable, OpenSSL 1.0.2o, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd N/A.
Jul 01 21:24:47.029 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Jul 01 21:24:47.029 [notice] This version is not a stable Tor release. Expect more bugs than usual.
Jul 01 21:24:47.039 [notice] Configuration file "C:\Users\ahf\AppData\Roaming\tor\torrc" not present, using reasonable defaults.
Jul 01 21:24:47.041 [warn] Path for GeoIPFile (<default>) is relative and will resolve to C:\msys64\home\ahf\src\github.com\ahf\tor\<default>. Is this what you wanted?
Jul 01 21:24:47.041 [warn] Path for GeoIPv6File (<default>) is relative and will resolve to C:\msys64\home\ahf\src\github.com\ahf\tor\<default>. Is this what you wanted?
a: 0000000001C6C9E0
b: 0000000001C6CE60
a->key:   0000000001C6CC50
a->value: 00000000035DAC70
a->next:  0000000000000000

b->key:   0000000001C6CDA0
b->value: 00000000035DACC0
b->next:  0000000000000000

a->key: 'TestingV3AuthInitialVotingInterval'
a->value: '1800'
b->key: 'TestingV3AuthInitialVotingInterval'
b->value: '1800'

And after that it would become unresponsive again.

If we try to change the call from strcasecmp() to strcmp() in config_lines_eq() Tor will progress and bootstrap successfully:

$ ./src/or/tor.exe
Jul 01 21:29:19.667 [notice] Tor 0.3.5.0-alpha-dev (git-0b7452eeb2f2dee7) running on Very recent version of Windows [major=10,minor=0] with Libevent 2.1.8-stable, OpenSSL 1.0.2o, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd N/A.
Jul 01 21:29:19.667 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Jul 01 21:29:19.667 [notice] This version is not a stable Tor release. Expect more bugs than usual.
Jul 01 21:29:19.678 [notice] Configuration file "C:\Users\ahf\AppData\Roaming\tor\torrc" not present, using reasonable defaults.
Jul 01 21:29:19.679 [warn] Path for GeoIPFile (<default>) is relative and will resolve to C:\msys64\home\ahf\src\github.com\ahf\tor\<default>. Is this what you wanted?
Jul 01 21:29:19.679 [warn] Path for GeoIPv6File (<default>) is relative and will resolve to C:\msys64\home\ahf\src\github.com\ahf\tor\<default>. Is this what you wanted?
a: 000000000139CB30
b: 000000000139C950
a->key:   000000000139C8C0
a->value: 000000000365AF30
a->next:  0000000000000000

b->key:   000000000139C7D0
b->value: 000000000365AF40
b->next:  0000000000000000

a->key: 'TestingV3AuthInitialVotingInterval'
a->value: '1800'
b->key: 'TestingV3AuthInitialVotingInterval'
b->value: '1800'

[ ... some debug output omitted here ... ]

Jul 01 21:29:19.683 [notice] Scheduler type KISTLite has been enabled.
Jul 01 21:29:19.683 [notice] Opening Socks listener on 127.0.0.1:9050
Jul 01 21:29:19.000 [notice] Bootstrapped 0%: Starting
Jul 01 21:29:20.000 [notice] Starting with guard context "default"
Jul 01 21:29:20.000 [notice] Bootstrapped 80%: Connecting to the Tor network

But since our configuration keys are supposed to be case insensitive this is not a fix for the problem.

Reproducing

I have tested this in two different environments: 64-bit Tor compiled via msys2 and 32-bit Tor compiled using my own build scripts from https://github.com/ahf/tor-win32 in a Debian WSL (Windows Subsystem for Linux) container. The results are the same. All of it was done on Windows 10.

Child Tickets

Change History (24)

comment:1 Changed 14 months ago by ahf

Just a short note after some confusion in #tor-dev. I only *compile* Tor in WSL, but we run it *outside* of WSL in the normal Windows environment. Building Tor where the target is a "native" Linux binary runs fine via WSL (and isn't interesting for this issue).

comment:2 Changed 14 months ago by ahf

Hello71 on #tor-dev says that the issue is reproducible in Wine as well.

comment:4 Changed 14 months ago by ahf

It is possible, yes.

My next step is to figure out why the unresponsiveness happens. I have a bit of a feeling that my understanding of the Windows platform is against me here, but I'm going to try to figure it out.

comment:5 Changed 14 months ago by cypherpunks

Reduce code to something like

strcasecmp("TestingV3AuthInitialVotingInterval", "TestingV3AuthInitialVotingInterval")

Does it hangs?

comment:6 Changed 14 months ago by ahf

Here's the stack traces:

ahf@SHU MINGW64 ~
$ ps aux | grep tor
     3272    1092    3272       5452  pty0      197609 11:23:46 /home/ahf/src/github.com/ahf/tor/src/or/tor

ahf@SHU MINGW64 ~
$ gdb -p 3272
GNU gdb (GDB) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-w64-mingw32".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 3272
[New Thread 3272.0x16f0]
[New Thread 3272.0x1aec]
[New Thread 3272.0x15a4]
[New Thread 3272.0x1a88]
[New Thread 3272.0x23bc]
[New Thread 3272.0x1be4]
Reading symbols from C:\msys64\usr\bin\bash.exe...(no debugging symbols found)...done.
(gdb) thread apply all bt

Thread 6 (Thread 3272.0x1be4):
#0  0x00007ffb5495d881 in ntdll!DbgBreakPoint ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb549899fb in ntdll!DbgUiRemoteBreakin ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 5 (Thread 3272.0x23bc):
#0  0x00007ffb54959f74 in ntdll!ZwReadFile ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb50d72b36 in ReadFile () from C:\Windows\System32\KernelBase.dll
#2  0x0000000180122c9a in wait_sig(void*) ()
   from C:\msys64\usr\bin\msys-2.0.dll
#3  0x0000000180044693 in cygthread::callfunc(bool) ()
   from C:\msys64\usr\bin\msys-2.0.dll
#4  0x0000000180044c4a in cygthread::stub(void*) ()
   from C:\msys64\usr\bin\msys-2.0.dll
#5  0x0000000180045663 in _cygtls::call2(unsigned int (*)(void*, void*), void*, void*) () from C:\msys64\usr\bin\msys-2.0.dll
#6  0x0000000180045714 in _cygtls::call(unsigned int (*)(void*, void*), void*)
    () from C:\msys64\usr\bin\msys-2.0.dll
#7  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#8  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#9  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 4 (Thread 3272.0x1a88):
#0  0x00007ffb5495d804 in ntdll!ZwWaitForWorkViaWorkerFactory ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb548df856 in ntdll!RtlReleaseSRWLockExclusive ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 3 (Thread 3272.0x15a4):
#0  0x00007ffb5495d804 in ntdll!ZwWaitForWorkViaWorkerFactory ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb548df856 in ntdll!RtlReleaseSRWLockExclusive ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 2 (Thread 3272.0x1aec):
#0  0x00007ffb5495d804 in ntdll!ZwWaitForWorkViaWorkerFactory ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb548df856 in ntdll!RtlReleaseSRWLockExclusive ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (Thread 3272.0x16f0):
#0  0x00007ffb54959f34 in ntdll!ZwWaitForSingleObject ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb50d79252 in WaitForSingleObjectEx ()
   from C:\Windows\System32\KernelBase.dll
#2  0x00000001800eb103 in pinfo::maybe_set_exit_code_from_windows() ()
   from C:\msys64\usr\bin\msys-2.0.dll
#3  0x00000001800eb331 in pinfo::exit(unsigned int) ()
   from C:\msys64\usr\bin\msys-2.0.dll
#4  0x00000001801287b5 in child_info_spawn::worker(char const*, char const* const*, char const* const*, int, int, int) () from C:\msys64\usr\bin\msys-2.0.dll
#5  0x0000000180129409 in spawnve () from C:\msys64\usr\bin\msys-2.0.dll
#6  0x000000018011bedb in _sigfe () from C:\msys64\usr\bin\msys-2.0.dll
#7  0x0000000100426660 in ?? ()
#8  0x0000000000080000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)

comment:7 in reply to:  5 Changed 14 months ago by ahf

Replying to cypherpunks:

Does it hangs?

No, it does not.

The file test.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[])
{
        int ret = 0;

        ret = strcasecmp("TestingV3AuthInitialVotingInterval", "TestingV3AuthInitialVotingInterval");

        printf("Ret: %d\n", ret);

        return EXIT_SUCCESS;
}

Compiling and running it:

ahf@SHU MINGW64 ~ $ x86_64-w64-mingw32-gcc test.c
ahf@SHU MINGW64 ~ $ ./a.exe
Ret: 0

comment:8 Changed 14 months ago by cypherpunks

x86_64-w64-mingw32-gcc test.c

Optimization "-O2" required maybe?

comment:9 Changed 14 months ago by teor

Did you attach gdb to bash or tor?

Reading symbols from C:\msys64\usr\bin\bash.exe...(no debugging symbols
found)...done.

comment:10 in reply to:  9 Changed 14 months ago by ahf

Replying to teor:

Did you attach gdb to bash or tor?

I'm not sure why it thinks it's bash. When I give it the explicit binary it looks the same:

$ gdb -p 3272 src/or/tor.exe
GNU gdb (GDB) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-w64-mingw32".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from src/or/tor.exe...done.
Attaching to program `C:\msys64\home\ahf\src\github.com\ahf\tor\src\or\tor.exe', process 3272
[New Thread 3272.0x16f0]
[New Thread 3272.0x1aec]
[New Thread 3272.0x15a4]
[New Thread 3272.0x1a88]
[New Thread 3272.0x23bc]
[New Thread 3272.0x1b60]
(gdb) bt
#0  0x00007ffb5495d881 in ntdll!DbgBreakPoint ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb549899fb in ntdll!DbgUiRemoteBreakin ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) thread apply all bt

Thread 6 (Thread 3272.0x1b60):
#0  0x00007ffb5495d881 in ntdll!DbgBreakPoint ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb549899fb in ntdll!DbgUiRemoteBreakin ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 5 (Thread 3272.0x23bc):
#0  0x00007ffb54959f74 in ntdll!ZwReadFile ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb50d72b36 in ReadFile () from C:\Windows\System32\KernelBase.dll
#2  0x0000000180122c9a in wait_sig(void*) ()
   from C:\msys64\usr\bin\msys-2.0.dll
#3  0x0000000180044693 in cygthread::callfunc(bool) ()
   from C:\msys64\usr\bin\msys-2.0.dll
#4  0x0000000180044c4a in cygthread::stub(void*) ()
   from C:\msys64\usr\bin\msys-2.0.dll
#5  0x0000000180045663 in _cygtls::call2(unsigned int (*)(void*, void*), void*, void*) () from C:\msys64\usr\bin\msys-2.0.dll
#6  0x0000000180045714 in _cygtls::call(unsigned int (*)(void*, void*), void*)
    () from C:\msys64\usr\bin\msys-2.0.dll
#7  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#8  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#9  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 4 (Thread 3272.0x1a88):
#0  0x00007ffb5495d804 in ntdll!ZwWaitForWorkViaWorkerFactory ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb548df856 in ntdll!RtlReleaseSRWLockExclusive ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 3 (Thread 3272.0x15a4):
#0  0x00007ffb5495d804 in ntdll!ZwWaitForWorkViaWorkerFactory ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb548df856 in ntdll!RtlReleaseSRWLockExclusive ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 2 (Thread 3272.0x1aec):
#0  0x00007ffb5495d804 in ntdll!ZwWaitForWorkViaWorkerFactory ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb548df856 in ntdll!RtlReleaseSRWLockExclusive ()
   from C:\Windows\SYSTEM32\ntdll.dll
#2  0x00007ffb53283034 in KERNEL32!BaseThreadInitThunk ()
   from C:\Windows\System32\kernel32.dll
#3  0x00007ffb54931431 in ntdll!RtlUserThreadStart ()
   from C:\Windows\SYSTEM32\ntdll.dll
#4  0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (Thread 3272.0x16f0):
#0  0x00007ffb54959f34 in ntdll!ZwWaitForSingleObject ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x00007ffb50d79252 in WaitForSingleObjectEx ()
   from C:\Windows\System32\KernelBase.dll
#2  0x00000001800eb103 in pinfo::maybe_set_exit_code_from_windows() ()
   from C:\msys64\usr\bin\msys-2.0.dll
#3  0x00000001800eb331 in pinfo::exit(unsigned int) ()
   from C:\msys64\usr\bin\msys-2.0.dll
#4  0x00000001801287b5 in child_info_spawn::worker(char const*, char const* const*, char const* const*, int, int, int) () from C:\msys64\usr\bin\msys-2.0.dll
#5  0x0000000180129409 in spawnve () from C:\msys64\usr\bin\msys-2.0.dll
#6  0x000000018011bedb in _sigfe () from C:\msys64\usr\bin\msys-2.0.dll
#7  0x0000000100426660 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

comment:11 Changed 14 months ago by ahf

Compiling Tor with -O0 and -O1a makes the issue go away and Tor bootstraps correctly. Compiling with -O2 and -O3 and the issue appears again.

Last edited 14 months ago by ahf (previous) (diff)

comment:12 in reply to:  8 Changed 14 months ago by ahf

Replying to cypherpunks:

x86_64-w64-mingw32-gcc test.c

Optimization "-O2" required maybe?

Same good result with -O1 and -O2.

comment:13 Changed 14 months ago by cypherpunks

Same good resul

Yeah, gcc generate final result for such little snippets if optimization enabled.

Maybe something like

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) __attribute__((optimize("O0")));

int main(int argc, char *argv[])
{
        int ret = 0;

        ret = strcasecmp("TestingV3AuthInitialVotingInterval", "TestingV3AuthInitialVotingInterval");

        printf("Ret: %d\n", ret);

        return EXIT_SUCCESS;
}

To compile by something like

x86_64-w64-mingw32-gcc -O2 test.c

Will gcc keep optimization enabled for strcasecmp?

comment:14 Changed 14 months ago by cypherpunks

This one

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) __attribute__((optimize("O0")));
int strcasecmp(const char *s1, const char *s2) __attribute__((optimize("O2")));

int main(int argc, char *argv[])
{
        int ret = 0;

        ret = strcasecmp("TestingV3AuthInitialVotingInterval", "TestingV3AuthInitialVotingInterval");

        printf("Ret: %d\n", ret);

        return EXIT_SUCCESS;
}

comment:15 Changed 14 months ago by ahf

Here's the control flow of the unoptimized (left) and optimized (right): https://i.imgur.com/C2n7LUK.png

Version 0, edited 14 months ago by ahf (next)

comment:16 in reply to:  14 Changed 14 months ago by ahf

Replying to cypherpunks:

This one

[ ... ]

Result:

ahf@SHU MINGW64 ~ $ x86_64-w64-mingw32-gcc test.c
In file included from test.c:3:0:
test.c:6:5: warning: '_stricmp' redeclared without dllimport attribute: previous dllimport ignored [-Wattributes]
 int strcasecmp(const char *s1, const char *s2) __attribute__((optimize("O2")));
     ^
ahf@SHU MINGW64 ~ $ ./a.exe
Ret: 0
ahf@SHU MINGW64 ~ $ x86_64-w64-mingw32-gcc -O2 test.c
test.c:6:1: warning: optimization attribute on 'strcasecmp' follows definition but the attribute doesn't match [-Wattributes]
 int strcasecmp(const char *s1, const char *s2) __attribute__((optimize("O2")));
 ^~~
In file included from test.c:3:0:
C:/msys64/mingw64/x86_64-w64-mingw32/include/string.h:117:28: note: previous definition of 'strcasecmp' was here
   __CRT_INLINE int __cdecl strcasecmp (const char *__sz1, const char *__sz2) { return _stricmp (__sz1, __sz2); }
                            ^~~~~~~~~~
ahf@SHU MINGW64 ~ $ ./a.exe
Ret: 0

comment:17 Changed 14 months ago by cypherpunks

Result:

Ok, it's not about strcasecmp itself, it's about config_lines_eq optimization. gcc generates nice loop there.

comment:18 Changed 14 months ago by ahf

The test case test_storagedir_read_labeled() reproduces this issue.

Can be triggered using: ./src/test/test.exe storagedir/read_labeled --verbose --no-fork.

comment:19 in reply to:  17 Changed 14 months ago by ahf

Replying to cypherpunks:

Ok, it's not about strcasecmp itself, it's about config_lines_eq optimization. gcc generates nice loop there.

I think so too. The GCC output doesn't even make any calls to strcasecmp() here.

comment:20 Changed 14 months ago by cypherpunks

Btw, #20424
What gcc version?

comment:21 Changed 14 months ago by cypherpunks

The test case test_storagedir_read_labeled() reproduces this issue.
Can be triggered using: ./src/test/test.exe storagedir/read_labeled --verbose --no-fork.

Some case of #24857 related to maybe.

comment:22 Changed 14 months ago by nickm

Status: newneeds_review

See https://github.com/torproject/tor/pull/197 for the cause and a proposed solution.

comment:23 Changed 14 months ago by ahf

Status: needs_reviewmerge_ready

Patch LGTM. Thanks!

comment:24 Changed 14 months ago by nickm

Keywords: regression added
Milestone: Tor: unspecifiedTor: 0.3.5.x-final
Resolution: fixed
Status: merge_readyclosed

ok; merged!

Note: See TracTickets for help on using tickets.