i don't know what this error is about. Seen in logs many thousands of many of this log lines (650k), even if they tell to not warn about it again.
Dec 17 01:17:40.000 [warn] {HANDSHAKE} 20190 connections have failed:Dec 17 01:17:40.000 [warn] {HANDSHAKE} 20190 connections died in state connect()ing with SSL state (No SSL object)Dec 17 01:17:40.000 [warn] {BUG} tor_bug_occurred_(): Bug: pubsub_publish.c:37: pubsub_pub_: Non-fatal assertion !(! d) failed. (Future instances of this warning will be silenced.) (on Tor 0.4.2.5 )Dec 17 01:17:40.000 [warn] {BUG} Bug: Tor 0.4.2.5: Non-fatal assertion !(! d) failed in pubsub_pub_ at pubsub_publish.c:37. (Stack trace not available) (on Tor 0.4.2.5 )Dec 17 01:17:40.000 [warn] {BUG} tor_bug_occurred_(): Bug: pubsub_publish.c:37: pubsub_pub_: Non-fatal assertion !(! d) failed. (Future instances of this warning will be silenced.) (on Tor 0.4.2.5 )Dec 17 01:17:40.000 [warn] {BUG} Bug: Tor 0.4.2.5: Non-fatal assertion !(! d) failed in pubsub_pub_ at pubsub_publish.c:37. (Stack trace not available) (on Tor 0.4.2.5 )Dec 17 01:17:40.000 [warn] {CONTROL} Problem bootstrapping. Stuck at 0% (starting): Starting. (Connection timed out [WSAETIMEDOUT ]; TIMEOUT; count 20192; recommendation warn; host CE3FE883C6C9EF475EA097DC3E33A6F32B852DA1 at 78.129.218.56:443)
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
Okay. Does this happen right when you start, or after a while. Does it happen every time you try to run Tor, or only with certain configuration options?
Dec 30 12:19:49.000 [warn] tor_bug_occurred_(): Bug: pubsub_publish.c:37: pubsub_pub_: Non-fatal assertion !(! d) failed. (Future instances of this warning will be silenced.) (on Tor 0.4.2.5 )Dec 30 12:19:49.000 [warn] Bug: Tor 0.4.2.5: Non-fatal assertion !(! d) failed in pubsub_pub_ at pubsub_publish.c:37. (Stack trace not available) (on Tor 0.4.2.5 )
lines after upgrading from 0.4.1.7 to 0.4.2.5.
That was for my relay.
When I tried to run the same 0.4.2.5 binaries with clean configuration inside VirtualBox, there was no such errors.
Will try to dig deeper.
@[comment:4 nickm]: looks like problem is again related to Windows service mode of operation.
I was able to reproduce the problem inside VM with clean data directory: [notices.log].
as the error was reported, my tor was run as Windows service too and not otherwise yet. this is confirmable by me. But since i never saw this happening again. I sadly cannot tell more about it here.
This is a minimal fix to avoid possible stability issues. For a longer-term fix, see #32883 (moved)
Thanks! Looks good to me, but I haven't tested it on Windows. Maybe someone else who knows how to do that could do so? If not, I'm also ok with merging this as-is because it seems like a less-common use case.
23:15 <+nickm> catalyst: if we can't find a tester for #32778, do you think we should go ahead and merge?23:15 -zwiebelbot:#tor-meeting- tor#32778: pubsub_pub_ - [needs_information] - https://bugs.torproject.org/3277823:16 < catalyst> nickm: yeah, it seems reasonable by inspection, and appveyor seems to be ok with it23:16 < catalyst> i think we lack automated testing for that functionality
But it would be really good if we could get some testing on this before we ship it.
There was a conflict with #32883 (moved), which replaces the pubsub calls added in this ticket, with calls to tor_run_main() (which then calls the pubsub functions). Therefore, I did an "ours" merge, to avoid taking any changes from this PR in master. (But resolve future conflicts when we backport this change.)
Unfortunately, the "ours" merge means that this code won't be tested in master.
But this PR seems reasonable to me for backport. The pattern of calls in nt_service_main() and nt_service_body() matches the pubsub call pattern used by other platforms in tor_run_main() before #32883 (moved). (And used by all platforms after #32883 (moved).)
Given the lack of testing, I'm marking it for backport after 0.4.3-stable.
When I try to launch it in service mode, process is created, but nothing happens next: no lines in log file, no ports are opened and very low RAM consumption. Looks like complete hang.
4025cbada6ab01247275faec8dd32ae857ef0fd5 does not work at all. Looks like no one tested these fixes, sorry.
Thanks for the information. That commit seems to be on master and includes the more extensive fix for #32883 (moved). Would you please try the more minimal fix at https://github.com/torproject/tor/pull/1634 ?
tor-0.4.2.5 (52d386c9) + nmathewson:ticket32778_041 (54eec534) looks good so far: no pubsub_pub errors in logs.
Thanks for the info! I've reopened #32883 (moved) so we can work on the issue in master.
Hello privacy friends!
I'm the opener. I was slow to respond on ticket.
Replying to catalyst:
wait a week to see if there's any more feedback on the smaller patch.
I now compiled the linked minimal patch "Initialize publish/subscribe code when running as an NT service. https://github.com/torproject/tor/pull/1634" applied against tor-0.4.2.5 running MINGW64_NT-10.0-18363 and run tor x64 as NT Service mode.
No more log message appears for me so far. Looks good. :)
This is no review in any way. Just a short feedback.
Don't worry, we're not planning on dropping Windows relay support any time soon.
We brainstorm lots of different ideas during our release retrospectives. If you want to see what we're actually going to do, look at the "POSSIBLE LESSONS TO TAKE FROM THIS RETROSPECTIVE" section.
In particular, we said:
Some parts of our code base are identifiably under-tested or badly held together. They include:
...
Our Windows testing and QA is severely underpowered.
You can help us test our Windows code by running Windows relays, and reporting bugs. If you can, run our alpha releases, so you catch bugs early. (If people don't report bugs in a feature, then we might assume no-one is using it. That makes it more likely to be dropped.)
I didn't do this detailed analysis. I've CC'd the people who did, in case they want to comment more.