I have dozen of V3 onions launched via ADD_ONION control port command. Sometimes they work, sometimes do not. "GETINFO onions/current" shows them working. I have to restart tor to make them work. I have some v2 onions too and they are always working with no trouble at all under the same conditions.
Trac: Username: time_attacker
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Might be related to #26549 (moved): v3 ephemeral onions don't work very well, this will be fixed with #25552 (moved) (patch available and will soon be merged to master).
If that's not the issue then logs would help. Thanks!
So I've been having this issue but rarely. This weekend, it happened to me in the morning where one of my service wasn't computing the same hashring as the client. No matter how many times I would restart the client, with latest consensus, they were always different. So the service was the issue.
Every single parameter on the service side was correct in order to compute the right hashring (SRV, time period num from the ns->valid_after, replica, ...).
My investigation lead me to hs_service_descriptor_t->time_period_num value. In theory, every descriptor is only built for a specific time period, they don't overlap. When we build a descriptor, we keep the time period num it is built for and then we never change it (which in theory should be OK). But, descriptor rotation happens at each new SRV which happens 12h before a new time period.
Thus, I believe we have an issue where a descriptor can be between two time periods leading to something like: Current Desc: TP - 1, and Next desc.: TP + 1 or something like that which means there is up to a 12h time frame where the current time period num has simply no descriptor for it and thus the service is unreachable.
See build_descriptors_for_new_service() ... there is something problematic there where we use now to check if we are in between TP and SRV and if so, then we get the previous/current time period num but this time using the valid_after ... these can be offset which can lead to missing a time period num for the descriptor we are building.
I'm running an experiment right now that should confirm the theory. I'll have results in hopefully less than 48h.
I'm promoting this to 035 milestone and flagging it for backport since if we are correct, this is a major reachability issue.
Trac: Severity: Critical to Major Milestone: N/Ato Tor: 0.3.5.x-final Priority: Medium to Very High Keywords: N/Adeleted, 034-backport, 032-backport, 033-backport added
We are currently investigating many HS reachability bugs. This ticket most likely falls under one or many of those issues. Since there are no action items here, I'll close this but rest assure known issues have been opened and are being worked on.
For the record, I still have an experimental patch tracking the time period number so if my service ever is not reachable again, I'll at least have that results. It hasn't happened since my post.
Trac: Resolution: N/Ato wontfix Status: needs_information to closed