Opened 16 months ago
Closed 15 months ago
#24119 closed enhancement (fixed)
channel_rsa_id_group_set_badness spends a lot of time in malloc/free
Reported by: | Hello71 | Owned by: | Hello71 |
---|---|---|---|
Priority: | Medium | Milestone: | Tor: 0.3.3.x-final |
Component: | Core Tor/Tor | Version: | |
Severity: | Minor | Keywords: | tor-channel, tor-sched, 032-backport |
Cc: | Actual Points: | ||
Parent ID: | #23777 | Points: | |
Reviewer: | isis | Sponsor: |
Description
in particular, a very large proportion (~25%) of allocations made seem to be "temporary", i.e. an allocation is made and then freed before any other allocations are made. possibly a portion of these are due to the second loop, but I was wondering if it is possible that the function is called with an empty list, and if that is a problem.
regardless, I have written a patch to use ht instead of only smartlists, which should very slightly increase the memory usage and moderately decrease the CPU usage in this function.
Child Tickets
Attachments (4)
Change History (26)
comment:1 Changed 16 months ago by
Status: | new → needs_review |
---|
comment:2 Changed 16 months ago by
Keywords: | tor-channel tor-sched added |
---|
comment:3 Changed 16 months ago by
Status: | needs_review → needs_revision |
---|
comment:4 Changed 16 months ago by
fixed patch to actually add the first element to the set. still wondering if this function truly needs to be called every single second.
comment:5 Changed 16 months ago by
Keywords: | 032-backport added |
---|
Changed 16 months ago by
Attachment: | 0001-Rewrite-channel_rsa_id_group_set_badness-with-ht-241.patch added |
---|
comment:6 Changed 16 months ago by
Status: | needs_revision → needs_review |
---|
I think this patch finally actually does the same work as the other one, just faster.
Changed 16 months ago by
Attachment: | 0001-Simplify-channel_rsa_id_group_set_badness-24119.patch added |
---|
comment:7 Changed 16 months ago by
I think this one is even better, but given that the first one took me about six tries, this one is probably wrong too.
Changed 16 months ago by
Attachment: | 0001-Add-fast-paths-to-channel_rsa_id_group_set_badness.patch added |
---|
Changed 16 months ago by
Attachment: | 0002-Simplify-channel_rsa_id_group_set_badness-24119.patch added |
---|
comment:8 Changed 16 months ago by
so I realised that we can just avoid all allocations in the vastly more common case that there is only one ed25519 identity for an rsa id group. 0001-Add-fast-paths-to-channel_rsa_id_group_set_badness.patch does this first step. 0002-Simplify-channel_rsa_id_group_set_badness-24119.patch applied on top additionally rewrites the nested loop into a sort and a single loop. performance is probably better with huge n, but with the small numbers we are likely to have here, it's really between which one is simpler. I think mine is, but that might just be because I've spent several days staring at it.
tl;dr pls review and apply 0001-Add-fast-paths-to-channel_rsa_id_group_set_badness.patch, 0002-Simplify-channel_rsa_id_group_set_badness-24119.patch is probably nice to have but not needed
comment:9 Changed 16 months ago by
Owner: | set to Hello71 |
---|---|
Status: | needs_review → assigned |
setting owner. (We'll take a look at this once review-group-24 is done and review-group-25 is open.)
comment:10 Changed 16 months ago by
Status: | assigned → needs_review |
---|
comment:11 Changed 16 months ago by
Keywords: | review-group-25 added |
---|
comment:12 Changed 15 months ago by
Status: | needs_review → needs_revision |
---|
On the first patch: I think connection_or_expire_
should probably have a name more like connection_or_mark_bad_if_too_old
or connection_or_set_singleton_badness
or something ? The current name makes it sound like it always expires the connection.
On the second patch: First, please don't use memcmp() directly. Use one of the aliases or function in di_ops.h instead.
I think there's a bug on the loop in your second patch: connection_or_group_set_badness_ will get called on every Ed25519 ID except for the last one.
comment:13 Changed 15 months ago by
https://cgit.alxu.ca/tor.git/log/?h=bug24119 resolves the problems. I don't know if you agree about fast_memcmp here, but there seem to be similar uses in other sorting helpers. also I dunno about the underscore here, because this part seems to already be back and forth between connections and channels.
comment:14 Changed 15 months ago by
Status: | needs_revision → needs_review |
---|
comment:15 Changed 15 months ago by
Reviewer: | → isis |
---|---|
Sponsor: | → Sponsor8-can |
Calling this a Sponsor8 because resource consumption effects mobile; feel free to change if there's a better sponsor.
comment:17 Changed 15 months ago by
Hi, and sorry about the delay!
I've fixed a few things I noticed in a new branch, bug24119
, in my public repository (https://git.torproject.org/nickm/tor.git). How does this look to you?
comment:18 Changed 15 months ago by
Keywords: | review-group-25 review-group-26 removed |
---|---|
Status: | needs_review → merge_ready |
I think this is ready to merge, if you agree with the changes.
comment:19 Changed 15 months ago by
lgtm. (I wonder if ed25519_pubkey_eq could safely use fast_memcmp... although it doesn't really matter here)
comment:20 Changed 15 months ago by
I think that would be correct if all callers ensure that either side channels are (effectively) unobservable, such as in these background tasks, or that the foreign party has proved that it owns the private key for this public key. the former doesn't apply everywhere, and the latter sounds nearly impossible to ensure.
tl;dr lgtm, ignore part about ed25519_pubkey_eq in previous comment
comment:21 Changed 15 months ago by
Sponsor: | Sponsor8-can |
---|
https://trac.torproject.org/projects/tor/wiki/org/sponsors/Sponsor8 says it is Android, but this change affects only nodes with many channels open, mainly participating nodes (relays, directories, etc). clients are unlikely to have this type of load, and mobile clients especially so.
comment:22 Changed 15 months ago by
Resolution: | → fixed |
---|---|
Status: | merge_ready → closed |
Okay; squashed and merged to master!
hm... annoyingly, this patch only seems to reduce the allocations from 30% to 25%, and that's probably noise. I will investigate whether this API can be changed to be more efficient.