Opened 8 years ago

Closed 8 years ago

#5684 closed enhancement (implemented)

Should we stop sanitizing nicknames in bridge descriptors?

Reported by: karsten Owned by:
Priority: Medium Milestone:
Component: Metrics/CollecTor Version:
Severity: Keywords:
Cc: aagbsn, bastik.public@… Actual Points: 6
Parent ID: Points:
Reviewer: Sponsor:

Description

When we started making sanitized bridge descriptors available on the metrics website we replaced all contained nicknames with "Unnamed". The reason was that "bridge nicknames might give hints on the location of the bridge if chosen without care; e.g. a bridge nickname might be very similar to the operators' relay nicknames which might be located on adjacent IP addresses."

This was an easy decision back then, because we didn't use the nickname for anything. This has changed with #5629 where we try to count EC2 bridges which all have a similar nickname. So, while we don't have that information, there'd now be a use for it. Another advantage of having bridge nicknames would be that they're easier to look up in a status website like Atlas (which doesn't support searching for bridges yet). We should re-consider whether it still makes sense to sanitize nicknames in bridge descriptors or not.

Regarding the reasoning above, couldn't an adversary just scan adjacent IP addresses of all known relays, not just the ones with similar nicknames? And are we giving away anything else with the nicknames?

Child Tickets

Change History (9)

comment:1 Changed 8 years ago by aagbsn

Cc: aagbsn added

Some options:

  1. Do not sanitize the nicknames
    • counting Tor Cloud bridges would be easy
    • might reveal bridges whose nicknames correlate with public relays
    • reveals approximate network location (AWS)

More than half of all bridges have a unique nickname. How do we estimate the risk of not sanitizing nicknames?

  1. Mention Tor Cloud in the extra-info 'platform' string
    • counting Tor Cloud bridges would be easy
    • could indicate approximate network location (AWS)
    • Tor Cloud image would need to be updated and re-deployed

Is this possible? How does Tor decide what to put in the platform string?

comment:2 in reply to:  1 Changed 8 years ago by rransom

Replying to aagbsn:

  1. Mention Tor Cloud in the extra-info 'platform' string
    • counting Tor Cloud bridges would be easy
    • could indicate approximate network location (AWS)
    • Tor Cloud image would need to be updated and re-deployed

Is this possible? How does Tor decide what to put in the platform string?

This would require a patch to Tor -- possibly a patch to the Tor package used in the Tor Cloud images, or possibly an upstream patch to add an ExtraPlatformInformationString torrc option.

comment:3 in reply to:  1 Changed 8 years ago by karsten

Replying to aagbsn:

Some options:

  1. Do not sanitize the nicknames
    • counting Tor Cloud bridges would be easy
    • might reveal bridges whose nicknames correlate with public relays
    • reveals approximate network location (AWS)

More than half of all bridges have a unique nickname. How do we estimate the risk of not sanitizing nicknames?

The typical approach in the past was to describe the suggested change on tor-dev, ask people if they think it's a bad idea and why, and if nobody objects, make the new data available one or two weeks later. If there are no general concerns about the idea, I'll move the discussion to tor-dev.

  1. Mention Tor Cloud in the extra-info 'platform' string
    • counting Tor Cloud bridges would be easy
    • could indicate approximate network location (AWS)
    • Tor Cloud image would need to be updated and re-deployed

Is this possible? How does Tor decide what to put in the platform string?

Yes, this is possible by doing what Robert suggests. And it's probably even the cleaner approach to encode this information in the platform string instead of the nickname. Drawbacks are that we won't learn about previously deployed EC2 bridges, and that status websites like Atlas wouldn't benefit from this solution. Maybe we should do both.

comment:4 Changed 8 years ago by bastik

Cc: bastik.public@… added

This is to CC myself and to ask something that I would have asked on the list otherwise.

Couldn't you add a setting for bridges like

  • sanitize0/1

where 1 is default?

For the ones that you wanna count you can set it to 0.
Bridge operators that want to use something like Atlas can set it to 0 as well.

comment:5 in reply to:  4 ; Changed 8 years ago by karsten

Replying to bastik:

Couldn't you add a setting for bridges like

  • sanitize0/1

where 1 is default?

For the ones that you wanna count you can set it to 0.
Bridge operators that want to use something like Atlas can set it to 0 as well.

Adding an option only because the devs don't want to make a decision is in general a pretty bad idea. How can a user decide if a dev cannot? I'd guess that 95% of bridge operators would never see this option and the remaining 5% wouldn't know how to set it right. That would make the data almost unusable for counting EC2 bridges and for Atlas, and we'd generate support requests for no good reason. No, we should decide whether we can safely include all original nicknames, and if not, we should keep sanitizing all of them.

comment:6 in reply to:  5 ; Changed 8 years ago by bastik

Replying to karsten:

Replying to bastik:

Couldn't you add a setting for bridges like

  • sanitize0/1

where 1 is default?

For the ones that you wanna count you can set it to 0.
Bridge operators that want to use something like Atlas can set it to 0 as well.

Adding an option only because the devs don't want to make a decision is in general a pretty bad idea.

Might be true, but maybe the devs can't decide.

How can a user decide if a dev cannot?

They would know if relay and bridge name share a naming scheme.

I'd guess that 95% of bridge operators would never see this option and the remaining 5% wouldn't know how to set it right. That would make the data almost unusable for counting EC2 bridges and for Atlas, and we'd generate support requests for no good reason.

I thought the Tor people would create the Tor image and can control the setting.

I don't love the idea, but wanted to add it for discussion. Maybe I should have added myself as CC, only.

No, we should decide whether we can safely include all original nicknames, and if not, we should keep sanitizing all of them.

For my understanding you, the Tor people, can't do that. Names can be changed. How to define safe? Please don't feel "forced" to reply. I really don't want to start a discussion here.

comment:7 in reply to:  6 ; Changed 8 years ago by karsten

Replying to bastik:

Replying to karsten:

How can a user decide if a dev cannot?

They would know if relay and bridge name share a naming scheme.

In order to make this decision, operators would have to understand that they should use a different scheme for naming their bridges than for their relays. As I said on tor-dev, that's yet one more thing to tell them, and it's likely going to generate support requests for no good reason.

I'd guess that 95% of bridge operators would never see this option and the remaining 5% wouldn't know how to set it right. That would make the data almost unusable for counting EC2 bridges and for Atlas, and we'd generate support requests for no good reason.

This is the case for newly created EC2 images. It doesn't apply to existing EC2 images which are not updated. We'd also not learn about past statistics, and this wouldn't help Atlas at all. All in all, this config option is a usability nightmare that leaves us with mostly useless statistics.

No, we should decide whether we can safely include all original nicknames, and if not, we should keep sanitizing all of them.

For my understanding you, the Tor people, can't do that. Names can be changed. How to define safe?

I think this is something developers have to decide, not users. Note that this isn't about a single bridge that can be located via nickname similarity. It's about not letting the attack become successful enough to make it attractive. If the adversary could locate 1% of bridges via nickname similarity, they probably wouldn't care. Also, if we can double the number of bridges by getting more funding for EC2 bridges and making it easier for operators to check how their bridge is doing via Atlas, that's a win.

Please don't feel "forced" to reply. I really don't want to start a discussion here.

Oh, discussion is good. Please feel free to post any thoughts you have either here or on tor-dev. I'm not at all trying to kill the discussion.

comment:8 in reply to:  7 Changed 8 years ago by bastik

Replying to karsten:

Replying to bastik:

Replying to karsten:

How can a user decide if a dev cannot?

They would know if relay and bridge name share a naming scheme.

In order to make this decision, operators would have to understand that they should use a different scheme for naming their bridges than for their relays. As I said on tor-dev, that's yet one more thing to tell them, and it's likely going to generate support requests for no good reason.

True.

I'd guess that 95% of bridge operators would never see this option and the remaining 5% wouldn't know how to set it right. That would make the data almost unusable for counting EC2 bridges and for Atlas, and we'd generate support requests for no good reason.

This is the case for newly created EC2 images. It doesn't apply to existing EC2 images which are not updated. We'd also not learn about past statistics, and this wouldn't help Atlas at all. All in all, this config option is a usability nightmare that leaves us with mostly useless statistics.

I hadn't thought about that. I agree about the usability. Sure statistics would be mostly useless.

No, we should decide whether we can safely include all original nicknames, and if not, we should keep sanitizing all of them.

For my understanding you, the Tor people, can't do that. Names can be changed. How to define safe?

I think this is something developers have to decide, not users. Note that this isn't about a single bridge that can be located via nickname similarity. It's about not letting the attack become successful enough to make it attractive. If the adversary could locate 1% of bridges via nickname similarity, they probably wouldn't care. Also, if we can double the number of bridges by getting more funding for EC2 bridges and making it easier for operators to check how their bridge is doing via Atlas, that's a win.

I agree that only the devs can make the decision. It has to be a "global"/general decision. I should have been more verbose. My point was (and I should have made that clear) that an adversary may learn about bridge locations via nickname similarity. That couldn't be called safe, but is has to be set into relation to other things an adversary can do. Might be safe enough. So I agree with the adversary thing.

Please don't feel "forced" to reply. I really don't want to start a discussion here.

Oh, discussion is good. Please feel free to post any thoughts you have either here or on tor-dev. I'm not at all trying to kill the discussion.

I agree here as well. I picked this "channel" because it creates less noise.

comment:9 Changed 8 years ago by karsten

Actual Points: 6
Resolution: implemented
Status: newclosed

Sanitized bridge descriptors now contain bridge nicknames. Closing.

Note: See TracTickets for help on using tickets.