Opened 5 years ago

Closed 3 months ago

#7520 closed project (fixed)

Design a social bridge distributor

Reported by: aagbsn Owned by: isis
Priority: Medium Milestone:
Component: Obfuscation/BridgeDB Version:
Severity: Normal Keywords: bridgedb-socdist, isisExB, isis2015Q3Q4
Cc: phw, Matthew.Finkel@…, isis@…, wfn Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

The sixth strategy outlined at https://svn.torproject.org/svn/projects/design-paper/blocking.html#tth_sEc7.4 describes a social bridge distribution strategy:

The sixth strategy ties in the social network design with public bridges and a reputation system.
We pick some seeds — trusted people in blocked areas — and give them each a few dozen bridge addresses and a few delegation tokens.

Some other services use a similar system to try and restrict the set of users using an invite system. One example is private bittorrent trackers.

In an email, I described such a system:

Here's a simple concept for how this model might be applied to bridge
distribution:

The basic idea is:
1. Create a handful of tokens that can be exchanged for an account
that may request a bridge.
2. Periodically give accounts some tokens to hand out to ther friends.

Most (all?) of the private trackers employ a ratio system - and you
lose your account if you don't maintain a ratio above a certain
threshold. That is, they try to separate users by behavior, and drop
the ones whose behavior is undesirable.

In the context of bridges, we want to be able to separate users into
two groups: users who use bridges, and users who block them.

1. Each time a user is given a bridge, note the bridges given to that user.
2. Each time a bridge is blocked, increment a per-user counter for
every user given that bridge.
2a. Shuffle the affected users so that the same users are not given
the same bridges twice. If using a hashring, a key consisting of the
user-id+counter might be sufficient.
3. Periodically, rank users by this counter, and drop the worst N
percent of users.
4. Periodically, allocate new account tokens in proportion to
available bridges to random users in the 100-N percent.

Child Tickets

TicketStatusOwnerSummaryComponent
#7521closedisisDesign a system for generating delegation tokensObfuscation/BridgeDB
#7522closedisisDesign a user interface for redeeming invite tokensObfuscation/BridgeDB
#7523closedisisDecide whether reputation should be tracked between accountsObfuscation/BridgeDB
#7524closedDesign a system for applying an allocator strategyObfuscation/BridgeDB
#7525closedisisDesign a system for tracking bridge assignment metrics.Obfuscation/BridgeDB

Change History (22)

comment:1 Changed 5 years ago by aagbsn

Cc: ioerror added

ioerror pointed out that we may not want to do step 3. Ideally, the implementation would be flexible so that upon deployment one can choose from various strategies and not provide a blueprint for gaming the distributor.

For example, accounts should not get dropped if a bridge is detected by DPI and blocked. Since we may not know how a bridge was blocked, we might prefer a strategy whereby users more strongly correlated with blocking events are isolated rather than removed.

comment:2 Changed 5 years ago by phw

Cc: phw added

I feel like we need to provide incentives for people to actually give bridges to friends. From an egoistic point of view: why would they do that? Distributing bridges to other people increases the probability of the bridge getting blocked and the system classifying you as a "bad user" (I'm a pessimist).

Another problem might be the first node in the invitation graph: the users who receive bridges first. If we only have a small set of trusted people, the entire system might be ineffective. If we have some automated way to handle this, then we can expect large-scale sybil attacks like the bridges.tpo crawling.

Some papers which thought about models for this problem:
http://freehaven.net/anonbib/#proximax11
http://www.cs.kau.se/philwint/censorbib/#Mahdian2010
http://www.cs.kau.se/philwint/censorbib/#Sovran2008

comment:3 Changed 5 years ago by phobos

Keywords: SponsorL added

comment:4 in reply to:  2 Changed 5 years ago by aagbsn

Replying to phw:

I feel like we need to provide incentives for people to actually give bridges to friends. From an egoistic point of view: why would they do that? Distributing bridges to other people increases the probability of the bridge getting blocked and the system classifying you as a "bad user" (I'm a pessimist).

I suppose that a user could just redeem an invite themself for additional bridges without sharing, and there's not much we could do about that.

I do think people like helping their friends (hey, I'm a pessimist too, but I still have friends ;-)), and think the fact that sharing an invite with a bad user could (slightly, depending on the total numbmer of bridges available) increase the chance that your bridges are blocked is in fact good -- it should prevent users from giving bridges out to people who they don't trust to some degree. Isn't that the point of a social distributor? To me, that's exactly the behavior we want. The incentive here is that people who have invites posess something that other people want, gifting things is something only people who "have" can do, and each act of gifting strokes the ego.

Another problem might be the first node in the invitation graph: the users who receive bridges first. If we only have a small set of trusted people, the entire system might be ineffective. If we have some automated way to handle this, then we can expect large-scale sybil attacks like the bridges.tpo crawling.

This is a good point. We should think about what the intial invitation size should be, and who should get an invite. I think it's OK to experiment with a small initial pool (and a corresponding small initial allocation of bridges) and see what happens. In fact, maybe we would like to split the social distributor into several pools - give out invites to people who "seem legit" on irc, twitter, and email and again, observe what happens.

Some papers which thought about models for this problem:
http://freehaven.net/anonbib/#proximax11
http://www.cs.kau.se/philwint/censorbib/#Mahdian2010
http://www.cs.kau.se/philwint/censorbib/#Sovran2008

Thanks! I'll read these now.

comment:5 Changed 5 years ago by aagbsn

Ah, so one thing that Proximax does is evaluate users of the system by a metric "user-hours", referred to as yield, in order to figure out how to allocate a sparse set of bridges effectively.

BridgeDB already obtains the extra-info field bridge-ips for each bridge, which consists of approximate user-counts per country.
BridgeDB also learns which bridges are blocked by parsing a file, however, we don't yet produce this list of blocked bridges. Also, the file format does not include a timestamp field, and blocking events are grouped into a single line. If we want to extrapolate bridge availability we will probably want a more descriptive format, and BridgeDB should learn to track some sort of per-country availabile-uptime metric.

Alternately, we could extract yield entirely from the bridge-ips line by tracking users seen over time. This could be manipulated by a dishonest bridge operator, or an attacker who generates traffic to known bridges to boost their ranking and obtain more trust in this system.

I'm not sure what we can do about this. Are there other ways to estimate bridge usage without just trusting the bridge self-reporting?

comment:6 Changed 5 years ago by arma

Fyi, I also write about this problem in the 'putting it all together' section at the end of https://blog.torproject.org/blog/research-problem-five-ways-test-bridge-reachability

comment:7 in reply to:  5 Changed 5 years ago by arma

Replying to aagbsn:

Alternately, we could extract yield entirely from the bridge-ips line by tracking users seen over time. This could be manipulated by a dishonest bridge operator, or an attacker who generates traffic to known bridges to boost their ranking and obtain more trust in this system.

I think it's a really hard theoretical problem to distinguish 'real' usage from artificial usage added by an adversary who controls the country that we're trying to measure usage from.

I'm not sure what we can do about this. Are there other ways to estimate bridge usage without just trusting the bridge self-reporting?

We could do spot checking from trusted users in-country, to make sure that the bridge remains reachable during the time that it's reporting high load. That's a variant of the reachability testing approaches from the blog post above.

I should also note that Damon told me a year or so back that he wants to pick up the Proximax work and get some grants and some grad students to work on it. This ticket in particular sounds like it needs a few research papers written before we have a good handle on what we should deploy. In particular, one of the first things I'd want to see is a list of attacks on Proximax that aim to skew its results.

comment:8 Changed 5 years ago by arma

Keywords: SponsorZ added; SponsorL removed

comment:9 Changed 5 years ago by jhd

Hello! I am a grad student working on a thesis that is related to Tor bridge distribution. I'm not funded by a grant, so mine is sort of a solo effort, but I thought I might try to contribute a little bit before I finish my thesis up.

I should also note that Damon told me a year or so back that he wants to pick up the Proximax work and get some grants and some grad students to work on it. This ticket in particular sounds like it needs a few research papers written before we have a good handle on what we should deploy. In particular, one of the first things I'd want to see is a list of attacks on Proximax that aim to skew its results.

While doing some preliminary research, I found a very recent paper on Tor bridge distribution. See the citation below (pdf is available via Google Scholar).

Wang, Q., Lin, Z., Borisov, N., & Hopper, N. J. rBridge: User Reputation based Tor Bridge Distribution with Privacy Preservation.

They have a similar approach to distributing bridges as defined above (credit/reputation based system), and they claim to outperform Proximax. Some disadvantages I see are that their privacy preservation feature introduces some overhead and also forces a random selection of bridges for each user (Wouldn't it be better if you could determine the bridges to give a new user based on who invited them?).

comment:10 Changed 4 years ago by sysrqb

(I'll post more later, but for now...)

After reading rBridge, Proximax, Kaleidoscope, Tor's blocking resistance paper:

some thoughts on a future system:

  • We will want multiple pools (possibly three, to start off: Automated distribution, manual distribution, reserve)
  • Use of a credential system that awards users via allocation of credits seems like a good idea
  • Awarding credits based on a bridges user-hours value seems like a good idea
  • We should try to add the "intrinsic risk" of a bridge into the reputation calculations
  • Without the use of NIPK and OT, the BridgeDB operators MUST be trusted
  • Reputation should not only be based on a social tree
  • We can use the bridge's geoip stats to *help* determine when the bridge has been blocked within a zone
  • Bridges can be selected based on the user's identity rather than location. (Really, how bad is random selection?)
  • Do we want to maintain an ID system (ex. Persona)?
  • We need reachability testing...yesterday
  • When we determine (within a reasonably high probability) that a user is a censor and/or in cohorts with one, only supply blocked bridges
  • GEO IP tracking by a bridge needs to distinguish between direct connections and connections via PT
  • Can we use standalone PT nodes within a censored zone to obscure a connection between a PT client and bridge?
  • How do we prevent sybils when we have registered users?
  • If we don't track the social graph, can we somehow factor it into our calculations? (assuming a registered users may distribute her bridges to friends)
  • Is the use of FQDN as bad an idea as I think it is?
  • Low plausible deniability that you don't have credentials if you use Tor

Isis has several really good ideas too (Persona was one of them, now that I think about it).

comment:11 in reply to:  5 Changed 4 years ago by sysrqb

Replying to aagbsn:

BridgeDB already obtains the extra-info field bridge-ips for each bridge, which consists of approximate user-counts per country.

I wonder if this will be "good enough" until we implement a better solution. Initially I didn't think this was a good idea, but if the value that we extract from this is "From which country do we no longer see users", then I think this should work.

Alternately, we could extract yield entirely from the bridge-ips line by tracking users seen over time. This could be manipulated by a dishonest bridge operator, or an attacker who generates traffic to known bridges to boost their ranking and obtain more trust in this system.

I think this value is difficult to determine. Maybe (initially) user-feedback is "good enough" in this situation?

Four scenarios we need to take into account:

1) benevolent bridge with no connection to censor

When censor blocks bridge, geoip stats accurately reflect this. user-hours can be calculated with sufficient accuracy.

2) benevolent bridge with connection to censor

When censor blocks bridge, geoip stats *may* accurately reflect this. Unknown if user-hours can be calculated accurately.

3) malicious bridge with no connection to censor

When censor blocks bridge, geoip stats are not reliable and *may not* reflect this. The bridge may have artificially inflated or deflated the stats it reported throughout its operation. User-hours calculation should be assumed to be inaccurate.

4) malicious bridge with connection to (or is) censor

When censor blocks bridge, geoip stats are not reliable and likely *do not* reflect this blocking. Stats probably will report artificially inflated usages from the censored zone, thus reducing the probability it is removed from the distributor. This also shrinks the number of new bridges a user can obtain.

If we appropriately handle 1 and 4, this should include adequate countermeasures for 2 and 3.

comment:12 in reply to:  10 ; Changed 4 years ago by asn

Replying to sysrqb:

(I'll post more later, but for now...)

After reading rBridge, Proximax, Kaleidoscope, Tor's blocking resistance paper:

some thoughts on a future system:

I also agree that cherry-picking features we like from all these schemes, might be a good way to design a decent future BridgeDB. Adding some notes on my own:

  • We will want multiple pools (possibly three, to start off: Automated distribution, manual distribution, reserve)
  • Use of a credential system that awards users via allocation of credits seems like a good idea
  • Awarding credits based on a bridges user-hours value seems like a good idea

You mean based on bridge uptime (like the rBridge paper)? I also like this idea.

  • We should try to add the "intrinsic risk" of a bridge into the reputation calculations
  • Without the use of NIPK and OT, the BridgeDB operators MUST be trusted

Yeah, the threat model of BridgeDB will have to remain the same on this matter.
I'd also like BridgeDB to have all those fancy rBridge cryptowanWfeatures (ZKP/OT/etc.), but I really doubt we can implement them efficiently/securely/in the next 5 years. I don't know a single widely used application with oblivious transfer capabilities.

BTW, as far as the BridgeDB threat model goes, note that all these reputation-based systems probably require BridgeDB to keep accounting logs for users ("user X got bridges at TIMESTAMP", "user X invited user Y", etc.). This is not the case with the current BridgeDB.

  • Reputation should not only be based on a social tree
  • We can use the bridge's geoip stats to *help* determine when the bridge has been blocked within a zone
  • Bridges can be selected based on the user's identity rather than location. (Really, how bad is random selection?)
  • Do we want to maintain an ID system (ex. Persona)?

By ID system, you mean some kind of identifier per user other than an IP address? I guess we will need such a system, if we want to build the whole invitation/credential-based idea.

Will the bridge selection happen based on the ID of the user, or the IP address of the user? For example, Persona is based on a single email address; will a user who creates multiple Persona IDs be able to get more than a single bunch of bridges?

  • We need reachability testing...yesterday
  • When we determine (within a reasonably high probability) that a user is a censor and/or in cohorts with one, only supply blocked bridges
  • GEO IP tracking by a bridge needs to distinguish between direct connections and connections via PT
  • Can we use standalone PT nodes within a censored zone to obscure a connection between a PT client and bridge?
  • How do we prevent sybils when we have registered users?
  • If we don't track the social graph, can we somehow factor it into our calculations? (assuming a registered users may distribute her bridges to friends)
  • Is the use of FQDN as bad an idea as I think it is?

FQDN? What do you mean?

  • Low plausible deniability that you don't have credentials if you use Tor

What do you mean on this one?

Thanks for looking into this! I like where it's going.

comment:13 in reply to:  12 Changed 4 years ago by sysrqb

Cc: Matthew.Finkel@… added

Thanks for the feedback asn!

Replying to asn:

Replying to sysrqb:

(I'll post more later, but for now...)

After reading rBridge, Proximax, Kaleidoscope, Tor's blocking resistance paper:

some thoughts on a future system:

I also agree that cherry-picking features we like from all these schemes, might be a good way to design a decent future BridgeDB. Adding some notes on my own:

  • We will want multiple pools (possibly three, to start off: Automated distribution, manual distribution, reserve)
  • Use of a credential system that awards users via allocation of credits seems like a good idea
  • Awarding credits based on a bridges user-hours value seems like a good idea

You mean based on bridge uptime (like the rBridge paper)? I also like this idea.

Yup. From what I can tell the idea actually originated in the proximax paper and then was improved in rBridge.

  • We should try to add the "intrinsic risk" of a bridge into the reputation calculations
  • Without the use of NIPK and OT, the BridgeDB operators MUST be trusted

Yeah, the threat model of BridgeDB will have to remain the same on this matter.
I'd also like BridgeDB to have all those fancy rBridge cryptowanWfeatures (ZKP/OT/etc.), but I really doubt we can implement them efficiently/securely/in the next 5 years. I don't know a single widely used application with oblivious transfer capabilities.

Yeah, sad reality. I made this point specifically regarding aagbsn's BridgeHerder idea and the idea that third parties can run BridgeDB social distributors. BridgeHerder really is a good idea, but I hadn't really thought about it until I read the rBridge paper. After reading it I realized how dangerous a compromised BridgeDB instance can be to a user's anonymity.

BTW, as far as the BridgeDB threat model goes, note that all these reputation-based systems probably require BridgeDB to keep accounting logs for users ("user X got bridges at TIMESTAMP", "user X invited user Y", etc.). This is not the case with the current BridgeDB.

This is true and a valid point (which is where the zero knowledge implementation shines) and I think we need to discuss/debate/design the best way to handle this information such that the users are put in the least amount of danger.

  • Reputation should not only be based on a social tree
  • We can use the bridge's geoip stats to *help* determine when the bridge has been blocked within a zone
  • Bridges can be selected based on the user's identity rather than location. (Really, how bad is random selection?)
  • Do we want to maintain an ID system (ex. Persona)?

By ID system, you mean some kind of identifier per user other than an IP address? I guess we will need such a system, if we want to build the whole invitation/credential-based idea.

Right, we will need some way to create/maintain accounts. I know, at the least, mikeperry and isis were looking at persona (independently). We must be careful about which implementation we use, as well.

Will the bridge selection happen based on the ID of the user, or the IP address of the user? For example, Persona is based on a single email address; will a user who creates multiple Persona IDs be able to get more than a single bunch of bridges?

Good question. rBridge selects all bridge randomly, I haven't decided if I like this yet. If we don't want to do that, we could base selection on the MAC of the userid + nonce (or something similar). I think how we choose to select bridges will really depend on how bridges are stored (i.e. will we still use rings?).

  • We need reachability testing...yesterday
  • When we determine (within a reasonably high probability) that a user is a censor and/or in cohorts with one, only supply blocked bridges
  • GEO IP tracking by a bridge needs to distinguish between direct connections and connections via PT
  • Can we use standalone PT nodes within a censored zone to obscure a connection between a PT client and bridge?
  • How do we prevent sybils when we have registered users?
  • If we don't track the social graph, can we somehow factor it into our calculations? (assuming a registered users may distribute her bridges to friends)
  • Is the use of FQDN as bad an idea as I think it is?

FQDN? What do you mean?

Sorry, Fully-Qualified domain names. Proximax relies on the clients using domain names instead of IP addresses. Multiple bridge's are mapped to the same domain name, and are used in round-robin. When an IP address is blocked, then proximax assigns a new bridge's IP address to the domain name, etc. If the domain name is blocked then proximax simply deploys a new domain.

I have some reservations about this idea, but I haven't decided if they're well-founded yet.

  • Low plausible deniability that you don't have credentials if you use Tor

What do you mean on this one?

Currently if you're using Tor in a censored zone you may have received bridge addresses from BridgeDB, or a friend, or help@tpo, etc. If we switch to a social distributor model based on rBridge, then if a person is using Tor, they most likely either received a bridge from BridgeDB or help@. If they received it from BridgeDB then they must have credentials, and if those credentials fall into the wrong hands it would be detrimental (blocked bridges and, depending on the regime, dangerous for the user).

Thanks for looking into this! I like where it's going.

This was more brain-dump than something constructive, I appreciate the feedback though. I'll work on writing Something-Constructive soon.

comment:14 Changed 4 years ago by isis

Cc: isis@… added

*First, what I´ve researched thus far. Later, I will respond to the last few comments. My eye cannot handle this much glaring computer screens right now.*

I spent a lot of time at first thinking about PoW schemes to protect BridgeDB
from a malicious user requesting bridges and then burning them. To see notes
on all the research I did on that, see Appendix A. The research is irrelevant,
because an email from phw:

And I also wastedWspent quite some time thinking about PoW schemes for
scanning resistance and bridge distribution. I came to the conclusion that the
bridge churn rate might not be high enough for it to make sense. I have some
details in Section 4.1.1 here:
http://www.cs.kau.se/philwint/pdf/scramblesuit2013.pdf Let me know if you have
some thoughts; I'd love to chat more about this.

and reading phw's ScambleSuit paper [0], mentioned above, convinced me that
PoW schemes cannot ever be made to work.

I read the rBridge [12] and Proximax [13] papers, and I must agree with the
rBridge authors that the granularity of categories for classifying whether or
not a distribution tree is "infected" with malicious users is not fine-grained
enough to have results as good as rBridge's (see the "Comparison with
Proximax" section in [12]).

Also asn and I spoke very briefly on IRC about possible implementations of
rBridge. asn thinks the crypto needs to be more widely reviewed. I agree,
mostly...though, I would note that some of the simpler propositions for
privacy preservation in Section 5 of the rBridge paper, such as the Pedersen
secrets [14] (essentially a "newer" version of Shamir's secret sharing
algorithm, where "newer" means "1992") are pretty well-established. However, I
would need a bit of time to read up on the Oblivious Transfer (OT) scheme
utilised [15] -- which most of the rBridge privacy preservation depends upon
-- as well as the authors of that paper's [14] updates to their protocol
[16][17], and more recently publishes articles on OT, ([18] for one). First,
I'm not a "real cryptographer". Second, this will take me a while, due to my
recent injury. TTS doesn't do well with LaTeX algorithms.

Third, that's beside the point. BridgeDB doesn't currently preserve privacy
(as far as I can tell), and if it were compromised all the bridges would be
leaked anyway. But in the meantime, without implementing Section 5 of the
rBridge paper, we could implement the rest, and then perhaps have a real
proposal to a funder later for implementing the privacy-preserving features,
possibly including some way to incentivize real cryptographers to take a look
at the proofs in Sections 5 and 6 of rBridge.

Other possible solutions (though not tailored to distribution of Tor bridges)
that I have investigated somewhat, though these could perhaps use further
research if it is decided that rBridge is non-optimal were using Mozilla's
Persona [19][20] to implement blinded bridge-user account authentication
(mostly because mikeperry has been praising Persona on tor-dev and got me
interested in it). Another was IBM Research's IDEmix [21][22][23], though I
put off looking into that more because there seemed easier, faster paths to
improving BridgeDB (namely rBridge). Of the two, IDEmix seems more privacy
preserving, as tokens are completely blinded, as opposed to Persona, where the
centralised Persona server can see which identity (based on email address) is
logging in to which service, from where (meaning which webpage, etc.), and
when. Other papers which have been mentioned which I haven't yet gotten to
reading are BridgeHerder [24] and sysrqdb mentions a "Kaleidescope" paper
which I've not read either.

Concerning Metrics:


Eventually, I think it will be necessary to conduct further research to
determine if rBridge's privacy-preservation scheme for bridge users'
identities is something we wish to implement, or if we should continue looking
at any of the alternatives listed above, or others.

I think this research, and the implementation of some type of
privacy-preservation scheme will be necessary, for several reasons, mostly
concerning safely obtaining metrics which would enable us to improve bridge
distribution, uptime, and bridge user connectivity (though also out of concern
at having the compromise of BridgeDB be a single point-of-failure for users in
censored regions to obtain internet access):

1) There isn't currently a way to safely and accurately get metrics for
patterns in which bridges are getting burned (sysrqdb also makes this point
in [https://trac.torproject.org/projects/tor/ticket/7520#comment:12 comment
#12]). If there were a way to take these metrics, while still preserving the
anonymity of the client, then we could track which client is using what
without knowing who the client is.

2) There isn't an easy way for clients, if/when they do connect to the Tor
network, to report which bridges they had previously tried which did not
work for them. Since bridges already can collect the geolocational data on
connecting clients, it would be nice to have the ability for a client to say
which bridges are unreachable, without giving away (to the bridge currently
in use, as it might be malicious) any information on which bridges were
previously tried (though it is probably safe to assume that if the other
bridges are blocked and the current, malicious bridge is colluding with the
censor, then the censor has already blocked the other bridges tried and thus
already knows about them).

Payment-based Bridge Distribution Schemes:


Tor developers and assistants should spend their time doing sysadmin work, and
bridges should be easily runnable by third partied. aagbsn mentioned an idea
to setup a "cloud bridge management" system where a provider would have a
simple interface for deploying new bridges and taking anonymous payments, and
users could subscribe to N bridges, helping to pay for the cost of running the
bridges.[11] I don't especially like this idea, see Appendix A for why I think
it doesn't help much.

If "Concerning Metrics" #2 were solved, a provider could recieve reports from
clients' connections to the other bridges which would tell the provider that a
bridge that the provider runs is unreachable, which the provider could then
corroborate with reports from other users in the same country/region. There
could even be an alert system, i.e. "email me if N clients from the same
country report that the same bridge is down" or possibly/eventually automation
of deploying a new/replacement bridge instance.

WOT / contact list schemes:


1) Using the PGP WOT seems like low-hanging fruit. Sending an email with a
key in the strong set could possibly be useful test for clients requesting
bridges. However, it would be possible to generate a lot of keys, have them
all sign each other, and then upload them to the key servers. It would be a
good idea to decrease the amount of incentivisation for a censor to take
such actions. Requiring at least one of the signatures to have been made
before this PGP-WOT-client-authentication system is deployed is one way to
do it, though rather obviously rather exclusionary. Another way might be to
have a trust path to a Tor developer, or other whitelisted parties, although
this is exclusive and problematic for obvious reasons.

2) Another option would be to have a way to hook into a google account, to
provide clients with the ability to send bridge "invites" to people on their
contact list. This does not sound very sustainable, as every time a social
network provider updated their API, the code for BridgeDB would need to
change. Also, many censoring countries are moving towards having their own
alternatives to US-based services, such as Weibo and Baidu in China, and
vKontakte in Russia.

Questions:

  • Do Chinese social networks use OpenID?

============================================================================

What to do ASAP:

1) Go through BridgeDB's codebase and assess the difficultly and time
requirements for implementing rBridge, Sections 1-4 only.

2) Write a proposal for fixing/improving/maintaining BridgeDB.

It should include:

  • Solutions and time requirements for the points in asn's list of concerns about BridgeDB
  • Better sanitisation of BridgeDB's logs #3797, #4771
  • Improve BridgeDB's knowledge of current Tor exit nodes (so that they aren't used to obtain bridges) #4405, #7750
  • Improve the email and website interfaces, including better commands/responses for email distribution #8705, #1562, #1610, #3061, #3573, #5851, #6125, #7296, #5655, #8616
  • Modifying the format of BridgeDB's response to provide extra fields for transport protocols like ScrambleSuit and obfsproxy (not sure if the latter was already implemented) #5119, #9013
  • Improving the documentation for BridgeDB code for future maintainers.

3) Hopefully someone funds it.

4) Do things.

What to do long term:

1) Evaluate and discuss the usefulness of directions for further research as
outlined above.

2) Do that research, rinse, repeat.

============================================================================

Appendix A


I re-read the old Rivest and Shamir paper[1] on time locked puzzles, as well
as the scrypt paper [2] and some documentation on blockchain computation in
bitcoin trying to brainstorm a working PoW scheme. I also looked into
automated CPU-scaling for MPI clusters[3], the latest custom ASIC hardware for
SHA256 hash digest computation [4][5][6], and the current state of CMOS-based
and single-electron transistors and their efficiency for carrying out the four
steps for AES encryption[7][8][9][10] to try to get an idea of what the actual
economic and time costs would be for a determined attacker. I think that a
single determined attacker with roughly my capability levels, roughly three
months of development time, and roughly $10,000 USD could break any of the PoW
schemes which remain usable for Tor bridge clients. They might need more
funding for hardware if we picked a work function which doesn't already have
specialized chips/software in production for the underlying computations --
but nevertheless, this is a rather low bar, and it wouldn't buy us much in
terms of decreasing the bridge enumeration rate.

Incidentally, I also read a PETS 2008 paper titled "PAR: Payment for Anonymous
Routing". [11] I was thinking about actual payment systems because aagbsn had
mentioned at some point an idea to create a easy mechanism for outside
entities to set up a sort of systems management system for a bunch of TorCloud
bridge instances and take anonymous payment from users for private
bridges. While I admit that I have no idea how "e-cash", as mentioned in the
PAR paper, works, I know that the scheme they described would be completely
traceable with bitcoin. Plus, our goal is to withstand attacks by
government-level adversaries who are less economically-, CPU-, and memory-
bounded than bridge users. Payment mechanisms would only allow the rich to
subvert the "protections" afforded to bridges.

ScrambleSuit is really interesting, has an elegantly simple morphing
algorithm, and provides protection against timing and replay attacks. However,
I wanted the PoW so that obtaining bridges through the current distribution
mechanisms would be more expensive. At the present, as aagsn noted at one
point (in person, not on this thread), "$20k could buy enough googlemail
addresses to get all the bridges out of the email distributor". Since the PoW
scheme isn?t going to work, I'm trying to find a better way to *distribute*
the bridge addresses -- which now also potentially includes ScrambleSuit's
master shared secret.

References:


[0] http://www.cs.kau.se/philwint/pdf/scramblesuit2013.pdf "ScrambleSuit: A Polymorph Network Protocol to Circumvent Censorship"
[1] http://people.csail.mit.edu/rivest/RivestShamirWagner-timelock.ps "Time-lock Puzzles and Timed-release Crypto"
[2] https://www.tarsnap.com/scrypt/scrypt.pdf "STRONGER KEY DERIVATION VIA SEQUENTIAL MEMORY-HARD FUNCTIONS"
[3] https://datawrangling.s3.amazonaws.com/elasticwulf_pycon_talk.pdf "MPI Cluster Programming with Python and Amazon EC2"
[4] http://www.butterflylabs.com/faq/ see "What is ?Hosting? and how do I setup my order to be ?Hosted??"
[5] https://en.bitcoin.it/wiki/Mining_hardware_comparison#ASIC
[6] http://electronics.stackexchange.com/questions/7042/how-much-does-it-cost-to-have-a-custom-asic-made
[7] http://www.ecs.umass.edu/~jiazhao/custom_aes_liang_li.pdf "A Full-custom Design of AES SubByte Module with Signal Independent Power Consumption"
[8] http://arxiv.org/pdf/1203.4811.pdf "Few Electron Limit of n-type Metal Oxide Semiconductor Single Electron Transistors"
[9] Closing the Power Gap Between ASIC and Custom: Tools and Techniques for Low Power Design. Chinnery, David. Keutzer, Kurt William. pp.115- http://books.goo\
gle.com/books?id=Pektbnxx6G4C&pg=PA115&lpg=PA115&ots=dqirO074Fj&dq=custom+asic+aes
[10] https://en.wikipedia.org/wiki/Tensilica_Instruction_Extension "Tensilica Instruction Extension"
[11] http://cs.gmu.edu/~astavrou/research/Par_PET_2008.pdf "PAR: Payment for Anonymous Routing"
[12] http://www-users.cs.umn.edu/~hopper/rbridge_ndss13.pdf "rBridge: User Reputation based Tor Bridge Distribution with Privacy Preservation"
[13] http://cseweb.ucsd.edu/~klevchen/mml-fc11.pdf "Proximax: A Measurement Based System for Proxies Dissemination"
[14] http://www.cs.huji.ac.il/~ns/Papers/pederson91.pdf "Non-interactive and information-theoretic secure verifiable secret sharing"
[19] https://github.com/mozilla/id-specs/blob/prod/browserid/index.md "Specifications related to Mozilla's Identity Effort."
[20] https://current.trovebox.com A website using Mozilla's Persona for authentication.
[21] http://www.zurich.ibm.com/security/idemix/ "IBM Research: Identity Mixer"
[22] https://idemix.wordpress.com/2009/09/29/pub-anonymous-credentials/ Papers related to anonymous authentication in IDEmix.
[23] http://domino.research.ibm.com/library/cyberdig.nsf/papers/EEB54FF3B91C1D648525759B004FBBB1/$File/rz3730_revised.pdf IDEmix specification

Not Yet Read:


[14] http://www.pinkas.net/PAPERS/effot.ps "Efficient Oblivious Transfer Protocols"
[15] http://logic.pdmi.ras.ru/ics/papers/ot.pdf "Computationally Secure Oblivious Transfer"
[16] http://www.pinkas.net/ "Homepage of Benny Pinkas" includes bibliography of this cryptographer's papers
[17] http://www.iacr.org/archive/asiacrypt2002/25010142/25010142.pdf "Efficient Oblivious Transfer in the Bounded-Storage Model"
[24] https://trac.torproject.org/projects/tor/ticket/7207 "BridgeHerder: A tool to manage bridges"

comment:15 in reply to:  14 Changed 4 years ago by rransom

Replying to isis:

Also asn and I spoke very briefly on IRC about possible implementations of
rBridge. asn thinks the crypto needs to be more widely reviewed.

The critical piece of cryptography for rBridge is the ‘k-TAA (k-times anonymous authentication) blind signature scheme’ which can additionally prove knowledge of exponents and group elements satisfying certain equations. The particular k-TAA scheme they recommend/require requires a pairing-friendly group; the pairing is the most problematic part of the protocol.

I agree,
mostly...though, I would note that some of the simpler propositions for
privacy preservation in Section 5 of the rBridge paper, such as the Pedersen
secrets [14] (essentially a "newer" version of Shamir's secret sharing
algorithm, where "newer" means "1992") are pretty well-established.

They use Pedersen's commitment scheme (section 3 of the paper, on page 3 of the PDF), not his secret-sharing scheme. Since it's a TTS-hostile bitmap, I'll summarize it here, using the notation of an additive group: The system parameters contain two elements G and H of a group of prime order q, such that no one knows n such that G = nH. A commitment is a group element C. To create a commitment to an exponent s, the committer chooses a secret exponent t uniformly at random and computes C = sG + tH. To ‘open’ a commitment C, the committer discloses s and t satisfying the same equation. Any two openings of a single commitment to different values of s disclose the discrete logarithm of H with respect to G.

However, I
would need a bit of time to read up on the Oblivious Transfer (OT) scheme
utilised [15] -- which most of the rBridge privacy preservation depends upon
-- as well as the authors of that paper's [14] updates to their protocol
[16][17], and more recently publishes articles on OT, ([18] for one).

The rBridge protocol does not depend on the specific choice of OT protocol, and does not require that it be performed using the same pairing-friendly group as the rest of the protocol.

However, I don't believe that OT is useful in rBridge. Its purpose is to prevent a malicious bridge distributor from learning a user's identity, but a distributor could still offer the user an oblivious choice of 1000 bridges on different ports of a single IP address, and recognize the user when he/she/it connects to that IP address.

(Also, you omitted reference 18.)

comment:16 Changed 3 years ago by wfn

Cc: wfn added

comment:17 Changed 3 years ago by isis

Keywords: bridgedb-socdist added

comment:18 Changed 3 years ago by isis

Keywords: isisExB isis2015Q3Q4 added

comment:19 Changed 3 months ago by isis

Cc: ioerror removed
Owner: set to isis
Severity: Normal
Status: newassigned

The design is called Hyphae, and it is described in a paper written by Henry de Valence and I (which still needs a concrete implementation section, but is otherwise complete enough to do such an implementation).

comment:20 Changed 3 months ago by isis

Summary: Design and implement a social distributor for BridgeDBDesign a social bridge distributor

I'm changing this ticket to refer to only the design, since "design and implementation" is overly broad. As such, I'm calling this task done as of April 2017. The new ticket for tracking implementation is #22775.

comment:21 Changed 3 months ago by isis

Type: enhancementproject

comment:22 Changed 3 months ago by isis

Keywords: SponsorZ removed
Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.