Opened 8 years ago

Closed 14 months ago

#4666 closed task (wontfix)

Write proposal for proof-of-work service

Reported by: mikeperry Owned by: mikeperry
Priority: Medium Milestone:
Component: Archived/Ponies Version:
Severity: Normal Keywords: SponsorZ-large, archived-closed-2018-07-04
Cc: aagbsn, phobos, arma, hellais, nickm, g.koppen@…, pebbles@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

aagbsn and I dreamed up a simple proof-of-work service a week or two ago. It would work like this:

  1. A web service runs at a url and provides signed or HMACed products-of-primes combined with a timestamp and a unique sequence number. The primes should be chosen such that factoring their product takes on the order of several seconds on most hardware.
  1. Torbutton periodically fetches batches of these products-of-primes over Tor and factors them into their component integers in a background thread, until a reserve pool is met.
  1. DoS-sensitive websites can advertise an additional header and/or accept N factored products issued from within the last T minutes in exchange for captcha-free/unblocked service. The products are encoded as either cookies, GET parameters, or POST parameters.
  1. Factored products can be reused on a site until the user either clicks "New Identity" or they become too old for use at a particular site.

This system provides two knobs for sites to throttle traffic in response to current scraping levels. On days where scraping query volume is low, sites can be lenient in terms of how many proof of work units they require for use, and how recently they must have been issued. If scraping volume increases, sites can increase N and/or reduce T, to change the amount of work that must be done by scrapers in order to receive service.

Alternatively or in addition, we can investigate using a Nymble-like (or possibly simpler) system to allow the creation of single identifiers in exchange for some number of proof-of-work units as well. Converting units of work into identifiers (and preventing re-use of work using sequence numbers) may simplify implementation on the DoS-sensitive service side, by enabling the service to concern itself only with limiting the rate of queries per identifier.

Child Tickets

Change History (20)

comment:1 Changed 8 years ago by mikeperry

If we're extra-clever, we could find some way to partner with a distributed computing effort that provides some benefit to society and/or allows us to monetize the work done as well, as opposed to simply doing factoring or similar asymmetric busywork.

comment:2 Changed 8 years ago by mikeperry

The first property we need from the distributed computing service is the ability to authenticate unique units of computational work, either via signed tokens from the service, or some public aspect of the work itself.

The second property we need is that the individual units of work be small. We don't want to be burning someone's battery down for 30 minutes just for one unit of work (only to have them shut down Tor Browser long before the unit completes, or otherwise experience excessive captchas for a long period of time while waiting for more units of work).

The third property we need is the ability to easily submit the work over Tor without risking linkability through the additional submission of machine info, OS info, distributed client version info, etc etc etc.

We should do quick rundown of existing distributed computing efforts to see if any of them meet our needs before falling back on the prime factorization idea.

comment:3 Changed 8 years ago by Sebastian

I'm not sure what this would gain me as a service operator... It might slow down real DDoS attacks a little, but if i have one troll on my forum I bet that troll won't mind using a few cores to continuously prove work and keep posting spam. Did I miss a use case here? It seems to me that the services that ban Tor are mostly concerned with individuals, not so much real ddos.

comment:4 Changed 8 years ago by mikeperry

To be honest, I was primarily thinking about very large services. Search engines, large social media sites, and other services that measure abuse primarily in terms of the computational+monetary cost to them. My primary goal is to be able to provide a search engine with a "clean" feed of users that statistically can be regarded as mostly not bots (see #3962).

However, that doesn't mean it doesn't work for other sites too. Your small site admin could flip a switch that requires new accounts to have completed increasingly larger amounts of work, until the trolls give up and go away. The Nymble literature caters to this use case, but the only limiting resource the literature seriously suggests is IP address (which seem to be not scarce at all if you are clever).

comment:6 in reply to:  5 Changed 8 years ago by mikeperry

Replying to rransom:

See http://www.cl.cam.ac.uk/~rnc1/proofwork.pdf.

It appears the "failure" of proof-of-work in this paper hinges critically on the one-to-many ability of legitimate email to be sent to multiple recipients and via mailing lists, which would cause bottlenecks for a considerable percentage of legitimate users. Web services do not have this property in general.

Furthermore, email proof-of-work is competing against rather highly effective spam filters. Web services don't really have such highly-effective general-purpose technology. Our proof of work service would only be competing against Captchas and IP bans.

Also of interest is that as spam has become less profitable in recent years, spam volumes have actually dropped considerably. Increasing the cost of sending mail did in fact lower spam volume. It is just that improved filtering, user education, and botnet takedowns is what caused this particular increase in cost, not proof of work.

I am not convinced :).

comment:7 Changed 8 years ago by mikeperry

https://en.wikipedia.org/wiki/List_of_distributed_computing_projects

Most distributed computing projects seem to take on the order of an hour on average, but perhaps some of these are tunable for very short work units:
http://www.boinc-wiki.info/BOINC_FAQ:_Performance#What.27s_the_average_processing_time_per_Work_Unit.3F

comment:8 Changed 8 years ago by mikeperry

According to http://www.boinc-wiki.info/Choosing_a_BOINC_Powered_Project, other promising candidates include:

  1. ABC@Home
  2. The Lattice Project
  3. SETI @ Home
  4. Rosetta @ Home

This list excludes a couple projects from the FAQ entry in my previous comment because of either closed-source software or infrequent work availability.

All of these have either low unit of work times, and/or can be configured to perform smaller tasks. Though, on the lower bound we still appear to be looking at at least a half our of work. Additionally, the memory requirements for Rosetta in particular seem rather obscene.

In general, BOINC does seem to have some support for an "anonymous" mode that does not transmit your platform information to the server:
http://www.boinc-wiki.info/BOINC_Anonymous_Platform_Mechanism

comment:9 Changed 8 years ago by mikeperry

Some additional comments:

The Lattice Project is a generalized grid. It has several sub-projects of varying complexity, not all of which have availability.

Leiden Classical seems to also have short units of work, but potentially doesn't have full availability.

So far I'm not seeing a lot of convincing options.

We may have to make some mailinglist posts, to see if any major, fully committed projects might be willing to provide us with short units of work. Otherwise it seems like we're falling back to prime factorization or other asymmetric busywork.

comment:10 Changed 8 years ago by cypherpunks

See if Folding@home will work. Also, you could just mine Bitcoins and donate them to TTP. SETI@home is just looking for nearby space aliens; that would be a colossal waste.

comment:11 in reply to:  10 ; Changed 8 years ago by mikeperry

Replying to cypherpunks:

See if Folding@home will work. Also, you could just mine Bitcoins and donate them to TTP. SETI@home is just looking for nearby space aliens; that would be a colossal waste.

The problem with both Folding@Home and Bitcoin are that the units of work are waaaay up there, on the order of a day to days. Rosetta does similar computations to Folding@Home with shorter units of work, but has insane RAM requirements.

So if SETI could give us short units of work with low RAM requirements, I think it's certainly a more productive option than factoring products of small primes. After all, "Where is everybody???" https://en.wikipedia.org/wiki/Fermi_paradox#Name

comment:12 Changed 7 years ago by mikeperry

Cc: hellais nickm added

Every time I bring this up, someone complains about mobile users. I see two solutions:

  1. Mobile users only generate tokens when plugged in to AC with full battery.
  1. We add a nymble layer, and mobile users can get nymble ids via either computation or SMS message, their choice.

These solutions are orthogonal. I would like to avoid adding the nymble layer, because it's complex and we don't *really* need it. But if we do add it, we have a solution for mobile that doesn't *force* anyone to use SMS/non-tor.

comment:13 Changed 7 years ago by mikeperry

Keywords: SponsorZ-large added

The sum total of all of this work is a large task.

comment:14 Changed 6 years ago by mikeperry

Memory-bound puzzles might be a better option here for Mobile, actually:
https://research.microsoft.com/pubs/54395/memory-longer-acm.pdf

A search for "Memory-bound puzzles" on Google turns up about a dozen more papers.

comment:15 Changed 6 years ago by gk

Cc: g.koppen@… added

comment:16 in reply to:  11 Changed 6 years ago by dusty

Cc: pebbles@… added

Replying to mikeperry:

The problem with both Folding@Home and Bitcoin are that the units of work are waaaay up there, on the order of a day to days.

Work units do not have to be large to be monetizable for Bitcoin and related networks, see for example https://en.bitcoin.it/wiki/P2Pool .

One may find bitcoin as a solution here has a wide variety of other notable (but solveable) problems.

Really if my browser is forcing me to do random proof of work in the background, I'd like to be able to decide what kind of work it's doing myself, even if that means writing some kind of authenticated plugin.

comment:17 Changed 4 years ago by isis

Component: CompanyPonies

Recomponentising from "Company" to "Ponies". (Why was this ever in "Company" anyway?)

comment:18 Changed 3 years ago by cass

Severity: Normal

This ticket is tagged SponsorZ, but it looks like its path forward is unclear (and now it's in ponies). Do we still want to seek funding for it or can I clear it out of SponsorZ?

comment:19 Changed 3 years ago by nickm

I think this might fall under our tor-dos heading. If it's a good idea, we could look for funding for it! I'm only about 50% sure it's a good idea though. We might want to mention this as a possible future avenue in anti-dos work, but we should think more before we commit to doing it specifically.

comment:20 Changed 14 months ago by teor

Keywords: archived-closed-2018-07-04 added
Resolution: wontfix
Status: newclosed

Close all tickets in archived components

Note: See TracTickets for help on using tickets.