Opened 6 months ago

Closed 6 months ago

#34286 closed defect (fixed)

gettor appears to be in an email loop war with a .sk address

Reported by: arma Owned by: cohosh
Priority: Medium Milestone:
Component: Applications/GetTor Version:
Severity: Normal Keywords:
Cc: traumschule, hiro, gaba, phw, cohosh Actual Points:
Parent ID: Points:
Reviewer: phw Sponsor:

Description

I happened to be looking at eugeni's mail.log for other debugging, and saw that approximately 25% of the lines in mail.log contain the string gettor.

(Yesterday, eugeni's postfix had 460k lines in it, and 101k of them said "gettor" in them. Today in the first hour or so, it's 7k out of 25k.)

Does gettor get into fights with external addresses, where it replies to the bounce, gets another bounce and replies to that, etc?

There are probably smart guidelines for avoiding mail loop wars, like not answering names that start with mailer-domain, checking for the presence of an X-Something-Something header, or rate limiting responses to a given address.

And this is a great case where unifying how bridgedb handles its email answers, and how gettor does it, will save a lot of headache.

Child Tickets

Change History (11)

comment:1 Changed 6 months ago by cohosh

Owner: set to cohosh
Status: newassigned

Aha! Perhaps this is the reason we're getting so many repeat requests for icelandic versions of Tor Browser? I thought someone wrote a bot but this seems much more likely.

comment:2 in reply to:  description Changed 6 months ago by cohosh

Reviewer: phw
Status: assignedneeds_review

Here's a patch: https://gitlab.torproject.org/torproject/anti-censorship/gettor-project/gettor/-/merge_requests/10

This was a really good catch.

Replying to arma:

There are probably smart guidelines for avoiding mail loop wars, like not answering names that start with mailer-domain, checking for the presence of an X-Something-Something header, or rate limiting responses to a given address.

I couldn't find any best practices while poking around, so I went with ignoring all emails from mailer-daemon@ addresses. It will work for most cases for now.

We have a rate-limiter in place, but it only ensures that a single user doesn't request links too many times per minute. These means at least one of these auto-generation loop emails is still getting in every minute. I don't want to limit the total requests received from a given email because it's reasonable to expect someone to want to download Tor multiple times in their life (see also #33123).

And this is a great case where unifying how bridgedb handles its email answers, and how gettor does it, will save a lot of headache.

Yeah, from what I can tell it actually looks like bridgedb might handle this by rate limiting (I'll need to look into it further). We might not want to handle this in the same way since bridgedb's reasons for rate limiting (preventing enumeration) are different from gettor's.

But in general I agree that this a case in which it is a pain to repeatedly solve problems for each system separately.

comment:3 Changed 6 months ago by phw

Status: needs_reviewmerge_ready

Looks good to me!

On a slightly related note: I believe that an email's body is supposed to be separated by two (rather than one) newlines from its header. GetTor's unit tests are using only one (and mix \n with \r\n). Python's email module is also confused by this and thinks that the body is part of the To field:

In [1]: from email import message_from_string
In [3]: m=message_from_string("From: MAILER-DAEMON@mx1.riseup.net\nSubject: Undelivered Mail Returned to Sender\r\nTo: gettor@torproject.org\n osx en\n")
In [6]: m.items()
Out[6]: 
[('From', 'MAILER-DAEMON@mx1.riseup.net'),
 ('Subject', 'Undelivered Mail Returned to Sender'),
 ('To', 'gettor@torproject.org\n osx en')]

This seems like something we should fix.

comment:4 in reply to:  3 Changed 6 months ago by cohosh

Replying to phw:

Looks good to me!

Thanks, merged and deployed at 2020-05-23T14:21:52+0000.

On a slightly related note: I believe that an email's body is supposed to be separated by two (rather than one) newlines from its header. GetTor's unit tests are using only one (and mix \n with \r\n). Python's email module is also confused by this and thinks that the body is part of the To field:

Created #34300 for this, thanks!

comment:5 Changed 6 months ago by cohosh

Status: merge_readyassigned

Okay I'm still getting the repeated requests so I looked at the emails in the database and found the following:

gettor@torproject.org|links|windows|is|email|20200523142434|ONHOLD
postmaster@[REDACTED].sk|links|windows|is|email|20200523142435|ONHOLD

So our list was incomplete :) Setting this back to assigned to add GetTor's own email and postmaster@ emails to the list we won't respond to.

comment:6 Changed 6 months ago by cohosh

Status: assignedneeds_review

comment:7 Changed 6 months ago by phw

Status: needs_reviewmerge_ready

Looks good to me! I left a nitpick as comment.

comment:8 Changed 6 months ago by cohosh

Resolution: fixed
Status: merge_readyclosed

Merged and deployed at 2020-05-29T19:04:49+0000.

Seems to be fixed now :)

comment:9 Changed 6 months ago by phw

Resolution: fixed
Status: closedreopened

I'm reopening this because we seem to have overlooked a bug. GetTor chocked earlier today and I found the following exception in its log file:

2020-05-29 19:04:43+0000 [email parser] Error while parsing email content: [Failure instance: Traceback: <class 'KeyError'>: 'command'
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:311:addCallbacks
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:654:_runCallbacks
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:1613:unwindGenerator
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:1529:_cancellableInlineCallbacks
	--- <exception caught here> ---
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:1418:_inlineCallbacks
	/srv/gettor.torproject.org/home/gettor/gettor/parse/email.py:256:parse_callback
	].

When GetTor detects an autoresponder, it returns an empty request dictionary, {}. GetTor then calls parse_callback, which assumes that the given request has the "command" key but that's not the case.

Here's a fix:
https://gitlab.torproject.org/tpo/anti-censorship/gettor-project/gettor/-/merge_requests/13

Version 0, edited 6 months ago by phw (next)

comment:10 Changed 6 months ago by cohosh

Status: reopenedmerge_ready

lgtm. Thanks for catching this!

comment:11 Changed 6 months ago by cohosh

Resolution: fixed
Status: merge_readyclosed

Merged and deployed at 2020-05-30T02:03:11+0000

Note: See TracTickets for help on using tickets.