Opened 3 months ago

Last modified 5 weeks ago

#27836 new defect

RSS feed https authentication

Reported by: atagar Owned by: qbi
Priority: Medium Milestone:
Component: Internal Services/Service - trac Version:
Severity: Normal Keywords:
Cc: hiro, ahf Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Hi trac admins. I maintain the r2e feed that provides our tor-wiki-changes@ list [1]. Its been broken for a while and today I spent a few hours digging into why.

This list relies on trac's RSS feed [2]. Https looks to be broken on it. In particular...

  • Firefox can view the feed, but unlike other trac pages indicates the site is insecure ('Page Info > Security' says the page is unencrypted and lacks any certificate).
  • Curl responds with a 403...
% curl -s 'https://trac.torproject.org/projects/tor/timeline?wiki=on&format=rss' | grep 'Forbidden'
      Error: Forbidden – Tor Bug Tracker & Wiki
          <h1>Error: Forbidden</h1>
  • Python's urllib (which is what r2e uses) fails with a 403 as well. It can access other trac pages...
import urllib.request
request = urllib.request.Request('https://trac.torproject.org/')

... but the feed fails with...

import urllib.request
request = urllib.request.Request('https://trac.torproject.org/projects/tor/timeline?wiki=on&format=rss')
% python3 scrap.py 
Traceback (most recent call last):
  File "scrap.py", line 4, in <module>
    urllib.request.urlopen(request)
  File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

At this point I'm pretty stumped. Something's broken with the feed and I'm unsure how to work around it. We don't care about encryption on these requests so if providing a plain http feed is easier than fixing this then happy to go with that too (presently http is a 302 to https).

Thanks!

[1] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-wiki-changes
[2] https://trac.torproject.org/projects/tor/timeline?wiki=on&format=rss

Child Tickets

Change History (2)

comment:1 Changed 5 weeks ago by traumschule

Cc: hiro ahf added

Hi atagar! The feed only works when logged in so a solution would be to make the script login first and send a cookie with the RSS request. Unfortunately this is necessary to prevent spam. AFAIK ahf is working on a similar script to scrape trac.

comment:2 Changed 5 weeks ago by atagar

Ah! Thanks traumschule. I could log in if this was my own script, but since this is r2e it's gonna require me to adjust their codebase a bit.

Can we drop the login requirement on a per-page basis? The rss feed is read-only and as such there's no risk of spam. It sounds like this is simply collateral damage from spam elsewhere causing a site-wide restriction.

Note: See TracTickets for help on using tickets.