Opened 5 months ago
Last modified 4 months ago
#27836 new defect
RSS feed https authentication
Reported by: | atagar | Owned by: | qbi |
---|---|---|---|
Priority: | Medium | Milestone: | |
Component: | Internal Services/Service - trac | Version: | |
Severity: | Normal | Keywords: | |
Cc: | hiro, ahf | Actual Points: | |
Parent ID: | Points: | ||
Reviewer: | Sponsor: |
Description
Hi trac admins. I maintain the r2e feed that provides our tor-wiki-changes@ list [1]. Its been broken for a while and today I spent a few hours digging into why.
This list relies on trac's RSS feed [2]. Https looks to be broken on it. In particular...
- Firefox can view the feed, but unlike other trac pages indicates the site is insecure ('Page Info > Security' says the page is unencrypted and lacks any certificate).
- Curl responds with a 403...
% curl -s 'https://trac.torproject.org/projects/tor/timeline?wiki=on&format=rss' | grep 'Forbidden' Error: Forbidden – Tor Bug Tracker & Wiki <h1>Error: Forbidden</h1>
- Python's urllib (which is what r2e uses) fails with a 403 as well. It can access other trac pages...
import urllib.request request = urllib.request.Request('https://trac.torproject.org/')
... but the feed fails with...
import urllib.request request = urllib.request.Request('https://trac.torproject.org/projects/tor/timeline?wiki=on&format=rss')
% python3 scrap.py Traceback (most recent call last): File "scrap.py", line 4, in <module> urllib.request.urlopen(request) File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.6/urllib/request.py", line 532, in open response = meth(req, response) File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python3.6/urllib/request.py", line 570, in error return self._call_chain(*args) File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain result = func(*args) File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden
At this point I'm pretty stumped. Something's broken with the feed and I'm unsure how to work around it. We don't care about encryption on these requests so if providing a plain http feed is easier than fixing this then happy to go with that too (presently http is a 302 to https).
Thanks!
[1] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-wiki-changes
[2] https://trac.torproject.org/projects/tor/timeline?wiki=on&format=rss
Child Tickets
Change History (2)
comment:1 Changed 4 months ago by
Cc: | hiro ahf added |
---|
comment:2 Changed 4 months ago by
Ah! Thanks traumschule. I could log in if this was my own script, but since this is r2e it's gonna require me to adjust their codebase a bit.
Can we drop the login requirement on a per-page basis? The rss feed is read-only and as such there's no risk of spam. It sounds like this is simply collateral damage from spam elsewhere causing a site-wide restriction.
Hi atagar! The feed only works when logged in so a solution would be to make the script login first and send a cookie with the RSS request. Unfortunately this is necessary to prevent spam. AFAIK ahf is working on a similar script to scrape trac.