Opened 7 years ago

Closed 7 years ago

#6484 closed defect (worksforme)

host=*.domain wildcarding broken

Reported by: grarpamp Owned by: pde
Priority: Medium Milestone:
Component: HTTPS Everywhere/EFF-HTTPS Everywhere Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

"
https://www.eff.org/https-everywhere/rulesets

A target may, however, contain a wildcard in one portion of the domain (like *.google.com or google.*, but *.google.* would not work). A wildcard on the left will match arbitrarily deep subdomains (for instance, *.facebook.com will match s-static.ak.facebook.com).1

Exception: currently this is not true for a target host that is less than three levels deep. <target host="*.com"> would match thing.com but not very.thing.com. We would consider changing that if anybody needs to use it. <target host="*"> means a ruleset should be tested for every single URL.
"

Sure, all fine. But put a said host='*.2ld.tld' in xmlconfig, and browse a 4ld.3ld.2ld.tld, and you will not be rewritten. So it's broken, or counter-documented.

I would fix it by sticking to the (pcre.org / www.regular-expressions.info/pcre.html) that the rules overall seem to conform to, rather than making the separate documented hack...
host='.*\.2ld\.tld$' is not so bad.... assuming I got this report right, that is.

If not, what regex are the rules written in?

Child Tickets

Change History (4)

comment:1 Changed 7 years ago by grarpamp

trac seems to interpret the leading caret... host='.*\.2ld\.tld$'

comment:2 Changed 7 years ago by pde

Do you have an example of a ruleset where this is broken?

Note that the <target host> elements in the rulesets are /not/ regular expressions. They have to be static strings so that they can be looked up efficiently in a hash table. The code that processes them is here:

https://gitweb.torproject.org/https-everywhere.git/blob/HEAD:/src/chrome/content/code/HTTPSRules.js#l504

the targets data structure is mostly created here:

https://gitweb.torproject.org/https-everywhere.git/blob/HEAD:/src/chrome/content/code/HTTPSRules.js#l309

The Chromium code is similar, IIRC.

comment:3 Changed 7 years ago by grarpamp

Probably first saw it in a rule[s] update ticket I helped. like pof.com.
Though for a simple test you could maybe host=*.org, from www.torproject.org to www.freebsd.org, and see where browser to www.torproject.org takes you.
I see those src codes in some ways but am not at that level yet.

comment:4 Changed 7 years ago by pde

Resolution: worksforme
Status: newclosed

This worked for me:

{{{<ruleset name="Freebsd.org">

<target host="*.org" />

<rule from="http://(www\.)?freebsd\.org/" to="https://www.torproject.org/" />

</ruleset>}}}

Note that if you're trying to redirect to HTTP, it won't work unless you add a "downgrade" attribute to the rule, like this:

https://gitweb.torproject.org/https-everywhere.git/blob/3.0development.5:/src/chrome/content/rules/Lenovo.xml#l50

Note: See TracTickets for help on using tickets.