Opened 6 years ago

Closed 5 years ago

Last modified 5 years ago

#10174 closed enhancement (fixed)

Ruleset bloat -> memory usage, startup time. Replace by HTTPSF

Reported by: Faziri Owned by: pde
Priority: Medium Milestone:
Component: HTTPS Everywhere/EFF-HTTPS Everywhere Version:
Severity: Keywords:
Cc: marnick.leau@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Problem:
HTTPS Everywhere consumes over 30MB of RAM according to WMI and over 10MB according to about:memory. Startup time is noticeably slower.

Cause:
The default ruleset is absolutely massive, over 3MB in XML.

Suggestion:
Remove the default ruleset. Create a tighter cooperation with HTTPS Finder to compensate, or even implement HTTPSF's functionality into HTTPSE.

Reasoning:
1) same problem as with Adblock Plus' Easylist and everything else based on community rules that are stored client-side: 99.99% goes unused and only bogs down the client without ever being useful to an individual user. Messy, inefficient and not very helpful since a lot of users' regular sites are not in the ruleset anyway.
2) replacing the rulesets by a cooperation with HTTPS Finder means that users will have ONLY the rulesets for the sites they themselves visit, making the ruleset 100% efficient at the measly cost of a single button click per new website.

In fact, why not just integrate HTTPS Finder (or its functionality anyway) into HTTPSE? And while you're at it, why not upload it to AMO so we can get updates and everything?

HTTPSF broke in the latest Firefox/Palemoon and there hasn't been any activity on the code to fix it for 3 months. HTTPSF not working means HTTPSE not working for many people who surf a lot. Essentially, HTTPSE partially depends on HTTPSF on the user end. Merging the two improves the reliability of the entire concept.

Please consider, I miss these 2 a lot but HTTPSF does not work and HTTPSE is just a bloaty mess right now.

Child Tickets

Change History (5)

comment:1 Changed 6 years ago by arma

Component: - Select a componentEFF-HTTPS Everywhere
Owner: set to pde

comment:2 Changed 6 years ago by zyan

Ugh, this is a problem. You are not the first to complain about the memory issue.

I guess the way to combine us with HTTPS Finder would be to mark some set of the existing stable rules as important enough to be included in every HTTPS Everywhere download. Then when a user encounters a new host that isn't in the ruleset, HTTPS Finder checks if it can force HTTPS, generates a new rule, and adds it to the ruleset.

This brings up some questions:

  • If HTTPS Finder finds that a domain doesn't support HTTPS, does it test it again in the future? If not, for how long does it remember test results?
  • What happens when a new rule gets added to the stable HTTPS Everywhere ruleset, and someone already has another version of that rule that was generated with HTTPS Finder? Do we overwrite their version when they upgrade?
  • One of the features of HTTPS Everywhere currently is that it doesn't leak information about someone's browsing history by default. This is not true of HTTPS Finder. There you have the choice of either keeping a persistent file with a list of every HTTPS site that you've visited or deleting this file periodically and recreating all of your custom rules from scratch.

BTW, there's potentially easier ways to decrease memory usage that will work for near future. I'm testing putting as many HTTPS Everywhere rules as possible into the browsers' HSTS list and seeing if that cuts down dramatically on memory usage.

Regarding AMO, we haven't put ourselves in the catalog in the past because EFF's privacy policy is more protective than Mozilla's.

-Yan

comment:3 Changed 6 years ago by Faziri

It's not just about the RAM usage (though lower is always better), but mostly about the performance impact of parsing so many rules and storing/reading them in memory. Installing Easylist in Adblock Plus is a comparable action (though Easylist isn't even half the size) and even that creates massive lag.

Shortening the filter list would definitely help. Just a very small list of filters for only the most common domains practically anyone is sure to visit from time to time would be great.

1) I'd say maybe a week or so. Keep a map of domains-dates that acts as a whitelist and delete entries older than a week.
2) Overwrite a user's rules by default, the developers of HTTPSE and the list managers can be expected to know better when it comes to writing the most useful/correct filters. The list should be kept small enough that an update of the built-in filters can be displayed to the user so that (s)he can opt out of overwriting certain rules ("Select which rules you'd like to overwrite with the latest update: [list with checkboxes]"). Conflicts between the update and existing filters can be detected and shown.
3) Why does it keep a list of all sites you've visited? All it should keep is the whitelist of domains that (temporarily) don't need to be checked for HTTPS and the .xml user rules it creates. Acquiring either list can reveal part of your browsing history, but I find that something odd to be concerned about. Also, you could record someone's entire browsing history anyway by just monitoring the outgoing/incoming traffic destinations. If you're close enough to get a hold of those lists in the add-on, you're more than close enough to eavesdrop on the destination IPs.

Adding the large filter list to the browser's list probably won't help too much performance-wise: it still has the same problem of being huge and 99% irrelevant to the individual user.

Not meaning to shoot you down or anything, just pointing out what I think of it. :)

Last edited 6 years ago by Faziri (previous) (diff)

comment:4 Changed 5 years ago by pde

Resolution: fixed
Status: newclosed

HTTPS Finder produces incorrect rulesets that break thousands of websites, so there's no way we can ship it.

However the startup time and RAM usage issues were real. Git master has a new sqlite ruleset storage engine implemented by jsha. It works very well at our current size (10K rulesets) and should scale further.

comment:5 in reply to:  4 Changed 5 years ago by Faziri

Replying to pde:

HTTPS Finder produces incorrect rulesets that break thousands of websites, so there's no way we can ship it.

However the startup time and RAM usage issues were real. Git master has a new sqlite ruleset storage engine implemented by jsha. It works very well at our current size (10K rulesets) and should scale further.

The only site I've ever had that problem on is DeviantArt, it works flawlessly for any other website

Note: See TracTickets for help on using tickets.