Come up with a better interface between metrics-db and metrics-web
We should rethink the interface between metrics-db and metrics-web to make it easier for other people to create a better metrics or TorStatus-like website.
The current distinction is mostly that metrics-db contains all parts that need to be run periodically, triggered by the operating system, whereas metrics-web has the code to display the website, triggered by web users. The main interface between the two is the database schema.
It seems that the current distinction is not ideal, because changes to the website affect both metrics-db (for the database schema) and metrics-web (for the display parts). We should move the database to metrics-web and not make it the interface between metrics-db and metrics-web. A better interface would be the various descriptor formats that we use for metrics. Having the files as the interface also makes more sense, because we don't provide a database to the world anyway, but tarballs. If someone wants to set up a metrics-website, they need to set up metrics-db to import descriptors into a database and metrics-web for the website part. It would be better to give them the tarballs and just metrics-web. If they don't like their metrics website, they can easily change or replace metrics-web and do their own thing.
Here's the proposed distinction: metrics-db shall do all the data processing and sanitizing and metrics-web shall import descriptors into a database and run the website. In particular, they would perform the following tasks:
-
metrics-db
- read relay descriptors from a local Tor data directory and/or weasel's directory archive script, and download missing relay descriptors from the directory authorities
- read and sanitize bridge descriptors and bridge pool assignments
- download and archive exit lists
- download and archive GetTor files
- download and archive Torperf files
- write all output files to the local file system
-
metrics-web
- read input files from the local file system
- import relay descriptors into a database
- provide relay search capability
- provide ExoneraTor capability
- process relay and bridge descriptors for daily user statistics
- process Torperf files for Torperf statistics
- process GetTor files for GetTor statistics
- display customizable graphs
- provide CSV files
- make tarballs for download
- analyze current consensus and votes for possible voting process problems
- provide list of currently running relay
If we want to implement this ticket, a fair amount of code will move from metrics-db to metrics-web. I might start on this unless someone tells me it's a really stupid idea.