Provide an index.json file on Tor Metrics containing stats files
We have been discussing separating the data-aggregating part of metrics-web from the website part in the past. Here's a plan to make this happen:
- We provide a new index file on Tor Metrics containing all stats files specified on the Statistics page, including path, size, and last-modified time. Example (with just a single file):
{
"index_created": "2017-08-21 13:10",
"path": "https://metrics.torproject.org",
"directories": [
{
"path": "stats",
"files": [
{
"path": "servers.csv",
"size": 4794794,
"last_modified": "2017-08-21 00:29"
}
]
}
]
}
-
The new index file will be available under
https://metrics.torproject.org/index/index.json
(does not exist yet) as well as.gz
,.xz
, etc. -
The new file will be written right after running the periodic update twice per day as part of this script.
-
We might even include an
"implementation_version"
field as discussed in #21414 (moved). -
We start using that file by putting a new table at the top of the Statistics page that lists all available files together with their size, last update time, and link to their specification. Like a table of contents. So far so good, this is not yet worth the effort. That comes next!
-
In the next step we write a little internal downloader that is part of the website part of metrics-web. That downloader periodically fetches the
index.json
file to see if there are updates to stats files. If there are, it downloads these files and stores them locally for rserve to produce new graphs based on the new data. -
Now we can set up a second metrics-web instance somewhere that has the sole purpose of aggregating data. We might want to call it
https://metrics2.torproject.org/
(or some other name, if we can settle on one). We point the periodic downloader to that host and fetch newly updated CSV files from there. And we turn off data-aggregating modules on the actual Tor Metrics website host. (Maybe it's easier to find a smaller host for the website and move that part, while keeping the data-aggregating parts in place. Whatever.)
Does this make sense?