Opened 2 years ago

Closed 13 months ago

#23285 closed enhancement (wontfix)

Provide an index.json file on Tor Metrics containing stats files

Reported by: karsten Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Statistics Version:
Severity: Normal Keywords: metrics-2018
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

We have been discussing separating the data-aggregating part of metrics-web from the website part in the past. Here's a plan to make this happen:

  • We provide a new index file on Tor Metrics containing all stats files specified on the Statistics page, including path, size, and last-modified time. Example (with just a single file):
{
  "index_created": "2017-08-21 13:10",
  "path": "https://metrics.torproject.org",
  "directories": [
    {
      "path": "stats",
      "files": [
        {
          "path": "servers.csv",
          "size": 4794794,
          "last_modified": "2017-08-21 00:29"
        }
      ]
    }
  ]
}
  • The new index file will be available under https://metrics.torproject.org/index/index.json (does not exist yet) as well as .gz, .xz, etc.
  • The new file will be written right after running the periodic update twice per day as part of this script.
  • We might even include an "implementation_version" field as discussed in #21414.
  • We start using that file by putting a new table at the top of the Statistics page that lists all available files together with their size, last update time, and link to their specification. Like a table of contents. So far so good, this is not yet worth the effort. That comes next!
  • In the next step we write a little internal downloader that is part of the website part of metrics-web. That downloader periodically fetches the index.json file to see if there are updates to stats files. If there are, it downloads these files and stores them locally for rserve to produce new graphs based on the new data.
  • Now we can set up a second metrics-web instance somewhere that has the sole purpose of aggregating data. We might want to call it https://metrics2.torproject.org/ (or some other name, if we can settle on one). We point the periodic downloader to that host and fetch newly updated CSV files from there. And we turn off data-aggregating modules on the actual Tor Metrics website host. (Maybe it's easier to find a smaller host for the website and move that part, while keeping the data-aggregating parts in place. Whatever.)

Does this make sense?

Child Tickets

Change History (5)

comment:1 Changed 2 years ago by karsten

Component: Metrics/WebsiteMetrics/Statistics

Moving all tickets to Metrics/Statistics that are more related to the data-aggregating modules rather than the website parts of metric-web.

comment:2 Changed 2 years ago by karsten

Keywords: metrics-2018 added

comment:3 Changed 2 years ago by karsten

Keywords: metrics-2017 added; metrics-2018 removed

comment:4 Changed 21 months ago by iwakeh

Keywords: metrics-2018 added; metrics-2017 removed

Will be completed in 2018.

comment:5 Changed 13 months ago by karsten

Resolution: wontfix
Status: newclosed

With #27000 this ticket doesn't really make sense anymore. Closing.

Note: See TracTickets for help on using tickets.