Opened 2 weeks ago

Last modified 31 hours ago

#29425 needs_revision enhancement

Write integration tests for data-processing modules

Reported by: karsten Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Statistics Version:
Severity: Normal Keywords: metrics-roadmap-2019-q2
Cc: metrics-team Actual Points:
Parent ID: Points: 8
Reviewer: irl Sponsor:

Description

We discussed in Brussels that we'll need at least integration tests for metrics-web in order to make code changes like the Java 8 Date/Time API update.

I started working on this. Here's what I did:

  • Pick a small set of descriptors as test data that are sufficient to produce at least something as .csv files.
  • Write a script that runs all data-processing modules.
  • Run the script once to get output that we would expect from future test runs.

The result is too big (IMHO) to add to the Git repository. That's why I uploaded it here:

https://people.torproject.org/~karsten/volatile/metrics-web-integ-tests.tar

shasum -a 256 metrics-web-integ-tests.tar 
728c4e4ee184f2260cd30286f2925aa627d7b3d572a236b646021cc1b461de10  metrics-web-integ-tests.tar

It would be great if somebody else besides me tries this out and verifies that their run produces the same output.

A next good step after that would be to talk about where/how to put this under version control. If that's impossible, we might be able to reduce the test data size a bit more, but maybe not as substantial as we'd want.

Oh, and we could probably fetch libs from Debian rather than shipping them. I didn't bother for now.

Child Tickets

Change History (5)

comment:1 Changed 2 weeks ago by karsten

Status: newneeds_review

comment:2 Changed 12 days ago by irl

Keywords: metrics-roadmap-2019-q2 added
Points: 8

comment:3 in reply to:  description Changed 2 days ago by karsten

Replying to karsten:

A next good step after that would be to talk about where/how to put this under version control. If that's impossible, we might be able to reduce the test data size a bit more, but maybe not as substantial as we'd want.

I put some more thoughts on this. How about we create a new metrics-test.git repository for this? Testing metrics-web could be the start, and we could easily extend tests to other metrics code bases in the future. To be clear, unit tests would still exist in the respective repositories, this would just be for integration/system tests.

comment:4 Changed 2 days ago by irl

Reviewer: irl

Planned for review party tomorrow.

comment:5 Changed 31 hours ago by irl

Status: needs_reviewneeds_revision

git-annex is what I have used in the past for storing large test data in git repositories. Our git server does not currently support this but we could look at adding support. If it requires us to upgrade the gitolite then that is not an easy project, if we are running a new enough version then it could just be some lines in the config.

I looked at running the script:

Cloning metrics-web Git repository into metrics-web/ subdirectory...
Cloning into 'metrics-web'...
remote: Counting objects: 14553, done.
remote: Compressing objects: 100% (1213/1213), done.
remote: Total 14553 (delta 977), reused 503 (delta 233)
Receiving objects: 100% (14553/14553), 16.55 MiB | 2.64 MiB/s, done.
Resolving deltas: 100% (8303/8303), done.
Bootstrapping development environment...
Submodule 'src/build' (https://git.torproject.org/metrics-base.git) registered for path 'src/build'
Submodule 'src/submods/metrics-lib' (https://git.torproject.org/metrics-lib.git) registered for path 'src/submods/metrics-lib'
Cloning into '/tmp/metrics-web-integ-tests/metrics-web/src/build'...
Cloning into '/tmp/metrics-web-integ-tests/metrics-web/src/submods/metrics-lib'...
Submodule path 'src/build': checked out 'e639c697e9e94c6dbb26e946e5247c20a62c0661'
Submodule path 'src/submods/metrics-lib': checked out '23927c2777f273c42ad3e75fc0a2940ed8eb4bf6'
Submodule 'src/build' (https://git.torproject.org/metrics-base) registered for path 'src/build'
Cloning into '/tmp/metrics-web-integ-tests/metrics-web/src/submods/metrics-lib/src/build'...
Submodule path 'src/build': checked out 'e639c697e9e94c6dbb26e946e5247c20a62c0661'
Replacing absolute paths in build.xml with relative paths...
sed: can't read s/.srv.metrics.torproject.org.metrics/./: No such file or directory

I like the concept though. This is the sort of thing we can run in CI or Vagrant to be able to run it in disposable environments.

Note: See TracTickets for help on using tickets.