Is non-determinism in test helper deployment or MLab-ns API acceptable?
This issue was automatically migrated from github issue https://github.com/TheTorProject/ooni-probe/issues/118.
Close this ticket with a yes / no.
The MLab initialize.sh
script for Ooni selects which test helpers bind to a given port randomly. The requirement is for the same port to provide multiple distinct test helpers, so the current strategy is to partition the MLab slices (and thus IP addresses) for each port according to how many helpers require that port. The random selection accomplishes this in a stateless / configuration-free manner.
Meanwhile, the probe will use the mlab-ns
web service to request test helpers and a collector prior to running a net-test. This service currently responds non-deterministically (with various constraints and prioritizations such as scoring based on load).
The question is: Are these two sources of non-determinism a problem?
For scientific repeatability, randomness adds noise. For diagnostic reasons, determinism can make it simpler to understand logs or report data. For security reasons, censors might be able to game non-determinism in a way to favor particular test results. It may be that none of these concerns are strong enough (also considering the dev cost of removing the non-determinism).
If the answer is "no", there's a dev cost implication for mlab-ns
which should be coordinated with MLab.