The task manager is currently designed to recursively call _fillSlots that will then call _run that then again will call _fillSlots on success or failure. This means that when there are a lot of tasks failing very quickly it is very likely that the default python recursion limit will be overcome (1000).
To reproduce this bug you can try and run a test with a long invalid input for example http_requests:
Note that the fact that this test fails is correct, however it fails in a surprising manner:
Unhandled error in Deferred:Unhandled ErrorTraceback (most recent call last): File "/ooni-probe/ooni/managers.py", line 153, in _failed super(LinkedTaskManager, self)._failed(result, task) File "/ooni-probe/ooni/managers.py", line 44, in _failed task.done.errback(failure) File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/internet/defer.py", line 423, in errback self._startRunCallbacks(fail) File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/internet/defer.py", line 490, in _startRunCallbacks self._runCallbacks()--- <exception caught here> --- File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/internet/defer.py", line 577, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/ooni-probe/ooni/director.py", line 188, in measurementFailed log.msg("Failed doing measurement: %s" % measurement) File "/ooni-probe/ooni/utils/log.py", line 62, in msg print "%s" % msg File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/python/log.py", line 505, in write msg(message, printed=1, isError=self.isError) File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/python/threadable.py", line 53, in sync return function(self, *args, **kwargs) File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/python/log.py", line 185, in msg actualEventDict = (context.get(ILogContext) or {}).copy() File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/python/context.py", line 121, in getContext return self.currentContext().getContext(key, default)exceptions.RuntimeError: maximum recursion depth exceeded
I think this bug is perhaps a good opportunity to discuss some possible refactoring of the task scheduler related code. It may be a good idea to draw some inspiration from: https://github.com/terrycojones/txrdq