Opened 5 years ago

Closed 5 years ago

#6439 closed defect (fixed)

Arm periodically freezes

Reported by: atagar Owned by: atagar
Priority: Very High Milestone:
Component: Core Tor/Nyx Version:
Severity: Keywords:
Cc: ioerror Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Arm version 1.4.4 had an issue where it froze after running for several days. I managed to get a reliable repro for this sort of glitch and tracked it to an interaction between readline and screen which has been fixed in 1.4.5, but the original problem persists.

At this point I'm at a loss for what is causing it or how to diagnose the problem since it takes days to manifest. At present I'm trying a sort of git bisect to narrow down the issue. I'll update this ticket as I narrow things down.

Commit e249dc8 (version 1.4.5.0) - broken
Commit d0bb81a (version 1.4.2.0) - testing...
Commit f403ccc (version 1.4.0.0) - works

Child Tickets

Change History (15)

comment:1 Changed 5 years ago by atagar

Commit e249dc8 (version 1.4.5.0) - broken
Commit 6cf4836 (version 1.4.3.0) - testing...
Commit d0bb81a (version 1.4.2.0) - works
Commit f403ccc (version 1.4.0.0) - works

comment:2 Changed 5 years ago by atagar

Commit e249dc8 (version 1.4.5.0) - broken
Commit edcde43 (version 1.4.4.1) - testing...
Commit 6cf4836 (version 1.4.3.0) - works
Commit d0bb81a (version 1.4.2.0) - works
Commit f403ccc (version 1.4.0.0) - works

comment:3 Changed 5 years ago by atagar

Commit e249dc8 (version 1.4.5.0) - broken
Commit 3f71b51 - testing...
Commit edcde43 (version 1.4.4.1) - works
Commit 6cf4836 (version 1.4.3.0) - works
Commit d0bb81a (version 1.4.2.0) - works
Commit f403ccc (version 1.4.0.0) - works

comment:4 Changed 5 years ago by ioerror

  • Cc ioerror added

comment:5 Changed 5 years ago by atagar

Commit e249dc8 (version 1.4.5.0) - broken
Commit a570d9f - testing...
Commit 3f71b51 - mostly works *
Commit edcde43 (version 1.4.4.1) - works
Commit 6cf4836 (version 1.4.3.0) - works
Commit d0bb81a (version 1.4.2.0) - works
Commit f403ccc (version 1.4.0.0) - works

  • Didn't freeze but borders failed to render and it started experiencing some input glitches. This was just before the curses/readline module fix so that's probably the issue.

comment:6 Changed 5 years ago by atagar

Commit e249dc8 (version 1.4.5.0) - broken
Commit a570d9f - broken
Commit b86e5bf - testing...
Commit 3f71b51 - mostly works
Commit edcde43 (version 1.4.4.1) - works
Commit 6cf4836 (version 1.4.3.0) - works
Commit d0bb81a (version 1.4.2.0) - works
Commit f403ccc (version 1.4.0.0) - works

Yay, finally got the broken functionality. The wide characters used to draw the box around dialogs (for instance the help dialog) were replaced by characters and character input was screwed up (for instance, arrows were interpreted as a 'q').

I'm a little worried that the problem really lies between edcde43 and 3f71b51, with the readline issues masking whatever the real root cause is (maybe related). If this turns out to be the case then I'll disable the interpretor panel so I can, at least, narrow our woes to that.

comment:7 Changed 5 years ago by atagar

Commit e249dc8 (version 1.4.5.0) - broken
Commit a570d9f - broken
Commit b86e5bf - broken
Commit 3f71b51 - mostly works
Commit edcde43 (version 1.4.4.1) - works
Commit 6cf4836 (version 1.4.3.0) - works
Commit d0bb81a (version 1.4.2.0) - works
Commit f403ccc (version 1.4.0.0) - works

Yup, screwy behavior seems to be related to the readline fix in b86e5bf. That's... frustrating. Giving 3f71b51 another try to see how much that brokenness resembles the later badness. Then I'll probably try removing the readline module entirely (it's not needed for the interpretor panel, rather it's just used for the non-curses interface).

comment:8 Changed 5 years ago by atagar

On second thought giving edcde43 (version 1.4.4.1) another try. That had the interpretor panel so if that was really the issue then it should be exhibiting the bug. The 1.4.5 changes were mostly bug fixes on top of that and I could have sworn that 1.4.4 exhibited terminal glitches.

If it really is fine then the readline import might be a red herring.

comment:9 Changed 5 years ago by atagar

Ah ha, guess it just needed more time. edcde43 exhibited the same issues as 3f71b51...

Commit e249dc8 (version 1.4.5.0) - broken
Commit a570d9f - broken
Commit b86e5bf - broken
Commit 3f71b51 - mostly works
Commit edcde43 (version 1.4.4.1) - mostly works
Commit 6cf4836 (version 1.4.3.0) - works
Commit d0bb81a (version 1.4.2.0) - works
Commit f403ccc (version 1.4.0.0) - works

Dropping the interpretor panel via the armrc and giving that a shot. If the interpretor panel is indeed the culprit then I might just turn it off by default until we have a fix. I've never heard of people using it... which is sad because it's a very sweet feature.

comment:10 Changed 5 years ago by atagar

Yup, ran edcde43 for eight days without the interpertor panel and it worked perfectly. Seeing if we can drop the panel to address the issue in e249dc8 too (it might be a different issue).

comment:11 Changed 5 years ago by atagar

Nope, still busted. The bisect indicates that the interpretor panel added instability and my attempt to fix it (b86e5bf) made things worse. It's strange then that dropping the panel didn't remedy the situation. I also tried commenting out the interpretorPanel import in e249dc8 to be doubly sure that it wasn't inadvertently triggering something but no luck.

I'm trying 3f71b51 with a similar import removal. If it breaks then I'll try a bisect with this change since that could indicate that the interpretor panel is a red herring after all.

comment:12 Changed 5 years ago by atagar

Nope, had arm spaz with 3f71b51. Going back to bisect between this and 6cf4836.

[ with interpretor panel removed ]
Commit e249dc8 (version 1.4.5.0) - broken
Commit 3f71b51 - broken
Commit edcde43 (version 1.4.4.1) - [untested]
Commit 6cf4836 (version 1.4.3.0) - testing...

comment:13 Changed 5 years ago by atagar

Ah ha! There was an import that I wasn't accounting for in starter.py - that would explain how we were still seeing screwy interpretor related behavior. Dropping that and re-testing e249dc8.

comment:14 Changed 5 years ago by atagar

So far so good. Well, actually when I just checked arm the screen was completely screwed up, but pressing any key to refresh the page cleaned it up (probably just a normal screen/curses hiccup). The arm session is otherwise healthy, not exhibiting the earlier issues.

I'll give it some more time and if it continues to be healthy I'll drop the interpretor panel from arm.

comment:15 Changed 5 years ago by atagar

  • Resolution set to fixed
  • Status changed from new to closed

Three more days and no freeze. Screen's ASC support broke, but this is a known issue for come terminals...

https://gitweb.torproject.org/arm.git/blob/HEAD:/README#l108

Dropped the interpretor panel from arm. It'll live on as a separate application, but as a part of arm its been impressively unused.

https://gitweb.torproject.org/arm.git/commitdiff/582bd8e556ea26cc1c7e2ba2aa3a6b8d859fdd2b

Note: See TracTickets for help on using tickets.