Opened 8 months ago

Closed 4 weeks ago

#28152 closed enhancement (fixed)

Gettor code refactor with Python Twisted

Reported by: ilv Owned by: hiro
Priority: Very High Milestone:
Component: Applications/GetTor Version:
Severity: Major Keywords: gettor-roadmap
Cc: gaba, phw Actual Points:
Parent ID: #28232 Points:
Reviewer: phw Sponsor: Sponsor19

Description

Code refactor

Gettor needs some love. It should be more robust to make it: easier to maintain (by me or somebody else), to know when it is working or not, and to allow more developers to contribute to it.

For the above, I propose to refactor the current code and turn it into a twisted daemon [1, 2]. This would preserve the main logic of the current system and add all the benefits of having a daemonized application. This service approach considers two main parts:

  1. Distribution channels. Whenever gettor receives a request or sends a reply it uses a channel (e.g. e-mail). Each channel could be handled by one or more services. These services would be constantly fetching and updating information in a SQLite database to know how to proceed.

In the case of e-mail, there should be a script that receives messages forwarded by the MTA, process them, and add a request with a given status to the SQLite database. On the other hand, a service running on background will be fetching ready-to-be-sent requests from the database and send e-mails with the requested information.

For a twitter bot, a single service that receives DMs, process them and send replies would be enough.

  1. Tor Browser sync. A service constantly checking new Tor Browser releases, downloading the new packages and updating the SQLite database with the new links.

The logging system provided by twistd is easy to use and works very well. This will solve one of the problems with the current code and the use of logging, also providing useful information for debugging and statistics. Log rotation is automatic.

I have developed a similar service using twistd. Adapting it to gettor would be fairly easy and it would take me a few weeks of spare time.

Twisted is not installed on getulum, so I will collect all the needed packages and ask for them to be installed.

1: https://twistedmatrix.com/documents/current/core/howto/application.html.
2: https://twistedmatrix.com/documents/current/core/howto/basics.html#twistd

Child Tickets

TicketTypeStatusOwnerSummary
#1593taskclosedkanerImplement test (-t switch) functionality
#28091taskclosedtraumschulePort GetTor to python3
#28905taskclosedtpaPlease install packages for twisted gettor

Change History (20)

comment:1 Changed 8 months ago by traumschule

See also #28091

comment:2 Changed 8 months ago by traumschule

Parent ID: #28232

#28232 now groups GetTor working areas.

comment:3 Changed 8 months ago by hiro

Hi I have been looking at the code of gettor and bridge db. I am thinking many of the functionalities implemented in twistd in bridgedb could be easily ported.

Also I think it would help to have scripts between:

  • development scripts
  • install scripts

With developmente scripts I would group everything that can be used or is needed while developing or testing gettor.
With install scripts I would group scripts that can be used to install and deploy the service.

Last edited 8 months ago by hiro (previous) (diff)

comment:4 Changed 4 months ago by pili

Cc: hiro added

comment:5 Changed 2 months ago by gaba

Keywords: gettor-roadmap added
Sponsor: Sponsor19

comment:6 Changed 2 months ago by gaba

Cc: hiro removed
Owner: changed from ilv to hiro
Status: newassigned

comment:7 Changed 2 months ago by gaba

Status: assignedneeds_review

comment:8 Changed 2 months ago by hiro

Current branch: https://gitweb.torproject.org/user/hiro/gettor.git/tree/?h=refactoring
Also we should review the list of people currently able to contribute to gettor. At the moment this list includes:
arma ilv sukhbir

comment:9 Changed 8 weeks ago by gaba

Reviewer: phw

Adding phw to review this code.

comment:10 Changed 7 weeks ago by phw

I had a look at the branch and here are my thoughts:

  • I find it a bit difficult to understand what some of the commits are trying to accomplish. For example, 9f5394e7 says "Start with main task" but I'm not quite sure what that means. Another example: commit 5271b12c says "Update script". A more descriptive commit message would be "Use Python 3-style format message." Here's a page that provides an excellent overview of crisp and useful commit messages: https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project
  • Several commits seem to group unrelated changes together, in a single commit. For example, 309e4e38 says "Create script to add links to db" but it also adds and modifies code comments. The modification of comments should be a separate commit.
  • The last line in 61973769 should probably be print_footer() and not print_footer.
  • Nitpick: Do we really need spaces in between the \n in d4777d66?
  • For substantial changes, it's helpful to provide more than just a one-line summary in the git commit message. For example, I was curious what the purpose of the restructuring in ba190034 was. Does it make maintenance easier? A short paragraph under the one-line summary wouldn't leave me guessing :) Commit 02ae46d0 is another example. It changes several hundred lines of code, yet the commit message only says "Update structure and code".
  • What's the purpose of the giant TAGS file that was added in commit c9531362?

Hiro, do you mind cleaning up the commit messages to make them more descriptive? Please let me know if I can help. Also, let me know if there's any code that you want me to pay particular attention to.

comment:11 Changed 7 weeks ago by gaba

Cc: gaba added

comment:12 Changed 7 weeks ago by phw

Cc: phw added

comment:13 Changed 7 weeks ago by hiro

I think it would make much more sense to document the code in a wiki possibly on dip.tp.o than going through the commits again. A big part of what I did was fixing what was in the repo from Isra, what he had testing on prod and what could work with our current repo.

comment:14 Changed 7 weeks ago by phw

I think it's a great idea to document the code, but it's orthogonal to my point that we should strive for clean git histories. Also, I'm happy to help, and polish the git history to the best of my abilities! Do you want me to make a pass over it?

Finally, we should take care of the following remark, which seems to be an actual bug:

The last line in ​61973769 should probably be print_footer() and not print_footer.

comment:15 Changed 6 weeks ago by hiro

Hi phw,
I have condensed the git history into fewer comments and cleaned up some of the files generated by the coverage script (like the HUGE TAGS file).
Also I think this would help you to visualise it: https://dip.torproject.org/hiro/gettor/merge_requests/1/commits

Also fixed the print bug.

comment:16 Changed 6 weeks ago by phw

Thanks, hiro! The commit history looks much better now.

comment:17 in reply to:  8 Changed 6 weeks ago by arma

Replying to hiro:

Also we should review the list of people currently able to contribute to gettor. At the moment this list includes:
arma ilv sukhbir

I am happy to be removed from this list if it makes roles clearer.

I also support some new people being added to the list.

Last edited 6 weeks ago by arma (previous) (diff)

comment:18 Changed 4 weeks ago by phw

I hear that the refactored code is already deployed. Is there anything left before we can close this ticket?

comment:19 Changed 4 weeks ago by gaba

I didn't close it before because we had https://trac.torproject.org/projects/tor/ticket/28339 as a child but I will "unchild" it and then close this one.

comment:20 Changed 4 weeks ago by gaba

Resolution: fixed
Status: needs_reviewclosed

Thanks!

Note: See TracTickets for help on using tickets.