Opened 5 weeks ago

Closed 4 weeks ago

#29789 closed defect (fixed)

practracker.py codec exception in some locales

Reported by: catalyst Owned by: catalyst
Priority: Medium Milestone: Tor: 0.4.1.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: easy, tools, teor-merge, nickm-merge
Cc: Actual Points: 0.1
Parent ID: Points: 0.1
Reviewer: asn Sponsor: Sponsor31-can

Description

practracker.py, implemented in #29221, seems to have a locale dependency when python3 is being used. If the locale isn't a UTF-8 locale, UTF-8 characters in sources can result in an exception:

$ LANG=en_US.US-ASCII make check-best-practices PYTHON=python
python ../scripts/maint/practracker/practracker.py ..
mirkwood:build-norust tlyu$ LANG=en_US.US-ASCII make check-best-practices
python3 ../scripts/maint/practracker/practracker.py ..
Traceback (most recent call last):
  File "../scripts/maint/practracker/practracker.py", line 151, in <module>
    main()
  File "../scripts/maint/practracker/practracker.py", line 134, in main
    found_new_issues = consider_all_metrics(files_list)
  File "../scripts/maint/practracker/practracker.py", line 89, in consider_all_metrics
    found_new_issues |= consider_metrics_for_file(fname, f)
  File "../scripts/maint/practracker/practracker.py", line 104, in consider_metrics_for_file
    found_new_issues |= consider_file_size(fname, f)
  File "../scripts/maint/practracker/practracker.py", line 51, in consider_file_size
    file_size = metrics.get_file_len(f)
  File "/Users/tlyu/src/tor/scripts/maint/practracker/metrics.py", line 11, in get_file_len
    for i, l in enumerate(f):
  File "/Users/tlyu/src/brew/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 14: ordinal not in range(128)
make: *** [check-best-practices] Error 1

I'm also seeing this on gitlab.com CI, but I don't know offhand what its locale environment variables are.

We might want to use the encoding= keyword parameter to open(), but I think that would no longer be python2 compatible.

Child Tickets

Change History (9)

comment:1 Changed 5 weeks ago by catalyst

There's a change in b4b8fa4899fcde9983f66a6310878ea47186e5eb to checkIncludes.py that looks like it might help. Maybe we should copy it (or refactor it into some shared maintenance script utility code)?

comment:2 Changed 5 weeks ago by catalyst

Keywords: easy tools added
Points: 0.1

comment:3 Changed 4 weeks ago by catalyst

Owner: set to catalyst
Status: newassigned

comment:4 Changed 4 weeks ago by catalyst

Actual Points: 0.1
Status: assignedneeds_review

comment:5 Changed 4 weeks ago by teor

Milestone: Tor: unspecifiedTor: 0.4.1.x-final

comment:6 Changed 4 weeks ago by asn

Reviewer: asn

comment:7 Changed 4 weeks ago by asn

Status: needs_reviewmerge_ready

LGTM!

comment:8 Changed 4 weeks ago by teor

Keywords: teor-merge nickm-merge added

comment:9 Changed 4 weeks ago by teor

Resolution: fixed
Status: merge_readyclosed

Looks good to me.

Merged #29823 and #29789 to master.

Note: See TracTickets for help on using tickets.