Opened 2 years ago

Closed 2 years ago

#19143 closed defect (fixed)

Change the BadContent rules

Reported by: cypherpunks Owned by: qbi
Priority: Medium Milestone:
Component: Internal Services/Service - trac Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

The BadContent page contains the http: line. I'm suspecting this rule being the cause of CAPTCHAs being shown whenever submissions to the wiki or the bug tracker contain links.

Also the phone number lines trigger submissions that contain ticket numbers.

Child Tickets

Change History (3)

comment:1 Changed 2 years ago by qbi

Resolution: fixed
Status: newclosed

I changed the regex for the phone numbers a bit. So it should fit to the spam attempts we saw earlier.

The http: prevents that trac gets eaten by spammers. It catches most of the attempts and removing this line would render the site completely unusable.

comment:2 in reply to:  1 Changed 2 years ago by cypherpunks

Resolution: fixed
Status: closedreopened

Replying to qbi:

I changed the regex for the phone numbers a bit. So it should fit to the spam attempts we saw earlier.

It seems the new regex let one spam attempt through. These phone numbers use groups of numbers which the regex currently isn't catching.

A better regex would be 1[ -\.]*8[ -\.]*[0-9]+[ -\.]*[0-9]+[ -\.]*[0-9]+. FWIW Trac's own BadContent page contains an even longer regex which looks to block the same spam (see the last line).

The http: prevents that trac gets eaten by spammers. It catches most of the attempts and removing this line would render the site completely unusable.

The site also becomes harder to use for legitimate users, but if that's really the only solution i guess we have to live with it :(

comment:3 Changed 2 years ago by qbi

Resolution: fixed
Status: reopenedclosed

The regex was catching in this case, but the spammer had enough positive scores and so the spam came through.

The BadContent from trac's page seems to miss some spam attempts we saw. So currently I'll stick with our regex, but will watch it and modify if I see the need.

Most of the spam attempts we see in our trac are simple URLs. The spammer usually tried to post their spam 3--10 times and then change to another URL which they also try for 3--10 times. The only commonality with those links is the http: part. This spam makes up the most part we see and it arrives at a rate of 1--2 attempts per minute.

Note: See TracTickets for help on using tickets.