P

Visitor

 • 

5 Messages

Friday, February 3rd, 2023 6:41 PM

What is the source of the built-in Regex processor so that I can test out regex expressions for functionality?

​Hello..​

​I have been trying different email filter capabilities off the Sender/From field using Regex strings and have had some success. I think I could improve that success if I knew where the Regex expression analyzer that is built into the Email filter subsystem so that I know what to test and implement. I am not a fan of trial and error and there are several 'editions' of a Regex analyzer.​

​Some of the many Conditions in a filter using Regex expressions I have had success with are the following which I have set to move the incoming message to the Trash folder when ANY of the conditions is met. I used the 'View Source' of a selected email and searched for the 'From' line and then examined the full string to see what unique parts to use Regex to search for within this field and typically using the last part of the DNS name which is the country code and specified that the Regex match string match at the end of the string by including the '$' character at the end of the string. I used the "\" character to escape the "." character being interpreted as a wildcard match within the Regex analyzer, which is the default in most cases from what I have found.​

​\.ru$​

​\.cn$​

​\.in$​

​\.it$​

​\.tw$​

​\.ae$​

​\.jp$​

​\.th$​

​\.in$​

​\.nl$​

​\.id$​

​\.online$​

​\.es$​

​\.my$​

​\.pe$​

​\.lk$​

​\.id$​

​\.fr$​

​\.co$​

​\.de$​

​\.cl$​

​\.br$​

​[.@]edu​

​What I am curious is if the "[.@]" string will result in a match for both "@edu" and ".edu" anywhere in the From field so it would work in any part of a DNS name since I get a lot of junk emails that have "edu" somewhere in their DNS name with no consistency.​

​I am also interested to know if the 'Header' condition with the 'Name' field means that we can specify fields within an email header such as 'Return-Path' and then using a Regex expression to match against what the email contains to figure out if the email should be trashed.​

​I found a number of Regex interactive testers on the web, such as ​​https://regex101.com/​​ to try out expression strings but there are other other testers on the web with different nuances on how a Regex expression is interpreted. I want to use a tester that is closest to how the Comcast email filter processing is going to perform when an email comes in.​

​My thanks for your help on this!​

​Paul​

Accepted Solution

BruceW

Gold Problem Solver

 • 

24.5K Messages

2 months ago

... I want to use a tester that is closest to how the Comcast email filter processing ...

In response to the question "What regex syntax does webmail use?", employee @XfinityGabrielS wrote: "That is based on the sieve specification here: https://datatracker.ietf.org/doc/html/draft-murchison-sieve-regex#section-3".

You may find other items mentioned in https://forums.xfinity.com/conversations/email/email-filter-rules-difference/62fab3e42ff2c66589fcf7d5 of interest. Note that email addresses are sometimes enclosed in <angle brackets>, and sometimes not. Good luck!

Please be aware that there are 2 kinds of responses in this Forum: Replies and Comments. When you Comment on a post by scrolling down to "Comment on this post here...", I am notified of your response. But if you select Reply, I am NOT notified and may not be aware of your response.

Visitor

 • 

5 Messages

2 months ago

BruceW...

My thanks for the information on the REGEX expression analyzer!

That helped me tie together other information into a change in how I am doing the Sender/From REGEX filters to get rid of an increasing quantity of junk emails into my account. The info I factored in was the list of currently defined (as of 2/5/2023) list of country code DNS suffix strings. I converted my original REGEX filters into a set of 12 new ones that have the following values.

@*(\.ac|\.ad|\.ae|\.af|\.ag|\.ai|\.al|\.am|\.an|\.ao|\.aq|\.ar|\.as|\.at|\.au|\.aw|\.ax|\.az|\.ba|\.bb|\.bd)$
@*(\.be|\.bf|\.bg|\.bh|\.bi|\.bj|\.bl|\.bm|\.bn|\.bo|\.br|\.bq|\.bs|\.bt|\.bv|\.bw|\.by|\.bz|\.ca|\.cc|\.cd)$
@*(\.cf|\.cg|\.ch|\.ci|\.ck|\.cl|\.cm|\.cn|\.co|\.cr|\.cs|\.cu|\.cv|\.cw|\.cx|\.cy|\.cz|\.dd|\.de|\.dj|\.dk)$
@*(\.dm|\.do|\.dz|\.ec|\.ee|\.eg|\.eh|\.er|\.es|\.et|\.eu|\.fi|\.fj|\.fk|\.fm|\.fo|\.fr|\.ga|\.gb|\.gd|\.ge)$
@*(\.gf|\.gg|\.gh|\.gi|\.gl|\.gm|\.gn|\.gp|\.gq|\.gr|\.gs|\.gt|\.gu|\.gw|\.gy|\.hk|\.hm|\.hn|\.hr|\.ht|\.hu)$
@*(\.id|\.ie|\.il|\.im|\.in|\.io|\.iq|\.ir|\.is|\.it|\.je|\.jm|\.jo|\.jp|\.ke|\.kg|\.kh|\.ki|\.km|\.kn|\.kp)$
@*(\.kr|\.kw|\.ky|\.kz|\.la|\.lb|\.lc|\.li|\.lk|\.lr|\.ls|\.lt|\.lu|\.lv|\.ly|\.ma|\.mc|\.md|\.me|\.mf|\.mg)$
@*(\.mh|\.mk|\.ml|\.mm|\.mn|\.mo|\.mp|\.mq|\.mr|\.ms|\.mt|\.mu|\.mv|\.mw|\.mx|\.my|\.mz|\.na|\.nc|\.ne|\.nf)$
@*(\.ng|\.ni|\.nl|\.no|\.np|\.nr|\.nu|\.nz|\.om|\.pa|\.pe|\.pf|\.pg|\.ph|\.pk|\.pl|\.pm|\.pn|\.pr|\.ps|\.pt)$
@*(\.pw|\.py|\.qa|\.re|\.ro|\.rs|\.ru|\.rw|\.sa|\.sb|\.sc|\.sd|\.se|\.sg|\.sh|\.si|\.sj|\.sk|\.sl|\.sm|\.sn)$
@*(\.so|\.sr|\.ss|\.st|\.su|\.sv|\.sx|\.sy|\.sz|\.tc|\.td|\.tf|\.tg|\.th|\.tj|\.tk|\.tl|\.tm|\.tn|\.to|\.tp)$
@*(\.tr|\.tt|\.tv|\.tw|\.tz|\.ua|\.ug|\.uk|\.um|\.us|\.uy|\.uz|\.va|\.vc|\.ve|\.vg|\.vi|\.vn|\.vu|\.wf|\.ws)$
@*(\.ye|\.yt|\.yu|\.za|\.zm|\.zr|\.zw)$

Yes, I know that .US is in this list but I have yet to get an email with that country code at the end. And I am willing to deal with the few emails that get through these filter rules due to having a terminating '>' character on the FROM line.

There is one part of my original post that got lost in the weeds of my original thread. That being:

              I am also interested to know if the 'Header' condition with the 'Name' field means that we can specify fields within

              an email header such as 'Return-Path' and then using a Regex expression to match against what the email contains

              to figure out if the email should be trashed.

Can you help with this part as well?

Thank you!

Paul

BruceW

Gold Problem Solver

 • 

24.5K Messages

2 months ago

... 'Header' condition ... 'Name' ... 'Return-Path' ... Regex expression ...

Should be fine. Is it not working correctly?

Also, my notes tell me that if I use Regex ".tld>?$" against an email address it should match addresses ending in ".tld" as well as those ending in ".tld>".

Please be aware that there are 2 kinds of responses in this Forum: Replies and Comments. When you Comment on a post by scrolling down to "Comment on this post here...", I am notified of your response. But if you select Reply, I am NOT notified and may not be aware of your response.

Visitor

 • 

5 Messages

2 months ago

BruceW..

Now that is a super sweet addition! I find the subtle nuances of REGEX expressions when strung together is hard to get a grip on. 

I have updated my 12 REGEX filter lines to now be the following to take care of the trailing ">" character in the FROM field.

@*(\.ac|\.ad|\.ae|\.af|\.ag|\.ai|\.al|\.am|\.an|\.ao|\.aq|\.ar|\.as|\.at|\.au|\.aw|\.ax|\.az|\.ba|\.bb|\.bd)>?$
@*(\.be|\.bf|\.bg|\.bh|\.bi|\.bj|\.bl|\.bm|\.bn|\.bo|\.br|\.bq|\.bs|\.bt|\.bv|\.bw|\.by|\.bz|\.ca|\.cc|\.cd)>?$
@*(\.cf|\.cg|\.ch|\.ci|\.ck|\.cl|\.cm|\.cn|\.co|\.cr|\.cs|\.cu|\.cv|\.cw|\.cx|\.cy|\.cz|\.dd|\.de|\.dj|\.dk)>?$
@*(\.dm|\.do|\.dz|\.ec|\.ee|\.eg|\.eh|\.er|\.es|\.et|\.eu|\.fi|\.fj|\.fk|\.fm|\.fo|\.fr|\.ga|\.gb|\.gd|\.ge)>?$
@*(\.gf|\.gg|\.gh|\.gi|\.gl|\.gm|\.gn|\.gp|\.gq|\.gr|\.gs|\.gt|\.gu|\.gw|\.gy|\.hk|\.hm|\.hn|\.hr|\.ht|\.hu)>?$
@*(\.id|\.ie|\.il|\.im|\.in|\.io|\.iq|\.ir|\.is|\.it|\.je|\.jm|\.jo|\.jp|\.ke|\.kg|\.kh|\.ki|\.km|\.kn|\.kp)>?$
@*(\.kr|\.kw|\.ky|\.kz|\.la|\.lb|\.lc|\.li|\.lk|\.lr|\.ls|\.lt|\.lu|\.lv|\.ly|\.ma|\.mc|\.md|\.me|\.mf|\.mg)>?$
@*(\.mh|\.mk|\.ml|\.mm|\.mn|\.mo|\.mp|\.mq|\.mr|\.ms|\.mt|\.mu|\.mv|\.mw|\.mx|\.my|\.mz|\.na|\.nc|\.ne|\.nf)>?$
@*(\.ng|\.ni|\.nl|\.no|\.np|\.nr|\.nu|\.nz|\.om|\.pa|\.pe|\.pf|\.pg|\.ph|\.pk|\.pl|\.pm|\.pn|\.pr|\.ps|\.pt)>?$
@*(\.pw|\.py|\.qa|\.re|\.ro|\.rs|\.ru|\.rw|\.sa|\.sb|\.sc|\.sd|\.se|\.sg|\.sh|\.si|\.sj|\.sk|\.sl|\.sm|\.sn)>?$
@*(\.so|\.sr|\.ss|\.st|\.su|\.sv|\.sx|\.sy|\.sz|\.tc|\.td|\.tf|\.tg|\.th|\.tj|\.tk|\.tl|\.tm|\.tn|\.to|\.tp)>?$
@*(\.tr|\.tt|\.tv|\.tw|\.tz|\.ua|\.ug|\.uk|\.um|\.us|\.uy|\.uz|\.va|\.vc|\.ve|\.vg|\.vi|\.vn|\.vu|\.wf|\.ws)>?$
@*(\.ye|\.yt|\.yu|\.za|\.zm|\.zr|\.zw)>?$

My thanks for your help on this!

Paul

Visitor

 • 

5 Messages

2 months ago

BruceW...

I did some more testing regarding the use of the 'Header' condition in a filter rule and found my problem.

I had all upper case and the trailing ":" character in the 'Name' field of the filter. When I copied what is displayed, minus the ":", when you do a 'View Source' sequence on an email to see what fields you can select within the header of the email.

Once I got past that part, using a 'Contains' or REGEX expression worked like a charm!

My thanks for your time!

Paul

BruceW

Gold Problem Solver

 • 

24.5K Messages

2 months ago

... I had all upper case and the trailing ":" character in the 'Name' field ...

Yeah, that would do it. It's such a shame that Comcast is so documentation-averse. Having some instructions for this stuff, preferably with a few examples, would really help!

Visitor

 • 

5 Messages

2 months ago

BruceW..

That is the other reason I have been rather verbose here with this thread. To provide some details on my education/travels on using filters and some examples that folks can copy/paste to use on their own.

Overall a bumpy but fun journey. 

And I am very happy with the junking of spam/phishing emails coming in from locations around our planet to my inbox.  Always a good thing!

Take care.

Paul

forum icon

New to the Community?

Start Here