Mailing List Archive

A regular expression that matches SPF records
Today, I finally got around to doing something that has been on my
TODO list for about 6 months. I took the ABNF out of the SPF spec and
converted into a regular expression. So, I now have a regular
expression that matches valid SPF records and rejects invalid ones.

I used egrep to run this regular expression over a list of 591475 SPF
records that I had found in the .com domains and it took 1.25seconds
on my 900MHz PIII. I think this shows that you can easily do complete
syntax checking on all SPF records without any significant performance
penalty.


Using this regular expression, I have already discovered a couple of
bugs in the SPF spec's ABNF and several more in the test suite. I
think this regular expression can be used to help develop and/or test
your SPF implementations.



I have linked to the regular expression on the
http://www.schlitt.net/spf/tests/ webpage. It currently comes in two
forms, one that works for egrep, and one the uses ruby/perl style
"extended" regular expressions.


Just for giggles, here is the "extended" version of the regular
expression:

%r{[Vv]=[Ss][Pp][Ff]1
(?:\x20+
(?:[\x2b\x2d\x3f~]?
(?:[Aa][Ll][Ll]|
[Ii][Nn][Cc][Ll][Uu][Dd][Ee]:
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d)|
[Aa]
(?::
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d))?(?:(?:/\d+)?(?://\d+)?)?|
[Mm][Xx]
(?::
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d))?(?:(?:/\d+)?(?://\d+)?)?|
[Pp][Tt][Rr]
(?::
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d))?|
[Ii][Pp]4:(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])(?:/\d+)?|
[Ii][Pp]6:
(?:::|
(?:[0-9A-Fa-f]{1,4}:){7}[0-9A-Fa-f]{1,4}|
(?:[0-9A-Fa-f]{1,4}:){1,8}:|
(?:[0-9A-Fa-f]{1,4}:){7}:[0-9A-Fa-f]{1,4}|
(?:[0-9A-Fa-f]{1,4}:){6}(?::[0-9A-Fa-f]{1,4}){1,2}|
(?:[0-9A-Fa-f]{1,4}:){5}(?::[0-9A-Fa-f]{1,4}){1,3}|
(?:[0-9A-Fa-f]{1,4}:){4}(?::[0-9A-Fa-f]{1,4}){1,4}|
(?:[0-9A-Fa-f]{1,4}:){3}(?::[0-9A-Fa-f]{1,4}){1,5}|
(?:[0-9A-Fa-f]{1,4}:){2}(?::[0-9A-Fa-f]{1,4}){1,6}|
[0-9A-Fa-f]{1,4}:(?::[0-9A-Fa-f]{1,4}){1,7}|
:(?::[0-9A-Fa-f]{1,4}){1,8}|
(?:[0-9A-Fa-f]{1,4}:){6}(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])
\x2e(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){6}:(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])
\x2e(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){5}:(?:[0-9A-Fa-f]{1,4}:)?
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){4}:(?:[0-9A-Fa-f]{1,4}:){0,2}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){3}:(?:[0-9A-Fa-f]{1,4}:){0,3}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){2}:(?:[0-9A-Fa-f]{1,4}:){0,4}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
[0-9A-Fa-f]{1,4}::(?:[0-9A-Fa-f]{1,4}:){0,5}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
::(?:[0-9A-Fa-f]{1,4}:){0,6}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5]))(?:/\d+)?|
[Ee][Xx][Ii][Ss][Tt][Ss]:
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d))|
[Rr][Ee][Dd][Ii][Rr][Ee][Cc][Tt]=
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d)|
[Ee][Xx][Pp]=
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d)|
[A-Za-z][\x2d\x2e0-9A-Z_a-z]*=
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*))*\x20*}x




-wayne

-------
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/?listname=spf-devel@v2.listbox.com
Re: A regular expression that matches SPF records [ In reply to ]
wayne wrote:

> I used egrep to run this regular expression over a list of 591475 SPF
> records that I had found in the .com domains and it took 1.25seconds
> on my 900MHz PIII. I think this shows that you can easily do complete
> syntax checking on all SPF records without any significant performance
> penalty.

Hi Wayne,

I wrote a PCRE regex for SPF a couple of days ago - if you wouldn't mind, I'd like to run
it over your 600K records.


/Per Jessen, Zürich

-------
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/?listname=spf-devel@v2.listbox.com
Re: A regular expression that matches SPF records [ In reply to ]
In <d9r34m$un0$1@saturn.local.net> Per Jessen <per@computer.org> writes:

> wayne wrote:
>
>> I used egrep to run this regular expression over a list of 591475 SPF
>> records that I had found in the .com domains and it took 1.25seconds
>> on my 900MHz PIII. I think this shows that you can easily do complete
>> syntax checking on all SPF records without any significant performance
>> penalty.
>
> I wrote a PCRE regex for SPF a couple of days ago - if you wouldn't
> mind, I'd like to run it over your 600K records.


Sorry for not getting back to you sooner.

The list of SPF records that I used can be found at:

http://www.schlitt.net/spf/spf_records_050125_com_raw.gz

As implied by the file name, these SPF records were the ones that I
found when I did a survey of all .com domains as of 2005/01/25.


-wayne

-------
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/?listname=spf-devel@v2.listbox.com