Mailing List Archive

Mini Rex: a tool for creating regex's from a list of words

Mini_Rex reads a list of strings (one per line, blanks and other characters
are permitted), and generates a Perl Regular Expression which will match an
occurrence of any of the input strings.

Mini_Rex is invoked as follows:

mini_rex [-help] [-l <num>] [ -i ]
[-prefix="prefix string"]
-i ignore character case
-l <num> maximum pattern length
(default: no pattern length limit)
-prefix="prefix_string" prepends "prefix" to each line
(default: no prefix)
As an example, let's try feeding Mini_Rex a text file with 80/so spellings
of the ever-popular V-word:

% -prefix='body Viagra_OBFU' -i < viagra.txt
body Viagra_OBFU /(?:\\\/(?:1agr4|\ \|\ a\ g\ r\ a)|v(?:1\@gra
|\ (?:1\ (?:\@\ g\ r\ |a\ g\ r\ ?)a|\!\ a\ g\ r\ a|\|\ a\ g\ r\ a
|i(?:\ (?:\@\ g\ (?:\.r\.\ a|r\ [\@a])|a(?:\ g\ r\ [\@a]|\.g\ r\ a))
|\.\@\.g\.r\.a)|l\ (?:\@\ g\ r\ \@|a\ g\ r\
|\|i\|a\|g\|r\|a|\~(?:\ i\ \~\ a\~\ g\~\ r\~\ |i(?:\~a\~g\~r\~
|ag\~r))a|_(?:1_\@_g_|i(?:\ a\ g\ |\.a\.g\.|_a_g_))r_a|i(?:\ (?:\ ag
|ag\ )ra|\&acirc\;gra|\-(?:\-a\,g\-r\-\-r\,|ag\.r)a|\.(?:a\.)?g\.r\.a
|\@raga|a(?:g(?:\-g?ra|gra|rax)|rgg?ra)|magera)|jiagmra|ye\ agrah))/i

Other tools:

Expand_regex, a tool that assists in understanding and debugging complicated

Split_mail, a tool for splitting up large mbox's into smaller ones, each
containing at most a specified number of messages.

Extract_url's, a tool that extracts URL's from the messages in an mbox file,
and validates them by attempting to fetch the referenced page.