crm114 20040409-BlameMarys-auto.1

1.  
2.  
3.  
4.  
5.  

NAME

crm114 - The Controllable Regex Mutilator

SYNOPSIS

crm [-d N (enter debugger after running N cycles. Omitting N means N equals 0.)] [-e (do not import any environment variables)] [-h (print help text)] [-p (generate an execution-time-spent profile on exit)] [-P N (max program lines)] [-q m (mathmode (0,1 = alg/RPN only in EVAL, 2,3 = alg/RPN everywhere))] [-s N (new feature file (.css) size is N (default 1 meg+1 featureslots))] [-S N (new feature file (.css) size is N rounded to 2^I+1 featureslots)] [-t (user trace output)] [-T (implementors trace output (only for the masochistic!))] [-u dir (chdir to directory dir before starting execution)] [-v (print CRM114 version identification and exit)] [-w N (max data window (bytes, default 16 megs))] [-- (signals the end CRM114 flags; prior flags are not seen by the user program; subsequent args are not processed by CRM114)] [--foo (creates the user variable :foo: with the value SET)] [--x=y (creates the user variable :x: with the value y)] [-{ stmts} (execute the statements inside the {} brackets)] crmfile (.crm file name)

DESCRIPTION

CRM114 is a language designed to write filters in. It caters to filtering email, system log streams, html, and other marginally human-readable ASCII that may occasion to grace your computer.

CRM114's unique strengths are the data structure (everything is a string and a string can overlap another string), it's ability to work on truly infinitely long input streams, it's ability to use extremely advanced classifiers to sort text, and the ability to do approximate regular expressions (that is, regexes that don't quite match) via the TRE regex library.

CRM114 also sports a very powerful subprocess control facility, and a unique syntax and program structure that puts the fun back in programming (OK, you can run away screaming now). The syntax is declensional rather than positional; the type of quote marks around an argument determine what that argument will be used for.

The typical CRM114 program uses regex operations more often than addition (in fact, math was only added to TRE in the waning days of 2003, well after CRM114 had been in daily use for over a year and a half).

In other words, crm114 is a very very powerful mutagenic filter that happens to be a programming language as well.

The filtering style of the CRM-114 discriminator is based on the fact that most spam, normal log file messages, or other uninteresting data is easily categorized by a few characteristic patterns (such as "Mortgage leads", "advertise on the internet", and "mail-order toner cartridges".) CRM114 may also be useful to folks who are on multiple interlocking mailing lists.

In a bow to Unix-style flexibility, by default CRM114 reads it's input from standard input, and by default sends it's output to standard output. Note that the default action has a zero-length output. Redirection and use of other input or output files is possible, as well as the use of windowing, either delimiter-based or time-based, for real-time continuous applications.

CRM114 can be used for other than mail filtering; consider it to be a version of grep with super powers. If perl is a seventy-bladed swiss army knife, CRM114 is a razor-sharp katana that can talk.

INVOCATION

Absent the -{ program } flag, the first argument is taken to be the name of a file containing a crm114 program, subsequent arguments are merely supplied as :_argN: values. Use single quotes around commandline programs '-{ like this }' to prevent the shell from doing odd things to your command-line programs.

CRM114 can be directly invoked by the shell if the first line of your program file uses the shell standard, as in:

#! /usr/bin/crm

You can use CRM114 flags on the shell-standard invocation line, and hide them with '--' from the program itself; '--' incidentally prevents the invoking user from changing any CRM114 invocation flags.

Flags should be located after any positional variables on the command line. Flags are visible as :_argN: variables, so you can create your own flags for your own programs (separate CRM114 and user flags with '--'). Two examples on how to do this:

./foo.crm bar mugga < baz  -t -w 150000
./foo.crm -t -w 1500000 -- bar < baz mugga

One example on how not to do this:

./foo.crm -t -w 150000 bar < baz mugga

(That's WRONG!)

You can put a list of user-settable vars on the #!/usr/bin/crm invocation line. CRM114 will print these out when a program is invoked directly (e.g. "./myprog.crm -h", not "crm myprog.crm -h") with the -h (for help) flag. (note that this works ONLY on bash on Linux- *BSD's have a different bash interpretation and this doesn't work)

Example:

#!/usr/bin/crm  -( var1 var2=A var2=B var2=C )

This allows only var1 and var2 be set on the command line. If a variable is not assigned a value, the user can set any value desired. If the variable is equated to a set of values, those are the only values allowed.

Another example:

#!/usr/bin/crm  -( var1 var2=foo )  --

This allows var1 to be set to any value, var2 may only be set to either foo or not at all, and no other variables may be set nor may invocation flags be changed (because of the trailing "--"). Since "--" also blocks '-h' for help, such programs should provide their own help facility.

VARIABLES

Variable names and locations start with a : , end with a : , and may contain only characters that have ink (i.e. the [:graph:] class) with few exceptions.

Examples :here:, :ThErE:, :every-where_0123+45%6789:, :this_is_a_very_very_long_var_name_that_does_not_tell_us_much:. Builtin variables: