Distfile Processing

We've seen how to use distrib to select machines. For each selected machine, distrib processes a Distfile, and then uses it to rdist files. The Distfile is intended to be an rdist Distfile after this processing, but might not be. In that situation, you probably wouldn't want to run rdist on it. We'll get to that later.

Distfiles

By default, distrib uses a file named "Distfile" (or "distfile" if it's not there) as the Distfile. You can specify another one with the -f option (use "-" as the filename to use stdin).

The original intent was for Distfile to be an rdist Distfile using the macros defined for the machines. Instead of explicitly putting in the target hostname, for example, you could just put HOST, and for each machine it would use the Distfile tailored to that machine.

This processing is done using m4. Because of this, much more than simple text substitution can be done. There are whole books on m4, so we won't go into too much detail (though the beauty- and hazard- of it is its simplicity), but see the later examples for some common uses. (I also recommend you go bone up on m4 at this point.)

Besides using all the macros set by the configuration file for each machine, each file listed in the Distfile with '@'s around the name (e.g., "@Makefile@"), will be processed by m4 (with the same macros used on the Distfile) and its name replaced by the name of the processed file. This will only happen if the Distfile is actually named "Distfile" or "distfile" or if the -F or -c option is used.

So not only can we customize Distfile by machine, but each file listed in the Distfile can also be tailored based on the attributes of the machine. Does your head hurt yet?

More on Configuration Files

Besides lines that are empty, or comments, or machines, you can also put in lines to just set macros in a configuration file. These lines are of the form "NAME=value". This can also be used to unset a macro (set it equal to a dot). Macros stay set until you change them, so you can use them in place of columns that would be constant or nearly so.

Column entries in machines can also contain just a dot (.) to unset the macro (we've seen that mentioned in relation to HASSRC and -S). If you want to set a macro to have an empty value, use "". If you actually want to set it to be a period, you have to put it in quotes.

If you want to set a macro that has double quotes in it, you have to backslash them, but it only works on an = line, not in a column. This is a bug.

Additionally, macros can be set on distrib's command line with the -D option. These will be overridden by macros specified (set or unset) in the configuration file- as if they were set before those.

Distrib and M4

Depending on which version of m4 you're using, you may need to take more precautions with certain types of files during processing. Scripts for the UNIX shells, for instance, are often full of quotation marks which will do odd things when processed by m4.

We recommend the use of GNU m4 on all platforms so you don't have to worry about inconsistencies among different vendors. It also has some useful features to make things easier.

If you need to run a different version of m4 than the one compiled in, you can use the environment variable M4_PATH to choose a different one. You could also use this to run a completely different program in place of m4, but that's of dubious utility.

You'll notice we tend to only define macros with all-capital names. This is certainly not a requirement of either m4 or distrib. It's primarily to cut down on unwanted macro expansion and to make input files easier to read.

It is also common practice to change the quote characters used by m4, as well as to quote any parts of the file that shouldn't have macros run on them (most of most files). These practices will be described more fully in the examples.

The -G option is useful for doing selections more complicated than those offered by distrib's other options. We'll see some detailed examples, but for now, think about expressions you could write that would produce "a non-empty string containing any character other than white-space or the digit zero (0)" based on the macros you already have set. There are a lot of possibilites.

One more thing you can do with m4 is process your configuration file. If a configuration file has a name that ends in ".mcf", it is run through m4 before distrib sees it. Use distrib's -D option to set macros to tailor that file. This is most useful in situations where a configuration file might not be editable, such as on a CD or other read-only filesystem, but needs to be modified for different uses.

When is a Distfile not a "Distfile"?

As we mentioned above in relation to the -F option, distrib handles files differently if it believes they're actually Distfiles intended for rdist.

What other files could we process? Certainly if we just wanted to tailor a file, but not rdist it, we could give it directly to distrib as the Distfile. That could be useful.

It could also be used for other scripts or lists that do contain other file's names but are not actual Distfiles. This could be used to modify a program in another language (such as Perl) that will handle other processed files modified by distrib.

Part of the Process

Remember that the actions listed above are done for each selected machine (sequentially and ordered by the configuration file in current implementations, but who knows what the future holds?). If you want to handle machines in groups instead of one by one, select a member of each group and then make a script for the rest- or run distrib again.

Once we have a processed Distfile, distrib either sends it to stdout if the -E option is given, or uses it for the next step.