Distrib Runs Rdist (or Not)

So far, we've seen that distrib is used to select machines. For each selected machine, a Distfile (and any listed files whose names are quoted with '@'s) is processed with m4. Once this is done (and still for each selected machine), distrib takes some action. By default, that's running rdist with the processed Distfile.

Running Rdist

If we really want to run rdist on the processed Distfile, we're in luck, because that's the default action. Most of distrib's options are just passed down to rdist or influence its behavior. Brush up on your rdist and look over the distrib options to see what you can do.

Of most interest to many people is that you can use -P to set the transport mechanism for rdist so you could use, for example, ssh instead of rsh.

You can also pass anything you want to rdist by using the -o option. (This might be more useful if "rdist" isn't rdist.)

Recall that the files listed in the processed Distfile may themselves be processed. When that happens, their names will change, but distrib does a good job of hiding that from you- as long as you don't try to do anything "clever" in your Distfile.

... or Not!

Sending files via rdist is just one thing you can do with distrib, and while useful, far from the most interesting one.

The UNIX philosophy is that lots of programs that do small jobs well can be used together to do many tasks. If we take this approach to distrib, we see that if we could use any other program on the processed Distfile instead of rdist, there'd be a lot of possibilities.

In fact, we can use nearly any program in place of rdist. There are several ways to do this, and a few limitations on each approach.

The Ol' Bait and Switch

Why isn't there an option to specify which program to run? There probably should be, but you don't really need it. If the m4 macro or environment variable RDIST_PATH is set, that's the program that will be run. It will usually be run with the options "-f -" as well as whatever other rdist options result from the options given to distrib. It will get the processed Distfile as its stdin.

If the options given will confuse your program or you can't make use of the processed Distfile on stdin, you could write a wrapper script or use a different approach.

If you use the environment variable RDIST_PATH, don't forget that you changed it. You might want to unset it or set to the location of a real rdist (or whatever it was before you changed it).

Instead of setting RDIST_PATH in the environment, you can set it as a macro. this allows you to set it by machine, and doesn't mess with your environment. If it's set both places, the macro takes precedence. If the macro is unset, it goes back to the environment value.

If your Distfile is intended to always be run with a different program than a real rdist, putting the program in the configuration file as a macro is the best approach. If it is only rarely run with a different program, the environment variable is easier.

One For All

Another option that we've briefly mentioned already is to use distrib's -E option. Just as cpp will emit its output instead of sending it to be compiled, distrib will just print the processed Distfiles to stdout. The good and bad news is that all the processed Distfiles (one for each selected machine) will be emitted together.

This could be useful if we just wanted to generate a list, and distrib has a builtin option to do this for a degenerate Distfile containing only "HOST" (the manual page lists an equivalent invocation). That makes distrib -H just print HOST for each machine. If there's no HOST macro defined, you won't get anything, of course.

Similarly, you could use a small Distfile to print any other attribute(s) of the selected machines. Such a list could be informative in its own right or could form a script or a list of arguments for use with xapply.

Choose Your own Adventure

Now that we've selected machines and processed some files, we get to do something with the Distfile(s) (and indirectly, maybe other files as well). By default, that action is to run rdist, which we can pass options to to change its behavior.

We can also run another command in place of rdist in some cases if we want to handle each processed Distfile separately. This isn't usually necessary, but it's a nice feature.

Finally, we can use the output of distrib (the processed Distfile(s) catted together) to feed another command. This could be used to generate a list or file or in some kind of a loop to execute other commands.

With these alternatives available, distrib provides us with a shorthand for many types of tasks including- but certainly not limited to- file distribution.