A filter is assigned to a file by use of the Filter= directive in the file's record in its index.cache file. For example, the lines
File=foo.gz
Content-type=text/plain
Filter=/usr/local/bin/zcat
Content-encoding=none
cause the compressed file foo.gz to be uncompressed on the fly and served to the client as a text/plain document. Notice that it is necessary to have the content-encoding line to override the default action of wndex which is to infer from the ".gz" suffix that the content-encoding is x-gzip. If the compressed file were named simply "foo" then the content-encoding line would be unnecessary.
The Filter= directive takes the value of a path to a file in three different forms. If the path begins with a '/' then it is relative to the system root. If it begins with ~/ then it is relative to the WN hierarchy root, and otherwise it is relative to the directory containing the index file.
If a file has a filter only that file will be filtered, not any wrappers or includes.
The ability to filter files can be restricted in several ways. If WN is invoked with the -e option then no includes, filters, or CGI programs will be executed.
The -E option in conjunction with the -t or -T options restricts the use of filters to those listed in index.cache files owned by trusted users or groups. The -u option allows only the use of filters owned by the owner of the index.cache file which lists them.
The directory directive
Default-Filter=/path2/filterspecifies that files in this directory should all be treated as if the Filter file directive had been set to /path2/filter. To override this setting and specify no Filter use the "Filter=<none>" directive.
The first thing to note is that there is no requirement that the filter program or script actually make use of the file being filtered. (This file must exist though.) Thus if an empty file "foo" is created and has an index file entry like
File=foothen the output of the program "somescript" will be served. A program or script used in this way differs somewhat from a CGI script, in that no headers should be supplied by the script as WN will automatically provide them. For example, while a CGI script typically starts with printing "Content-type: text/html" followed by a blank line, this should not be done for "somescript" in the index entry above, because WN will automatically provide the appropriate HTTP headers based on the "Content-type=text/html" line in the index file.
Content-type=text/html
Filter=somescript
Attributes=parse,cgi
An important difference between filters and CGI programs is that the output of filters can be parsed while CGI output cannot. The fact that you want the output parsed must be signalled by the use of an Attributes=parse line.
If you wish to have all the standard CGI environment variables made available to the filter program you can do so by adding the line
Attributes=cgito the file record. A list of these environment variables can be found in the Appendix D. (Also see the sample CGI script which is located in the file /docs/examples/sample.cgi which accompainies the WN distribution.)
One difference between CGI scripts and filters is that with filters there is no way to have a non-empty PATH_INFO environment variable since anything appended to the path part of the URL will be interpreted as a path to an actual file. Of course the "query" part of a URL (everything after a '?') will work for filters as well as CGI scripts and its contents will be put in the QUERY_STRING environment variable.
Another difference between CGI and filter scripts is in the handling of POST or PUT data. A CGI script reads the data provided by the client on its standard input. This is not possible for a filter since its standard input is attached to the file it is supposedly filtering. To use the PUT or POST method with a filtered file, the Attributes=post directive must be used since otherwise the server will not permit a POST or PUT. It is then possible to read the POSTed data by openning and reading the temporary file containing this data. The name of this file changes with each request, but if Attributes=cgi is used then the name is given in the environment variable HTTP_POST_FILE or HTTP_PUT_FILE depending on the method used to submit the data.
One advantage of using a filter instead of a CGI script is that it may have slightly better security. With a filter the name of the executed program is never visible outside the server. It is not in any URL and it is not in any served file. Perhaps a more important feature is that no arguments can be supplied to a filter except those listed in the index file filter entry. Unlike CGI scripts, it is not possible for a remote user to supply any arguments whatsoever to the script.