John DiMarco on Computing (and occasionally other things)
I welcome comments by email to jdd at cs.toronto.edu.

Fri 11 Jul 2008 20:46

Web shell scripting
It is a very handy thing to be able to write a quick script. UNIX and its derivitives have long been superb at making this possible: they possess a great many utilities that are designed to be used both from the command line and within scripts, and they possess shells that have all the control structures one might expect from any programming language. In fact, the traditional UNIX philosophy is to write small programs that do one thing well, and then combine them using scripts into rich and powerful applications. Indeed, the UNIX scripting environment is a rich one. But it is difficult to write shell scripts for the web. The unix scripting environment is designed for files, not web forms, the contents of which are encoded as url-encoded or multipart-encoded data. Hence, while unix shell scripts are sometimes used for web applications (cgi scripts), they are relatively rare, and generally frowned upon. The reason is no surprise: url-encoded and multipart-encoded data is complex to parse, and shell scripts that parse such data using sed, awk, etc. tend to be slow and hard to get right.

But this is easily fixed. If UNIX shell scripts like files, then they should be fed files. Hence I've written a small program (in C and lex), urldecode (ftp:/ftp.cs.toronto.edu/pub/jdd/urldecode) that parses url-encoded and multipart-encoded data, and converts the data into files. No complex file encoding is used. urldecode reads url-encoded or multipart-encoded data, creates a directory, then populates it with files such that each filename is a variable name, and the file contains the variable value. So all a web shell script needs to do to parse url-encoded data is to run urldecode on the data received from a web form, then read the results out of suitably named files. While this is hardly a replacement for PHP or .NET, it does provide a surprisingly simple and straightforward way to script for the web, because it allows all the handy UNIX utilities in the UNIX shell script environment to be leveraged to process web data. That's useful.

/it permanent link


Blosxom