haserl(1) | General Commands Manual | haserl(1) |
haserl - A cgi scripting program for embedded environments
#!/usr/bin/haserl [--shell=pathspec] [--upload-dir=dirspec] [--upload-handler=handler] [--upload-limit=limit] [--accept-all] [--accept-none] [--silent] [--debug]
[ text ] [ <% shell script %> ] [ text ] ...
Haserl is a small cgi wrapper that allows "PHP" style cgi programming, but uses a UNIX bash-like shell or Lua as the programming language. It is very small, so it can be used in embedded environments, or where something like PHP is too big.
It combines three features into a small cgi engine:
This is a summary of the command-line options. Please see the OPTIONS section under the long option name for a complete description.
-a --accept-all
-n --accept-none
-d --debug
-s, --shell
-S, --silent
-U, --upload-dir
-u, --upload-limit
-H, --upload-handler
To include shell parameters do not use the --shell=/bin/sh format. Instead, use the alternative format without the "=", as in --shell "/bin/bash --norc". Be sure to quote the option string to protect any special characters.
If compiled with Lua libraries, then the string "lua" is used to use an integrated Lua vm. This string is case sensitive. Example: --shell=lua
An alternative is "luac". This causes the haserl and lua parsers to be disabled, and the script is assumed to be a precompiled lua chunk. See LUAC below for more information.
In general, the web server sets up several environment variables, and then uses fork or another method to run the CGI script. If the script uses the haserl interpreter, the following happens:
The environment is scanned for HTTP_COOKIE, which may have been set by the web server. If it exists, the parsed contents are placed in the local environment.
The environment is scanned for REQUEST_METHOD, which was set by the web server. Based on the request method, standard input is read and parsed. The parsed contents are placed in the local environment.
The script is tokenized, parsing haserl code blocks from raw text. Raw text is converted into "echo" statements, and then all tokens are sent to the sub-shell.
haserl forks and a sub-shell (typically /bin/sh) is started.
All tokens are sent to the STDIN of the sub-shell, with a trailing exit command.
When the sub-shell terminates, the haserl interpreter performs final cleanup and then terminates.
The haserl interpreter will decode data sent via the HTTP_COOKIE environment variable, and the GET or POST method from the client, and store them as environment variables that can be accessed by haserl. The name of the variable follows the name given in the source, except that a prefix ( FORM_) is prepended. For example, if the client sends "foo=bar", the environment variable is FORM_foo=bar.
For the HTTP_COOKIE method, variables are also stored with the prefix ( COOKIE_) added. For example, if HTTP_COOKIE includes "foo=bar", the environment variable is COOKIE_foo=bar.
For the GET method, data sent in the form %xx is translated into the characters they represent, and variables are also stored with the prefix ( GET_) added. For example, if QUERY_STRING includes "foo=bar", the environment variable is GET_foo=bar.
For the POST method, variables are also stored with the prefix ( POST_) added. For example, if the post stream includes "foo=bar", the environment variable is POST_foo=bar.
Also, for the POST method, if the data is sent using multipart/form-data encoding, the data is automatically decoded. This is typically used when files are uploaded from a web client using <input type=file>.
If the client sends data both by POST and GET methods, then haserl will parse only the data that corresponds with the REQUEST_METHOD variable set by the web server, unless the accept-all option has been set. For example, a form called via POST method, but having a URI of some.cgi?foo=bar&otherdata=something will have the POST data parsed, and the foo and otherdata variables are ignored.
If the web server defines a HTTP_COOKIE environment variable, the cookie data is parsed. Cookie data is parsed before the GET or POST data, so in the event of two variables of the same name, the GET or POST data overwrites the cookie information.
When multiple instances of the same variable are sent from different sources, the FORM_variable will be set according to the order in which variables are processed. HTTP_COOKIE is always processed first, followed by the REQUEST_METHOD. If the accept-all option has been set, then HTTP_COOKIE is processed first, followed by the method not specified by REQUEST_METHOD, followed by the REQUEST_METHOD. The last instance of the variable will be used to set FORM_variable. Note that the variables are also separately creates as COOKIE_variable, GET_variable and POST_variable. This allows the use of overlapping names from each source.
When multiple instances of the same variable are sent from the same source, only the last one is saved. To keep all copies (for multi-selects, for instance), add "[]" to the end of the variable name. All results will be returned, separated by newlines. For example, host=Enoch&host=Esther&host=Joshua results in "FORM_host=Joshua". host[]=Enoch&host[]Esther&host[]=Joshua results in "FORM_host=Enoch\nEsther\nJoshua"
The following language structures are recognized by haserl.
<% [shell script] %>Anything enclosed by <% %> tags is sent to the sub-shell for execution. The text is sent verbatim.
<%in pathspec %>Include another file verbatim in this script. The file is included when the script is initially parsed.
<%= expression %>print the shell expression. Syntactic sugar for "echo expr".
<%# comment %>Comment block. Anything in a comment block is not parsed. Comments can be nested and can contain other haserl elements.
#!/usr/local/bin/haserl content-type: text/plain <%# This is a sample "env" script %> <% env %>
Prints the results of the env command as a mime-type "text/plain" document. This is the haserl version of the common printenv cgi.
#!/usr/local/bin/haserl Content-type: text/html <html> <body> <table border=1><tr> <% for a in Red Blue Yellow Cyan; do %> <td bgcolor="<% echo -n "$a" %>"><% echo -n "$a" %></td> <% done %> </tr></table> </body> </html>
Sends a mime-type "text/html" document to the client, with an html table of with elements labeled with the background color.
#!/usr/local/bin/haserl content-type: text/html <% # define a user function
table_element() {
echo "<td bgcolor=\"$1\">$1</td>"
}
%> <html> <body> <table border=1><tr> <% for a in Red Blue Yellow Cyan; do %> <% table_element $a %>
<% done %> </tr></table> </body> </html>
Same as above, but uses a shell function instead of embedded html.
#!/usr/local/bin/haserl content-type: text/html <html><body> <h1>Sample Form</h1> <form action="<% echo -n $SCRIPT_NAME %>" method="GET"> <% # Do some basic validation of FORM_textfield
# To prevent common web attacks
FORM_textfield=$( echo "$FORM_textfield" | sed "s/[^A-Za-z0-9 ]//g" )
%> <input type=text name=textfield Value="<% echo -n "$FORM_textfield" | tr a-z A-Z %>" cols=20> <input type=submit value=GO> </form></html> </body>
Prints a form. If the client enters text in the form, the CGI is reloaded (defined by $SCRIPT_NAME) and the textfield is sanitized to prevent web attacks, then the form is redisplayed with the text the user entered. The text is uppercased.
#!/usr/local/bin/haserl --upload-limit=4096 --upload-dir=/tmp content-type: text/html <html><body> <form action="<% echo -n $SCRIPT_NAME %>" method=POST enctype="multipart/form-data" > <input type=file name=uploadfile> <input type=submit value=GO> <br> <% if test -n "$HASERL_uploadfile_path"; then %>
<p>
You uploaded a file named <b><% echo -n $FORM_uploadfile_name %></b>, and it was
temporarily stored on the server as <i><% echo $HASERL_uploadfile_path %></i>. The
file was <% cat $HASERL_uploadfile_path | wc -c %> bytes long.</p>
<% rm -f $HASERL_uploadfile_path %><p>Don't worry, the file has just been deleted
from the web server.</p> <% else %>
You haven't uploaded a file yet. <% fi %> </form> </body></html>
Displays a form that allows for file uploading. This is accomplished by using the --upload-limit and by setting the form enctype to multipart/form-data. If the client sends a file, then some information regarding the file is printed, and then deleted. Otherwise, the form states that the client has not uploaded a file.
#!/usr/local/bin/haserl <% echo -en "content-type: text/html\r\n\r\n" %> <html><body>
... </body></html>
To fully comply with the HTTP specification, headers should be terminated using CR+LF, rather than the normal unix LF line termination only. The above syntax can be used to produce RFC 2616 compliant headers.
In addition to the environment variables inherited from the web server, the following environment variables are always defined at startup:
These variables can be modified or overwritten within the script, although the ones starting with "HASERL_" are informational only, and do not affect the running script.
There is much literature regarding the dangers of using shell to program CGI scripts. haserl contains some protections to mitigate this risk.
It is safe to use this "dangerous" variable in shell scripts by enclosing it in quotes; although validation should be done on all input fields.
If compiled with lua support, --shell=lua will enable lua as the script language instead of bash shell. The environment variables (SCRIPT_NAME, SERVER_NAME, etc) are placed in the ENV table, and the form variables are placed in the FORM table. For example, the self-referencing form above can be written like this:
#!/usr/local/bin/haserl --shell=lua content-type: text/html <html><body> <h1>Sample Form</h1> <form action="<% io.write(ENV["SCRIPT_NAME"]) %>" method="GET"> <% # Do some basic validation of FORM_textfield
# To prevent common web attacks
FORM.textfield=string.gsub(FORM.textfield, "[^%a%d]", "")
%> <input type=text name=textfield Value="<% io.write (string.upper(FORM.textfield)) %>" cols=20> <input type=submit value=GO> </form></html> </body>
The <%= operator is syntactic sugar for io.write (tostring( ... )) So, for example, the Value= line above could be written: Value="<%= string.upper(FORM.textfield) %>" cols=20>
haserl lua scripts can use the function haserl.loadfile(filename) to process a target script as a haserl (lua) script. The function returns a type of "function".
For example,
bar.lsp
<% io.write ("Hello World" ) %> Your message is <%= gvar %> -- End of Include file --
foo.haserl
#!/usr/local/bin/haserl --shell=lua <% m = haserl.loadfile("bar.lsp")
gvar = "Run as m()"
m()
gvar = "Load and run in one step"
haserl.loadfile("bar.lsp")() %>
Running foo will produce:
Hello World Your message is Run as m() -- End of Include file -- Hello World Your message is Load and run in one step -- End of Include file --
This function makes it possible to have nested haserl server pages - page snippets that are processed by the haserl tokenizer.
The luac "shell" is a precompiled lua chunk, so interactive editing and testing of scripts is not possible. However, haserl can be compiled with luac support only, and this allows lua support even in a small memory environment. All haserl lua features listed above are still available. (If luac is the only shell built into haserl, the haserl.loadfile is disabled, as the haserl parser is not compiled in.)
Here is an example of a trivial script, converted into a luac cgi script:
Given the file test.lua:
print ("Content-Type: text/plain0) print ("Your UUID for this run is: " .. ENV.SESSIONID)
It can be compiled with luac:
And then the haserl header added to it:
Alternatively, it is possible to develop an entire website using the standard lua shell, and then have haserl itself preprocess the scripts for the luac compiler as part of a build process. To do this, use --shell=lua, and develop the website. When ready to build the runtime environment, add the --debug line to your lua scripts, and run them outputting the results to .lua source files. For example:
Given the haserl script test.cgi:
#!/usr/bin/haserl --shell=lua --debug Content-Type: text/plain Your UUID for this run is <%= ENV.SESSIONID %>
Precompile, compile, and add the haserl luac header:
./test.cgi > test.lua luac -s -o test.luac test.lua echo '#!/usr/bin/haserl --shell=luac' | cat - test.luac >luac.cgi
Old versions of haserl used <? ?> as token markers, instead of <% %>. Haserl will fall back to using <? ?> if <% does not appear anywhere in the script.
When files are uploaded using RFC-2388, a temporary file is created. The name of the file is stored in FORM_variable_name, POST_variable_name, and HASERL_variable_name. Only HASERL_variable_name should be used - the others can be overwritten by a malicious client.
The name "haserl" comes from the Bavarian word for "bunny." At first glance it may be small and cute, but haserl is more like the bunny from Monty Python & The Holy Grail. In the words of Tim the Wizard, That's the most foul, cruel & bad-tempered rodent you ever set eyes on!
Haserl can be thought of the cgi equivalent to netcat. Both are small, powerful, and have very little in the way of extra features. Like netcat, haserl attempts to do its job with the least amount of extra "fluff".
Nathan Angelacos <nangel@users.sourceforge.net>
php(http://www.php.net) uncgi(http://www.midwinter.com/~koreth/uncgi.html) cgiwrapper(http://cgiwrapper.sourceforge.net)
October 2010 |