FBB::CGI - handles GET and POST submitted form data
#include <bobcat/cgi>
Linking option: -lbobcat
The class CGI offers an interface to data submitted by
web-forms. The data is sent to a script handling the data using a
<form action="/path/to/form/script"> stanza.
Very often this is indeed a script, like a Perl script, but there is no need
to use a scripting language. The class CGI allows C++
programmers to process the form by an executable usually resulting in faster
processing and in construction time benefits from the type safety offered by
C++. The class CGI automatically handles data submitted using
the GET method as well as data submitted using the POST
method.
By default the class’s constructor writes the customary
Content-type header lines to the standard output stream. Additional
(html) output of a reply page must be provided by other code. Therefore, a
program processing an uploaded form will have an organization comparable to
the following basic setup:
// assume includes and namespace std/FBB were defined
int main()
{
CGI cgi;
cout << "<html><body>\n";
if (parametersOK(cgi))
{
process(cgi);
generateReplyPage();
}
else
generateErrorPage();
cout << "</body></html>\n;
}
When errors in the received form-data are detected an error
message is written to the standard output stream and an
FBB::Exception exception is thrown.
FBB
All constructors, members, operators and manipulators, mentioned in this
man-page, are defined in the namespace FBB.
- o
- CGI::MapStringVector:
A shorthand for std::unordered_map<std::string,
std::vector<std::string> >, which is the data type in
which form-variables are stored.
The CGI::Method enumeration specifies values indicating the
way the form’s data were submitted:
- o
- CGI::UNDETERMINED:
Used internally indicating that the form’s method was neither
GET nor POST.
- o
- CGI::GET:
Indicates that the GET method was used when submitting the
form’s data;
- o
- CGI::POST:
Indicates that the POST method was used when submitting the
form’s data.
The CGI::Create enumeration is used to request or suppress
creation of the directory to contain any file uploaded by a form:
- o
- CGI::DONT_CREATE_PATH:
When uploading files, the destination directory must exist;
- o
- CGI::CREATE_PATH:
When uploading files, the destination directory will be created.
- o
- CGI(bool defaultEscape = true, char const *header =
"Content-type: text/html", std::ostream &out =
std::cout):
The default constructor writes the standard content type header to the
standard output stream and will use std::cout for output.
Specifying 0 as header suppresses outputting the
Content-type line. Otherwise the content type line is also followed
by two \r\n character combinations. By default all characters in
retrieved form-variables are escaped. The overloaded insertion operators
(see below) can be used to modify the default set of characters to escape.
The backslash is used as the escape character. The escape-prefix is not
used if the defaultEscape value is specified as false and if
no insertions into the CGI object were performed. The copy and move
constructors are available.
Note: the following three insertion operators, defining
sets of characters that should be escaped, can only be used before calling
any of the param, begin or end members. As soon as one of
these latter three members has been called the set of characters to be
escaped is fixed and attempts to modify that set is silently ignored.
- o
- char const *operator[](std::string const &key) const:
The index operator returns the value of the environment variable specified
as the index. 0 is returned if the variable specified at key is not
defined.
- o
- CGI &operator<<(std::string const &accept):
This member’s actions are suppressed once param, begin or
end (see below) has been called.
- The insertion operator can be used to fine-tune the set of characters that
are escaped in strings returned by param (see below). Depending on
the value of the constructor’s defaultEscape parameter
characters inserted into the CGI object will or will not be escaped
by a backslash.
- If the constructor’s defaultEscape parameter was specified
as true then the insertion operator can be used to define a set of
characters that are not escaped.
- If defaultEscape was specified as false then the insertion
operator will define a set of characters that will be escaped.
- The backlash itself is always escaped and a request to use it unescaped is
silently ignored.
- The accept string can be specified as a regular expression
character set, without the usual surrounding square brackets. E.g., an
insertion like cgi << "-a-z0-9" defines the set
consisting of the dash, the lower case letters and the digits.
- Individual characters, character ranges (using the dash to specify a
range) and all standard character classes ([:alnum:], [:alpha:],
[:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:],
[:space:], [:upper:], and [:xdigit:]) can be used to specify
a set of characters. In addition to these standard character classes the
class [:cgi:] can be used to define the set consisting of the
characters " ’ ` ; and \.
- Note that standard and [:cgi:] character classes do require
square brackets.
- When a series of insertions are performed then the union of the sets
defined by these insertions are used.
- Note: using unescaped single quotes, the double quotes, backtick
characters and semicolons in CGI-programs might be risky and is not
advised.
- o
- CGI &operator<<(int c):
This member’s actions are suppressed once param, begin or
end (see below) has been called.
- This insertion operator is used to change the default escape handling of a
single character c. The int parameter is cast internally to
a char.
- o
- CGI &operator<<(std::pair<char, char> range):
This member’s actions are suppressed once param, begin or
end (see below) has been called.
- This insertion operator can be used to change the default escape handling
of a range of characters. The pair’s second character must be equal
to or exceed the position of the pair’s first character in the
ASCII collating sequence or the member will have no effect.
- o
- std::ostream &std::operator<<(std::ostream &out, CGI
const &cgi):
CGI objects can be inserted into ostreams to display the
characters that will appear escaped in strings returned by the
param() member function. Each character for which isprint()
returns true will be displayed as character, surrounded by single
quotes. For all other characters their ASCII values are displayed. Each
character is displayed on a line by itself.
- The copy and move assignment operators are available.
- o
- CGI::MapStringVector::const_iterator begin():
Returns the begin iterator of the form’s parameter map. Iterator
values unequal to end (see below) point to a pair of values, the
first of which is the name of a field defined by the form, the second is a
vector of strings containing the field’s value(s). See also the
description of the param member below.
- o
- CGI::MapStringVector::const_iterator end():
Returns the end iterator of the form’s parameter map.
- o
- unsigned long long maxUploadSize() const:
Returns the current maximum file upload size in bytes.
- o
- CGI::Method method() const:
Returns the method that was used when the form was submitted (either
CGI::GET or CGI::POST).
- o
- std::vector<std::string> const ¶m(std::string const
&variable):
Returns the value of the form-variable specified by the function’s
argument. An empty vector is returned if the variable was not provided by
the form’s data.
- If the same variable was specified multiple times or if its value extends
over multiple lines (only with multipart/form-data) then the vector
contains multiple strings.
- With GET and POST methods not using
multipart/form-data input fields extending over multiple lines are
stored in one string, using \r\n combinations between those
lines.
- When files are uploaded the vectors contain sets of four strings. The
first string provides the path nme of the uploaded file; the second string
provides the file name specified in the form itself (so it is the name of
the file at the remote location); the third string shows the content type
specified by the remote browser (e.g., application/octet-stream),
the fourth string contains OK if the file was successfully uploaded
and truncated if the file was truncated. Existing files will not be
overwritten. When uploading a file a usable filename must be found within
100 trials.
- o
- std::string param1(std::string const &variable) const:
Returns the first element of the vector<string> returned by the
param member or an empty string if variable was not defined
by the received form.
- o
- std::string const &query() const:
Returns the query-string submitted with CGI::GET or CGI::POST
forms (if the POSTed form specified
ENCTYPE="multipart/form-data" the query string is
empty).
- o
- report():
The report member silently returns if no errors were encountered
while processing form-data. Otherwise, the html file generated by
the CGI program displays a line starting with FBB::CGI,
followed by the status report.
- The following status report messages are presently defined:
- Content-Disposition not recognized in:, which is followed by the
line where the Content-Disposition was expected. This may occur
when processing multipart/form data.
- Invalid multipart/form-data. This message can be generated when
readling lines while processing multipart/form data.
- GET/POST REQUEST_METHOD not found. This message is shown if the
program couldn’t find the form’s REQUEST_METHOD type
(i.e., GET or POST).
- Invalid CONTENT_LENGHT in POSTed form. This message is shown if the
content-length header has an incorrect value.
- Content-Type not found for file-field, followed by the
file’s field name. This message is shown if no Content-Type
specification was found in an uploaded form.
- Can’t open a file to write an uploaded file. This message
indicates that the CGI program was unable to open a file to write an
uploaded file to. This can be caused by an overfull disk or partition or
by incorrect write-permissions.
- multipart/form-data: no end-boundary found. This message is shown
if the end-boundary was missing in a multipart/form-data form.
- o
- void setFileDestination(std::string const &path, std::string
const &prefix = "", Create create = CREATE_PATH):
This member is used to specify the path and prefix of uploaded files.
Uploaded files will be stored at path/prefixNr where Nr is
an internally used number starting at one. When CREATE_PATH is
specified path must be available or the CGI object must be
able to create the path. If DONT_CREATE_PATH is specified the
specified path must be available. If not, an FBB::Exception
exception will be thrown.
- o
- void setMaxUploadSize(size_t maxSize, int unit =
’M’):
This member can be used to change the maximum size of uploaded files. Its
default value is 100Mb. The unit can be one of b (bytes, the
default), K (Kbytes), M (Mbytes) or G (Gbytes).
Unit-specifiers are interpreted case insensitively. File uploads will
continue until the maximum upload size is exceeded, followed by discarding
any remainder.
- o
- void swap(CGI &other):
The current and other object are swapped. The first time one of the
param(), begin() or end() members is called these members
may detect errors in the the received form data. If so, an error message
is written to the standard output stream and an FBB::Exception
exception will be thrown.
- o
- std::string dos2unix(std::string const &text):
This member converts all \r\n character combinations in text
into plain \n characters, returning the converted text.
- o
- std::string unPercent(std::string const &text):
This member converts all %xx encoded characters into their
corresponding ASCII values. Also, + characters are converted to
single blank spaces. The converted string is returned.
#include "main.ih"
void showParam(CGI::MapStringVector::value_type const &mapValue)
{
cout << "Param: " << mapValue.first << ’\n’;
for (auto &str: mapValue.second)
cout << " " << CGI::dos2unix(str) << "\n"
" ";
cout << ’\n’;
}
int main(int argc, char **argv)
try
{
Arg &arg = Arg::initialize("evhm:", argc, argv);
// usage and version are in the source archive in .../cgi/driver
// arg.versionHelp(usage, version, 2);
ifstream in(arg[0]);
string line;
while (getline(in, line))
{
size_t pos = line.find(’=’);
if (pos == string::npos)
continue;
// set environment vars simulating
// a GET form
if (setenv(line.substr(0, pos).c_str(),
line.substr(pos + 1).c_str(), true) == 0)
{
if (arg.option(’e’))
cout << line.substr(0, pos).c_str() << ’=’ <<
line.substr(pos + 1).c_str() << ’\n’;
}
else
cout << "FAILED: setenv " << line << ’\n’;
}
CGI cgi(false); // chars are not escaped
cgi << arg[1];
if (arg.option(&line, ’m’))
cgi.setMaxUploadSize(A2x(line), *line.rbegin());
cout << "Max upload size (b): " << cgi.maxUploadSize() << ’\n’;
CGI::Method method = cgi.method();
cout << "To escape:\n" <<
cgi << "\n"
"Method: " << (method == CGI::GET ? "GET" : "POST") <<
’\n’;
cout << "Query string: " << cgi.query() << ’\n’;
cout << "Submit string: `" << cgi.param1("submit") << "’\n";
for (auto &mapElement: cgi)
showParam(mapElement);
cout << "END OF PROGRAM\n";
}
catch (exception const &err)
{
cout << err.what() << ’\n’;
return 1;
}
catch (...)
{
return 1;
}
To test the program’s get form processing, call it
as driver get ’[:cgi:]’, with the file
get containing:
INFO=This is an abbreviated set of environment variables
SERVER_ADMIN=f.b.brokken@rug.nl
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
REQUEST_METHOD=GET
QUERY_STRING=hidden=hidval&submit=Submit+%20Query
To test the program’s post form processing, call it
as driver post1 ’[:cgi:]’, using post1
and post1.cin found in Bobcat’s source archive under
../cgi/driver.
bobcat/cgi - defines the class interface
- o
- bobcat_4.08.06-x.dsc: detached signature;
- o
- bobcat_4.08.06-x.tar.gz: source archive;
- o
- bobcat_4.08.06-x_i386.changes: change log;
- o
- libbobcat1_4.08.06-x_*.deb: debian package holding the
libraries;
- o
- libbobcat1-dev_4.08.06-x_*.deb: debian package holding the
libraries, headers and manual pages;
- o
- http://sourceforge.net/projects/bobcat: public archive location;
Bobcat is an acronym of `Brokken’s Own Base Classes And
Templates’.
This is free software, distributed under the terms of the GNU
General Public License (GPL).
Frank B. Brokken (f.b.brokken@rug.nl).