Access Control Mechanisms
=========================

This document contains information about mechanisms available in mod_wsgi
for controlling who can access a WSGI application. This includes coverage
of support for HTTP Basic and Digest authentication mechanisms, as well
as server side mechanisms for authorisation and host access control.

HTTP User Authentication
------------------------

The HTTP protocol supports user authentication mechanisms for clients
through the 'Authorization' header. The two main examples for this are
the Basic and Digest authentication mechanisms.

Unlike other HTTP headers, the authorisation header is not passed through
to a WSGI application by default. This is the case as doing so could leak
information about passwords through to a WSGI application which should not
be able to see them when Apache is performing authentication.

If Apache is performing authentication, a WSGI application can still find
out what type of authentication scheme was used by checking the variable
``AUTH_TYPE`` of the WSGI application environment. The login name of the
authorised user can be determined by checking the variable
``REMOTE_USER``.

If it is desired that the WSGI application be responsible for handling user
authentication, then it is necessary to explicitly configure mod_wsgi to
pass the required headers through to the application. This can be done by
specifying the WSGIPassAuthorization directive in the appropriate context
and setting it to 'On'. Note that prior to mod_wsgi version 2.0c5, this
directive could not be used in .htaccess files.

When passing of authorisation information is enabled, the authorisation
headers are passed through to a WSGI application in the
``HTTP_AUTHORIZATION`` variable of the WSGI application environment when
the equivalent HTTP request header is present. You will still need to
provide your own code to process the header and perform the required hand
shaking with the client to indicate whether the client is permitted access.

Apache Authentication Provider
------------------------------

When Apache 2.2 was released, it introduced the concept of authentication
providers. That is, Apache implements the hand shaking with the client for
authentication mechanisms such as Basic and Digest. All that the user
server side code needs to provide is a means of authenticating the actual
credentials of the user trying to gain access to the site.

This greatly simplified the implementation of client authentication as the
hand shaking for a particular authentication mechanism was implemented only
once in Apache and it wasn't necessary for each authentication module to
duplicate it. This was particularly good for the Digest authentication
mechanism which was non trivial to implement correctly.

Using mod_wsgi 2.0 or later, it is possible using the WSGIAuthUserScript
directive to define a Python script file containing code which performs the
authenticating of user credentials as outlined.

The required Apache configuration for defining the authentication provider
for Basic authentication when using Apache 2.2 would be::

    AuthType Basic
    AuthName "Top Secret"
    AuthBasicProvider wsgi
    WSGIAuthUserScript /usr/local/wsgi/scripts/auth.wsgi
    Require valid-user

The 'auth.wsgi' script would then need to contain a 'check_password()'
function with a sample as shown below::

    def check_password(environ, user, password):
        if user == 'spy':
            if password == 'secret':
                return True
            return False
        return None

This function should validate that the user exists in the user database and
that the password is correct. If the user does not exist at all, then the
result should be 'None'. If the user does exist, the result should be
'True' or 'False' depending on whether the password was valid.

If wishing to use Digest authentication, the configuration for Apache 2.2
would instead be::

    AuthType Digest
    AuthName "Top Secret"
    AuthDigestProvider wsgi
    WSGIAuthUserScript /usr/local/wsgi/scripts/auth.wsgi
    Require valid-user

The name of the required authentication function for Digest authentication
is 'get_realm_hash()'. The result of the function must be 'None' if the
user doesn't exist, or a hash string encoding the user name, authentication
realm and password::

    import md5

    def get_realm_hash(environ, user, realm):
        if user == 'spy':
            value = md5.new()
            # user:realm:password
            value.update('%s:%s:%s' % (user, realm, 'secret'))
            hash = value.hexdigest()
            return hash
        return None

By default the auth providers are executed in context of first interpreter
created by Python, ie., '%{GLOBAL}' and always in the Apache child
processes, never in a daemon process. The interpreter can be overridden
using the 'application-group' option to the script directive. The namespace
for authentication groups is shared with that for application groups
defined by WSGIApplicationGroup.

Because the auth provider is always run in the Apache child processes and
never in the context of a mod_wsgi daemon process, if the authentication
check is making use of the internals of some Python web framework, it is
recommended that the application using that web framework also be run in
embedded mode and the same application group. This is the case as the
Python web frameworks often bring in a huge amount of code even if using
only one small part of them. This will result in a lot of memory being used
in the Apache child processes just to support the auth provider.

If mod_authn_alias is being loaded into Apache, then an aliased auth
%rovider can also be defined::

    <AuthnProviderAlias wsgi django>
    WSGIAuthUserScript /usr/local/django/mysite/apache/auth.wsgi \
     application-group=django
    </AuthnProviderAlias>

    WSGIScriptAlias / /usr/local/django/mysite/apache/django.wsgi

    <Directory /usr/local/django/mysite/apache>
    Order deny,allow
    Allow from all

    WSGIApplicationGroup django

    AuthType Basic
    AuthName "Django Site"
    AuthBasicProvider django
    Require valid-user
    </Directory>

An authentication script for Django might then be something like::

    import os, sys
    sys.path.append('/usr/local/django')
    os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'

    from django.contrib.auth.models import User
    from django import db

    def check_password(environ, user, password):
        db.reset_queries() 

        kwargs = {'username': user, 'is_active': True} 

        try: 
            try: 
                user = User.objects.get(**kwargs) 
            except User.DoesNotExist: 
                return None

            if user.check_password(password): 
                return True
            else: 
                return False
        finally: 
            db.connection.close() 

For both Basic and Digest authentication providers, the 'environ' dictionary
passed as first argument is a cut down version of what would be supplied
to the actual WSGI application. This includes the 'wsgi.errors' object for
the purposes of logging error messages associated with the request.

Any configuration defined by !SetEnv directives is not passed in the
'environ' dictionary because doing so would allow users to override the
configuration specified in such a way from a '.htaccess' file.
Configuration should as a result be placed into the script file itself.

Although authentication providers were a new feature in Apache 2.2, the
mod_wsgi module emulates the functionality so that the above can also be
used with Apache 2.0. In using Apache 2.0, the required Apache configuration
is however slightly different and needs to be::

    AuthType Basic
    AuthName "Top Secret"
    WSGIAuthUserScript /usr/local/wsgi/scripts/auth.wsgi
    AuthAuthoritative Off
    Require valid-user

When using Apache 2.0 however, only support for Basic authentication
mechanism is provided. It is not possible to use Digest authentication.
When using Apache 1.3, this feature is not available at all.

The benefit of using the Apache authentication provider mechanism rather
than the WSGI application doing it all itself, is that it can be used to
control access to a number of WSGI applications at the same time as well as
static files or dynamic pages implemented by other Apache modules using
other programming languages such as PHP or Perl. The mechanism could even
be used to control access to CGI scripts.

Apache Group Authorisation
--------------------------

As compliment to the authentication provider mechanism, mod_wsgi 2.0 also
provides a mechanism for implementing group authorisation using the Apache
'Require' directive. To use this in conjunction with an inbuilt Apache
authentication provider such as a password file, the following Apache
configuration would be used::

    AuthType Basic
    AuthName "Top Secret"
    AuthBasicProvider dbm
    AuthDBMUserFile /usr/local/wsgi/accounts.dbm
    WSGIAuthGroupScript /usr/local/wsgi/scripts/auth.wsgi
    Require group secret-agents
    Require valid-user

The 'auth.wsgi' script would then need to contain a 'groups_for_user()'
function with a sample as shown below::

    def groups_for_user(environ, user):
        if user == 'spy':
            return ['secret-agents']
        return ['']

The function should supply a list of groups the user is a member of or
an empty list otherwise.

The feature may be used with any authentication provider, including one
defined using WSGIAuthUserScript.

The 'environ' dictionary passed as first argument is a cut down version of
what would be supplied to the actual WSGI application. This includes the
'wsgi.errors' object for the purposes of logging error messages associated
with the request.

Any configuration defined by !SetEnv directives is not passed in the
'environ' dictionary because doing so would allow users to override the
configuration specified in such a way from a '.htaccess' file.
Configuration should as a result be placed into the script file itself.

Configuration of group authorisation is the same whether Apache 2.0 or 2.2
is used. The feature is not available when using Apache 1.3.

By default the group authorisation code is always executed in the context
of the first interpreter created by Python, ie., '%{GLOBAL}', and always in
the Apache child processes, never in a daemon process. The interpreter can
be overridden using the 'application-group' option to the script directive.

Host Access Controls
--------------------

The authentication provider and group authorisation features help to control
access based on the identity of a user. Using mod_wsgi 2.0 it is also
possible to limit access based on the machine which the client is connecting
from. The path to the script is defined using the WSGIAccessScript
directive::

    WSGIAccessScript /usr/local/wsgi/script/access.wsgi

The name of the function that must exist in the script file is
'allow_access()'.  It must return True or False::

    def allow_access(environ, host):
        return host in ['localhost', '::1']

The 'environ' dictionary passed as first argument is a cut down version of
what would be supplied to the actual WSGI application. This includes the
'wsgi.errors' object for the purposes of logging error messages associated
with the request.

Any configuration defined by !SetEnv directives is not passed in the
'environ' dictionary because doing so would allow users to override the
configuration specified in such a way from a '.htaccess' file.
Configuration should as a result be placed into the script file itself.

By default the access checking code is executed in context of the first
interpreter created by Python, ie., '%{GLOBAL}', and always in the Apache
child processes, never in a daemon process. The interpreter used can be
overridden using the 'application-group' option to the script directive.