What is mod_authz_svn_db?
Posted by Christopher Wojno Sun, 19 Aug 2007 20:31:00 GMT
Abstract
The Authorization Subversion Database Apache Module is an implementation of database driven permissions of groups and individuals for subversion repositories. Think mod_authz_svn1 but instead of using a file to support your permissions, you can use a database. This will allow a single server repository collection the ability to host, control (via a web application with PHP/Ruby/ASP), and create permissions policies for multiple subversion repositories.
Motivation
While attending USC, Trevor Johns and I were stymied by the lack of companion collaboration software for classes which either required it or lent themselves well to practice with those tools. While most students e-mailed (or worse: shared their user accounts to share) their code, Trevor and I were the only ones with the resources to deploy a source code management server of our own. Even in our capstone classes where the use of subversion was attempted, it’s complexity quickly dazed the rest of the students and in my case, subversion fell by the wayside.
Trevor had more luck, as he spent hours developing a smiling slide presentation which explained (with more clarity than my demonstration) the benefits and use of subversion. He succeeded in getting the class to use subversion, but the deployment was still a nightmare.
History
As you may or may not be aware, Trevor and I are members of UPE, a computer science honor society. That is related only in that Ross Boucher, a fellow member, suggested that UPE create some sort of software to manage subversion projects and permissions.
I took a special interest in the project and began planning its development. I realized that we would need two separate components, the web application to provide an interface for policy setting, and the means to enforce the policies. Enforcement fell to an Apache2 (sorry Apache1 users) module. I dubbed it the mod_authz_svn_db (Module Authorization Subversion Database) after mod_authz_svn, mod_authz_mysql, mod_authz_pgsql2, mod_authz_dbd.
As a small note of interest, there was a heated battle about just using a flat file with Subversion’s mod_dav_svn module (which already existed) to manage the permissions. However, the biggest problems concern flat-file parsing for thousands of users and repositories and concurrency (writing such a huge file is bound to provoke inconsistencies in permissions). That balloons to file-system level locking. Is it ugly enough for you yet? Imagine parsing a 5,000+ line authorization file in a PHP application to figure out who has access to what. Then, imagine WRITING that data back to the file. And whatever help you if your PHP application hiccups mid-write. BYE BYE file! Better write a backup script! See why I wanted a database back-end? I’m not saying it’s impossible, it’s just not for me and maybe not for you.
Structure
I wanted this module to be useful for more than just our little web application (SwitchYard). There is no reason why other people should be forced to use our web application for use with subversion. I therefore, designed it to be as modular as possible. Mod_authz_svn_db has nothing to do with SwitchYard. You are able to use it in conjunction with other implementations.
Conforming to my modularity motif, I developed a “base” module where the bulk of the heavy lifting and programming lives. Therefore, all database specific code can be delegated to a select few functions: open, close, query. Porting this module to other databases is extremely easy. To support a new database, simply include the “base” module code and define the database specific functions. This reduces errors and speeds integration with new database technology. It also allows for database speed-up customizations (write the code optimized for your database). By far, the most difficult part is writing the SQL in C-code.
Admittedly, I used the mod_authz_svn module source code as a template (knowing nothing about Apache module programming, I figured working code is a good place to start). Some of the old structure still remains, but all of the controlling code has been gutted and replaced. I used the per-directory configuration structure to hold the database configuration. As the database configuration is flexible, it also ballooned the number of variables that are stored. Each of those had to be connected to the Apache configuration file. Because the permissions were no longer being read from a file, I had to replace the file-reading with a database call (3 of them: open, query, close).
How it works
It is an Apache2 module so it works just as those work, here’s the environment setup:- Build the module (or otherwise obtain it built)
- Tell Apache where to find it (when Apache starts up, it will load that code in as a shared library and execute the hook scripts so that the module is installed and called correctly)
- Build/acquire mod_auth_mysql or mod_auth_pgsql2
- Configure mod_auth_pgsql2 or mod_auth_mysql in the httpd.conf or in the .htaccess (as applicable)
- Configure your httpd.conf (or .htaccess files) to use the mod_dav_svn and the mod_authz_svn_db
- Configure the location of your repository
- Create a repository
- Create the database as per the schema
- Put in a user (or define the anonymous group and permissions)
- (Re)Start Apache
Once you have a working environment, here is how requests are handled for anonymous users:
- Browser/SVN client requests (GET) document within your repositories path; Example: svn.myserver.com/svnrepos/repositoryX/document.txt
- Apache handles the GET request and passes it to the authorization chain
- mod_authz_svn_db is invoked (as it sits in the aforementioned chain)
- mod_authz_svn_db looks at the database for the user, in this case, since no authorization is present, the default of “anonymous” is used
- Database returns a list of permissions associated with the repository and path for anonymous
- Module rejects if no permissions found (permissions can be inherited from parent directories), or if permission is explicitly denied
- Apache returns the success code (200 OK) if anonymous is allowed or an authorization failure (401) if anonymous is not allowed. If you are authorized in part, but not in another, you will receive a 403 (Forbidden) error, meaning you are not authorized, even if you are authorized to view other parts of the repository.
Anonymous access has two flavors: can be an explicit pseudo-user (you log-in as anonymous and supply a password, but you can explicity reject anonymous users in the database) or it can be an understood default pseudo-user (you never log-in, it automatically assumes your user name is anonymous). Either way, you can control, with very very fine detail what anonymous users can see or do. When you come to a section of the repository that needs permission beyond the anonymous user, you’ll be prompted for a password and username.
For non-anonymous users, it works EXACTLY the same as the explicit anonymous users, only you must provide a user name and password. Remember: anonymous is a special, reserved, username for anonymous users (so you can’t use it to identify a specific user).
What can you do with it?
Well, you can create a single directory for lots of repositories. That’s not novel, mod_dav_svn already lets you do that. You can also deploy permissions for your repositories. That’s not new either, mod_authz_svn lets you do that. So what can you do? You can support your permissions with a database. So you can build a web application to manage your repositories and permissions. Trevor and I are working on a Ruby on Rails version we call “SwitchYard.”
Like the name? I thought of it myself ;-). We’re using rails, so we had to go with the train theme. Real-life switch yards multiplex few tracks onto many tracks so trains can be connected with various cars and freight. In the same way, SwitchYard is the place where code can be moved, shared, and collaborated on to form a select, productive set of applications built on all the sub-components. SwitchYard, with the help of mod_authz_svn_db and friends, allows you to give wide and fine control over their own repositories. No need to get a system administrator involved.
SwitchYard talks with the database, the Apache module also talk to the database. Combined, you can do nearly anything, even develop an online community of coders… Such as CollabNet.
What databases?
Right now, the module only supports MySQL and PostgreSQL (my favorite). There is an edge version of SQLite3, but it has not been approved for the trunk build yet. While the source code is provided for each, you have the option of building only the module for the database you desire. Or you can build all and play!
How can I get it?
-Well, Trevor has graciously decided to host it on his server at: http://tjohns.net/svn/mod_authz_svn_db/ -
I’ve moved it to my own server because Trevor shouldn’t have to host my stuff. It is available here: http://svn.wojno.com/mod_authz_svn_db
I’ve decided to release this one on the FreeBSD license under the CollabNet license (I borrowed some of their code, so credit is due). I have provided basic make files, guaranteed to build on FreeBSD 6.1. I make no guarantees about other platforms (or any platform for that matter), though Michael Rodgers has informed me that it works (with minor tweaks) on SuSE 10.2. No Windows support yet. I’ll have to think about how to approach that one. I’m sure it will build and run, I just am not sure how easy it will be to configure that build script.
Anyway: please, take a look and tell me what you think. Took me about a week to code the base, and it’s been on-going development as of late for the database specific stuff. I hope that one day, this may replace mod_authz_svn! Wish us luck!
1 This comes with subversion, you need to build it along with your subversion installation. Sorry there is no page dedicated to this module from subversion at this time (that I could find)
