he Struts Web application framework facilitates building robust Web applications. Java Authentication and Authorization Services (JAAS) is a rich API for adding pluggable security modules to applications. These powerful services work well together, however, their combination can also add some complexity to maintenance and enhancement tasks. Maintaining synchronization between an application’s Struts configuration file(s) mappings and the JAAS security framework policy file can be a challenge.
As the size and complexity of your site increases and the number of people contributing expands, it is possible (even probable) that the site’s configuration files will get out of synch. One common scenario is that a developer might add a page to the policy file in his test environment and then inadvertently fail to commit the updated file to source control. Alternatively, perhaps a colleague’s merge overwrites your updates to the policy file that support a new page. Inserting an automated utility into your build process to alert you when configuration files aren’t in synch is far better than your help desk getting calls from frustrated users who can’t access pages that were formerly available to them.
For example, here are two snippets from a Jaas.policy file and a struts_config file:
Jaas.policy file snippet
permission com.tillman.service.security.auth.URLPermission "/login.shtml";
Struts-config.xml file snippet
It’s easy enough to look at the few lines cut from the two configuration files above to determine whether they match, but what if you have multiple configuration files and hundreds of pages or more in your site? In such a situation, checking the files manually would be unrealistic. Certainly, if you have automated testing tools you might catch some synchronization issues in QA testing, but it’s far less trouble to correct problems during the development build or developer unit testing cycle.
Fortunately, the Struts configuration files are XML and the JAAS policy file is a structured text file, therefore, they can be parsed and processed in an automated fashion. This article describes an audit utility that iterates over the XML configuration files in a given directory, parses the
Having the source code in front of you will make it easier to understand what is going on. Please review the sidebar Network Connection Required to avoid frustration running the code. I’ve included sample struts-config.xml and jaas.policy files in the download for your convenience.
To work through the code, you’ll need Python 2.3 or later (available for free from http://www.python.org) and a working understanding of the Python programming language.
It would be helpful if you’re familiar with Struts and the Java Authorization and Authentication Services (JAAS) that is part of the JDK 1.4 and later and available as a separate download for JDK 1.3; however, you can follow along and learn something from the text processing and parsing techniques presented even if you have no Struts or JAAS experience. I’ve assumed that readers have some understanding of XML parsing using SAX, and some familiarity with object-oriented development and regular expressions.
The Audit Utility
The audit utility (hereafter called the auditor) begins its work by creating a Python list of the files referenced in the Permissions section of the JAAS policy file. As I mentioned, this is a structured file, so it is possible to process it, however, it isn’t as straightforward a task as parsing an XML document. I’ve hard-coded the path to the policy file in the sample code that accompanies the article but you could just as easily read this value from a properties file or pass it as a command line parameter.
The createPolicyFileList() method reads the JAAS file and then uses a regular expression to cull out the lines that specify a JAAS permission (see the sample permission file bundled with the downloadable source). Python provides the re regular expression library as a part of its core API, so it isn’t difficult to compile a regular expression pattern so only the “Permission” lines are processed.
q = re.compile("URLPermission")
As the method reads each line of the file, it splits each line containing a JAAS permission. Each token in the line becomes an element in the plist variable (line 3 in the code below) which is a Python list data structure. Line 6 compiles another regular expression that finds lines ending with a .shtml extension, which is an arbitrary, pre-defined extension used in my environment to indicate a page served via the Struts framework. If you use a different extension you’ll obviously have to modify this line of code.
|Editor’s Note: In the code snippets in this article, some of the Python indentations have been altered to suit the formatting of this article’s Web page. Double check the source code download to verify the indentations if you cut-and-paste these code snippets into your own project.
1. for line in inputfile: 2. if q.search(line): 3. plist = q.split(line) 4. page=re.compile("w*.shtml") 5. if page.search(plist): 6. pageName = plist[string.rfind(plist, "/")+1:string.rfind(plist, ".")] 7. policyPages.append(pageName)
Executing the split() method at line 3 on the line below from the sample JAAS policy file seems like it ought to result in a three-element list but Python parses it into only two. The reasons for this are outside the scope of this discussion; suffice it to say that buyFromGrainger.shtml becomes the subscript element of the plist variable (see line 3 above).
permission com.grainger.URLPermission "/buyFromGrainger.shtml"
What you really need though is just the page name without the extension, so line 6 takes a slice of the plist variable culling out the text between the slash and the period. Line 7 adds the page name to the policyPages list that will be returned by the method. The next portion of the code deals with parsing and extracting the paths from the Struts configuration file(s) and checking them against this list.
Parsing the Configuration Files
Parsing the configuration files starts with a recursive walk through the directory that contains the configuration files. You specify this location via the first command line argument. The second command line argument is the pattern for the file type you’re interested in matching as the code “walks” the directory tree ? *.xml in this case. The pattern match is important because there could be other non-XML configuration files in the directory such as tag lib descriptor files with a .tld extension.
After instantiating the XML SAX parser and ContentHandler the method invokes the listFiles() method, which is chock full of good Pythonic stuff. The method defines a class called Bunch used to collect the arguments needed by the visit() method as it processes the directory tree. It passes the visit()as a parameter to the os.path.walk() method which is part of the standard Python library and is a real labor saver when working with the file system. In Python, methods are first-class objects, so you can pass method callbacks as arguments.
1. for name in files: 2. fullname = os.path.normpath(os.path.join( dirname, name)) 3. if arg.return_folders or os.path.isfile(fullname): 4. for pattern in arg.pattern_list: 5. if fnmatch.fnmatch(name, pattern): 6. parseFile(fullname) 7. break 8. #Block recursion if recursion was prohibited 9. if not arg.recurse: files[:]=
The central block of the visit method loops through the files in the given directory and if there is a file name match (line 5 above) between the name of the file and the .xml extension pattern specified on the command line then it invokes the parseFile(fullFileName) method at line 6 to initiate the configuration file parsing process.
The ActionPathHandler is the ContentHandler callback class for the XML parsing process. In compliance with the SAX API it defines the startElement() method so that each configuration file XML element will be parsed and evaluated.
1. #for each action see if path is in the policy file. 2. def startElement(self, name, attrs): 3. if name == 'action': 4. self.path = attrs.get('path',"") 5. try: 6. pageName=self.path[1:len(self.path)] 7. if(self.policyPages.index(pageName) > 0): 8. pass 9. except ValueError: 10. print "not found: " + self.path
In Struts configuration files the action mapping contains a path attribute that holds the name of the page that will map a particular URL to an action class. The action mapping’s path attribute doesn’t include the .shtml extension so this is the name that should appear in the JAAS list. If it doesn’t, the auditor reports the file’s absence. This is also the name that would appear in a Web request. Therefore, a URL to retrieve home.jsp as shown in line 3 below would look something like http://www.doug.com/start.shtml. This sort of indirection can be challenging to understand at first but it is one of the true strengths of Struts.
Snippet from the sample struts-config.xml configuration file
2. 3. 4. 6. 5.
The auditor passes the ActionPathHandler to the SAX parser as it starts to parse the file. The parser is responsible for invoking the ContentHandler methods to pull out the various XML elements, entities, etc. The code at line 4 above gets the path attribute from the action element.
In this case, the page name that we want to check against the JAAS configuration file starts with a leading slash (see the sample configuration file or the snippet above) and so line 6 extracts just the page name using a Python ‘slice’ of the list data. Line 7 uses the built-in Python list index(data_to_find) method to check if the page name exists in the list of pages that were previously culled from the JAAS policy file. The indexOf() method returns an integer to represent the location of any match found in the list. Since the auditor is interested only in configuration file pages that aren’t in the policy file, it catches a ValueError when a page doesn’t appear (see line 9) and prints out the file name to the console along with a short message (see line 10).
Struts and JAAS are a potent combination for building robust and secure Web applications. With the help of this auditor utility?customized to meet your needs?you can spend less time worrying about keeping these files in synch and debugging problems when things go awry and spend more time writing the code that is critical to your business’ success.