Parsing the Configuration Files
Parsing the configuration files starts with a recursive walk through the directory that contains the configuration files. You specify this location via the first command line argument. The second command line argument is the pattern for the file type you're interested in matching as the code "walks" the directory tree
*.xml in this case. The pattern match is important because there could be other non-XML configuration files in the directory such as tag lib descriptor files with a .tld extension.
After instantiating the XML SAX parser and ContentHandler the method invokes the
listFiles() method, which is chock full of good Pythonic stuff. The method defines a class called Bunch used to collect the arguments needed by the
visit() method as it processes the directory tree. It passes the
visit()as a parameter to the
os.path.walk() method which is part of the standard Python library and is a real labor saver when working with the file system. In Python, methods are first-class objects, so you can pass method callbacks as arguments.
1. for name in files:
2. fullname =
os.path.normpath(os.path.join( \
dirname, name))
3. if arg.return_folders or \
os.path.isfile(fullname):
4. for pattern in arg.pattern_list:
5. if fnmatch.fnmatch(name, pattern):
6. parseFile(fullname)
7. break
8. #Block recursion if recursion was prohibited
9. if not arg.recurse: files[:]=[]
The central block of the
visit method loops through the files in the given directory and if there is a file name match (line 5 above) between the name of the file and the
.xml extension pattern specified on the command line then it invokes the
parseFile(fullFileName) method at line 6 to initiate the configuration file parsing process.
The ActionPathHandler is the ContentHandler callback class for the XML parsing process. In compliance with the SAX API it defines the
startElement() method so that each configuration file XML element will be parsed and evaluated.
1. #for each action see if path is in the policy file.
2. def startElement(self, name, attrs):
3. if name == 'action':
4. self.path = attrs.get('path',"")
5. try:
6. pageName=self.path[1:len(self.path)]
7. if(self.policyPages.index(pageName) > 0):
8. pass
9. except ValueError:
10. print "not found: " + self.path
In Struts configuration files the
action mapping contains a
path attribute that holds the name of the page that will map a particular URL to an action class. The action mapping's path attribute doesn't include the
.shtml extension so this is the name that should appear in the JAAS list. If it doesn't, the auditor reports the file's absence. This is also the name that would appear in a Web request. Therefore, a URL to retrieve
home.jsp as shown in line 3 below would look something like
http://www.doug.com/start.shtml. This sort of indirection can be challenging to understand at first but it is one of the true strengths of Struts.
Snippet from the sample
struts-config.xml configuration file
1. <action-mappings>
2. <action path="/start" forward="/home.jsp" />
3. <action path="/parseablePage" scope="request"
type="com.doug.AnAction">
4. <forward name="error" path="/Error.jsp"/>
5. </action>
6. </action-mappings>
The auditor passes the ActionPathHandler to the SAX parser as it starts to parse the file. The parser is responsible for invoking the ContentHandler methods to pull out the various XML elements, entities, etc. The code at line 4 above gets the
path attribute from the
action element.
In this case, the page name that we want to check against the JAAS configuration file starts with a leading slash (see the sample configuration file or the snippet above) and so line 6 extracts just the page name using a Python 'slice' of the
list data. Line 7 uses the built-in Python
list index(data_to_find) method to check if the page name exists in the list of pages that were previously culled from the JAAS policy file. The
indexOf() method returns an integer to represent the location of any match found in the list. Since the auditor is interested only in configuration file pages that aren't in the policy file, it catches a ValueError when a page doesn't appear (see line 9) and prints out the file name to the console along with a short message (see line 10).
Struts and JAAS are a potent combination for building robust and secure Web applications. With the help of this auditor utilitycustomized to meet your needsyou can spend less time worrying about keeping these files in synch and debugging problems when things go awry and spend more time writing the code that is critical to your business' success.