Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Generating Reports and Statistics in PHP : Page 2

Discover the PHP libraries that help you generate statistics and reports that analyze data from text files, XML, or relational databases.


advertisement

The XML_Statistics PEAR Package
To analyze XML documents, you can install and use the XML_Statistics package, which provides methods for obtaining statistics about tags, attributes, entities, processing instructions, data blocks, and CDATA sections for any well-formatted XML document.

To install the PEAR, use this command (version 0.1 is the beta version):

   > pear install -alldeps XML_Statistics-0.1
Authors' Note Use the -alldeps option when installing, because the XML_Statistics PEAR depends on the XML_Parser PEAR package, which is an XML parser based on PHP's built-in XML extension. The latest released of the XML_Parser PEAR is 1.2.8 (stable).


After installing the package, you'll find the code for the base class of the XML_Statistics package in the file XML_Statistics.php. Here are some of its main functions:

  • boolean analyzeFile(mixed $file,string $filename): This function analyzes an XML file by loading it from a file path or URL. To see the results of the analysis use the countX() and getX() methods.
  • integer countTag([string $tagname = null]): This function returns the number of occurrences of a tag in the entire XML document. The tag name is passed to the function through the $tagname argument.
  • integer countAttribute([string $attribute = null], [string $tagname = null]): This function returns the number of occurrences of an attribute. You pass the attribute name to the function via the $attribute argument. If you don't specify the second argument, $tagname, then the function searches for the specified attribute in the entire XML document; otherwise, it limits the search range to the specified tag.
  • integer getMaxDepth(): This function returns the maximum nesting level in the document, the "depth" of the XML tree.
  • integer countTagsInDepth(integer $depth): This function returns the number of tags that "live" at the specified depth. The root tag depth is zero.
  • integer countExternalEntity([string $name = null]): This function returns the number of occurrences of external entities. If you don't specify an entity name then the function counts the total number of external entities; otherwise, it counts only the occurrences of the specified entity.
  • integer getCDataLength(): This function return the total length of all CDATA sections.


Listing 1 contains an XML document used for test purposes in the example code (see the file myxml.xml in the downloadable code).

The following PHP example uses the XML document in Listing 1 to retrieve some basic statistics about tags, attributes, and CDATA sections:

   <?php
     //import Statistics.php
     require_once 'XML/Statistics.php';
       
     //ignore whitespaces
     $stat = new XML_Statistics(array("ignoreWhitespace" => true));
       
     //analyze a file or URL
     $result = $stat->analyzeFile("myxml.xml");
       
     if ($stat->isError($result)) {
       die("Error: " . $result->getMessage());
     } 
     else 
     {        
       // total number of tags
       echo "Total tags: " . $stat->countTag()."<br>";
       
       // count number of 'type' attribute
       echo "Occurences of attribute type: " . $stat->countAttribute("type")."<br>";
             
       // get the maximum depth
       echo "Maximum depth: " . $stat->getMaxDepth()."<br>";
   
       // count total number of tags in depth 3
       echo "The number of tags in depth 3: " . $stat->countTagsInDepth(3)."<br>";
       
       // count the occurences of data blocks
       echo "Data chunks: " . $stat->countDataChunks()."<br>";
   
       // get the length of all CData sections
       echo "Length of all data chunks: " . $stat->getCDataLength()."<br>";
     }
   ?>

The output of this example is:

   Total tags: 16
   Occurences of attribute type: 2
   Maximum depth: 3
   The number of tags in depth 3: 10
   Data chunks: 6
   Length of all data chunks: 93

You can combine the XML statistics with statistical functions such as max, min, midrange, sum, variance, quartiles, etc. by installing the Math_stats PEAR package, which is usually used in conjunction with XML_Statistics PEAR. You install the Math_stats PEAR like this:

   pear install Math_stats

The most-commonly-used functions of the Math_stats package are:

  • mixed calcBasic([boolean $returnErrorObject = true]): This function calculates a basic set of statistics.
  • mixed calcFull([boolean $returnErrorObject = true]): This function calculates a full set of statistics.



The next example demonstrates both these functions using the myxml.xml document in Listing 1:

   
   <?php
   
     //import Statistics.php and Stats.php
     require_once 'XML/Statistics.php';
     require_once 'Math/Stats.php';
   
     $stat = new XML_Statistics();
     $result = $stat->analyzeFile("myxml.xml");
       
     if ($stat->isError($result)) {
       die("Error: " . $result->getMessage());
     } 
     else {
       // get the number of tags per tagname
       $tags = $stat->getTagOccurences();
   
       // use Math_Stats class
       $stats = new Math_Stats();
         
       // set the data
       $stats->setData($tags);
   
       // calculates a basic set of statistics
       $stats1 = $stats->calcBasic();
   
       // calculates a full set of statistics
       $stats2 = $stats->calcFull();
   
       echo "<pre>";
       print_r('<b><u>'."A basic set of statistics".'</u></b><br /><br />');
   
       // print a basic set of statistics 
       print_r($stats1);
   
       print_r('<br /><b><u>'."A full set of statistics".'</u></b><br /><br />');
   
       // print a full set of statistics
       print_r($stats2);
   
       echo "</pre>";
     }
   ?>
   

The "basic statistics" portion of the output of this example is shown below. Listing 2 shows the full output.

   A basic set of statistics
   Array
   (
       [min] => 1
       [max] => 2
       [sum] => 16
       [sum2] => 20
       [count] => 14
       [mean] => 1.1428571428571
       [stdev] => 0.36313651960128
       [variance] => 0.13186813186813
   )

Now that you've seen how to generate reports and statistics from flat files, you can move on to generating them from data stored in relational databases.



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap