The Object-Oriented Evolution of PHP

ew people know this, but when PHP as we know it today was being molded, back in the summer of 1997, there were no plans for it to have any object-oriented capabilities. Andi Gutmans and I were working to create a powerful, robust and efficient Web language loosely based on the PHP/FI 2.0 and C syntax. As a matter of fact, we got pretty far without having any notion of classes or objects?it was to be a purely structured language. However, on August 27th of that year, PHP’s object capabilities changed.

When classes were introduced to the code base of what was to become PHP 3.0, they were added as syntactic sugar for accessing collections. PHP already had the notion of associative array collections, and the new classes were nothing but a neat new way of accessing them. However, as time has proven, this new syntax proved to have a much more far-reaching effect on PHP than was originally intended.

Another thing that most people don’t know is that by the time PHP 3.0 was officially released in mid 1998, it was gaining momentum at a staggering rate (see Figure 1); Andi Gutmans and I were already determined to rewrite the language implementation. We were well aware that users liked PHP as it existed at the time. But as the authors of the engine we knew what was going on under the hood and we couldn’t live peacefully with that. The rewrite, which was later dubbed the ‘Zend Engine’ (Zend being a combination of Zeev and Andi), initiated and became one of the core components of the second revolution that PHP experienced in just over a year.

Figure 1: PHP Usage as of October, 2002

This revolution, however, left PHP’s object model mostly unchanged from version 3?it was still very simple. Objects were still very much syntactic sugar for associative arrays, and didn’t offer users too many additional features.

Objects in the Old Days
So, what could one do with objects back in the days of PHP 3.0 or, for that matter, with the current PHP 4.0 version? Not much, really. Objects were essentially containers for properties, like associative arrays. The biggest difference was that objects had to belong to a class. Classes, as in other languages, contained a collection of both properties and methods (functions). You could instantiate instances (objects) from the classes using the new operator. The object model supported single inheritance, which let users extend (specialize) an existing class without having to write it from scratch or copy it. Finally, PHP 4.0 also added the ability to call methods of a specific class, from both within and outside of object contexts.

One of the biggest twists in PHP’s history was that despite the very limited functionality, and despite a host of problems and limitations, object oriented programming in PHP thrived and became the most popular paradigm for the growing numbers of off-the-shelf PHP applications. This trend, which was mostly unexpected, caught PHP in a sub-optimal situation. It became apparent that objects were not behaving like objects in other OO languages, and were instead behaving like associating arrays.

The Limitations of the Old Object Model
The most problematic aspect of the PHP 3 / PHP 4 object model was that objects were passed around by value, and not by reference. What does that mean? Here’s an example.

Let’s say you have a simple, somewhat useless function, called myFunction():

   function myFunction($arg)    {      $arg = 5;   }

And you call this function as follows:

   $myArgument = 7;   myFunction($myArgument);   print $myArgument; 

As you probably know, the call to myFunction() will leave $myArgument unchanged; Sent to myFunction() is a copy of $myargument’s value, and not $myargument itself. This type of argument passing is called passing arguments by value. Passing arguments by reference is done by most structured languages and is extremely useful, as it allows you to write your functions or call other people’s functions without worrying about side effects they may have on variables outside their scope.

However, consider the following example:

   function wed($bride, $groom)   {      if ($bride->setHusband($groom)         && $groom->setWife($bride)) {         return true;      }      else {         return false;      }   }   wed($joanne, $joe);   print areMarried($joanne, $joe);
Author Note: Implementing Woman::setHusband(), Man::setWife() and areMarried() is left as an exercise for the reader.

What is the return value of areMarried()? One would hope that the two newlyweds would manage to stay married at least until the following line of code, but as you may have guessed?they wouldn’t. areMarried() will confirm that they got divorced just as soon as they got married. Why?

The reason is simple. Objects in PHP 3.0 and 4.0 are not ‘special’. They¬†behave like any other kind of variable; in other words, when you pass $joanne and $joe to wed(), you don’t really pass them. Instead, you pass clones or replicas of them. So, while their clones end up being married inside wed(), the real $joe and $joanne remained safely distant from the sacrament of holy matrimony, in their protected outer-scope.

Of course, PHP 3 and 4 did give you an option to force your variables to be passed by reference, so you could let functions change the arguments that were passed to them in the outer scope. To do that, you could define the prototype for wed() using the ampersand (&) symbol to tell PHP that its arguments should be passed by-reference, instead of by-value. For example:

   function wed(&$bride, &$groom)

With this new implementation of wed(), Joanne and Joe would have better luck (or not, depending on your point of view).

However, it gets more complicated than that. For instance, what if you want to return an object from a function, by reference? What if you want to make modifications to $this inside the constructor, without worrying about what may happen when it gets copied back from new‘s result into the container variable (if you don’t know what I’m talking about here, say “Hallelujah.”)?

While PHP 3 and 4 did address these problems to a certain extent by providing syntactic hacks to pass around objects by reference, they never addressed the core of the problem, which is that objects and other types of values are not created equal. Therefore, objects should be passed around by reference unless stated otherwise.

The Answer?Zend Engine 2
Being finally convinced that objects are indeed special creatures and deserve their own distinct behavior, was only the first step. We had to come up with a way of doing this without interfering with the rest of the semantics of PHP, and preferably, without having to rewrite the whole of PHP itself. Luckily, the solution came in the form of a big light bulb that emerged above Andi Gutmans’ head just over a year ago. His idea was to replace objects with object handles. The object handles would essentially be numbers, indices in a global object table. Much like any other kind of variables, they will be passed and returned by value. Thanks to this new level of indirection we will now be moving around handles to the objects and not the objects themselves. In effect, this feature means that PHP will behave as if the objects themselves are passed by reference.

Let’s go back to Joe and Joanne. How would wed() behave differently now? First, $joanne and $joe will no longer be objects, but rather, object handles, let’s say 4 and 7 respectively. These integer handles point to slots in some global objects table where the actual objects reside. When we send them to wed(), the local variables $bride and $groom will receive the values 4 and 7, setHusband() will change the object referenced by 4, setWife() will change the object referenced by 7, and when wed() returns, $joanne and $joe will already be living the first day of the rest of their lives together (see Figure 2).

Figure 2: The wed() function in the Zend Development Environment

What Do These New Capabilities Mean to Developers?
Alright, so the ending to the story is now more idyllic, but what does it mean to PHP developers? It means quite a number of things. First, it means that your applications will run faster, because passing objects by reference requires much less data copying. For instance, when you send $joe to a function, rather than creating a replica and copying his name, birth date, parents’ name, list of former addresses, social security number and whatnot, PHP will only have to pass one object handle, one integer. Obviously, another direct result of this is that it saves a significant amount of memory?storing an integer requires much less space than storing a full-fledged replica of an object.

But perhaps more important, the new object model makes object oriented programming in PHP much more powerful and intuitive. No longer will you have to mess with cryptic & characters to get the job done. No longer will you have to worry about whether changes you make to the object inside the constructor will survive the dreaded new-operator behavior. No longer will you ever have to stay up until 2:00AM tracking elusive bugs! OK, maybe I’m lying with that last one; but seriously, the new object model significantly reduces the object-related stay-up-until-2:00AM type of bugs . In turn, that greatly increases the feasibility of using PHP for large-scale projects.

What Else Is New?
As one might expect, the Zend Engine 2 packs in quite a few other features along with its brand new object model. Some of the features further enhance object-oriented capabilities, such as private member variables and methods, static variables and language-level aggregation. Most notable is the revolutionized interaction with external component models, such as Java, COM/DCOM and .NET via overloading.

PHP 4.0 was the first version to introduce this sort of integration, but the new implementation is much quicker, more complete, more reliable and far easier to maintain and extend. These elements mean that PHP 5.0 will play very nicely in your existing Java or .NET based setup. You’ll be able to use your existing components inside PHP transparently, as if they were regular PHP objects. Unlike PHP 4.0, which had a special implementation for such overloaded objects, PHP 5.0 uses the same interface for all objects, including native PHP objects. This feature ensures that PHP objects and overloaded objects behave in exactly the same way.

Finally, the Zend Engine 2 also adds exception handling to PHP. To date, the sad reality is that most developers write code that does not handle error situations gracefully. It’s not uncommon to see sites that spit out cryptic database errors to your browser, rather than displaying a more user-friendly “An error has occurred” message. A key reason for this is that handling error situations with the versions of PHP was a daunting task?you actually had to check the return value of each and every function. Adding a set_error_handler() function made managing errors slightly easier, ?but still left a lot to be desired. Adding exception handling to PHP lets developers achieve fine-grained error recovery, but more importantly, it facilitates graceful application-wide error recovery.

The release of PHP 5.0, powered by the Zend Engine 2.0, will mark a significant step forward in PHP’s evolution as one of the key Web platforms in the world today. While keeping its firm commitment to users who prefer using the functional structured syntax of PHP, the new version will provide a giant leap ahead for those who are interested in its object oriented capabilities?especially for companies developing large scale applications.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Related Posts