Using the new language features of PHP V5, the maintainability and reliability of the code can be significantly improved. By reading this article, you will learn how to take advantage of these new features to migrate code developed in PHP V4 to PHP V5.
PHP V5 has made significant improvements based on PHP V4. New language features make it easier to build reliable class libraries and maintain them. In addition, rewriting the standard library helped bring PHP more in line with its fellow Web idioms, such as the Java™ programming language. Let's take a look at some of PHP's new object-oriented features and learn how to migrate existing PHP V4 code to PHP V5.
First, let's look at how the new language features and PHP's creator have changed the way objects are created with PHP V4. The idea with V5 was to create an industrial-strength language for Web application development. That means understanding the limitations of PHP V4, then extracting known good language architectures from other languages (such as Java, C#, C++, Ruby, and Perl) and incorporating them into PHP.
The first and most important new feature is access protection for class methods and instance variables - the public, protected and private keywords. This new feature allows class designers to maintain control over the intrinsic properties of a class while telling users of the class which classes are accessible and which are not.
In PHP V4, all code is public. In PHP V5, class designers can declare which code is visible to the outside world (public) and which code is only visible inside the class (private) or only to subclasses of the class (protected). Without these access controls, developing code in a large team or distributing code as a library is hindered because users of those classes are likely to use the wrong methods or access code that should be private member variables.
Another big new feature is the keywords interface and abstract, which allow contract programming. Contract programming means that one class provides a contract to another class - in other words: "This is what I'm going to do, and you don't need to know how it's done." All classes that implement the interface adhere to this contract. All users of an interface agree to use only the methods specified in the interface. The abstract keyword makes working with interfaces very easy, as I'll explain later.
These two key features - access control and contract programming - allow large teams of coders to work more smoothly with large code bases. These features also enable the IDE to provide a richer set of language-intelligent features. This article not only addresses several migration issues, but also spends some time explaining how to use these new major language features.
Access Control
To demonstrate the new language features, I used a class called Configuration. This simple class contains configuration items for the Web application - for example, the path to the images directory. Ideally, this information would reside in a file or database. Listing 1 shows a simplified version.
Listing 1. access.php4
<?php
class Configuration
{
var $_items = array();
function Configuration() {
$this->_items[ 'imgpath' ] = 'images';
}
function get( $key ) {
return $this->_items[ $key ];
}
}
$c = new Configuration();
echo( $c->get( 'imgpath' )."n" );
?>
This is a completely orthodox PHP V4 class. The member variable holds the list of configuration items, the constructor loads the items, and the access method named get() returns the item's value.
After running the script, the following code will appear on the command line:
%php access.php4
images
%
very good! This result means that the code runs normally and the value of the imgpath configuration item is set and read normally.
The first step in converting this class to PHP V5 is to rename the constructor. In PHP V5, the method of initializing an object (constructor) is called __construct. This small change is shown below.
Listing 2. access1.php5
<?php
class Configuration
{
var $_items = array();
function __construct() {
$this->_items[ 'imgpath' ] = 'images';
}
function get( $key ) {
return $this->_items[ $key ];
}
}
$c = new Configuration();
echo( $c->get( 'imgpath' )."n" );
?>
The changes this time are not big. Just moved to PHP V5 convention. The next step is to add access control to the class to ensure that users of the class cannot directly read and write the $_items member variable. This change is shown below.
Listing 3. access2.php5
<?php
class Configuration
{
private $_items = array();
public function __construct() {
$this->_items[ 'imgpath' ] = 'images';
}
public function get( $key ) {
return $this->_items[ $key ];
}
}
$c = new Configuration();
echo( $c->get( 'imgpath' )."n" );
?>
If a user of this object were to access the item array directly, access would be denied because the array is marked private. Fortunately, users have discovered that the get() method provides the much-welcomed read permissions.
To illustrate how to use protected permissions, I need another class, which must inherit from the Configuration class. I called that class DBConfiguration and assumed that the class would read the configuration values from the database. This setup is shown below.
Listing 4. access3.php
<?php
class Configuration
{
protected $_items = array();
public function __construct() {
$this->load();
}
protected function load() { }
public function get( $key ) {
return $this->_items[ $key ];
}
}
class DBConfiguration extends Configuration
{
protected function load() {
$this->_items[ 'imgpath' ] = 'images';
}
}
$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."n" );
?>
This listing shows the correct usage of the protected keyword. The base class defines a method named load(). Subclasses of this class will override the load() method to add data to the items table. The load() method is internal to the class and its subclasses, so the method is not visible to all external consumers. If the keywords are all private, the load() method cannot be overridden.
I don't really like this design, but I chose it because I had to give the DBConfiguration class access to the item array. I would like to continue to have the item array maintained entirely by the Configuration class, so that as other subclasses are added, those classes won't need to know how to maintain the item array. I made the following changes.
Listing 5. access4.php5
<?php
class Configuration
{
private $_items = array();
public function __construct() {
$this->load();
}
protected function load() { }
protected function add( $key, $value ) {
$this->_items[ $key ] = $value;
}
public function get( $key ) {
return $this->_items[ $key ];
}
}
class DBConfiguration extends Configuration
{
protected function load() {
$this->add( 'imgpath', 'images' );
}
}
$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."n" );
?>
Arrays of items can now be private because subclasses use the protected add() method to add configuration items to the list. The Configuration class can change the way it stores and reads configuration items without regard to its subclasses. As long as the load() and add() methods are run in the same way, subclassing should have no problems.
For me, the added access control is the main reason to consider moving to PHP V5. Is it just because Grady Booch said that PHP V5 is one of the four major object-oriented languages? No, because I once accepted a task to maintain 100KLOC C++ code in which all methods and members were defined as public. It took me three days to clean up these definitions, and in the process, I significantly reduced the number of errors and improved maintainability. Why? Because without access control, it is impossible to know how objects use other objects, and it is impossible to make any changes without knowing what obstacles to overcome. With C++, at least I still have the compiler available. PHP does not come with a compiler, so this type of access control becomes even more important.
Contract Programming
The next important feature to take advantage of when migrating from PHP V4 to PHP V5 is support for contract programming through interfaces, abstract classes, and methods. Listing 6 shows a version of the Configuration class in which PHP V4 coders attempted to build a basic interface without using the interface keyword at all.
Listing 6. interface.php4
<?php
class IConfiguration
{
function get( $key ) { }
}
class Configuration extends IConfiguration
{
var $_items = array();
function Configuration() {
$this->load();
}
function load() { }
function get( $key ) {
return $this->_items[ $key ];
}
}
class DBConfiguration extends Configuration
{
function load() {
$this->_items[ 'imgpath' ] = 'images';
}
}
$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."n" );
?>
The listing starts with a small IConfiguration class that defines all the interfaces provided by the Configuration class or derived classes. This interface will define the contract between the class and all its users. The contract states that all classes that implement IConfiguration must be equipped with a get() method and that all users of IConfiguration must insist on using only the get() method.
The code below is run in PHP V5, but it is better to use the provided interface system as shown below.
Listing 7. interface1.php5
<?php
interface IConfiguration
{
function get( $key );
}
class Configuration implements IConfiguration
{
...
}
class DBConfiguration extends Configuration
{
...
}
$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."n" );
?>
On the one hand, readers can understand the running status more clearly; on the other hand, a single class can implement multiple interfaces. Listing 8 shows how to extend the Configuration class to implement the Iterator interface, which is the internal interface for PHP.
Listing 8. interface2.php5
<?php
interface IConfiguration {
...
}
class Configuration implements IConfiguration, Iterator
{
private $_items = array();
public function __construct() {
$this->load();
}
protected function load() { }
protected function add( $key, $value ) {
$this->_items[ $key ] = $value;
}
public function get( $key ) {
return $this->_items[ $key ];
}
public function rewind() { reset($this->_items); }
public function current() { return current($this->_items); }
public function key() { return key($this->_items); }
public function next() { return next($this->_items); }
public function valid() { return ( $this->current() !== false ); }
}
class DBConfiguration extends Configuration {
...
}
$c = new DBConfiguration();
foreach( $c as $k => $v ) { echo( $k." = ".$v."n" ); }
?>
The Iterator interface enables any class to appear to be an array of its consumers. As you can see at the end of the script, you can use the foreach operator to reiterate all configuration items in the Configuration object. PHP V4 does not have this functionality, but you can use this functionality in various ways within your application.
The advantage of the interface mechanism is that contracts can be quickly brought together without having to implement any methods. The final stage is to implement the interface, where you must implement all specified methods. Another helpful new feature in PHP V5 is abstract classes, which make it easy to implement the core part of an interface with a base class and then use that interface to create entity classes.
Another use of abstract classes is to create a base class for multiple derived classes in which the base class is never instantiated. For example, when DBConfiguration and Configuration exist at the same time, only DBConfiguration can be used. The Configuration class is just a base class - an abstract class. Therefore, you can force this behavior using the abstract keyword as shown below.
Listing 9. abstract.php5
<?php
abstract class Configuration
{
protected $_items = array();
public function __construct() {
$this->load();
}
abstract protected function load();
public function get( $key ) {
return $this->_items[ $key ];
}
}
class DBConfiguration extends Configuration
{
protected function load() {
$this->_items[ 'imgpath' ] = 'images';
}
}
$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."n" );
?>
Now, all attempts to instantiate an object of type Configuration will error because the system considers the class to be abstract and incomplete.
Static methods and members
Another important new feature in PHP V5 is support for static members and methods on classes. By using this functionality, you can use the popular singleton pattern. This pattern is ideal for the Configuration class because the application should have only one configuration object.
Listing 10 shows the PHP V5 version of the Configuration class as a singleton.
Listing 10. static.php5
<?php
class Configuration
{
private $_items = array();
static private $_instance = null;
static public function get() {
if ( self::$_instance == null )
self::$_instance = new Configuration();
return self::$_instance;
}
private function __construct() {
$this->_items[ 'imgpath' ] = 'images';
}
public function __get( $key ) {
return $this->_items[ $key ];
}
}
echo( Configuration::get()->{ 'imgpath' }."n" );
?>
The static keyword has many uses. Consider using this keyword when you need to access some global data for all objects of a single type.
Magic Method
Another big new feature in PHP V5 is support for magic methods, which allow objects to quickly change the object's interface - for example, adding member variables for each configuration item in the Configuration object. There is no need to use the get() method, just look for a particular item and treat it as an array, as shown below.
Listing 11. magic.php5
<?php
class Configuration
{
private $_items = array();
function __construct() {
$this->_items[ 'imgpath' ] = 'images';
}
function __get( $key ) {
return $this->_items[ $key ];
}
}
$c = new Configuration();
echo( $c->{ 'imgpath' }."n" );
?>
In this example, I created a new __get() method that is called whenever the user looks for a member variable on the object. The code within the method will then use the array of items to find the value and return that value as if there was a member variable there specifically for that keyword. Assuming that the object is an array, at the end of the script you can see that using the Configuration object is as simple as finding the value of imgpath.
When migrating from PHP V4 to PHP V5, you must be aware of these language features that are completely unavailable in PHP V4, and you must revalidate classes to see how they can be used.
Exceptions
finally end this article by introducing the new exception mechanism in PHP V5. Exceptions provide a completely new way to think about error handling. All programs inevitably generate errors - file not found, out of memory, etc. If exceptions are not used, an error code must be returned. Please look at the PHP V4 code below.
Listing 12. file.php4
<?php
function parseLine( $l )
{
// ...
return array( 'error' => 0,
data => array() // data here
);
}
function readConfig( $path )
{
if ( $path == null ) return -1;
$fh = fopen( $path, 'r' );
if ( $fh == null ) return -2;
while( !feof( $fh ) ) {
$l = fgets( $fh );
$ec = parseLine( $l );
if ( $ec['error'] != 0 ) return $ec['error'];
}
fclose( $fh );
return 0;
}
$e = readConfig( 'myconfig.txt' );
if ( $e != 0 )
echo( "There was an error (".$e.")n" );
?>
This standard file I/O code will read a file, retrieve some data, and return an error code if any errors are encountered. I have two questions about this script. The first one is the error code. What do these error codes mean? To find out what these error codes mean, you must create another system to map these error codes into meaningful strings. The second problem is that the return result of parseLine is very complicated. I just need it to return data, but it actually has to return an error code and data. Most engineers (myself included) often get lazy and just return data and ignore errors because errors are difficult to manage.
Listing 13 shows how clear the code is when using exceptions.
Listing 13. file.php5
<?php
function parseLine( $l )
{
// Parses and throws and exception when invalid
return array(); // data
}
function readConfig( $path )
{
if ( $path == null )
throw new Exception( 'bad argument' );
$fh = fopen( $path, 'r' );
if ( $fh == null )
throw new Exception( 'could not open file' );
while( !feof( $fh ) ) {
$l = fgets( $fh );
$ec = parseLine( $l );
}
fclose( $fh );
}
try {
readConfig( 'myconfig.txt' );
} catch(Exception $e) {
echo( $e );
}
?>
I don't need to worry about error codes because the exception contains descriptive text for the error. I also don't have to think about how to track down the error code returned from parseLine because the function will just throw an error if one occurs. The stack extends to the nearest try/catch block, which is at the bottom of the script.
Exceptions will revolutionize the way you write code. Instead of managing the headache of error codes and mappings, you can focus on the errors you want to handle. Such code is easier to read, maintain, and I would say even encourage you to add error handling, as it usually pays dividends.
Conclusion
The new object-oriented features and the addition of exception handling provide strong reasons for migrating code from PHP V4 to PHP V5. As you can see, the upgrade process is not difficult. The syntax that extends to PHP V5 feels just like PHP. Yes, these syntaxes come from languages like Ruby, but I think they work very well together. And these languages expand the scope of PHP from a scripting language for small sites to a language that can be used to complete enterprise-level applications.