Archive for the ‘PHP Programming’ Category

Use array_diff() with caution

Wednesday, April 4th, 2007

I was extending the Openads database abstraction layer last night, to ensure that creation of databases works correctly. I’m no PostgreSQL expert, but it seems that it won’t let you issue DDL/DML commands unless you have connected to a database first – so if you want to create a database, you have to connect to a different, already existing database first.

Luckily, the PEAR::MDB2 library will connect you to the “default” database for a database server if you specify no database name in the DSN. (For PostgreSQL, this is the “template1″ database.)

This was all well and good, until I tried to then connect to my newly created database, and all I’d get back was the previous connection to “template1″. It turns out that the MDB2::singleton() method compared the DSN array of the database you want to connect to with any DSN arrays of previous connections it has made by using the array_diff() function.

Alas, when creating a connection with no database name results in the DSN having the boolean false for the database name field – and because other items in the new DSN array were also false in my case (for example, the “mode” of the connection” was false), the array_diff() function returned no differences. Oops!

Changing to the array_diff_assoc() function solved the problem.

Custom Types in PEAR::MDB2

Tuesday, April 3rd, 2007

It’s been all systems go at Openads central this year. One of the big jobs we’ve been taking on is dealing with our database schema. Managing database schemata is actually a tough job, if you want to do it in a agile way. (No one actually designs their database schema up front any more, right?) There are several great books on the subject, though, and based on those ideas, we’ve produced a white paper outlining a system we’re writing for managing database schemata for PHP, in the same vein as Rail’s Migrations.

As part of this project, we’ve decided to use PEAR::MDB2 and PEAR::MDB2_Schema to manage the jobs of reading in our existing database schemata from existing databases, to then be stored in XML files, The same two PEAR packages will then be used to prepare the SQL DDL/DML commands required to create new databases when installing Openads, or to modify an existing Openads database when upgrading. Of course, the DDL/DML created will all be based on XML schemata files.

If you’ve ever used PEAR::MDB2, you may be aware that it works by storing database schemata information in an “internal” MDB2 datatype. This datatype needs to be converted into database-specific commands when talking to a database, obviously. But if you’re reverse engineering a schema from an existing database, you also need to be able to convert an existing database “nativetype” into the appropriate MDB2 datatype.

However, this is where we’ve run into a few issues.

The reason we’re writing a tool to manage database migrations is because we want to make it easy to refactor our database schemata. This means it needs to be easy to create PHP classes which will allow us (and our users) to migrate between different database versions, ensuring that the data in the database is migrated along with any schema changes. Of course, a central tenet of refactoring is that you improve the structure without changing functionality. (Normally, this refers to changing the structure of the code, but in this case, we’re talking about improving the structure of the database schema without changing the functionality of the database.)

This means that our tool, from the very outset, must be able to reverse engineer our database schema from an existing installed database into an XML file, and then be able to produce exactly the same database from the XML.

Alas, we found that PEAR::MDB2 couldn’t do that.

To understand why, take a look at the MDB2_Driver_Datatype_mysql class (for example). Specifically, look at the mapNativeDatatype() method. This is where PEAR::MDB2 converts a MySQL nativetype into an internal MDB2 datatype. Take a look at that case statement, and note the value(s) inserted into the $type array for the MySQL nativetypes "varchar", or "enum". PEAR::MDB2 only ever uses the first of any possible nativetype to datatype conversions, and both of these MySQL nativetypes have the first datatype option in PEAR::MDB2 as the datatype "text". As you can guess, this means if you convert your existing database schema into PEAR::MDB2 datatypes, you’re not going to get the same schema back again when you create the database from XML.

Of course, this isn’t to say that PEAR::MDB2 is rubbish. On the contrary, one might question why your database schema has an "enum" type in the first place – after all, it’s not the most portable native data type. So, converting these two MySQL nativetypes to the PEAR::MDB2 datatype "text" actually makes a lot of sense, under "normal" situations.

But it’s not what we need – we need to be able to get out exactly the same database as we put in.

So, what to do? Well, we’ve extended PEAR::MDB2, and a big thank you to both Lorenzo and Lukas for their help in getting this patch into the package.

As you may know, you can already define custom PEAR::MDB2 datatypes, and include in the datatype definition how it should be mapped back to a database nativetype. See Lukas’ blog post about this for more details. Our patch extends this feature to fix a couple of places where the datatype to nativetype was a little lacking, and you can now also define a custom mapping for a database nativetype back to your custom datatype (or any other existing PEAR::MDB2 datatype, if you want).

Here’s how it all fits together. Lets say that you want to actually support MySQL’s "varchar" nativetype. The first thing you need to do is to create a new custom PEAR::MDB2 datatype for "varchar" columns. To do that, follow Lukas’ guide, and create a callback function to deal with the conversion of the new PEAR::MDB2 datatype (let’s call it "openads_varchar") back to the MySQL nativetype:

/**
 * A callback function to map the MDB2 datatype "openads_varchar" into
 * the MySQL nativetype "VARCHAR".
 *
 * @param MDB2   $db         The MDB2 database reource object.
 * @param string $method     The name of the MDB2_Driver_Datatype_Common method
 *                           the callback function was called from. One of
 *                           "getValidTypes", "convertResult", "getDeclaration",
 *                           "compareDefinition", "quote" and "mapPrepareDatatype".
 *                           See {@link MDB2_Driver_Datatype_Common} for the
 *                           details of what each method does.
 * @param array $aParameters An array of parameters, being the parameters that
 *                           were passed to the method calling the callback
 *                           function.
 * @return mixed Returns the appropriate value depending on the method that
 *               called the function. See {@link MDB2_Driver_Datatype_Common}
 *               for details of the expected return values of the five possible
 *               calling methods.
 */
function datatype_openads_varchar_callback(&$db, $method, $aParameters)
{
    // Lowercase method names for PHP4/PHP5 compatibility
    $method = strtolower($method);
    switch($method) {
        case 'getvalidtypes':
            // Return the default value for this custom datatype
            return '';
        case 'convertresult':
            // Convert the nativetype value to a datatype value using the
            // built in "text" datatype
            return $db->datatype->convertResult($aParameters['value'], 'text', $aParameters['rtrim']);
        case 'getdeclaration':
            // Prepare and return the MySQL specific code needed to declare
            // a column of this custom datatype
            $name = $db->quoteIdentifier($aParameters['name'], true);
            $datatype = $db->datatype->mapPrepareDatatype($aParameters['type']);
            $declaration_options = $db->datatype->_getDeclarationOptions($aParameters['field']);
            $value = $name . ' ' . $datatype;
            if (isset($aParameters['field']['length']) && is_numeric($aParameters['field']['length'])) {
                $value .= '(' . $aParameters['field']['length'] . ')';
            }
            $value .= $declaration_options;
            return $value;
        case 'comparedefinition':
            // Return the same array of changes that would be used for
            // the built in "text" datatype
            return $db->datatype->_compareTextDefinition($aParameters['current'], $aParameters['previous']);
        case 'quote':
            // Convert the datatype value into a quoted nativetype value
            // suitable for inserting into MySQL using the built in
            // "text" datatype
            return $db->datatype->quote($aParameters['value'], 'text');
        case 'mappreparedatatype':
            // Return the MySQL nativetype declaration for this custom datatype
            return 'VARCHAR';
    }
}

Once that’s done, you need to create another callback function to ensure that when you dump out an existing MySQL database, the "varchar" nativetype is correctly converted into the newly created PEAR::MDB2 datatype "openads_varchar":

/**
 * A callback function to map the MySQL nativetype "VARCHAR" into
 * the extended MDB2 datatype "openads_varchar".
 *
 * @param MDB2 $db       The MDB2 database reource object.
 * @param array $aFields The standard array of fields produced from the
 *                       MySQL command "SHOW COLUMNS". See
 *                       {@link http://dev.mysql.com/doc/refman/5.0/en/describe.html}
 *                       for more details on the format of the fields.
 *                          "type"      The nativetype column type
 *                          "null"      "YES" or "NO"
 *                          "key"       "PRI", "UNI", "MUL", or null
 *                          "default"   The default value of the column
 *                          "extra"     "auto_increment", or null
 * @return array Returns an array of the following items:
 *                  0 => An array of possible MDB2 datatypes. As this is
 *                       a custom type, always has one entry, "openads_varchar".
 *                  1 => The length of the type, if defined by the nativetype,
 *                       otherwise null.
 *                  2 => A boolean value indicating the "unsigned" nature of numeric
 *                       fields. Always null in this case, as the type is not numeric.
 *                  3 => A boolean value indicating the "fixed" nature of text
 *                       fields. Always false in this case, as varchar is not
 *                       of fixed length.
 */
function nativetype_varchar_callback(&$db, $aFields)
{
    // Prepare the type array
    $aType = array();
    $aType[] = 'openads_varchar';
    // Can the length of the VARCHAR field be found?
    $length = null;
    $start = strpos($aFields['type'], '(');
    $end = strpos($aFields['type'], ')');
    if ($start && $end) {
        $start++;
        $chars = $end - $start;
        $length = substr($aFields['type'], $start, $chars);
    }
    // No unsigned value needed
    $unsigned = null;
    // Set fixed to false
    $fixed = false;
    return array($aType, $length, $unsigned, $fixed);
}

Obviously, both of these callback methods need to be registered with your PEAR::MDB2 database connection, to ensure that they will be used:

$aOptions['datatype_map'] = array('openads_varchar' => 'openads_varchar');
$aOptions['datatype_map_callback'] = array('openads_varchar' => 'datatype_openads_varchar_callback');
$aOptions['nativetype_map_callback'] = array('varchar' => 'nativetype_varchar_callback');
$oDbh = &MDB2::singleton($dsn, $aOptions);

That’s it! With PEAR::MDB2, you can now now only create your own custom datatypes to control how your XML-based schema is converted into SQL, you can now also create custom mappings to ensure that your existing database is converted back to an XML schema the way you want it.

Update: Demian has pointed out that he doesn’t think that the way the array is returned from the nativetype_varchar_callback() function is particularly great. I agree – but that’s the way PEAR::MDB2 expects it. Maybe we’ll get around to patching that too :)

PEAR::MDB2 Updated

Wednesday, March 14th, 2007

Oh yeah. A new version of PEAR::MDB2 is out.

I’m planning on updating my blog software “soon”, and I’ll put up a detailed run down on some of the new features in MDB2 as soon as I can get code syntax highlighting working…

Zend Studio and SVN

Tuesday, October 31st, 2006

One of the Max Media Manager users reported that s/he could not use Zend Studio’s built in SVN to checkout the trunk of our repository.

I’ve got a bee in my bonnet at the moment about Zend, and so I’m on their case these days about making sure Zend does what it says on the tin, so, I reported it as a bug.

I’ve posted a modified version of their workaround back on the Max forum.

(Yes, I’m back from the dead! Huzzah!)

Zend Problem Solved

Wednesday, August 16th, 2006

Rob points out, with regards to the problem with Zend Studio apparently ignoring your php.ini file when using the "internal" debugger, that Zend Studio comes with it’s own pre-compiled versions of PHP, which it uses when debugging.

So, if your php.ini file isn’t in /usr/local/lib (and mine isn’t, it’s in /usr/local/php4/lib, as a result of a dual PHP4/PHP5 setup), then the internal debugger won’t see your php.ini file at all; and that’s why it seems to ignore your settings.

Fun and Games with Zend

Thursday, July 20th, 2006

Did you know that Zend Studio, when you use "Internal" debugging (as opposed to using the Zend Platform), appears to completely ignore your php.ini file?

I can’t even begin to comprehend how it manages to do this, given that the Zend site indicates that internal debugging works by sitting on top of your local PHP install.

All I know is that this makes it incredibly hard to debug those big PHP scripts that use more than 8Mb of RAM, because you can’t adjust the memory_limit value in your php.ini file. (Well, you can, but it doesn’t seem to have any effect inside of Zend.)

You have to resort to configuring Apache and installing the Zend Platform to make it all work — which is a pain when you’re in a rush.

Planet PHP

Monday, June 19th, 2006

As a PHP developer, the Planet PHP blog is, to be fair, pretty useless. Many of the posts aren’t related to PHP, some of them are so basic they are not worth reading, and almost all of the rest are simply opinion pieces, or ill-conceived thoughts about the future of PHP.

However, the occasional interesting gem makes it worthwhile.

Patch!

Wednesday, June 14th, 2006

It’s almost two years since I raised this bug in PEAR::Date, described in more detail here – and at last, there’s been an official fix!

Zend & Command Line Debugging

Friday, February 17th, 2006

I was just asked how to debug PHP, using Zend Studio, but for a CLI PHP script, as opposed to a normal web-based PHP script. (Web based debugging is easy – fire up Zend Studio, load up Firefox with the Zend Toolbar installed, and hit "Debug".)

To be honest, debugging your CLI PHP scripts is almost as easy! All you need to do is launch your PHP script from a simple bash script, that does a little bit of simple magic for you:

#!/bin/sh
export QUERY_STRING="start_debug=1&debug_port=10000&debug_host=IP&debug_stop=1"
/path/to/php -q /path/to/script.php

Just replace IP with your IP address (be it 127.0.0.1, or whatever), launch Zend, and start debugging. Easy!

UPDATE: Zend have similar info on their site these days…

Timezones & PHP

Friday, September 16th, 2005

My day job involves working on an OSS project called Max Media Manager, written in PHP.

The project involves the collection and presentation of statistics, and thus, dates and times are rather important. Ideally, we want to record all dates and times in the database in UTC, and then simply adjust them according to user preference (ie. show them in UTC if they want, or change the times to be in the user’s time zone, so the statistics make sense to the user, wherever they are in the world). However, the project is a fork of a previous OSS project, and as a result, the UI doesn’t have separate data access and presentation layers yet, so this goal is a little way off.

In the mean time, then, there are some problems I have been having with dates and times. These basically come about because of a bug in PEAR::Date, which I reported nearly a year ago. The problem is that when PEAR::Date compares two dates, to find out which one is earlier and which one is later (or if they are equal), it needs to deal with the fact that the two dates might be in different timezones. So, it first converts both dates into UTC, so they can be directly compared. (If you are using PHP5, you might want to know about this patch that I submitted for PEAR::Date.)

However, in order to convert dates into UTC, PEAR::Date needs to know if the date it is converting is in a Daylight Savings Time (DST) zone, or not. As part of the method that does this (Date_TimeZone::inDaylightTime()), the environment variable TZ is set in PHP to alter the timezone that PHP is in, so that the date to be converted can be inspected with PHP’s localtime() function, and the "tm_isdst" value inspected.

Of course, this means that the TZ environment variable needs to be restored before the Date_TimeZone::inDaylightTime() method returns, otherwise future dates will be in the wrong timezone. Herein lies the problem I’m having.

On most Linux systems that I have come across, the TZ environment variable is not set. This means that a call to getenv("TZ") returns false. Thus, in the Date_TimeZone::inDaylightTime() method, it first tries to get the TZ data, which is "false". It then sets the TZ environment variable to the timezone the date it is testing is in, finds out if the date is in a DST timezone, and then tries to restore the old TZ value. However, because the previous value was found to be false, when a call of putenv("TZ", $env_tz) is made (where $env_tz = false), the end result is that the TZ environment variable is, in fact, set to an empty string (and not NULL, as it was before).

From here on, all new PEAR::Date objects are created in UTC. This is fine if you were in UTC to start with, but for me, when I’m in British Summer Time, it means that after the first date comparison, all new dates after that are 1 hour behind where they should be. I’m sure it’s even worse for people not in this timezone, as the difference between what they should be in, and UTC, is even greater. At least I have the timezone right for 6 months of the year (when GMT is the same as UTC).

As PHP lacks the ability to unset environment variables (at least in PHP4, anyway – maybe PHP5 can do this?), about the only solution I can see is to ensure that the TZ environment variable is set at the start of all scripts in the project. Fortunately, we have a nice initialisation system, shamelessly borrowed in concept from Seagull, so doing this is easy.

A couple of smart guys (thanks Matteo and Mark!) on the Max IRC channel suggested the following:

if (getenv('TZ') === false) {
    $diff = date('O') / 100;
    putenv('TZ=GMT'.($diff > 0 ? '-' : '+').abs($diff));
}

This way, the TZ environment variable is set to a timezone that indicates the offset from GMT, based on the user’s current timezone, so that we don’t have to add a configuration item to the project, specifying which timezone the user is in. Nifty! This has solved the problem, at least under PHP4. I’ve tested the code in BST, as well as a couple of other timezones, and it seems to work fine. However, there are still, apparently, issues under PHP5, which I want to tackle today.

If anyone has ideas about better ways to work around this issue in PEAR::Date (excluding the "right way" of doing everything in UTC and converting in the presentation layer, of course), I’m all ears…

Update: It turns out the PEAR::Date has a bug similar to the one I mentioned before, but in this case, in the Date_Span::setFromDateDiff() method. A similar patch would work here, or you can ensure that you clone your dates with Date::copy() before using them. As a result, the above solution does indeed work in PHP5, but, as one of the maintainers of PEAR::Date has mentioned, it isn’t a thread-safe solution – but who cares? The PEAR::Date package isn’t thread safe to begin with. At least this is backwards compatible with the current API.