It's been all systems go at Openads central this year. One of the big jobs we've been taking on is dealing with our database schema. Managing database schemata is actually a tough job, if you want to do it in a agile way. (No one actaully designs their database schema up front any more, right?) There are several great books on the subject, though, and based on those ideas, we've produced a white paper outlining a system we're writing for managing database schemata for PHP, in the same vein as Rail's Migrations.
As part of this project, we've decided to use PEAR::MDB2 and PEAR::MDB2_Schema to manage the jobs of reading in our existing database schemata from existing databases, to then be stored in XML files, The same two PEAR packages will then be used to prepare the SQL DDL/DML commands required to create new databases when installing Openads, or to modify an existing Openads database when upgrading. Of course, the DDL/DML created will all be based on XML schemata files.
If you've ever used PEAR::MDB2, you may be aware that it works by storing database schemata information in an "internal" MDB2 datatype. This datatype needs to be converted into database-specific commands when talking to a database, obviously. But if you're reverse engineering a schema from an existing database, you also need to be able to convert an existing database "nativetype" into the appropriate MDB2 datatype.
However, this is where we've run into a few issues.
The reason we're writing a tool to manage database migrations is because we want to make it easy to refactor our database schemata. This means it needs to be easy to create PHP classes which will allow us (and our users) to migrate between different database versions, ensuring that the data in the database is migrated along with any schema changes. Of course, a central tenet of refactoring is that you improve the structure without changing functionality. (Normally, this refers to changing the structure of the code, but in this case, we're talking about improving the structure of the database schema without changing the functionality of the database.)
This means that our tool, from the very outset, must be able to reverse engineer our database schema from an existing installed database into an XML file, and then be able to produce exactly the same database from the XML.
Alas, we found that PEAR::MDB2 couldn't do that.
To understand why, take a look at the MDB2_Driver_Datatype_mysql class (for example). Specifically, look at the mapNativeDatatype() method. This is where PEAR::MDB2 converts a MySQL nativetype into an internal MDB2 datatype. Take a look at that case statement, and note the value(s) inserted into the $type array for the MySQL nativetypes "varchar", or "enum". PEAR::MDB2 only ever uses the first of any possible nativetype to datatype conversions, and both of these MySQL nativetypes have the first datatype option in PEAR::MDB2 as the datatype "text". As you can guess, this means if you convert your existing database schema into PEAR::MDB2 datatypes, you're not going to get the same schema back again when you create the database from XML.
Of course, this isn't to say that PEAR::MDB2 is rubbish. On the contrary, one might question why your database schema has an "enum" type in the first place - after all, it's not the most portable native data type. So, converting these two MySQL nativetypes to the PEAR::MDB2 datatype "text" actually makes a lot of sense, under "normal" situations.
But it's not what we need - we need to be able to get out exactly the same database as we put in.
So, what to do? Well, we've extended PEAR::MDB2, and a big thank you to both Lorenzo and Lukas for their help in getting this patch into the package.
As you may know, you can already define custom PEAR::MDB2 datatypes, and include in the datatype definition how it should be mapped back to a database nativetype. See Lukas' blog post about this for more details. Our patch extends this feature to fix a couple of places where the datatype to nativetype was a little lacking, and you can now also define a custom mapping for a database nativetype back to your custom datatype (or any other existing PEAR::MDB2 datatype, if you want).
Here's how it all fits together. Lets say that you want to actually support MySQL's "varchar" nativetype. The first thing you need to do is to create a new custom PEAR::MDB2 datatype for "varchar" columns. To do that, follow Lukas' guide, and create a callback function to deal with the conversion of the new PEAR::MDB2 datatype (let's call it "openads_varchar") back to the MySQL nativetype:
/**
* A callback function to map the MDB2 datatype "openads_varchar" into
* the MySQL nativetype "VARCHAR".
*
* @param MDB2 $db The MDB2 database reource object.
* @param string $method The name of the MDB2_Driver_Datatype_Common method
* the callback function was called from. One of
* "getValidTypes", "convertResult", "getDeclaration",
* "compareDefinition", "quote" and "mapPrepareDatatype".
* See {@link MDB2_Driver_Datatype_Common} for the
* details of what each method does.
* @param array $aParameters An array of parameters, being the parameters that
* were passed to the method calling the callback
* function.
* @return mixed Returns the appropriate value depending on the method that
* called the function. See {@link MDB2_Driver_Datatype_Common}
* for details of the expected return values of the five possible
* calling methods.
*/function datatype_openads_varchar_callback
(&
$db,
$method,
$aParameters){ // Lowercase method names for PHP4/PHP5 compatibility $method =
strtolower($method);
switch($method) { case 'getvalidtypes':
// Return the default value for this custom datatype return '';
case 'convertresult':
// Convert the nativetype value to a datatype value using the // built in "text" datatype return $db->
datatype->
convertResult($aParameters['value'],
'text',
$aParameters['rtrim']);
case 'getdeclaration':
// Prepare and return the MySQL specific code needed to declare // a column of this custom datatype $name =
$db->
quoteIdentifier($aParameters['name'],
true);
$datatype =
$db->
datatype->
mapPrepareDatatype($aParameters['type']);
$declaration_options =
$db->
datatype->_getDeclarationOptions
($aParameters['field']);
$value =
$name .
' ' .
$datatype;
if (isset($aParameters['field']['length']) &&
is_numeric($aParameters['field']['length'])) { $value .=
'(' .
$aParameters['field']['length'] .
')';
} $value .=
$declaration_options;
return $value;
case 'comparedefinition':
// Return the same array of changes that would be used for // the built in "text" datatype return $db->
datatype->_compareTextDefinition
($aParameters['current'],
$aParameters['previous']);
case 'quote':
// Convert the datatype value into a quoted nativetype value // suitable for inserting into MySQL using the built in // "text" datatype return $db->
datatype->
quote($aParameters['value'],
'text');
case 'mappreparedatatype':
// Return the MySQL nativetype declaration for this custom datatype return 'VARCHAR';
}}
Once that's done, you need to create another callback function to ensure that when you dump out an existing MySQL database, the "varchar" nativetype is correctly converted into the newly created PEAR::MDB2 datatype "openads_varchar":
/**
* A callback function to map the MySQL nativetype "VARCHAR" into
* the extended MDB2 datatype "openads_varchar".
*
* @param MDB2 $db The MDB2 database reource object.
* @param array $aFields The standard array of fields produced from the
* MySQL command "SHOW COLUMNS". See
* {@link http://dev.mysql.com/doc/refman/5.0/en/describe.html}
* for more details on the format of the fields.
* "type" The nativetype column type
* "null" "YES" or "NO"
* "key" "PRI", "UNI", "MUL", or null
* "default" The default value of the column
* "extra" "auto_increment", or null
* @return array Returns an array of the following items:
* 0 => An array of possible MDB2 datatypes. As this is
* a custom type, always has one entry, "openads_varchar".
* 1 => The length of the type, if defined by the nativetype,
* otherwise null.
* 2 => A boolean value indicating the "unsigned" nature of numeric
* fields. Always null in this case, as the type is not numeric.
* 3 => A boolean value indicating the "fixed" nature of text
* fields. Always false in this case, as varchar is not
* of fixed length.
*/function nativetype_varchar_callback
(&
$db,
$aFields){ // Prepare the type array $aType =
array();
$aType[] =
'openads_varchar';
// Can the length of the VARCHAR field be found? $length =
null;
$start =
strpos($aFields['type'],
'(');
$end =
strpos($aFields['type'],
')');
if ($start &&
$end) { $start++;
$chars =
$end -
$start;
$length =
substr($aFields['type'],
$start,
$chars);
} // No unsigned value needed $unsigned =
null;
// Set fixed to false $fixed =
false;
return array($aType,
$length,
$unsigned,
$fixed);
}
Obviously, both of these callback methods need to be registered with your PEAR::MDB2 database connection, to ensure that they will be used:
$aOptions['datatype_map'] =
array('openads_varchar' =>
'openads_varchar');
$aOptions['datatype_map_callback'] =
array('openads_varchar' =>
'datatype_openads_varchar_callback');
$aOptions['nativetype_map_callback'] =
array('varchar' =>
'nativetype_varchar_callback');
$oDbh = &MDB2::
singleton($dsn,
$aOptions);
That's it! With PEAR::MDB2, you can now now only create your own custom datatypes to control how your XML-based schema is converted into SQL, you can now also create custom mappings to ensure that your existing database is converted back to an XML schema the way you want it.
Update: Demian has pointed out that he doesn't think that the way the array is returned from the nativetype_varchar_callback() function is particularly great. I agree - but that's the way PEAR::MDB2 expects it. Maybe we'll get around to patching that too 