Doc Blocks Rock!

April 14th, 2007

I can’t sleep, so I figured I’d wade into the inline documentation debate.

Travis states that programming languages are declarative, so code, along with associated unit tests are enough to define what the code does, as well as provide examples of how to use the code.

Well, I agree with that.

However, that doesn’t mean that inline documentation is therefore a “bad thing”. Tobias has pointed out a couple of reasons why that’s the case. I can think of a couple more, too.

Firstly, bugs.

Just becuase you’ve written your code, and you’ve written unit tests doesn’t mean the code is bug free. It should be, if your tests are good, but how much do you want to bet on that? Sometimes the unit tests you’ve written (so far) haven’t made the code bug free. It does happens. So, you extend your unit tests to cover that bug, and then you fix the code. No big deal.

But have you ever tried doing that with someone else’s code? I know that on many occasions, I’ve been using an external library, and thought I’ve found a bug. So, I’ve sent off a bug report and a patch, only to be told, "No, that’s how it’s supposed to work." How was I meant to know?

If you define "what the code does" as the same thing as "what the code should do", then there’s no such thing as a bug. Of course, that’s not really the case, is it? In fact, "what the code does" is only the definition of "what the code does", not "what the code should do".

That’s what inline documentation is for. A brief description of what the code should do, so that it’s possible to take a stab at working out if what the code does is the same as what it should be doing.

Secondly, understanding.

I’m not talking about helping other developers understand your code. As has been mentioned in the two articles above, tutorials, unit tests, decent method naming conventions, code layout, etc. as well as reading the code itself are probably more useful in helping developers new to your code in coming to terms with your code base. The inline documentation will only help those developers once they’ve come to understand it a little, and they want a quick reference later on.

No, I’m talking about your understanding of the code you’re writing.

When I was at Uni, I found that one of my most valuable exam techniques when I was faced with a tough question to answer in a short amount of time was to ask myself, "How would I explain this to my Mum?" By boiling the problem down into the simplest bits I could think of, and by approaching the problem from a way that was different to how you would explain it to an expert in the field, I’d often find that I’d come to a realisation of the really important part of the problem much faster than by trying to deal with the complex whole.

I find this works really well with programming, too.

Sure, I can bash out a method that does what it needs to do pretty easily. But when I write doc blocks, I find that the process of having to come up with one or two sentences that explain what the code is supposed to do really help me focus my attention on the point of the code, and to ensure that the code is doing what it needs to. I can’t possibly remember how many times I’ve written a doc block, and suddenly gone, "Oh, wait, what about…?".

So, as Travis says, if your code is so complex that you have to document it in English, then yes, it’s probably time to refactor. But that doesn’t mean inline document is bad – there’s some real benefits to documenting your code.

Brugge

April 14th, 2007

Cathy and I went to Brugge over the Easter long weekend. No complaints ;-)

Photos are up in the gallery.

teh funny

April 13th, 2007

Good Service

April 9th, 2007

Speaking of customer service, I returned some faulty headphones at Micro Anvika last week.

Normally, I’d do everything I could to not shop there – I can’t stand the stores, due to their appalling sales service. No one in the store is interested in helping you, and they certainly don’t know their products. (They are, however, better than Currys.)

However, I got some vouchers as a gift about almost a year ago, and so had used them to buy some headphones. The right headphone wasn’t working, however, so I took them back.

I have to say, I was really impressed. Sure, it took 20 minutes for them to locate a new set, but they swapped them over without a word of complaint (or the usual expected accusations that I wouldn’t know if the thing was broken or not), and they issued me with a receipt for another 12 month warranty.

Now, that’s how it should be.

Temper, temper…

April 7th, 2007

I said recently that we were never rude to our customers back when I was working at Adelaide Uni.

However, I must confess that I did once lose my temper, and while I didn’t exactly shout, I did raise my voice and speak in a somewhat unkind way.

I think, however, it was justified. The person in question came in, and asked to use a computer. About five minutes later, they asked for some help, because their files had “disappeared” from their disk.

So, I went over, and took a look. The first thing I did was to check in the file manager that the disk really was empty – I saw plenty of cases where an “empty” disk was simply a symptom of the customer not knowing how to display the contents of the disk. That’s fair enough – not everyone knew how to do that on a computer that wasn’t set up the same way as the one they would normally use.

However, when I did this, the floppy disk drive made an odd noise. Not the kind of noise you expect when reading a disk, and not that kind of “chunking” noise it makes when a disk’s FAT is corrupted. It was a different noise. Kind of, well, slow sounding. I ejected the disk. It was sticky.

It turns out that they’d had their disk in their bag along with an icecream, which had melted. I suggested that perhaps they should have given more thought to how appropriate it might be to put a wet, sticky, icecream covered disk into an expensive computer.

Only, perhaps not as kindly as that.

Nostalgia

April 5th, 2007

I spent something like seven years at university, and for most of them, I worked for the Adelaide University Union’s Computer Resource Centre.

At the time, the University of Adelaide didn’t provide any kind of centralised, standardised computing facilities – it was up to each faculty to provide computers (as, and if, required) for their students. As a result, the Resource Centre was the one place where students who struggled with technology could turn to for help, and for specialist services like document binding, cheap photocopying onto overhead transparencies, and helping PhD students recover their one and only copy of their dissertation — due next week — that they have saved on a $1 floppy disk that’s now gone bad.

Apart from the satisfaction of helping other students, and the fact the job paid my rent, the best thing about working there was the camaraderie with the other staff. We had these little rituals that we’d all follow – like using certain phrases to relieve the stress of dealing with some of our more challenging clients (without ever being rude, of course!). One of these rituals was to always correct people who came with a stack of printed sheets, and would ask "Ummm, can I get this binded?" We would always reply "No, but you can get it bound."

As a result, seeing this in the Metro recently made me a little nostalgic…

Use array_diff() with caution

April 4th, 2007

I was extending the Openads database abstraction layer last night, to ensure that creation of databases works correctly. I’m no PostgreSQL expert, but it seems that it won’t let you issue DDL/DML commands unless you have connected to a database first – so if you want to create a database, you have to connect to a different, already existing database first.

Luckily, the PEAR::MDB2 library will connect you to the “default” database for a database server if you specify no database name in the DSN. (For PostgreSQL, this is the “template1″ database.)

This was all well and good, until I tried to then connect to my newly created database, and all I’d get back was the previous connection to “template1″. It turns out that the MDB2::singleton() method compared the DSN array of the database you want to connect to with any DSN arrays of previous connections it has made by using the array_diff() function.

Alas, when creating a connection with no database name results in the DSN having the boolean false for the database name field – and because other items in the new DSN array were also false in my case (for example, the “mode” of the connection” was false), the array_diff() function returned no differences. Oops!

Changing to the array_diff_assoc() function solved the problem.

Custom Types in PEAR::MDB2

April 3rd, 2007

It’s been all systems go at Openads central this year. One of the big jobs we’ve been taking on is dealing with our database schema. Managing database schemata is actually a tough job, if you want to do it in a agile way. (No one actually designs their database schema up front any more, right?) There are several great books on the subject, though, and based on those ideas, we’ve produced a white paper outlining a system we’re writing for managing database schemata for PHP, in the same vein as Rail’s Migrations.

As part of this project, we’ve decided to use PEAR::MDB2 and PEAR::MDB2_Schema to manage the jobs of reading in our existing database schemata from existing databases, to then be stored in XML files, The same two PEAR packages will then be used to prepare the SQL DDL/DML commands required to create new databases when installing Openads, or to modify an existing Openads database when upgrading. Of course, the DDL/DML created will all be based on XML schemata files.

If you’ve ever used PEAR::MDB2, you may be aware that it works by storing database schemata information in an “internal” MDB2 datatype. This datatype needs to be converted into database-specific commands when talking to a database, obviously. But if you’re reverse engineering a schema from an existing database, you also need to be able to convert an existing database “nativetype” into the appropriate MDB2 datatype.

However, this is where we’ve run into a few issues.

The reason we’re writing a tool to manage database migrations is because we want to make it easy to refactor our database schemata. This means it needs to be easy to create PHP classes which will allow us (and our users) to migrate between different database versions, ensuring that the data in the database is migrated along with any schema changes. Of course, a central tenet of refactoring is that you improve the structure without changing functionality. (Normally, this refers to changing the structure of the code, but in this case, we’re talking about improving the structure of the database schema without changing the functionality of the database.)

This means that our tool, from the very outset, must be able to reverse engineer our database schema from an existing installed database into an XML file, and then be able to produce exactly the same database from the XML.

Alas, we found that PEAR::MDB2 couldn’t do that.

To understand why, take a look at the MDB2_Driver_Datatype_mysql class (for example). Specifically, look at the mapNativeDatatype() method. This is where PEAR::MDB2 converts a MySQL nativetype into an internal MDB2 datatype. Take a look at that case statement, and note the value(s) inserted into the $type array for the MySQL nativetypes "varchar", or "enum". PEAR::MDB2 only ever uses the first of any possible nativetype to datatype conversions, and both of these MySQL nativetypes have the first datatype option in PEAR::MDB2 as the datatype "text". As you can guess, this means if you convert your existing database schema into PEAR::MDB2 datatypes, you’re not going to get the same schema back again when you create the database from XML.

Of course, this isn’t to say that PEAR::MDB2 is rubbish. On the contrary, one might question why your database schema has an "enum" type in the first place – after all, it’s not the most portable native data type. So, converting these two MySQL nativetypes to the PEAR::MDB2 datatype "text" actually makes a lot of sense, under "normal" situations.

But it’s not what we need – we need to be able to get out exactly the same database as we put in.

So, what to do? Well, we’ve extended PEAR::MDB2, and a big thank you to both Lorenzo and Lukas for their help in getting this patch into the package.

As you may know, you can already define custom PEAR::MDB2 datatypes, and include in the datatype definition how it should be mapped back to a database nativetype. See Lukas’ blog post about this for more details. Our patch extends this feature to fix a couple of places where the datatype to nativetype was a little lacking, and you can now also define a custom mapping for a database nativetype back to your custom datatype (or any other existing PEAR::MDB2 datatype, if you want).

Here’s how it all fits together. Lets say that you want to actually support MySQL’s "varchar" nativetype. The first thing you need to do is to create a new custom PEAR::MDB2 datatype for "varchar" columns. To do that, follow Lukas’ guide, and create a callback function to deal with the conversion of the new PEAR::MDB2 datatype (let’s call it "openads_varchar") back to the MySQL nativetype:

/**
 * A callback function to map the MDB2 datatype "openads_varchar" into
 * the MySQL nativetype "VARCHAR".
 *
 * @param MDB2   $db         The MDB2 database reource object.
 * @param string $method     The name of the MDB2_Driver_Datatype_Common method
 *                           the callback function was called from. One of
 *                           "getValidTypes", "convertResult", "getDeclaration",
 *                           "compareDefinition", "quote" and "mapPrepareDatatype".
 *                           See {@link MDB2_Driver_Datatype_Common} for the
 *                           details of what each method does.
 * @param array $aParameters An array of parameters, being the parameters that
 *                           were passed to the method calling the callback
 *                           function.
 * @return mixed Returns the appropriate value depending on the method that
 *               called the function. See {@link MDB2_Driver_Datatype_Common}
 *               for details of the expected return values of the five possible
 *               calling methods.
 */
function datatype_openads_varchar_callback(&$db, $method, $aParameters)
{
    // Lowercase method names for PHP4/PHP5 compatibility
    $method = strtolower($method);
    switch($method) {
        case 'getvalidtypes':
            // Return the default value for this custom datatype
            return '';
        case 'convertresult':
            // Convert the nativetype value to a datatype value using the
            // built in "text" datatype
            return $db->datatype->convertResult($aParameters['value'], 'text', $aParameters['rtrim']);
        case 'getdeclaration':
            // Prepare and return the MySQL specific code needed to declare
            // a column of this custom datatype
            $name = $db->quoteIdentifier($aParameters['name'], true);
            $datatype = $db->datatype->mapPrepareDatatype($aParameters['type']);
            $declaration_options = $db->datatype->_getDeclarationOptions($aParameters['field']);
            $value = $name . ' ' . $datatype;
            if (isset($aParameters['field']['length']) && is_numeric($aParameters['field']['length'])) {
                $value .= '(' . $aParameters['field']['length'] . ')';
            }
            $value .= $declaration_options;
            return $value;
        case 'comparedefinition':
            // Return the same array of changes that would be used for
            // the built in "text" datatype
            return $db->datatype->_compareTextDefinition($aParameters['current'], $aParameters['previous']);
        case 'quote':
            // Convert the datatype value into a quoted nativetype value
            // suitable for inserting into MySQL using the built in
            // "text" datatype
            return $db->datatype->quote($aParameters['value'], 'text');
        case 'mappreparedatatype':
            // Return the MySQL nativetype declaration for this custom datatype
            return 'VARCHAR';
    }
}

Once that’s done, you need to create another callback function to ensure that when you dump out an existing MySQL database, the "varchar" nativetype is correctly converted into the newly created PEAR::MDB2 datatype "openads_varchar":

/**
 * A callback function to map the MySQL nativetype "VARCHAR" into
 * the extended MDB2 datatype "openads_varchar".
 *
 * @param MDB2 $db       The MDB2 database reource object.
 * @param array $aFields The standard array of fields produced from the
 *                       MySQL command "SHOW COLUMNS". See
 *                       {@link http://dev.mysql.com/doc/refman/5.0/en/describe.html}
 *                       for more details on the format of the fields.
 *                          "type"      The nativetype column type
 *                          "null"      "YES" or "NO"
 *                          "key"       "PRI", "UNI", "MUL", or null
 *                          "default"   The default value of the column
 *                          "extra"     "auto_increment", or null
 * @return array Returns an array of the following items:
 *                  0 => An array of possible MDB2 datatypes. As this is
 *                       a custom type, always has one entry, "openads_varchar".
 *                  1 => The length of the type, if defined by the nativetype,
 *                       otherwise null.
 *                  2 => A boolean value indicating the "unsigned" nature of numeric
 *                       fields. Always null in this case, as the type is not numeric.
 *                  3 => A boolean value indicating the "fixed" nature of text
 *                       fields. Always false in this case, as varchar is not
 *                       of fixed length.
 */
function nativetype_varchar_callback(&$db, $aFields)
{
    // Prepare the type array
    $aType = array();
    $aType[] = 'openads_varchar';
    // Can the length of the VARCHAR field be found?
    $length = null;
    $start = strpos($aFields['type'], '(');
    $end = strpos($aFields['type'], ')');
    if ($start && $end) {
        $start++;
        $chars = $end - $start;
        $length = substr($aFields['type'], $start, $chars);
    }
    // No unsigned value needed
    $unsigned = null;
    // Set fixed to false
    $fixed = false;
    return array($aType, $length, $unsigned, $fixed);
}

Obviously, both of these callback methods need to be registered with your PEAR::MDB2 database connection, to ensure that they will be used:

$aOptions['datatype_map'] = array('openads_varchar' => 'openads_varchar');
$aOptions['datatype_map_callback'] = array('openads_varchar' => 'datatype_openads_varchar_callback');
$aOptions['nativetype_map_callback'] = array('varchar' => 'nativetype_varchar_callback');
$oDbh = &MDB2::singleton($dsn, $aOptions);

That’s it! With PEAR::MDB2, you can now now only create your own custom datatypes to control how your XML-based schema is converted into SQL, you can now also create custom mappings to ensure that your existing database is converted back to an XML schema the way you want it.

Update: Demian has pointed out that he doesn’t think that the way the array is returned from the nativetype_varchar_callback() function is particularly great. I agree – but that’s the way PEAR::MDB2 expects it. Maybe we’ll get around to patching that too :)

PEAR::MDB2 Updated

March 14th, 2007

Oh yeah. A new version of PEAR::MDB2 is out.

I’m planning on updating my blog software “soon”, and I’ll put up a detailed run down on some of the new features in MDB2 as soon as I can get code syntax highlighting working…

FOWA London 07: Day 2

February 22nd, 2007

So, day two of the Future of Web Apps!

Mark Anders kicked things off with a demo of Flex Builder 2. That was actually pretty cool, I thought. I’m not much of a UI person (okay, I have no interest in UI design at all, beyond enjoying someone else’s nice UI) but it seemed like a pretty awesome way to create a Flash application. Mark also talked about how great ECMA will be in the future (nice) and also gave a quick demo of Apollo.

Next up Chris Wilson from Microsoft talked about IE7.

Khoi Vinh from NYTimes.com talked about site design. This was a great presentation, with lots of interesting things to note:

  • It’s Web 2.0, baby. Even the big boys at NYTimes.com don’t do it the old way, anymore. It used to be that new sites were based on “news delivery.” Now, it’s all about “news centric interactivity.”
  • Features are bad. If you look at users, they fit the bell curve. There’s a few beginners, there’s a few experts, but most people are the run-of-the-mill intermediate users. However, all of your fancy features are aimed at the experts. That means for most people, there’s a lot of “feature noise” that they have to tune out.
  • Should that be there? “For every single thing of importance, there should be multiple reasons.”
  • Settings and preferences? That’s just a “dumping ground” for design issues that you couldn’t solve, and that’s not good.
  • Do user testing. If you show it to your boss, that’s not user testing, that’s “executive testing”. (Or, usability testing vs. acceptance testing.)

Simon Willison talked about OpenID. Hopefully people will take his advice, now that the war is won, and soon we will be able to enjoy everyone’s sites supporting OpenID!

Jonathan Rochelle from Google talked about the lessons learned from creating Google Docs & Spreadsheets. His advice complemented Khoi’s nicely, I thought:

  • Get UI help early on, because UI innovation is key to success these days, and lots of your front end code will depend on what the UI does – so the sooner you know what your UI will do, the better.
  • Speed is critical. Kill those features, and make the important stuff fast. If it’s not fast, and you don’t have time to make it fast, dump it.
  • Get user feedback – not manager feedback.
  • Users cannot always see innovation – so sometimes, you have to innovate and create features that your users don’t know that they want (yet).
  • User data is sacred. Protect it at all costs.
  • Use test harnesses, automate your tests, and perform benchmark testing.

Rasmus Lerdorf talked about the history of PHP, and the new filter_input() option in PHP 5.2. Hopefully some good will come of this!

Finally, the boys from Moo got up and spoke. The product sounds, and looks great. If only they didn’t have to go down the innocent path. There are few things that annoy me more than companies that think they are “cute”. Fun, I can deal with. Funny, I like. Energetic? Sure. Fresh and funky? Great. But “cute” is just annoying.