Invalid md5 sum calculation in liquibase 1.9.5

Today I noticed that a large amount of changesets that were originally created by generateChangelog contained the mssql specific uniqueidentifier in stead of the neutral UUID type (can that be considered a bug?). I set out to fix this using a script. That would modify the changelog and put in validCheckSum tags for the modified changesets.

Here I ran into the trouble that I got validation errors after the script ‘fixed’ the changelog. Which was actually due to me misunderstanding validChecksum. It should contain the new checksum and not the old (guess to detect new modifications). It would be nice if the documentation could be a little bit more explicit about that. (And that the only way to obtain the new checksum is the output of liquibase.

Before I reached that conclusion I suspected the checksum calculation in liquibase. Especially since the md5sum column contained values ranging from 26 to 32, where I thought that md5sums were always fixed size (128 bit). I had a look at the code and googled a bit and it looks the algorithm that liquibase uses strips out the leading zero’s of individual bytes. Due to the use of Integer.toHexString().

More info at:

http://www.spiration.co.uk/post/1199/Java%20md5%20example%20with%20MessageDigest

So I guess that’s a bug? I modified the algorithm in liquibase and with a small fix it generates real md5sums.

What is the best way to deal with large amounts of changesets that need to be fixed? I notice the error output cuts off at 25 so I can’t easily use the output of liquibase to feed my fix script with the new checksums (of course I can change the source :slight_smile: but I guess more people might run into this).

Cheers,

Ric

It is a bug in 1.9 that has been fixed in 2.0.  Since checksums are used for determining if a changeset has been modified, I’ve not been able to fix the bug creating them wrong until 2.0 when I made enough changes I had to break compatibility anyway.  1.9 computed them wrong, but it was better to keep it consistent.

I’ll update the docs to try to explain the validCheckSum attribute better.  If you do find any documentation that could be improved, the site is a wiki so feel free to update and improve what you feel you can.

Another way to handle a lot of wrong checksums is to set the md5sum column in the database to null.  If liquibase sees that it is null, it store the correct md5sum in it for future reference, but does not fail.  Hopefully that helps. 

Nathan

Ah so we can get it fixed when we move to 2.0, we initially decided to move our project to 1.9.5 as being the stable version.

Would it make sense to have a command option to just list all the (or specified) calculated checksums (path, author, id, checksum) ? That would allow me to write some scripts for big refactorings. (Convert mssql uniqueidentifier to UUID and things like that) I’d rather not use the set to null option since we have quite some databases around and the level of understanding of liquibase also differs a lot among our developers.

We used the generate changelog to convert our old ‘master’ database hell to liquibase, but as we learn more about liquibase we run into quite some issues with the generated file.

If we decide to do bugfixes I guess we can do that best on the 2.0 codebase? Or will there be maintenance releases on 1.9 ?

Cheers,

Ric

PS while there are some issues with some parts of liquibase we really like the concept behind it :slight_smile: We see vast improvement in upgrading databases and finally having real version control on the database schema.

You could query the databasechangelog table for the id, path, author, checksum values for now.  What I generally do with teams with mixed liquibase expertise is have the developers that understand it do the dangerous things like setting the md5sum to null and not tell the newer developers about it…

There will be maintenance in the 1.9 branch as needed, although it will not be my primary focus.

Nathan