Strange checksum behaviour


We are getting different checksum validation behavior on different environments when running against the same database instance.

databasechangelog has a value of e9e9d79fd8154a72ab6aa65aae74021f for the changeset
checksum calculated from the xml change log is 7:85be67d6cb3df70dfa4228eddd2d1737

In environment W (Windows 7)
Java version: IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Windows 7 amd64-64 jvmwa6460sr9-20101124_69295 (JIT enabled, AOT enabled)
Liquibase fails checksum validation
This is the expected behaviour

In environment L (Red Hat LInux 5.9)
Java version: Java™ SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot™ 64-Bit Server VM (build 19.1-b02, mixed mode)
Liquibase passes checksum validation
This behaviour is not expected

Liquibase 3.0.5 installed on both environments

It may be an OS difference, although liquibase should be correcting for that. Could you try doing an update with logLevel=debug and send the output. That includes information on how it is computing the checksums.


If you are unable to do that, can you post the changeSets that are causing the problems?


Nathan

\ufffd seems  to be a general unprintable char. If you could send me the actual changelog that would be good. If you want to send it directly to me you can email it to nathan.voxland@liquibase.org


Nathan

Hi,

I just wanted to add that I get different checksums depending on OS and Liquibase-Version.

We did the initial run with Linux and Liquibase 3.0.4.
When using the same Liquibase-Version with Solaris I get 4 different Changesets, for Win7 it’s even 6.

Changing nothing but Liquibase to 3.0.5 I get the following change-count:
Linux: 20
Solaris: 24
Win7: 24

Merging all the changes of all 6 runs overall 26 different changesets are considered to be different and even the 24 changesets from Solaris and Win7 with 3.0.5 differ in two changesets.

Regards,
Lars





Thanks Nathan,

After looking a bit closer at the change set. I noticed that it contains some unprintable characters in a “COMMENT ON …” command. This does not seem to cause any problems for Liquibase when running the update. However the checksum is calculated differently depending on the environment that it was run on.

Below is the output for a calculateCheckSum with -logLevel=debug. You will see the strange characters after the string “Company, Meter, etc” - highlighted in red below. The change log files used on Windows and Linux are identical. Let me know if you you are unable to reproduce this and I will send you the change set.

On Windows:
liquibase --logLevel=debug --username=xxxx --password=xxxxxx --changeLogFile=“CR3/CR3 Alter 346 Backend fixes from CR2 and CR3.xml” --url=jdbc:oracle:thin:@//xxxxxxxx:1521/CSID calculateCheckSum “CR3/CR3 Alter 346 Backend fixes from CR2 and CR3.xml”::901::Robin
DEBUG 02/10/13 1:00 PM:liquibase: null: null: Unable to load/access Apache Derby driver class to check version
DEBUG 02/10/13 1:00 PM:liquibase: null: null: Connected to CSID@jdbc:oracle:thin:@//xxxxxxxx:1521/CSID
DEBUG 02/10/13 1:00 PM:liquibase: null: null: Setting auto commit to false from true
INFO 02/10/13 1:00 PM:liquibase: null: null: Calculating checksum for changeset CR3/CR3 Alter 346 Backend fixes from CR2 and CR3.xml::901::Robin
INFO 02/10/13 1:00 PM:liquibase: null: null: CR3/CR3 Alter 346 Backend fixes from CR2 and CR3.xml is using schema version 2.0 rather than version 3.0
DEBUG 02/10/13 1:00 PM:liquibase: null: null: Computed checksum for null:true:true:COMMENT ON COLUMN TANK.TANKSHORTNAME IS 'This is an abbreviated mnemonic code for the Tank. It is in uppercase characters and is used for quick searches and lookups. Is the abbreviated name for the piece of equipment that contains both the equipment abbreviation and the Local Tank ID. For example, EP-304-TK-301 (Functional Tag) would have the Short Name (TK-301) [Darin] - There is no Short Name for Tank in CPD, but there is a Short Name found for Facility, Company, Meter, etc� '; as 799d26eb99991fd15c5e78a3434442e2
DEBUG 02/10/13 1:00 PM:liquibase: null: null: Computed checksum for 7:799d26eb99991fd15c5e78a3434442e2: as 85be67d6cb3df70dfa4228eddd2d1737
7:85be67d6cb3df70dfa4228eddd2d1737

On Linux:
/u02/liquibase/liquibase --logLevel=debug --username=xxxx --password=xxxxxx --changeLogFile=“CR3/CR3 Alter 346 Backend fixes from CR2 and CR3.xml” --url=jdbc:oracle:thin:@//xxxxxxxx:1521/CSID calculateCheckSum “CR3/CR3 Alter 346 Backend fixes from CR2 and CR3.xml”::901::Robin
Liquibase Home is not set.
Liquibase Home: /u02/liquibase
DEBUG 10/2/13 1:05 PM:liquibase: null: null: Unable to load/access Apache Derby driver class to check version
DEBUG 10/2/13 1:05 PM:liquibase: null: null: Connected to CSID@jdbc:oracle:thin:@//xxxxxxxx:1521/CSID
DEBUG 10/2/13 1:05 PM:liquibase: null: null: Setting auto commit to false from true
INFO 10/2/13 1:05 PM:liquibase: null: null: Calculating checksum for changeset CR3/CR3 Alter 346 Backend fixes from CR2 and CR3.xml::901::Robin
INFO 10/2/13 1:05 PM:liquibase: null: null: CR3/CR3 Alter 346 Backend fixes from CR2 and CR3.xml is using schema version 2.0 rather than version 3.0
DEBUG 10/2/13 1:05 PM:liquibase: null: null: Computed checksum for null:true:true:COMMENT ON COLUMN TANK.TANKSHORTNAME IS 'This is an abbreviated mnemonic code for the Tank. It is in uppercase characters and is used for quick searches and lookups. Is the abbreviated name for the piece of equipment that contains both the equipment abbreviation and the Local Tank ID. For example, EP-304-TK-301 (Functional Tag) would have the Short Name (TK-301) [Darin] - There is no Short Name for Tank in CPD, but there is a Short Name found for Facility, Company, Meter, etc\ufffd '; as 8678aab79b3bdf5506fb2d3b17bc1ee7
DEBUG 10/2/13 1:05 PM:liquibase: null: null: Computed checksum for 7:8678aab79b3bdf5506fb2d3b17bc1ee7: as e9e9d79fd8154a72ab6aa65aae74021f
7:e9e9d79fd8154a72ab6aa65aae74021f
Liquibase ‘calculateCheckSum’ Successful

I did add a checksum version (the portion before the ‘:’ char), but currently it is a global flag. I’m planning on moving it to be more change specific so we can support multiple versions, but for now I try to avoid any changes that affect checksum versions.


In 3.0.5/3.0.6 the only think that should have affected the checksum algorithm is handling of unprintable UTF8 chars. What type of changeSets are failing the checksum? Is it or changeSets? Or is there anything else they have in common?

Nathan

Would you be able to send me a sample changelog where you are seeing the problem? I’m not seeing it so far. If you want, you can send it to me directly at nathan.voxland@liquibase.org


Nathan

I see the problem now. There was a bugfix in 3.0.5 that introduced a change to the checksum behavior for and changeSets with a where clause. Since people have upgraded to 3.0.5+, there isn’t a nice way to handle the problem since reverting teh change will break checksums for people that have already upgraded. 


I’m going to leave the change in logic in, since that also preserves a bugfix. That means you will have to update your changelog file to have a null md5sum column in databasechangelog for the rows that would have changed. The easiest query to run is just 'update datatabasechangelog set md5sum=null" and the next time you run liquibase, it will update the checksums to be correct.


Nathan

Hi Nathan,

we also see differences in checksum calculation when upgrading from 3.0.4 to 3.0.5/3.0.6 staying on the same OS. Since sum scripts have already been run in production it is currently impossible to upgrade the version as we get the checksum violations. Do you have any idea what has been changes in the newer version that cause the violation?

Maybe it is possible to save the checksum calculation version side by side with the checksum itself so that calculation and comparison can be based on the same algorithm. That would allow for introducing new version without braking existing states.

It would be nice if you could provide a short feedback.

Regards,
Florian

Hi Nathan,
as far as I can see there are no or tags in the changeSets that violate the checksum comparison. From what I can see is that all these changeSets have constraints using a tag, whereas chageSets without tags seem to run smoothly. Concerning the constraints I see no characters that are likely UTF8 unprintable.

I hope this helps…

Florian