Change log theory

Hello; I’m circling around LiquiBase, looking at it for a very large project with a lot of visibility.  Consequently I’m nervous :slight_smile: and want to make sure that I fully understand the general problem area before I start hacking.

I had a question related to change logs over time.

In the training videos and examples that I’ve watched so far (I’m about 2/3 of the way through the long one), there isn’t much discussion of what happens if a change log is “shortened”.  By that I mean obvious mistake entries being consolidated into correct entries, or six individual column renamings combined into one change set, and so on.

Is it expected that a change log will only ever be appended to, or can the change log itself be refactored?

If I have a change log entry that creates one table, and then another one that creates another table, and then seventeen others that affect the data in those tables, what happens if I consolidate those two entries into one after the fact, check it into version control, and run liquibase update?  Semantically it’s the same thing: I have created two tables and filled them with data, but I don’t see how LiquiBase could know that.

Or: if I have three entries in my changelog that are designed to back out some kind of boneheaded column misspelling mistake, I’d kind of like to make it so that I don’t have to air that dirty laundry in front of my customers.  It would be nice to be able to smush the column renaming operation together with the spelling correction after the fact.  Does LiquiBase do this sort of thing already?

Are there general rules for changelog maintenance on a large project?

Thanks very much for any pointers in this regard, and thanks for what looks like a really interesting project.

Best,
Laird

The short answer is: “only append to changelogs, don’t modify already ran changeset.  You don’t know what changeSets have been applied to which databases.  Plus, you have done your earlier testing on the original versions of the changeSets and you don’t want to introduce possible migration errors by creating a completely separate database update path”.

The longer version is: there are times where it is nice and/or necessary to modify existing chagneSets.  Besides the examples you gave, there are also times where there was a bug in the earlier version and you need to fix it, but cannot recover databases that have had the changes applied to them.  An example of this is an statement where you forgot the where clause. 

For cases like these, there is support in liquibase to allow you to modify your changesets if you know what you are doing.  The trouble with documenting how, is that what you do depends on what you are changing.  Some options you have:

  1. Just modify the existing changeset.  Any databases that have ran the changeset will complain about an invalid checksum, but this can be supressed with the tag.  With the validCheckSum, databases that already ran this changeset will just continue on and not get the new version, but future database updates will get the new version.

  2. Delete an old changeset.  Databases that already ran the changeset will keep the change in there, but they do not care that it is not in the file anymore.  It will not be applied to future database updates.

  3. Use based on existence of tables, execution of changesets by id, etc to control if a changeset is ran.’

#3 is the most flexible and what I usually use.  If you want to combine a create table and 3 addColumn changes into one larger createTable change, you delete the original createTable change and the addColumn changes, then create a new createTable change with a < precondition with onFail=“MARK_RAN”.  This way, if you have a database that was updated with the original changes, it will fail the precondtion and continue on.  If you have a database that has not been updated with the old changes, it will pass the precondition and run the new changeSet.

Like I said above, though, you do have to be very careful when doing this.  You will run into problems if you have a database that had the original createTable change, but not all the deleted addColumn changes.  Or if you have changes between the original createTable and addColumns, such as an , that will now fail because the table either doesn’t exist or has more columns than expected (depending on where you put the new createTable change)

It’s really up to you on each changeSet to weigh the costs and danger of the modification you want to make with the improvement it will give you.

Nathan

Originally posted by: Nathan
3. Use based on existence of tables, execution of changesets by id, etc to control if a changeset is ran.'

#3 is the most flexible and what I usually use.  If you want to combine a create table and 3 addColumn changes into one larger createTable change, you delete the original createTable change and the addColumn changes, then create a new createTable change with a < precondition with onFail=“MARK_RAN”.  This way, if you have a database that was updated with the original changes, it will fail the precondtion and continue on.  If you have a database that has not been updated with the old changes, it will pass the precondition and run the new changeSet.

I like this.  Thanks for the detailed answer.

Best,
Laird