Is there an Update option to minimize reading DATABASECHANGELOG?

Hello!

I’m still trying to guess at how Liquibase works, but what I’m seeing is:

A) when I NULL MD5SUMs (with SQL, since only selective), the next migration recalculates ~85% of them (at the start) but not the rest.
B) during liquibase:update when I requery the DB separately I’m seeing little change (is it all balled into a transaction?)
C) this doesn’t seem to be for every “[INFO] ChangeSet…” in the console output, but after MANY operations I’m seeing:
[INFO] Reading from DATABASECHANGELOG
This takes 46 seconds to run every time on my machine.

We are attempting to upgrade from LB v3 to v4, and I’m on a project with over 10,000 change sets, so this is nigh impossible.

The questions I can think of are:

  1. What triggers Liquibase v4 to reread from DATABASECHANGELOG?
  2. What options are there in mvn liquibase:update to reduce or eliminate those operations?
  3. Is there a separate LB operation to ONLY recalculate NULL checksums? (ie. separate the checksums and migrations runs)
  4. If I have to fork LB to get this to work in a reasonable manner, where should I start my search?

Thanks!

Hi @NathanVoxland and @MikeOlivas, would you mind taking a look at this one?
I have also asked other developers in community-channel regarding this one.

1 Like

The general process should be:

  1. A one-time select * from databasechangelog at the beginning of the command to pull down what had been ran before
  2. As liquibase runs, it uses that pre-fetched copy of the changelog history and shouldn’t generally be going back to re-check it or anything. Any updates to the history should be applied both to the in-memory version and back to disk
  3. As you perform an update, liquibase iterates over all the changesets to apply and compares the stored checksum with the current version. If the stored checksum is null, it will save the current checksum back to the databasechangelog table.

So for your question on updating null checksums, it should be updating all the changesets that did execute, not a subset of them. If you’re not seeing all of them updating, are some no longer in your changelog file or being filtered for context/label/dbms reasons?

Liquibase attempts to run each changeset in a transaction – starting the transaction at the beginning of the changeset and then committing it after inserting into the databasechangelog table. Many database will auto-commit DDL statements which make the actual transactions commit more often, though.

You shouldn’t be seeing so many “reading from databasechangelog” statements, so I’m curious why that is happening. That log message is inside a “if there are no previously cached ran changests” block so I’m not sure off hand what could be causing you to keep hitting that code. Do you notice any patterns with the changesets or times it does re-run that query?

There isn’t a command to just update the checksums, since normally that just happens in an update operation. If you can do Java programming, you could write an extension that adds a command to do it. Or we’re always up for pull requests to add features too.

Nathan

A) further investigation shows the ones not being recalculated are orphaned changesets, eg. name changes, defunct/no longer used.

C) I haven’t been able to deduce a pattern so far wrt the DATABASECHANGELOG rereads, though oddly it seems to reread the table more often after short changesets. Could it be a factor of how the changesets are organized in files/subfiles/changelogs?

A) That makes sense and is fine

B) The changeset file/subfile setup shouldn’t matter. Liquibase first parses your changelog files into a flat list regardless of where they came from. The actual file breakdown doesn’t survive past to the point where it’s deciding what has ran and what has not.

You don’t happen to know java enough to run liquibase in a debugger if I pointed you to where to set a breakpoint do you?

Nathan

Well, I have before within a Maven command (from IntelliJ), so I probably could again: yes.

The spot to set a breakpoint is liquibase.changelog.StandardHistoryService 's getRanChangeSets() method.

That method starts with an “if” statement whose body should only run once. A breakpoint inside that method would let you see when it keeps getting called. Is the reset() method ever called to clear the ranChangeSetList field?

Nathan

1 Like