Error trying to create a lock in Azure CosmosDB

Hello.

I’m having a strange issue when attempting to apply changelog updates to an Azure CosmosDB instance emulating the Cassandra API. When applying the updates, Liquibase attempts to get a lock in the DATABASECHANGELOGLOCK table. I have verified there is no lock prior to this. But instead of acquiring the lock, it gets into a loop waiting for the changelog lock. However, I can see that a lock was granted to this update process at the very timestamp of the first message “Waiting for changelog lock…”

Here’s the lock:

 id | locked | lockgranted                     | lockedby
----+--------+---------------------------------+---------------------------
  1 |   True | 2023-07-21 16:04:12.196000+0000 | 58ee50fd8429 (x.x.x.x)

And here is the output from the update operation with log-level=INFO:

clm-dev-liquibase-cassandra  | Starting Liquibase at 16:04:09 (version 4.21.1 #9070 built at 2023-04-13 20:56+0000)
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:09] INFO [liquibase.ui] Starting Liquibase at 16:04:09 (version 4.21.1 #9070 built at 2023-04-13 20:56+0000)
clm-dev-liquibase-cassandra  | Liquibase Version: 4.21.1
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:09] INFO [liquibase.ui] Liquibase Version: 4.21.1
clm-dev-liquibase-cassandra  | Liquibase Open Source 4.21.1 by Liquibase
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:09] INFO [liquibase.ui] Liquibase Open Source 4.21.1 by Liquibase
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:09] INFO [liquibase.integration] Starting command execution.
clm-dev-liquibase-cassandra  | SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
clm-dev-liquibase-cassandra  | SLF4J: Defaulting to no-operation (NOP) logger implementation
clm-dev-liquibase-cassandra  | SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:12] INFO [liquibase.ext] Waiting for changelog lock....
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:22] INFO [liquibase.ext] Waiting for changelog lock....
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:32] INFO [liquibase.ext] Waiting for changelog lock....
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:42] INFO [liquibase.ext] Waiting for changelog lock....
clm-dev-liquibase-cassandra  | [2023-07-21 16:04:53] INFO [liquibase.ext] Waiting for changelog lock....
clm-dev-liquibase-cassandra  | [2023-07-21 16:05:03] INFO [liquibase.ext] Waiting for changelog lock....
clm-dev-liquibase-cassandra  | [2023-07-21 16:05:13] INFO [liquibase.ext] Waiting for changelog lock....
clm-dev-liquibase-cassandra  | [2023-07-21 16:05:23] INFO [liquibase.ext] Waiting for changelog lock....

Notice the timestamp of the lock in the DATABASECHANGELOGLOCK table is the same as the first “Waiting for changelog lock…” message. The issue doesn’t happen all the time. Running the “release-locks” command clears the lock fine, but it will happen again with the next update.

Because this is CosmosDB emulating Cassandra, I’m using the Simba JDBC driver for Cassandra, not the CosmosDB driver. This works fine on true Cassandra databases. I’m only having the issue on Azure CosmosDB instances emulating Cassandra.

Anyone else seen this issue and perhaps know what to do about it?

Thanks in advance,
Marshall

I found that in the 4.23.1 version of the liquibase-cassandra extension there is logic to check to see if the lock was just obtained by the current process (by hostname and IP address), which appears to defensibly guard against what I’m seeing, which is the query following the set of the lock is not returning a valid result. However, for CosmosDB emulating the Cassandra API, the behavior is a bit more severe, where it takes at least one more iteration of “Waiting for changelog lock…” before the query properly returns a valid value and the lock is attributed to the current process. So I added the check for local process ownership of the lock to every isLocked() call and this fixed my problem.