I am looking for a database change management tool for Cassandra. I am curious to know if anyone has experience with extending Liquibase to support any NoSQL databases, including Cassandra. I realize for some NoSQL databases, it might not be applicable at all. Cassandra though, has CQL which is a subset of SQL or at least close to it. And there is a JDBC driver for CQL. It does not work out of the box with Liquibase. It looks like some extensions are needed, like a SQLGenerator for creating tables and maybe more. Anyway, just wondering if anyone has looked into anything like this and has tips, words of caution, etc.
There are a wide variety of NoSQL technologies. Many of them are schema-less in that they do not require any sort of static schema definition; however, for some of the NoSQL technologies, including some of the schema-less ones, there is still a need for a tool such as Liquibase. Consider MongoDB for example. There is no fixed schema. And in fact, the database and collection (analog to table in RDBMS) does not even get created until you insert data. But for most applications, documents in a collection will have similar structure. The schema just happens to be defined in application code. As with a lot of databases, there is often a need for reference data that is often created and managed with a tool such as Liquibase. So for even something like MongoDB which has no fixed schema, I could see reference data as a possible use case for Liquibase.
Other NoSQL technologies, like Cassandra, are not schema-less. Cassandra is a column oriented key/value store. You do have to define tables (i.e., column families up front). Declaring the columns in a table is optional; however, even if columns are not defined up front, they must all be of the same type. Liquibase is absolutely applicable to Cassandra in large part because Cassandra provides CQL and there is a JDBC driver for it. I have already begun working on Liquibase extensions for Cassandra. I ran into some problems with lock table that Liquibase uses. See this thread for more details. In short, Liquibase’s locking needs to be turned off in order to get things working for Cassandra. I have already submitted a pull request for these changes.
To me - not having very much experience with NoSQL - this sounds like a contradiction in terms.
My understanding was that the whole NoSQL “thing” is around not having a fixed database schema (that’s why they are also called “schemaless” solutions).
I don’t understand why you would want to use a tool that requires you to impose a fixed schema definition on a system that is intended to be used without a fixed schema definition?
Even though mongo is “schemaless”, it doesn’t mean that you don’t have homogenous collections. Also, there are many times where you want to run migrations on your data.