Our company is selecting DataBricks product DeltaBricks as a lake-house tool. We wish to manage schema evolution using Liquibase if possible. Please may anyone advise if Liquibase can support DeltaBricks/DataBricks?
Background;
Our raw ingest will be Parquet files on s3.
DeltaBricks format would be used for the next layers in the data lake for ‘silver’ (trusted de-duplicated events) and ‘gold’ (transformed) layers.
DeltaBricks is an open-course format, which under-the-hood is similar to Parquet but with additional metadata that allows more efficient queries than against Parquet, and supports ACID compliant operations.
We do not need to manage schemas in Parquet Format.
We Do wish to manage schemas in DeltaBricks Format.
DeltaBricks does support some SQL operations such as ALTER TABLE (ALTER TABLE | Databricks on AWS)
If we cannot directly use Liquibase to modify DeltaBricks schemas, may we reverse-engineer the schema from DeltaBricks on say a daily basis for version control? I know this might seem an odd requirement.
Many Thanks and sorry for asking such as basic question, I can see that none of these technologies are actively listed as a supported database.
But, you can actually extend Liquibase to add support for more databases. In fact many of our supported databases have been contributed by the community .
Or, if you can’t / won’t do that, we still love to hear the use case, as an issue, or right here where members of the Product team are listening (like @mariochampion ).
Hi @ronak , @r2liquibase ,
Very sorry for not replying earlier, I thought I had. Yes Ronak’s reply was very useful. We have discussed extending liquibase for our needs re DataBricks, however the case for doing so is neither yet a priority or with much weight.
We are currently using SQL for database and table definitions and looking to manually patch when we first encounter a breaking schema change.
DataBricks, like other ‘big data’ databases, does support ragged files/tables (appending columns) and non-breaking changes, which in turn reduces our need for schema evolution tools.
So in summary we have not yet decided whether to extend liquibase for DataBricks or use a ddl patch method instead. I have a feeling this will not be decided until early next year given our implementation timelines.
Many thanks, Richard