Redshift support

We are trying to use Liquibase with Redshift, and the first issue we have encountered are the Liquibase changelog tables.

The databasechangelog table has a column named TAG, and it seems it’s a reserved word in Redshift. We got around that issue on table creation (we create the Liquibase tables manually at the beginning).

However, on each Liquibase run, it does a query against that table to check if changelog already ran or not, and that query fails. What’s the best way to get this to work?

Is there any update to this issue? We’re having the same one. Did you make it work or just switched to using something else?

I haven’t gotten a chance to try liquibase on Redshift yet, you may want to try liquibase 3.0 as it made some improvements in the column escaping logic that may have fixed it. Let me know if it is still broken.


Nathan

I do not think “TIMESTAMP WITH TIME ZONE” is a valid Redshift data type.   I’ve not looked deeply in the issue.  While Liquibase connects properly (I assume) the transaction fails because of this databasechangeloglock table liquidbase uses.

liquibase.bat --driver=org.postgresql.Driver --changeLogFile=redbase.xml … generateChangeLog

Liquibase generateChangeLog Failed: org.postgresql.util.PSQLException: Unable to determine a value for MaxIndexKeys due to missing system catalog data.


liquibase.bat --driver=org.postgresql.Driver --changeLogFile=redbase.xml … DBDoc /docs

Liquibase DBDoc Failed: liquibase.exception.DatabaseException: Error executing SQL CREATE TABLE public.databasechangeloglock (ID INT NOT NULL, LOCKED BOOLEAN NOT NULL, LOCKGRANTED TIMESTAMP WITH TIME ZONE, LOCKEDBY VARCHAR(255), CONSTRAINT PK_DATABASECHANGELOGLOCK PRIMARY KEY (ID)): ERROR: Column “databasechangeloglock.lockgranted” has unsupported type “timestamp with time zone”.



By manually running the output from updateSQL, it seems to produce valid SQL

liquibase.bat --driver=org.postgresql.Driver --changeLogFile=redbase.xml … updateSQL

– *********************************************************************
– Update Database Script
– *************************************************************
– Change Log: redbase.xml
– Ran at: 9/16/13 4:43 PM
– Against: @jdbc:postgresql://
.redshift.amazonaws.com:
/*****db
– Liquibase version: 3.0.4
– *********************************************************************

– Create Database Lock Table
CREATE TABLE public.databasechangeloglock (ID INT NOT NULL, LOCKED BOOLEAN NOT NULL, LOCKGRANTED TIMESTAMP WITH TIME ZONE, LOCKEDBY VARCHAR(255), CONSTRAINT PK_DATABASECHANGELOGLOCK PRIMARY KEY (ID));

– Initialize Database Lock Table
DELETE FROM public.databasechangeloglock;

INSERT INTO public.databasechangeloglock (ID, LOCKED) VALUES (1, FALSE);

– Lock Database
– Create Database Change Log Table
CREATE TABLE public.databasechangelog (ID VARCHAR(63) NOT NULL, AUTHOR VARCHAR(63) NOT NULL, FILENAME VARCHAR(200) NOT NULL, DATEEXECUTED TIMESTAMP WITH TIME ZONE NOT NULL, ORDEREXECUTED INT NOT NULL, EXECTYPE VARCHAR(10) NOT NULL, MD5SUM VARCHAR(35), DESCRIPTION VARCHAR(255), COMMENTS VARCHAR(255), TAG VARCHAR(255), LIQUIBASE VARCHAR(20), CONSTRAINT PK_DATABASECHANGELOG PRIMARY KEY (ID, AUTHOR, FILENAME));

– Changeset redbase.xml::1::bob::(Checksum: 7:17a94fcdd73aa8852c1f53e4d06ee651)

CREATE TABLE public.department (id INT NOT NULL, name VARCHAR(50) NOT NULL, active BOOLEAN DEFAULT TRUE, CONSTRAINT PK_DEPARTMENT PRIMARY KEY (id));

INSERT INTO public.databasechangelog (ID, AUTHOR, FILENAME, DATEEXECUTED, ORDER EXECUTED, MD5SUM, DESCRIPTION, COMMENTS, EXECTYPE, LIQUIBASE) VALUES (‘1’, ‘bob’, ‘redbase.xml’, NOW(), 1, ‘7:17a94fcdd73aa8852c1f53e4d06ee651’, ‘Create Table’,’’, ‘EXECUTED’, ‘3.0.4’);



I tried to run a DIFF between my redshift cluster to a local postgres instance but a lot more work would be needed.

I tried it with liquibase 3.0.7 and it still doesn’t work. Even when I manually create the changelog table (by fixing the column name TAG by adding double-quotes: “TAG” and by removing “WITH TIME ZONE”) it doesn’t work, because in each select liquibase makes it uses unescaped column names, and it uses TAG.


If I have to run manually the updateSQL output in redshift, then I don’t need liquibase, in fact it only creates problems, as it’s not very user friendly to write the changelog xml (not documented well)


In other words the whole purpose of liquibase is that it does everything for you, and if you use it with redshift it’s just not the case, on the contrary, you do everything for liquibase…

I created https://github.com/liquibase/liquibase-redshift/releases/tag/snapshot-1 as a Liquibase extension to support redshift. It should correct the datetime problem and uses the redshift keywords. 


To use it, download the jar file from the github releases page and add it to your classpath.


Let me know how it works, I don’t have a redshift environment to test with at the moment.


Nathan

I fixed an issue so you will want at least snapshot #2 https://github.com/liquibase/liquibase-redshift/releases


Nathan

I missed overriding a method in the DateTimeType portion of the extension. I pushed a fix for it but haven’t done an official build yet.


You should be able to use dbms=“postgres” or dbms=“redshift” now. My logic for whether a connection is to redhsift or postgres is based on whether the url contains “.redshift.”. That is based on a quick googling that made it look like you used a jdbc:postgres URL when connecting to your redshift server but that the hostname is SOMETHING.redshift.SOMETHING.


Is there a better way to differentiate between redshift and postgres?

Let me know how to check the db URL and then I’ll do a new build with the date type fix.

Nathan

Hi Nathan,


the fix for the “TAG” works, but there’s still a problem with the “WITH TIME ZONE”:

ERROR: Column “databasechangeloglock.lockgranted” has unsupported type “timestamp with time zone”.


Also, is there a way to be able to differentiate between dbms jdbc:postgres (running for example on my local machine for development) and dbms: jdbc:redshift (in real Amazon Redshift). The reason is that in redshift we need to add things like:


       

           

       


Currently it only works if I use the context, but obviously it would be better to have the dbms.

Unfortunatelly there is no way I know of. Using the url might work for some - people that use the original bla-bla.us-east-1.redshift.amazonaws.com hostname. However I can imagine that it is not always the case. Many will use their own hostnames in their own domain I guess. It might be a better guess to look at the port. As I see it’s 5439. I’m not sure if it’s 100%, but I guess it’s more likely to stay like that (unless someone does some port forwarding magic) so maybe: “jdbc:postgresql:” and port:5439 might be a good guess.


Thanks for the info. I uploaded a new snapshot #3 with the date time fix, plus the isCorrectDatabaseImplementation logic is now:


StringUtils.trimToEmpty(System.getProperty(“liquibase.ext.redshift.force”)).equalsIgnoreCase(“true”)

                || conn.getURL().contains(".redshift.")

                || conn.getURL().contains(":5439");


so something with Nathan

Downloaded snapshot#3, and at first I thought it doesn’t work at all, but then I found out that it INDEED WORKS and both features I tested work (dbms=“redshift”, and the create databasechange* without the “WITH TIME ZONE”). At least when I run “updateSQL”.


However when I try to run it with “update”, this is what I get:


Liquibase update Failed: Error executing SQL SELECT FILENAME,AUTHOR,ID,MD5SUM,DATEEXECUTED,ORDEREXECUTED,TAG,EXECTYPE,DESCRIPTION,COMMENTS FROM public.databasechangelog ORDER BY DATEEXECUTED ASC, ORDEREXECUTED ASC: ERROR: syntax error at or near “TAG”

  Position: 61


I guess, this is a select in some other part of the code that doesn’t take in consideration that TAG is a reserved keyword in redshift.


There is one more thing I miss: the 1st time it didn’t work because I ran it from gradle. How can I make it work from there as well? Now it only worked when I copied the postgres driver next to liquibase.jar and liquibase-redshift-1.0-SNAPSHOT.jar and used it from the command line.


Gavriel

Yes, that select is from a separate spot. I can fix that for the upcoming 3.0.8 release since it is in the liquibase code, not in the extension.


I haven’t used gradle enough to know how it handles classpaths. You need the postgres driver, liquibase.jar and liquibase-redshift.jar in the same classpath for liquibase to find the driver and the extension.


Nathan

Unfortunatelly I’m back with the “TIMESTAMP WITH TIME ZONE” problem. Although it doesn’t happen on the changelog tables, but if I use type=“TIMESTAMP” in any of my changesets, it still adds the “WITH TIME ZONE”. Do you know how could it be removed?


Another problem is NOW(). When liquibase updates the changelog it uses NOW(), but it’s not supported in RedShift, we should use GETDATE() instead, see: http://docs.aws.amazon.com/redshift/latest/dg/r_NOW.html, http://docs.aws.amazon.com/redshift/latest/dg/r_GETDATE.html


UPDATE:

Nathan, snapshot #4 works! (but for some strange reason the forum doesn’t let me response to tour post or the thread…)

I pushed up a change to fix the timestamp issue (I fixed it for the ‘datetime’ type, but liquibase manages ‘timestamp’ separately). The fix should also use getdate() instead of now().


I published a new 1.0 snapshot release at https://github.com/liquibase/liquibase-redshift/releases/tag/snapshot-4 with the fixes.


Let me know how it works and if it seems ready for the official 1.0 release since I don’t have an actual redshift database to test against.


Nathan

I know this is a really old thread, but is liquibase-redshift still under development? It seems like it stalled in 2015. I’m using the 1.0 release and it’s still failing on TAG being a reserved word in Redshift.

I have fixed several bugs with liquibase redshift support. Could one of you review my changes?

https://github.com/liquibase/liquibase-redshift/pull/7

Thanks,

-Michael