Incorrect default charset for formatted sql files via maven plugin


I have been trying to solve a problem I have with UTF-8 encoded sql scripts via the maven plugin.  Data is being inserted into the database with encoding “windows-1252”, even though the file.encoding System property is set to “UTF-8”.

I tried various ways to workaround this, maven settings, plugins etc, but to no avail.  Working with the code I see that, even though the system property for file.encoding is UTF-8, the Charset.defaultCharset() still returns “windows-1252”.

Further testing (via groovy scripts) seems to verify the file.encoding has no effect on the default charset, contrary to the information I found online:

  1.        public UtfBomAwareReader(InputStream in, String defaultCharsetName) {

A little more research led me to the following stackoverflow page ( which mentions that Charset.defaultCharset() is not influenced by -Dfile.encoding, also : “The preferred way to change the default encoding used by the VM and the runtime system is to change the locale of the underlying platform before starting your Java program.”

It seems to me that relying on Charset.defaultCharset() leads to portability problems, i.e. I would prefer not to have to ask each developer to adjust their environment just to run liquibase.  I suspect the most portable solution would be to introduce a “defaultCharset” parameter to the Maven plugin, as relying on the file.encoding System Parameter is also not a great solution (although I can set this via a Maven plugin).

Charset handling is something high on my list of things to fix up, but also something I’ve not really dealt with a lot so I need to do some research.

I am working on adding a standard configuration system within liquibase so you can set configs via system properties, maven, etc. which will help with managing the charset.

I added your comments to so I don’t lose them. Thanks for the patch and the research.


Nathan, it would be awsome if this could be done through liquibase. Thank you so much.