Incorrect default charset for formatted sql files via maven plugin

GaryHodgson · May 8, 2014, 5:38pm

Hi,

I have been trying to solve a problem I have with UTF-8 encoded sql scripts via the maven plugin. Data is being inserted into the database with encoding “windows-1252”, even though the file.encoding System property is set to “UTF-8”.

I tried various ways to workaround this, maven settings, plugins etc, but to no avail. Working with the code I see that, even though the system property for file.encoding is UTF-8, the Charset.defaultCharset() still returns “windows-1252”.

Further testing (via groovy scripts) seems to verify the file.encoding has no effect on the default charset, contrary to the information I found online:

public UtfBomAwareReader(InputStream in, String defaultCharsetName) {

un1399545376150r55id · May 8, 2014, 5:38pm

A little more research led me to the following stackoverflow page (http://stackoverflow.com/questions/8177089/charset-defaultcharset-get-different-result-under-jdk1-7-and-jdk-1-6) which mentions that Charset.defaultCharset() is not influenced by -Dfile.encoding, also : “The preferred way to change the default encoding used by the VM and the runtime system is to change the locale of the underlying platform before starting your Java program.”

It seems to me that relying on Charset.defaultCharset() leads to portability problems, i.e. I would prefer not to have to ask each developer to adjust their environment just to run liquibase. I suspect the most portable solution would be to introduce a “defaultCharset” parameter to the Maven plugin, as relying on the file.encoding System Parameter is also not a great solution (although I can set this via a Maven plugin).

nvoxland · May 8, 2014, 5:38pm

Charset handling is something high on my list of things to fix up, but also something I’ve not really dealt with a lot so I need to do some research.

I am working on adding a standard configuration system within liquibase so you can set configs via system properties, maven, etc. which will help with managing the charset.

I added your comments to https://liquibase.jira.com/browse/CORE-1503 so I don’t lose them. Thanks for the patch and the research.

Nathan

un1387815775343r70id · May 8, 2014, 5:38pm

Nathan, it would be awsome if this could be done through liquibase. Thank you so much.

Topic		Replies	Views
Add the possibility to set defaultCharsetName General Discussion	5	806	February 12, 2014
Character encoding problem when using sqlFile General Discussion	2	2418	June 6, 2013
How to specify charset encoding on liquibase General Discussion	2	2179	February 24, 2012
utf8 in a SQL tag encoding properly General Discussion	3	2066	October 24, 2013
character encoding and sql output General Discussion	7	3768	May 4, 2010

Incorrect default charset for formatted sql files via maven plugin

Related topics