XML parsing error?

I’ve just tried to create a stored procedure in a liquibase changelog file using the tag.

The migration fails with (among other things) this weird error:

Caused By: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘}" tableName=“Shelter”><colLT FALSE;
     DECLARE c CURSOR FOR
       SELECT b’ at line 5

Obviously I would not write such invalid SQL.  I looked in the log to see what Liquibase was trying to execute, and lo and behold.  Here’s a snippet.  See line 5:

Error executing SQL CREATE PROCEDURE fixBreadcrumbs() MODIFIES SQL DATA
   BEGIN
     DECLARE newCurrentQuestionID VARCHAR(30);
     DECLARE newSessionID BIGINT(20);
     DECLARE maxId BIGINT(20);maName}" tableName=“Shelter”><colLT FALSE;
     DECLARE c CURSOR FOR
       SELECT bc.priorQuestionID, bc.sessionID
       FROM sessionbreadcrumb bc

That bizarre fragment–see the maName}" tableName=“Shelter”><col part?–is from an element waaaay up earlier in the changelog.xml.  Needless to say, I didn’t incorporate this piece of text in my stored procedure SQL.

Is there some weird parameter substitution going on here that shouldn’t be going on?

In case it helps, here’s the source of that snippet:

CREATE PROCEDURE fixBreadcrumbs() MODIFIES SQL DATA
    BEGIN
      DECLARE newCurrentQuestionID VARCHAR(30);
      DECLARE newSessionID BIGINT(20);
      DECLARE maxId BIGINT(20);
      DECLARE done BOOLEAN DEFAULT FALSE;
      DECLARE c CURSOR FOR
        SELECT bc.priorQuestionID, bc.sessionID
        FROM sessionbreadcrumb bc

Are you by any chance reusing StringBuffers or StringBuilders and forgetting to setLength(0); someplace?

Or maybe line 616 of XMLChangeLogSAXHandler.java?  Don’t have time to debug at the moment.

Hard to tell if I’ve definitively “fixed” it, but moral of the story so far: always enclose your tag body in a section!

I think that what might be going on here is a combination of a sneaky bug in Liquibase and good old fashioned dumb and careless behavior on my part.

The sneaky bug might have to do with the way that SAX reads and deals with CHARACTERS events from an XML stream.  I know that in StAX you can tell your parser to coalesce multiple back-to-back occurrences of such events, but when you do, you lose CDATA information.  Anyhow, it appears that something about adding CDATA specifiers back into the mix corrected the weird substitution Liquibase appeared to be doing.

I will still put my money on the fact that there’s a StringBuffer being recycled badly somewhere.

Best,
Laird

That is very odd.  The XMLChangeLogSAXHandler seems like the class that woudl have the issue in it.  There is a StringBuffer text attribute that is built up by the SAX characters(ch, start, length) callback handler, but that code looks like it is taking the start and length into account correctly.  The text StringBuffer gets cleared out in endElement and startElement as well, but I if that was a problem, I would expect your bad string to have additional text at the beginning or end, not in the middle, especially since we only update it with text.append().

Without being able to reproduce it, it almost looks like the SAX parser is passing us bad data in the characters callback.  What java and/or xml parsers are you using?   

Good to hear you have a work around at least.

Nathan

Short answer: Java 6u22 on a Mac; the parser is whatever Glassfish 3.1 uses.  (I’m running the Liquibase servlet listener.)

Now, before the changelog ever gets to Liquibase, it has been cobbled together out of other changelogs.  I use StAX to grab a bunch of changelogs in dependency order in my project’s Maven build, and stitch them together (obviously everything after the element, so that I spit out valid XML).  For reasons that I can go into if you wish, I can’t use the tag since among other things I’m in an .ear file.  In case it matters, I use the Woodstox StAX library to assemble this (valid) composite changelog.xml file.

Also, the size of the file is really quite large, in case that makes a difference.  I would be more than happy to send it to you.

Best,
Laird

Did you check to see if the generated changelog has the bad sql text in it, even if your original is good?

I would be interested to know why you can’t use the tag.  If there are things we can do to have it more usable, I would like to know.  If you use with a path to a classpath accessable file, it should find it, even if it is in an ear.

Nathan

Generated changelog.xml SQL was fine.

The include problem goes like this:

In unit tests, I have a JUnit test case that does a ClassLoader.getResources("/META-INF/liquibase/changelog.xml").  The changelog.xmls that are present on the classpath are thus handed back in guaranteed dependency order.  So each Maven module defines a changelog.xml “nugget” that doesn’t have to say anything about its dependencies (short of foreign keys–i.e. references need to refer to table names that are defined “upstream”).

This works great for development and unit testing.  Everything Just Works ™.

When I build the .ear file, the jars that are included as dependencies are of course located in the lib directory.  Or they might not be (in Java EE 6).  I can’t make any assumption about where they are located.  So I don’t know their path.  And of course the path at this point is the only thing that can be used to distinguish all those otherwise-identically-named changelog.xml files (recall that all changelog.xml nuggets are in /META-INF/liquibase).  So I’m in a place where I can’t rely on dependency order any more, and can’t rely, obviously, on Maven any more, so I’m left with having to generate the One True Changelog right before I leave Maven-land.

Best,
Laird

About the tag: one thing you could have done is use the logicalFilePath parameter on to give each changeSet a unique name as far as liquibase is concerned, even if their file paths are the same.  If you have multiple files all in the /meta-inf/liquibase/changelog.xml location, however, the tag may be confused by getting multiple files for the same file name.  I generally give each changelog file a unique name and/or package based on it’s feature which would allow you to have a master changelog that imports all the sub changelogs, although you can still run one at a time for testing purposes. If your setup is working, I’m not suggesting changing it, but figured I would point it out for anyone else following along.

If you could send your generated changelog to me to debug, that would help.  It seems like it should work, and I haven’t been seeing anything similar in my testing.  If you’d prefer, you can email it to me at nathan [at] voxland.net

Nathan

I do give all the changelog.xml “nuggets” unique logical file paths.  They do not in fact resemble paths; they are made up of the Maven groupId and artifactId, subbed in by Maven filtering mechanisms.  Are you saying I could include them using logicalFilePaths, not physical?  That would be great.

I’ll send you the composite changelog to your email address.

Best,
Laird