First of thank for spending the time on a great product. I recently started using it for a project.
I’m guessing this has been discussed before, but a quick forum search didn’t turn anything up. I have a suggestion that would greatly improve the usability from my perspective, and I’m hoping you would agree.
Currently the documentation states:
Each changeSet tag is uniquely identified by the combination of the �id� tag, the �author� tag, and the changelog file classpath name.
I’ve confirmed that this is the case. I can see the rational behind it, but I’d like to make a suggestion.
Instead of a file path (which really has little to do with what the change is) the tool could be updated to take the changeset XML, normalize or canonicalize it so the commands are what’s interesting and not things like whitespace, and then calculate an MD5 or something similar off of that it could accomplish the same objective that I think the file path was serving.
Also as another suggestion, instead of considering a change in the file path (fingers-crossed normalized md5 over xml) as a new changeset, it would be nice if it just errored that the changeset is different than what was identified (id + user) before.
The filename is used so that you only have to worry about uniqueness of id/author combinations within a single changelog file. There are use cases of liquibase where you will get duplicate identifiers if you don’t use the filename as well.
We do have the ability to list a logical filepath for each changLog in case you want to move your changelog file, and you can specify an original filename for a changeset that you moved from another file.
I’m not sure I understand what you are asking about the md5sum check, sorry. Are you thinking about generating an id by hashing several values together like git does?
I’m at a different organization now, and asked my team to use liquibase. Without me prompting them on use, they have experienced the same issue.
What I was asking for is similar to how git works.
Whitespace is normalized when XML parsing happens. It doesn’t matter if I have two space after the closing brace or one. It doesn’t matter if I have two carrige returns or one on a empty line. The current behavior is that those changes would cause it to reapply the changeset. This is why I’m suggesting that the CRC be calculated off a parse tree instead of a file. And if it is off a file, it should be “tidy’d” so white space changes don’t cause problems.
The filename while it may be a way to identify a changeset/file combination, the problem is that path is taken into account which makes sense to allow for similarly named files it breaks other things. A better implementation would have been to operate the way git does. changeset + filehash (based off of normalized content) and nothing to do with the file.