Thanks for the summary, Laird. This is as good of a thread as any to discuss how we want the type conversion logic to work for 2.1.
Another use case I would throw in is database diff/changelog generation. When you generate a changelog from a database, should the changelog include the liqubase types or the database-specific types? Similarly, if you are comparing two databases (including the “hibernate configuration” database), how do you determine if datatypes are equal? Do you do database-specific comparisons or attempt to convert to a liquibase standard type?
There is also the fact that liquibase doesn’t currently explicitly handle “standard” datatypes that the underlying database supports, so you can have type=“int” and it will work on oracle even though we don’t check for an “int” type and convert it to a “number(10)”. This currently confuses people when they do a generateChangeLog and see different types.
I do like the idea of more explicit syntax in the changelog file to control what is meant. I think that is a necessarily fallback option (like the tag). We do need to be careful about backwards compatibility too, which I think means that type=“XXX” will need to use the liquibase types.
The java.sql.Type mappings have always been a bit weird in my mind. They were suggested and included by someone else, I think a standard liquibase type system should be enough. We can’t get the database types directly from the JDBC driver for a given java.sql.Type, so in the end is just just an alias for a standard liquibase type. I think this is a fine standard to continue with, but with better documentation on what the java.type->liquibase type is per database.
I like the idea of a thinking of types as more descriptive than just standard datatypes. At the same time, database standard types bring with the a lot of built in meaning with less to type. For example, varchar(20) vs char(20) give the same meaning as string(size=20, padded=false) vs string(size=20, padded=true). Using the SQL99 datatypes as a basis I think makes sense, but there are other types such as CLOB, BLOB, TEXT, etc. that aren’t well defined.
What about this as a suggestion:
- When liquibase sees a string in a type attribute, it assumes it is a liquibase type something that should be converted to a database type
- If the type attribute starts with a “db:”, that means the type is just to be passed directly along to the database with no conversion.
- Liquibase types use the following syntax: TYPE(STANDARD_TYPE_PARAM(s), [LIQUIBASE_TYPE_PARAMS])
For example, “varchar(30)”, “varchar(30 [fullTextIndexed=true,clob=false]”
Implementation Details/Rationale:
The above syntax makes a more natural progression between database-specific types to standard datatypes that are mapped by the database to datatypes managed by liquibase. We would also be able to move how a particular string is handled (like “boolean”) from release to release from being a database pass through to a liquibase type without changing the changelog file.
In 2.0, we have the start of standard classes per liquibase datatype, but they do not really work like the rest of the extension system does. We should have a “name” column and a DataTypeFactory like we have with the ChangeFactory where we lookup the correct datatype by name (with the ability to add datatypes and/or override the core DataTypes with extensions) and instantiate a new instance of it. There would be a fallback RawDataType class if no registered type is found.
We would then have a setParameters(String[]) method for passing in the STANDARD_TYPE_PARAM(s). For each of the LIQUIBASE_TYPE_PARAMS key/value sets, we would use get/set methods on the instance. This would allow us to have some properties in superclasses and some in subclasses and extensions. If a property is not supported, an exception is thrown.
The DataType interface would have a toDatabaseType(database) [or the Database interface would have a toDatabaseType(datatype)] method which would take the object and all it’s parameters and make a string to pass to the underlying database. If a type’s configuration is not valid for the given database (like too long of a varchar or an unsupported type) then an exception is thrown from the toDatabaseType() method.
Going from a database diff, the DatabaseType objects can register the JDBC types they represent and the diff tool would use the DataTypeRegistry to instantiate the correcdt DataType object, then call a method on the DataType passing in its metadata and it would be up to the DataType instance to populate its own parameters.
The general idea is to encapsulate all the to/from liquibase types into a single extensible class. I go back and forth on whether the logic of creating the database string for a given type belongs in the database instance or the datatype instance.
Thoughts?
Nathan