Sounds like a great idea.
Steve Donie
Principal Software Engineer
Datical, Inc. http://www.datical.com/
Sounds like a great idea.
Steve Donie
Principal Software Engineer
Datical, Inc. http://www.datical.com/
Hi,
I’m not complaining about the checksumming part. I know what this is good for and why it is needed.
My problem is the loading of the data itself.
If I understand the code correctly, Liquibase has three stages when loading the data.
But there are two problems:
Memory consumption:
Step 1 always loads all rows into memory. Having a table with 100 million rows and 3 GB disk size will need at least 12-16GB heap space. It would be a lot better to have a multi staged, multi threaded system. One thread reads the file, creates the SqlStatement objects and pushes them to a queue. A second thread takes the SqlStatement objects from the queue, generates the INSERT statements and pushes them to a second queue. A third thread takes the INSERT statements from the queue and executes them.
Batched PreparedStatements:
At the moment regular INSERT statements are used. Sometimes with multiple rows. Please correct me, if I’m wrong. It would be much more efficient to use always single row prepared INSERT statements and batch some 10.000 rows together via standard JDBC PreparedStatement.addBatch().
Any opinions on that?
my 2 cent,
markus