liquibase bulk upload

nvoxland · February 15, 2011, 10:10am

It should be possible to speed up liquibase’s loadData CSV import significantly with a few change. Currently it loads the entire CSV file into memory before processing it which is not necessary. Also, most JDBC drivers support bulk imports which we are not using either.

I’m planning on implementing those changes, and anything else profiling turns up in 2.1, but if anyone wants to submit a patch I would appreciate it.

Nathan

joshjdevl · February 15, 2011, 10:10am

Hi,

We would like to use liquibase to handle large file insert. We have dozens of changeset files averaging around 75mb. What is the best way to have liquibase run these changesets? We currently receive out of memory errors. Also, the performance of liquibase is quite slow. It takes over an hour for a single 75mb file to go to a local mysql database.

As to why we are using liquibase, the validation and verification of liquibase is a nice feature. Especially as this is static data its nice to use liquibase to manage all of it.

joshjdevl · February 15, 2011, 10:10am

What are the methods to properly clear the resources?

So far I’ve tried the following, though the memory is being held up somewhere

ExecutorService.getInstance().clearExecutor(database);

ChangeLogParserFactory.getInstance().getParsers().clear();

ChangeLogParserFactory.reset();

cleanup(database);

database = null;

liquibase = null;

resourceAccessor = null;

nvoxland · February 15, 2011, 10:10am

There isn’t any work in progress so far. I don’t have a release target for 2.1 yet. Still making sure 2.0.1 is nice and stable before starting the next round of features.

I think that CSV import will be able to be made considerably faster than XML. It also has the advantage of being more readable (IMHO). The reason that CSV can be better optimized is that the way liquibase works is to parse the entire changelog into a single in-memory representation which is then sent to the changelog executor. If all the data is in changesets, it all needs to be included in the in-memory changelog. With CSV files, all that is in the changeLog itself is a reference to the .csv file which is read and inserted at run time. It is that reading and inserting of the CSV file which can be optimized.

That being said, it would be worth running liquibase and your changelog through a profiler such as the one that comes with netbeans. I have found that performance problems are rarely from where you expect. There could be other parts of the liquibase code that are causing the bulk of the problem in your case. Without profiling, you really don’t know for sure.

Nathan

joshjdevl · February 15, 2011, 10:10am

Thank you. Is there a patch in progress? Perhaps I can take a look and possibly help.

Also, when is the target for 2.1?

joshjdevl · February 15, 2011, 10:10am

Also wanted to point out. We arent using CSV. We are actually using liquibase xml changeset files. Is there a way this mechanism can be sped up/improved?

Topic		Replies	Views
Maximum csv file size for loadData General Discussion	2	2540	March 11, 2015
load data csv performance General Discussion	3	2201	February 21, 2011
OutOfMemoryError with a huge changelog General Discussion	1	2009	June 2, 2009
Loading large CSV files is hugely inefficient and slow Liquibase Development	1	1039	June 22, 2016
Memory consumption for large files General Discussion	5	1272	October 10, 2013

liquibase bulk upload

Related topics