Hi all,
I face slowness on loadUpdateData with a file of 100’000 lines.
I have an index on the columns use in the where clause.
I see that liquibase is sending a request per row in the DB:
BEGIN
UPDATE INSTRUMENT_PRICE_HISTORY SET CREATED_AT = ‘2020-11-03 00:00:00’, ID = -1560221, PRICE_CURRENCY_ISO_CODE = ‘CHF’, PRICE_QUANTITY = 107.16, PUBLICATION_DATE = ‘2020-11-03 00:00:00’, UPDATED_AT = ‘2020-11-03 00:00:00’ WHERE INSTRUMENT_ID = -156 AND DATE = ‘2020-11-03’;
IF not found THEN
INSERT INTO INSTRUMENT_PRICE_HISTORY (ID, CREATED_AT, UPDATED_AT, INSTRUMENT_ID, DATE, PRICE_QUANTITY, PRICE_CURRENCY_ISO_CODE, PUBLICATION_DATE) VALUES (-1560221, ‘2020-11-03 00:00:00’, ‘2020-11-03 00:00:00’, -156, ‘2020-11-03’, 107.16, ‘CHF’, ‘2020-11-03 00:00:00’);
END IF;
END;
I try to understand why is not grouping the requests and if it’s possible to use an SQL MERGE instead of this statement.
I found this comment in the code:
// we don't do batch updates for Postgres but we still send as a prepared statement, see LB-744
// mysql supports batch updates, but the performance vs. the big insert is worse
But I can’t find the issue or information about ‘LB-744’.
There is also a comment that is not aligned with the code here:
if (rows.stream().anyMatch(LoadDataRowConfig::needsPreparedStatement)) {
// If we have only prepared statements and the database supports batching, let's roll
The comment should be If we have at least one prepared statements
But instead of going through all the rows, I don’t understand why !preparedStatements.isEmpty()
is not sufficient.
I have several questions:
- Where can I found the ‘LB-744’ ?
- Why not using PreparedStatement with Liquibase ?
- Why Liquibase is not using a MERGE Statement ?
- Is there a reason to not group requests ?
Like
BEGIN
UPDATE INSTRUMENT_PRICE_HISTORY SET CREATED_AT = '2020-11-03 00:00:00', ID = -1560221, PRICE_CURRENCY_ISO_CODE = 'CHF', PRICE_QUANTITY = 107.16, PUBLICATION_DATE = '2020-11-03 00:00:00', UPDATED_AT = '2020-11-03 00:00:00' WHERE INSTRUMENT_ID = -156 AND DATE = '2020-11-03';
IF not found THEN
INSERT INTO INSTRUMENT_PRICE_HISTORY (ID, CREATED_AT, UPDATED_AT, INSTRUMENT_ID, DATE, PRICE_QUANTITY, PRICE_CURRENCY_ISO_CODE, PUBLICATION_DATE) VALUES (-1560221, '2020-11-03 00:00:00', '2020-11-03 00:00:00', -156, '2020-11-03', 107.16, 'CHF', '2020-11-03 00:00:00');
END IF;
UPDATE INSTRUMENT_PRICE_HISTORY SET CREATED_AT = '2020-11-03 00:00:00', ID = -1560221, PRICE_CURRENCY_ISO_CODE = 'CHF', PRICE_QUANTITY = 107.16, PUBLICATION_DATE = '2020-11-03 00:00:00', UPDATED_AT = '2020-11-03 00:00:00' WHERE INSTRUMENT_ID = -157 AND DATE = '2020-11-03';
IF not found THEN
INSERT INTO INSTRUMENT_PRICE_HISTORY (ID, CREATED_AT, UPDATED_AT, INSTRUMENT_ID, DATE, PRICE_QUANTITY, PRICE_CURRENCY_ISO_CODE, PUBLICATION_DATE) VALUES (-1560221, '2020-11-03 00:00:00', '2020-11-03 00:00:00', -157, '2020-11-03', 107.16, 'CHF', '2020-11-03 00:00:00');
/* Other statements here too */
END;
Thanks for you help,
Philippe