Organization of production and testing environments

confik · August 13, 2010, 6:01am

Hello, Nathan!
Could you help me with some advice?
I have successfully set up liquibase on MySQL with structure similar to Oracle tutorial article.
The problem is that I’m working now on the script that would import data from old Paradox database to the new mysql one.
So the data is quite big, maybe it’s about 100M sql dump file.

Is that a best practice to store production data in repository?

I don’t like this idea because:

it’s very big and it would be a pain for other programmers with bad internet connection to get this data
the repository become +100M bigger
leaking users information
But
some modules in our systems need various data with large amounts of time (for example, reports)
it’s hard to test system without real data
generating test data can become a challenge

How do you solve these problems?

UPD: I can load the data into liquibase, the question is how effectively store this data - inside repository or not? If not then how do you solve these issues (see above)?

taranenko · August 13, 2010, 6:01am

use to load the data with bulk copy utility supplied with sqlserver.
Liquibase could start the process inside the changeSet using executeCommand like this

taranenko · August 13, 2010, 6:01am

Originally posted by: taranenko
use to load the data with bulk copy utility supplied with sqlserver.

sorry I'm overlooking, you need mysql, not mssql :) Have mysql such utility? Normally should...

confik · August 13, 2010, 6:01am

Originally posted by: taranenko

Originally posted by: taranenko
use to load the data with bulk copy utility supplied with sqlserver.
sorry I'm overlooking, you need mysql, not mssql :) Have mysql such utility? Normally should...

taranenko, thanks for the answer. But my problem is not how to import data into liquibase, but how to store it effectively. Actually there is similar utility, it can be done like this:

$ mysql -uroot -ppassword db_name < dump.sql

alexis_lb · August 13, 2010, 6:01am

Hi confik, this is my oppinion about that:

1 - You have to put in source control only things that make sence to version (things that evolve through time) and you can track the changes.
2 - This kind of big files produce more headache than solutions.
3 - In these cases a diff against versions is useless.

So my suggestion is to keep these files in a file server, and make it part of a post deployment process or import process that occurs once after the db initialization. After that keep the thing updated with upgrade scripts.

Put in source control only static data and things you need to set up a db. Then for upgrades, data modification, etc, put the inserts, updateds or deletes in changesets.

Hope this helps,
Alexis.

confik · August 13, 2010, 6:01am

Originally posted by: Alexis
Hi confik, this is my oppinion about that:
1 - You have to put in source control only things that make sence to version (things that evolve through time) and you can track the changes.
2 - This kind of big files produce more headache than solutions.
3 - In these cases a diff against versions is useless.

So my suggestion is to keep these files in a file server, and make it part of a post deployment process or import process that occurs once after the db initialization. After that keep the thing updated with upgrade scripts.

Put in source control only static data and things you need to set up a db. Then for upgrades, data modification, etc, put the inserts, updateds or deletes in changesets.

Hope this helps,
Alexis.

Alexis, thank you very much for your answer. It seems that it’s the most reasonable solution.

dhartford · August 13, 2010, 6:01am

To further define, storing the schema/definition is a given.

As to which data should be stored in a schema/definition/change-set this is how I go about it:

*Environmental configurations related to things like email server, application title (if you prefix/postfix the environment), webservice server, (i.e. dev->test->live) are stored in source control, but outside the scope of liquibase. These should be very-slow-changing. (someone correct me if there is a way to do this)

*Control/reference data that is needed by application or processes should and must be stored with liquibase change sets. Example would be the 50 US states, pulldown action items (approve, deny), slow-changing reference data (i.e. data that isn’t updated by ETL jobs), or other control data that is added for application enhancements/improvements (i.e. approve, deny, and a new ‘defer’ option that has related application changes). If you have roles/permissions this would also be control data (unless it can be changed on live by an application, which would be ‘content managed’).

*Content managed data should never be stored anywhere except with tape backups and manually migrated/moved for testing purposes. Content managed data is usually reference data that can be changed through an application directly on a live environment. Example would be pre-defined auto-comments, and adding in new comments on live, or the custom creation/addition of security groups directly on live.

*working data, user data, fast-moving data, or in other terms data that is likely different on live than on dev or test and the application is still functional – all working data items that should not be managed by liquibase.

just my opinion/perspective, hope that helps.

-D

confik · August 13, 2010, 6:01am

dhartford, thanks, you helped me a lot.

Topic		Replies	Views
Liquibase Loading Data Performance General Discussion	2	1378	May 16, 2019
Want to evaluate LiquiBase General Discussion	0	268	August 23, 2011
Moving Test Datasets Across Database Types Community Content and Articles	0	776	October 27, 2021
Need some help with managing data in tables General Discussion	5	396	September 13, 2011
liquibase bulk upload General Discussion	5	2663	February 15, 2011

Organization of production and testing environments

Related Topics