Organization of production and testing environments

Hello, Nathan!
Could you help me with some advice?
I have successfully set up liquibase on MySQL with structure similar to Oracle tutorial article.
The problem is that I’m working now on the script that would import data from old Paradox database to the new mysql one.
So the data is quite big, maybe it’s about 100M sql dump file.

Is that a best practice to store production data in repository?

I don’t like this idea because:

  1. it’s very big and it would be a pain for other programmers with bad internet connection to get this data
  2. the repository become +100M bigger
  3. leaking users information
    But
  4. some modules in our systems need various data with large amounts of time (for example, reports)
  5. it’s hard to test system without real data
  6. generating test data can become a challenge

How do you solve these problems?

UPD: I can load the data into liquibase, the question is how effectively store this data - inside repository or not? If not then how do you solve these issues (see above)?

use to load the data with bulk copy utility supplied with sqlserver.
Liquibase could start the process inside the changeSet using executeCommand like this

                                       
Originally posted by: taranenko
use to load the data with bulk copy utility supplied with sqlserver.
sorry I'm overlooking, you need mysql, not mssql :) Have mysql such utility? Normally should...
Originally posted by: taranenko
Originally posted by: taranenko
use to load the data with bulk copy utility supplied with sqlserver.
sorry I'm overlooking, you need mysql, not mssql :) Have mysql such utility? Normally should...
taranenko, thanks for the answer. But my problem is not how to import data into liquibase, but how to store it effectively. Actually there is similar utility, it can be done like this:
    $ mysql -uroot -ppassword db_name < dump.sql

Hi confik, this is my oppinion about that:

1 - You have to put in source control only things that make sence to version (things that evolve through time) and you can track the changes.
2 - This kind of big files produce more headache than solutions.
3 - In these cases a diff against versions is useless.

So my suggestion is to keep these files in a file server, and make it part of a post deployment process or import process that occurs once after the db initialization. After that keep the thing updated with upgrade scripts.

Put in source control only static data and things you need to set up a db. Then for upgrades, data modification, etc, put the inserts, updateds or deletes in changesets.

Hope this helps,
Alexis.

Originally posted by: Alexis
Hi confik, this is my oppinion about that:

1 - You have to put in source control only things that make sence to version (things that evolve through time) and you can track the changes.
2 - This kind of big files produce more headache than solutions.
3 - In these cases a diff against versions is useless.

So my suggestion is to keep these files in a file server, and make it part of a post deployment process or import process that occurs once after the db initialization. After that keep the thing updated with upgrade scripts.

Put in source control only static data and things you need to set up a db. Then for upgrades, data modification, etc, put the inserts, updateds or deletes in changesets.

Hope this helps,
Alexis.

Alexis, thank you very much for your answer. It seems that it’s the most reasonable solution.

To further define, storing the schema/definition is a given.

As to which data should be stored in a schema/definition/change-set this is how I go about it:

*Environmental configurations related to things like email server, application title (if you prefix/postfix the environment), webservice server,  (i.e. dev->test->live) are stored in source control, but outside the scope of liquibase.  These should be very-slow-changing. (someone correct me if there is a way to do this)

*Control/reference data that is needed by application or processes should and must be stored with liquibase change sets.  Example would be the 50 US states, pulldown action items (approve, deny), slow-changing reference data (i.e. data that isn’t updated by ETL jobs), or other control data that is added for application enhancements/improvements (i.e. approve, deny, and a new ‘defer’ option that has related application changes).  If you have roles/permissions this would also be control data (unless it can be changed on live by an application, which would be ‘content managed’).

*Content managed data should never be stored anywhere except with tape backups and manually migrated/moved for testing purposes.  Content managed data is usually reference data that can be changed through an application directly on a live environment.  Example would be pre-defined auto-comments, and adding in new comments on live, or the custom creation/addition of security groups directly on live.

*working data, user data, fast-moving data, or in other terms data that is likely different on live than on dev or test and the application is still functional – all working data items that should not be managed by liquibase. 

just my opinion/perspective, hope that helps.

-D

dhartford, thanks, you helped me a lot.