Migrating the User Group Data – Part 3: Loading

After nearly 10 years it was time to migrate the website of the .Net User Group Bern. In the last part I explained my work on the transformation of the data and how we extended it to support new requirements. With all the data in the new structure, it is now time to load it into the new application.

This post is part three of a small series on the data migration task:

 

Step 1: Script to clean the database

While we all hope to get it right the first time, it seldom works this way. Creating a simple script with DELETE statements in the right order will help you to get your local database back in order to start the next round.

The challenge is to identify the order in which you need to delete the data. The more foreign keys you have, the more work it is to find the right order. You can postpone this step until you have data in your database and then use the error messages about violated referential integrity to guide you through the various dependencies. Whatever approach you choose, do it early on so that you and your teammates don’t have to repeat the work over and over again.

 

Step 2: Load the data

In the new application we use commands to create and modify our business objects. I went with them and did not directly write to the database. That approach would work as well, but since we intend to use the commands throughout the application, I thought it would be a good test if they are up to the task of batch processing.

For the import project I reused the transfer objects and deserialized the JSON files back to objects. The commands created the objects and assigned GUIDs as their Id property. To map from my migration integers to those GUID I used a simple dictionary with the int as the key and the GUID as its value.

As in the clean-up script, the order in which objects are created is important. As a rule of thumb, it should be in the opposite direction of the clean-up script.

 

Step 3: Fix the errors

As expected there where a few problems with missing data, too long values and non-unique keys. The database clean-up script helped me to get back to a clean start and then I could fix one problem after the other.

Every error needed its own approach. To fix the missing unique values I had to modify the transformed JSON file, for too long values I even went back to fix the big JSON file and rerun the transformation. Especially field lengths are up for bad surprises and its always at the end where those problems pop up.

 

The result

By migrating all the events, talks and speakers to our new application we got a nice (and long) list of all the events of the user group in our new website. A visitor now no longer has the impression that this group is inactive. Now it presents how much work it did over the last 10 years for the .Net community in the region.

New website with all the events

 

Lesions Learned

These are the 5 main points I take with me from this data migration task:

  1. An estimate on the size of the data helps you to find out what approaches for the migration are possible and which ones are unlikely to work. Knowing the order of magnitude is enough and you don’t need to spend much time on the estimate itself.
  2. A semi-automated approach can be the fastest way to get a task done. Especially when there are many edge-cases that require manual intervention. It may be simpler to do the hard parts completely manual and focus the automation on other, more simple parts.
  3. Changes occur well into the project. Don’t try to assume all possibilities. Instead, divide the work into small parts that can be changed when necessary.
  4. Clean-up the data as much upstream as you can. The later you fix them in your workflow the more likely is it that changes flush them away and you need to do them again.
  5. Until you successfully load the data into the new system you have no idea if everything works. Therefore, get as fast as possible to the loading part.

For my next data migration project, I would split the work not along the 3 phases of ETL but along the data to be able to load them as fast as possible into the new system. With this small data set it would not make a big difference, but with a regular project that could bring the integration time down significantly.

1 thought on “Migrating the User Group Data – Part 3: Loading”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.