Migrating the User Group Data – Part 2: Transformation

After nearly 10 years it was time to migrate the website of the .Net User Group Bern. The last part was all about consolidating the data into one single JSON file. The next step is now to move that data into the new structure.

This post is part two of a small series on the data migration task:

 

Step 1: What do we need?

As the name suggest the transformation step is a task in which one format is turned into another one. The starting point is defined with our JSON file with all the events, talks and speakers. For the target we could create whatever we want – as long as all the new fields are optional.

We had a few meetings in which we discussed various options and how we plan to use the new application. We settled on a structure that resembled the one we had but included a layer of abstraction to support planning events with multiple talks.

The main change we did was the switch to German as the project and domain language. That turned our events into Veranstaltungen and affected all the property and class names.

 

Step 2: Create the transfer objects

We expected some manual clean-up and wanted to do that in smaller files without all the noise we had in our big JSON file. Therefore, we created data transfer objects (DTO) who had only properties named like the fields in the new structure. There was not much difference to our new entities, except that the DTO use integers and not GUIDs for the Id property to make comparing them easy.

 

Step 3: Transform the data

The transformation was now down to run through all the entries in the big JSON file and create the transfer objects instead of printing them out as I did in my prototype. The newly created instances where collected in a list and then serialized to JSON.

 

Step 4: Generate the data for the new parts

As with the projects we do for work, some new requirements come up late in the migration. In the old structure the location was part of the description of an event. Now we wanted to have a location object to keep additional data (like the room size) and this meant we needed to go a few steps back and extend the big JSON file.

Most of the events (71) where at the same main location and we could handle those entries with a default value. For the other 9 events I had to add a location property with an Id for that place. The generated classes from the big JSON file needed a little update so that this property was correctly deserialised.

In combination with a small extension to my transfer project I could run the transformation again and all the events now had their correct location Id.

The list of locations required a new location transfer object, some hard-coded values and then I could serialize them into their own JSON file. This is another preparation to handle more changes around this object, since those changes tend to come in groups.

 

Step 5: Manual clean-up

Without the noise of all the unnecessary properties finding problems is easy. The big question that remains is where to fix them: in the big JSON file or in the small one created in the transformation task?

The answer to this question depends on how likely it is that you need to run the transformation again. If it is not literally the last minute before the migration is done on the production, I suggest you go with fixing the problem at the source. With this approach it doesn’t mater what happens next and you simply rerun the transformation when necessary – without losing any changes.

I couldn’t resist the temptation to change the speakers in the smaller file and regretted this decision a few times.

 

Next

At the end of the transformation phase we have multiple small JSON files with a clean structure. The next part in this series explains how we load the existing data into the new application.

1 thought on “Migrating the User Group Data – Part 2: Transformation”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.