Designing Documents for RavenDB

Documents are the most important part of a document oriented database like RavenDB. Without those documents you can’t do anything. To know how to design them is therefore a skill you really must learn to work effectively with RavenDB.

The good news first: Most of the things you know on how to build software and model your classes can be reused. It even gets easier as soon as you no longer try to represent the world as a series of rows.

This post is part of the RavenDB series. You can find the other parts here:

 

Documents aren’t Tables

But here is the bad news: You are no longer in the relational database world. All those helper objects, flat rows, normalization and redundancy free organization of data you used to know are gone. Your documents will vary, even if they represent the same kind of objects. Some may have a field, others not. Even if you only store one type of object every single document could be different.

That flexibility is an enormous difference, only topped by the redundant storage of information. For years you learned that it is a bad practice to store data multiple times. Now all the data you need should be retrieved as one independent document.

This concept is best explained by a little example. Given 2 classes for a blog post and comments:

To store a post with 2 comments you can write this code:

The resulting document in RavenDB contains all the information as a single document:

When you see this the first time it seems completely different to what you used in the relational world. But don’t despair. You can change the way you model your objects in small steps and follow patterns that are around for years, like Aggregates in Domain-Driven Design. And just because all the data can be different it is neither a must nor advised.

 

Documents and Relations

Relations between objects don’t stop when you use RavenDB. If all your documents are self-containing you still can reference other documents. But you can use a little bit of redundancy to keep your documents independent. If you reference a product in an order you not only can store its ID but also the name you use to display it and its price.

For an example we can look at the sample data that RavenDB created for us. The order documents all look like this and show exactly which products were ordered:

If you build your documents this way they are meaningful on their own.

 

Redundant Data: Problem or Solution?

If any information is stored only once, it is very easy to update them. You change it at only one place and you’re done. Except when you must know what value was stored at a given point in the past.

If you ever build an order system you know how often addresses get changed. And you know that it is often important to know to which address you had shipped an item. And for how much did you sell that now defect item 11 months ago? How do you answer those questions in a relational database? You either don’t or you start to build a highly complex temporal storage around your data model.

In RavenDB you are advised to add enough redundant data to answer those questions with the document itself. You not only store the customer number but all the data you need to find out what, when, where and for how much it was sold.

As it turns out (and the previous example showed) you can trade a little of your big disk space for fast and reliable answers. You almost never want to change a shipping address after the order was shipped, even if you want to know the current address of your customers. Having a copy of the data as it was when the order was created is a simple solution and saves you from a complex temporal adventure.

 

Use your Freedom

RavenDB gives you many new possibilities. Not all are helpful and easy to maintain. But you can build a solution that is focused at your problem and not on your storage system.

Use this freedom to experiment inside a pet project. Please don’t make your first steps on production code. Even if it is tempting it can go very wrong very fast…

 

Next

When we can build complex document structures we need some way to query them. In RavenDB the way to go is Map/Reduce. What this is and how we can use this algorithm will be explained in the next post.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.