Skip to content

Create Realistic Test Data With Bogus

For data-intensive applications, ensuring that the code is fast is not enough; we need realistic data too. Without it, optimisations may fail as soon as we use our application with real-world data volumes. Let us explore our options for creating realistic test data.

Why Bogus?

Bogus is an open-source library written in C#. It allows us to generate fake data quickly and easily for our projects. Whether we are creating test cases, or populating a database with sample records, Bogus provides a comprehensive toolkit to simplify these tasks.

For Python we may use Faker, and on NuGet we find a few packages with the same name. Unfortunately, they are no longer active. That is why I use Bogus.

Getting Started with Bogus

To start with Bogus, we need to install the NuGet package in the Package Manager Console:

Install-Package Bogus

Once installed, we can start to generate data. Let us look at this basic example on how we can use Bogus to create realistic data for a user object:

public class User
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string Email { get; set; }
    public DateTime DateOfBirth { get; set; }
    public string FullName { get; set; }
}

...

var faker = new Faker<User>()
    .RuleFor(u => u.FirstName, f => f.Name.FirstName())
    .RuleFor(u => u.LastName, f => f.Name.LastName())
    .RuleFor(u => u.Email, (f, u) => f.Internet.Email(u.FirstName, u.LastName))
    .RuleFor(u => u.FullName, (f, u) => $"{u.FirstName} {u.LastName} ({u.Email})")
    .RuleFor(u => u.DateOfBirth, f => f.Date.Past(30, new DateTime(2000, 1, 1)));

var users = faker.Generate(5); // Generate a list of 5 users

foreach (var user in users)
{
    Console.WriteLine($"{user.FirstName} {user.LastName}, {user.Email}, {user.DateOfBirth:yyyy-MM-dd}");
    Console.WriteLine($"{user.FullName}\n");
}

If we run this code, we should get 5 users that look like this:

Russell Crooks, [email protected], 1976-01-22
Russell Crooks ([email protected])

Evelyn Kassulke, [email protected], 1995-08-09
Evelyn Kassulke ([email protected])

Lelia Bode, [email protected], 1972-06-25
Lelia Bode ([email protected])

Kip Rowe, [email protected], 1971-04-17
Kip Rowe ([email protected])

Destinee Muller, [email protected], 1998-03-23
Destinee Muller ([email protected])

The little bit extensive syntax around email allows us to create an email address that reuses the first and last name of the generated user. This feature and compound properties like FullName, allow us to create realistic data.

There are built-in generators for commonly used data types, such as names and email addresses. These generators enable us to write significantly less code compared to manually specifying all the options inside the random generator. We can find a list of the available generators in the Readme.md file on GitHub

Advanced examples

Bogus does not stop at generating simple objects. It includes several advanced features that enhance its capabilities. Let us dive into a few more practical examples:

  1. Generating nested objects by reusing the faker instances:

     var orderFaker = new Faker<Order>()
         .RuleFor(o => o.Id, f => f.IndexFaker + 1)
         .RuleFor(o => o.ProductName, f => f.Commerce.ProductName())
         .RuleFor(o => o.Price, f => f.Finance.Amount(10, 500))
         .RuleFor(o => o.Customer, f => faker.Generate());
    
     var orders = orderFaker.Generate(5);
    
     foreach (var order in orders)
     {
         Console.WriteLine($@"{order.Id} - {order.ProductName} @ {order.Price} for {order.Customer.FullName}");
     }
    
     <!-- 
     1 - Generic Soft Mouse @ 143.92 for Ada McLaughlin (Ada.McLaughlin63@yahoo.com)
     2 - Fantastic Wooden Cheese @ 452.46 for Annabelle Weber (Annabelle_Weber@gmail.com)
     3 - Tasty Metal Car @ 33.21 for Autumn Bayer (Autumn.Bayer80@gmail.com)
     4 - Intelligent Soft Cheese @ 56.50 for Reymundo Berge (Reymundo14@gmail.com)
     5 - Handcrafted Concrete Pants @ 378.67 for Bernadette Hauck (Bernadette28@yahoo.com)
     -->
    

  2. Generate data using the underlying random generator:

    var customFaker = new Faker<CustomData>()
     .RuleFor(c => c.Code, f => f.Random.AlphaNumeric(8))
     .RuleFor(c => c.Description, f => f.Lorem.Sentence())
     .RuleFor(c => c.IsVerified, f => f.Random.Bool())
     .RuleFor(c => c.Value, f=> f.Random.Even(max: 100));
    
     var customDataList = customFaker.Generate(3);
     foreach (var data in customDataList)
     {
         Console.WriteLine($"{data.Code} - {data.Description} - {data.IsVerified} - {data.Value}");
     }
    
     <!-- 
     os297tks - Odit distinctio rerum earum tempora et cumque itaque vero voluptates. - True - 34
     ub7pv4wy - Corrupti non voluptatem facere sed accusamus consequatur. - False - 72
     6sihinu4 - Voluptate reiciendis ex beatae totam. - True - 70 
     -->
    

  3. Seeded Data for Reproducibility:

    1
    2
    3
    4
    5
     var seededFaker = new Faker<User>()
         .UseSeed(12345)
         .RuleFor(u => u.FirstName, f => f.Name.FirstName())
         .RuleFor(u => u.LastName, f => f.Name.LastName())
         .RuleFor(u => u.Email, f => f.Internet.Email());
    

  4. Verify that everything has a rule:

     var customFaker = new Faker<CustomData>()
         .StrictMode(true)
         .RuleFor(c => c.Code, f => f.Random.AlphaNumeric(8));
    
     var customDataList = customFaker.Generate(3);
    
     <!-- 
     Unhandled exception. Bogus.ValidationException: Validation was 
     called to ensure all properties / fields have rules.
     There are missing rules for Faker<T> 'CustomData'.
     =========== Missing Rules ===========
     Description
     IsVerified
     Value 
     -->
    

When to use Bogus

Bogus is particularly useful in scenarios such as:

  • Writing unit and integration tests.
  • Seeding development and staging environments with sample data.
  • Creating mock data for API documentation or demos.
  • Simulating edge cases by generating large, randomized datasets.

Especially the last point helps us with load testing. As long as we only have a few hundred rows in our database, we run into optimisations like caching all the time and cannot get a real measurement for our data retrieval. That changes if we can generate a million rows with Bogus and put that into the database.

Next

Bogus is a powerful library for generating realistic test data in .NET applications. Its user-friendly API and extensive features are an immense help to create working software faster.

Next week we explore another tool to create more realistic load tests.