Comparing Lists with LINQ & LINQPad

Many of my tasks around data quality control involve lists and the question “Which Id’s are in list A but not in list B?“. I tried to use Excel for this task but failed more often than I can count. As the next batch of checks rolled in, I had enough and tried a new approach using LINQ and the little tool LINQPad. This time it only took 20 lines of code and I could create a solution similar to that what I would write in SQL.

 

A little bit of set theory

Set theory is a branch of mathematical logic that studies sets, which are (simplified) collections of objects. Since I want a simple solution, we do not need to know much more than these two ideas: intersection and complement.

The intersection between two lists are those objects that are present in both lists:

Intersection (Image from Wikimedia Commons)

The complement are those elements of list A that are not present in list B:

Relative Complement (Image from Wikimedia Commons)

This is all the theory you need to use the built-in methods of LINQ to split your lists up. Intersections can be created using the method Intersect(), while the complement can be created using Except().

This works with all objects and data types that implement the IEquatable interface.

 

A solution using LINQ inside LINQPad

For my purpose the use LINQPad with C# Statement(s) as the programming language. This allows me to focus on the commands and I can ignore all the boilerplate code. Using LINQ to the splitting-up of the lists gives me this little snipped of code:

The .Dump() method is an extension method by LINQPad to pretty-print objects. Besides that, it is pure C# code.

The files I reference at the top contain the Id’s I need to check (one per line). If I need to check a new file pair, I simply change the path to the files and need no other modifications.

 

Why LINQPad?

I could use a command line application or a test and get the same result. However, LINQPad is great for such simple scripts where I have the source code and can modify the script at will should a slightly different requirement pop-up.

The basic version is free, but for any useful coding, you want the autocompletion and tooltips for C#, VB.net and F# of the Pro edition.

 

Conclusion

Those checks are not complicated, but you need to concentrate on the job. It is therefore an ideal candidate to automate and as the example above shows, it does not need a lot of code. Try LINQPad when you have similar tasks, it could save you a lot of time and reduce your errors significantly.

2 thoughts on “Comparing Lists with LINQ & LINQPad”

  1. Hi Johnny, great article as always 🙂 I was using LinqPad in free verson but then I found https://roslynpad.net/. It has all functionality I need to quickly test C# code and it’s open source. I wouldn’t say it’s better than LinqPad but I would recommend it as alternative.

    Reply
    • Hi Tomáš,
      Roslynpad looks interesting, especially the code generation for methods in an interface. I will give it a try when I need to write the next little script.

      Thanks,
      Johnny

      Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.