Python Friday #109: Set Operations on Lists With NumPy

Whenever I need to find values that are part of one list but not another one, I like to work with set operations. I find them more elegant than looping through the lists. Let’s look what Python to solve this problem.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

 

NumPy?

NumPy is the fundamental package for scientific computing with Python and a great addition to pandas. As with pandas, NumPy offers a large set of features that you find in the official documentation.

We can install NumPy with pip:

 

Preparation

With set operations we can get parts of two lists (or arrays) without iterating through them. This gives a more elegant solution and may reveal the goal of your code more clearly. For this post we need the two lists a and b:

I find it helpful to have a graphical representation of the two “sets” we work on. The numbers 1-4 are part of list a, while the numbers 3-6 are part of list b. The numbers 3 and 4 are in both lists:

The two lists in a graphical representation

 

Set difference: Elements in the first but not the second list

We can use the set difference when we want the elements in list a that are not part of list b. In NumPy this method is called np.setdiff1d():

This gives us the numbers 1 and 2:

in a but not in b: [1 2]

If we want to know what elements are in b but not in a, we need to switch the two lists:

This gives us the numbers 5 and 6:

in b but not in a: [5 6]

 

Intersection: Get the elements that are in both lists

If we want the elements that are in both lists, we can use the method np.intersect1d():

This gives us the numbers 3 and 4:

both in a AND b: [3 4]

 

Union: Get a set of all elements

With the method np.union1d() we get a set of all elements in both lists, but each element only comes up once:

This gives us the numbers 1 through 6:

everything from a and b: [1 2 3 4 5 6]

 

Conclusion

NumPy is a powerful library and the set operations are only a tiny bit of all the things it offers. For certain problems I like the set operations a lot and it is nice that Python offers support for them.

1 thought on “Python Friday #109: Set Operations on Lists With NumPy”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.