Python Friday #109: Set Operations on Lists With NumPy

Whenever I need to find values that are part of one list but not another one, I like to work with set operations. I find them more elegant than looping through the lists. Let’s look what Python to solve this problem.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

NumPy?

NumPy is the fundamental package for scientific computing with Python and a great addition to pandas. As with pandas, NumPy offers a large set of features that you find in the official documentation.

We can install NumPy with pip:

pip install numpy

1	pip install numpy

Preparation

With set operations we can get parts of two lists (or arrays) without iterating through them. This gives a more elegant solution and may reveal the goal of your code more clearly. For this post we need the two lists a and b:

import numpy as np

a = [1,2,3,4]
b = [3,4,5,6]

import numpy as np

a = [1,2,3,4]

b = [3,4,5,6]

I find it helpful to have a graphical representation of the two “sets” we work on. The numbers 1-4 are part of list a, while the numbers 3-6 are part of list b. The numbers 3 and 4 are in both lists:

The two lists in a graphical representation

Set difference: Elements in the first but not the second list

We can use the set difference when we want the elements in list a that are not part of list b. In NumPy this method is called np.setdiff1d():

in_a_not_in_b = np.setdiff1d(a,b) 
print(f"in a but not in b: {in_a_not_in_b}")

1 2	in_a_not_in_b = np.setdiff1d(a,b) print(f"in a but not in b: {in_a_not_in_b}")

This gives us the numbers 1 and 2:

in a but not in b: [1 2]

If we want to know what elements are in b but not in a, we need to switch the two lists:

in_b_not_in_a = np.setdiff1d(b,a) 
print(f"in b but not in a: {in_b_not_in_a}")

1 2	in_b_not_in_a = np.setdiff1d(b,a) print(f"in b but not in a: {in_b_not_in_a}")

This gives us the numbers 5 and 6:

in b but not in a: [5 6]

Intersection: Get the elements that are in both lists

If we want the elements that are in both lists, we can use the method np.intersect1d():

intersect = np.intersect1d(a,b)
print(f"both in a AND b: {intersect}")

1 2	intersect = np.intersect1d(a,b) print(f"both in a AND b: {intersect}")

This gives us the numbers 3 and 4:

both in a AND b: [3 4]

Union: Get a set of all elements

With the method np.union1d() we get a set of all elements in both lists, but each element only comes up once:

union = np.union1d(a,b)
print(f"everything from a and b: {union}")

1 2	union = np.union1d(a,b) print(f"everything from a and b: {union}")

This gives us the numbers 1 through 6:

everything from a and b: [1 2 3 4 5 6]

Conclusion

NumPy is a powerful library and the set operations are only a tiny bit of all the things it offers. For certain problems I like the set operations a lot and it is nice that Python offers support for them.

Python Friday #109: Set Operations on Lists With NumPy

NumPy?

Preparation

Set difference: Elements in the first but not the second list

Intersection: Get the elements that are in both lists

Union: Get a set of all elements

Conclusion

Like this:

Related

1 thought on “Python Friday #109: Set Operations on Lists With NumPy”

Leave a Comment Cancel reply

NumPy?

Preparation

Set difference: Elements in the first but not the second list

Intersection: Get the elements that are in both lists

Union: Get a set of all elements

Conclusion

Share this:

Like this:

Related

1 thought on “Python Friday #109: Set Operations on Lists With NumPy”

Leave a Comment Cancel reply