Finding Needles in Haystacks with Go
I'm working on a project where we're using a fake implementation
of our db
package: the package that does all the calls
to the database. The details of why and how aren't that important
here, but say we have a function like this one:
func (db *DB) GetBooksFromStatus(status []string) ([]Book, error) { // Query db and select all the books with a // status in the list of provided statuses return books, nil }
So in a fake database implementation, we might do something like
func (db *FakeDB) GetBooksFromStatus(status []string) ([]Book, error) { books := []Book{} for _, b := range db.Books { for _, s := range status { if b.Status == s { books = append(books, b) } } } return books, nil }
That's a nested loop. Or O(n2) running time.
That seems inefficient. Would it be faster to use a
map
?
func (db *FakeDB) GetBooksFromStatus(status []string) ([]Book, error) { statuses := map[string]bool{} for _, s := range status { statuses[s] = true } books := []Book{} for _, b := range db.Books { if statuses[b.Status] { books = append(books, b) } } return books, nil }
That's O(n). So it should be faster, right? Well maybe. Certainly not in our tests, since the fake database would rarely hold more than about 10 items of anything, but it's a nice litte thing to think about: at what point does it make sense to optimize something like this?
A quick disclaimer
The above does not make sense to optimize before there are any problems. Don't optimize prematurely and all that.
Second, this is not a comprehensive benchmark. I am leaving out a lot of details, and it's entirely on purpose. The idea here is not to be precise, but rather to build some intuition about when and when not to try to optimize code like the above.
Going in, I suspected that I'd be surprised about how much hay and needles were needed to make the O(n2) running time slower.
Generalizing the problem
The above code can be generalized to finding needles in a haystack. If there is only one needle, then it makes no sense to do anything other than loop through the haystack, but if you have to do it more than once, then surely it will at some point.
Let's write a more general function. It takes two lists of integers, and returns a list of booleans.
func WithBrute(hay, needles []int) []bool { result := make([]bool, len(needles)) for i, n := range needles { for _, h := range hay { if n == h { result[i] = true break } } } return result }
First we create a list with the same length as the needles, of
booleans set to false
. We then go through each needle,
and if we find it in our hay, then we change it to be
true
.
We would call it like this.
hay := []int{4, 2, 5, 8, 7, 1, 5} needles := []int{8, 0, 4} result := WithBrute(hay, needles) fmt.Println(result) // [true false true]
That's the brute force version. Next one, we use a
map
to make looking up faster.
func WithMap(hay, needles []int) []bool { dict := map[int]bool{} result := make([]bool, len(needles)) for _, h := range hay { dict[h] = true } for i, n := range needles { result[i] = dict[n] } return result }
Lastly, we can do something that will probably turn out very similar: sort the hay, then use binary search to find the needles.
func WithBinarySearch(hay, needles []int) []bool { result := make([]bool, len(needles)) sort.Ints(hay) for i, needle := range needles { l, r := 0, len(hay) for l < r { m := (l + r) >> 1 if hay[m] == needle { result[i] = true break } if hay[m] < needle { l = m + 1 } else { r = m } } } return result }
So we have three functions that do essentially the same. Which is the faster? When?
Benchmarking
Go has really good support for benchmarking, so I ran each of
them with different sizes of hays and needles to see how they
matched up. The sizes follow 2n for both, with
n
ranging from 3 to 15, inclusive. That's 8 to 32768
doubling at every step.
The code to run a benchmark would look something like this.
var result = []bool{} func BenchmarkBrute(b *testing.B) { r := []bool{} for n := 0; n < b.N; n++ { r = WithBrute(ints(512), ints(16384)) } result = r }
The call to ints(n)
just generates a list of random
ints (between 0 and 10*n) with the length of
n
. Assigning the result to result
is to
avoid the compiler being too smart on me on removing the call
entirely since it has no side effects and I don't use the
value.
Results
Displaying the results actually isn't that easy. How do you compare three results, with two dependent variables? I decided to plot the two variables as x and y coordinates, and then assign a color, red, green, or blue, to each function: WithBrute, WithMap, and WithBinarySearch.
The result is what you see below. Move your mouse over the plot, to get the exact data on the right.
What is important here is that smaller is better. It is miliseconds per operation. So, perhaps unintuitively, the more red the color is, the worse it is for that function. This is easily seen in the top right corner that is almost perfectly red and where we are searching through 32768 hays to find just as many needles.
Hay on y-axis and Needles on x-axis.