Benchmark with examples in Go

Benchmarking English keywords, mainly used to test CPU and memory efficiency. With certain tools, it can also visually demonstrate the effect of code optimization. Sometimes it is also called performance testing because it is related to performance. The Go language has a built-in benchmark testing package, which is very convenient to use. It is recommended that all Go developers become proficient in using it.

Let’s use the prime number search problem as an example to demonstrate the usage of benchmark testing. Given a number N, find all prime numbers less than N. The algorithm for determining whether a number is prime is also intuitive. Starting from 2, iterate through all numbers i up to the square root of N. If N is divisible by i, then it is not a prime number. So the first version of the code is as follows:

func primeNumbers(max int) []int {
  var primes []int

  for i := 2; i < max; i++ {
    isPrime := true

    for j := 2; j <= int(math.Sqrt(float64(i))); j++ {
      if i%j == 0 {
        isPrime = false
        break
      }
    }

    if isPrime {
      primes = append(primes, i)
    }
  }

  return primes
}

This is our test subject. Now let’s start writing benchmark code.

func BenchmarkPrimeNumbers(b *testing.B) {
  for i := 0; i < b.N; i++ {
    primeNumbers(1000)
  }
}

Ordinary test cases start with “Test”, while benchmark test code starts with “Benchmark”. The input parameter for ordinary test cases is of type *testing.T, while for benchmark tests it is of type testing.B. Everyone should pay attention to the distinction.

Benchmark tests need to call the tested function in a loop, a total of b.N times. The number of iterations is determined by the benchmark framework. This is probably the most significant difference between benchmark tests and ordinary tests.

Then, execute the benchmark test:

go test -run '^$' -bench=.
goos: darwin
goarch: amd64
pkg: ben
cpu: Intel(R) Core(TM) i7-8557U CPU @ 1.70GHz
BenchmarkPrimeNumbers-8           421335              2384 ns/op
PASS
ok      ben     1.042s

Here, -bench=. means executing all the benchmark test functions in the current context. The -run '^$' parameter is a bit confusing. Sometimes you may see others write it as -run '^#'. This parameter is basically used to filter which regular tests to execute. go test will execute both regular tests and benchmark tests. If you only want to perform benchmark tests, you need to find a way to filter out the regular tests. All regular tests start with Test, and none of them match the ^$ and ^# regular expressions, so they will not be executed.

The default output consists of three columns, which are the test name, the number of executions, and the time taken for each operation. If we want to test memory efficiency, we need to enable the -benchmem parameter.

go test -run '^$' -bench=. -benchmem
goos: darwin
goarch: amd64
pkg: ben
cpu: Intel(R) Core(TM) i7-8557U CPU @ 1.70GHz
BenchmarkPrimeNumbers-8           501062              2471 ns/op            5136 B/op         11 allocs/op
PASS
ok      ben     2.274s

The number and frequency of memory allocations for each operation will be outputted additionally.

With the benchmark test, we can verify the optimization effect. The previous prime number determination algorithm used square root, which was very inefficient. We can replace the square root operation with multiplication. Additionally, we can save the square results to further reduce redundant calculations. The complete code is as follows:

func primeNumbers(max int) []int {
  b := make([]bool, max)

  var primes []int

  for i := 2; i < max; i++ {
    if b[i] {
      continue
    }

    primes = append(primes, i)

    for k := i * i; k < max; k += i {
      b[k] = true
    }
  }

  return primes
}

Theoretically, the latter implementation is more efficient. However, actions speak louder than words.

Let’s first save the benchmark results of the first version:

go test -run '^$' -bench=. -benchmem > /tmp/old.txt

Then modify the code to the new version, run the benchmark test again, and save the results:

go test -run '^$' -bench=. -benchmem > /tmp/new.txt

Finally, compare performance differences. To do this, you will need to use the benchstate tool:

go install golang.org/x/perf/cmd/benchstat@latest

After installation, compare performance differences:

benchstat /tmp/old.txt /tmp/new.txt
goos: darwin
goarch: amd64
pkg: ben
cpu: Intel(R) Core(TM) i7-8557U CPU @ 1.70GHz
               │ /tmp/old.txt  │          /tmp/new.txt           │
               │    sec/op     │    sec/op     vs base           │
PrimeNumbers-8   44.244µ ± ∞ ¹   2.275µ ± ∞ ¹  ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

               │ /tmp/old.txt  │           /tmp/new.txt           │
               │     B/op      │     B/op       vs base           │
PrimeNumbers-8   3.992Ki ± ∞ ¹   4.992Ki ± ∞ ¹  ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

               │ /tmp/old.txt │          /tmp/new.txt           │
               │  allocs/op   │  allocs/op    vs base           │
PrimeNumbers-8    9.000 ± ∞ ¹   10.000 ± ∞ ¹  ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

You can see that benchstat prompts that there are too few test specimens. The confidence interval level here should be the confidence probability in mathematical statistics, and the alpha level is the significance level. I have already returned the specific meanings and usage to my university teacher. Here, everyone just needs to understand that benchstat will statistically compare the differences between the benchmark data in the two tests. If there are too few samples, the statistical results are not reliable. The above prompt suggests that there should be at least 6 sets of samples, and it is recommended to test more than 10 sets.

We can use count 10 to specify the number of benchmark test rounds:

go test -run '^$' -bench=. -benchmem -count 10 > /tmp/new.txt

The new test results are as follows:

benchstat /tmp/old.txt /tmp/new.txt
goos: darwin
goarch: amd64
pkg: ben
cpu: Intel(R) Core(TM) i7-8557U CPU @ 1.70GHz
               │ /tmp/old.txt │            /tmp/new.txt             │
               │    sec/op    │   sec/op     vs base                │
PrimeNumbers-8   42.869µ ± 2%   2.272µ ± 1%  -94.70% (p=0.000 n=10)

               │ /tmp/old.txt │             /tmp/new.txt             │
               │     B/op     │     B/op      vs base                │
PrimeNumbers-8   3.992Ki ± 0%   4.992Ki ± 0%  +25.05% (p=0.000 n=10)

               │ /tmp/old.txt │            /tmp/new.txt             │
               │  allocs/op   │  allocs/op   vs base                │
PrimeNumbers-8     9.000 ± 0%   10.000 ± 0%  +11.11% (p=0.000 n=10)

We look at the third column, where CPU time decreased by 94.7%, but memory consumption increased by 25.05%, and memory allocation also increased by 11.11%. This aligns with our strategy of trading space for time, and the optimization is effective.

The above demonstrates a scenario of finding prime numbers within 1000. However, in actual development process, we need to test multiple scenarios simultaneously, and this is where the b.Run() function comes into play.

func BenchmarkPrimeNumbers(b *testing.B) {
    table := []int{100, 1000, 3000}
    b.ResetTimer()
    for _, v := range table {
        b.Run(fmt.Sprintf("input_size_%d", v), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
                primeNumbers(v)
            }
        })
    }
}

Here b.ResetTimer() is used to reset the timer of the testing framework, avoiding the influence of previous operations on the test results. Of course, assigning a value to table will not have a substantial impact, it is just shown here to demonstrate the usage. If you are preparing for a very time-consuming operation, it is necessary to reset the timing state.

The testing framework will output three sets of results in sequence:

go test -run '^$' -bench=.
goos: darwin
goarch: amd64
pkg: ben
cpu: Intel(R) Core(TM) i7-8557U CPU @ 1.70GHz
BenchmarkPrimeNumbers/input_size_100-8    3407444   349.0 ns/op
BenchmarkPrimeNumbers/input_size_1000-8    483118  2124 ns/op
BenchmarkPrimeNumbers/input_size_3000-8    173293  6402 ns/op
PASS
ok      ben     4.834s

Reference Document:

  • https://blog.logrocket.com/benchmarking-golang-improve-function-performance/
  • https://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go
  • https://taoshu.in/go/bench.html
Feedback