Code scanning in Go

December 2017

Reading Max Kanat-Alexander’s post from 2009 entitled Why programmers suck, I found myself pondering about his first item on his list:

Do you know as much as possible about every single word and symbol on every page of code you’re writing?

Great question. No, I don’t. Especially not in Go. It’s a great question because it points to something very specific that one can do in order to level up: Go read the source code of the libraries (or packages) that you are using.

With that in mind, I decided to parse all the Go code that I have written in order to find out which functions I am calling, so I know which documentation to go read… I could also have used my intution, and aim for the standard http package, but that would be too easy.

With a lot of help from Francesc Campoy I hacked this together:

package main

import (
	"fmt"
	"go/scanner"
	"go/token"
	"io/ioutil"
	"log"
	"os"
	"sort"
)

type name struct {
	n string
	c int
}

func main() {
	fs := token.NewFileSet()
	namesMap := map[string]int{}

	for _, arg := range os.Args[1:] {
		b, err := ioutil.ReadFile(arg)
		if err != nil {
			log.Fatal(err)
		}
		f := fs.AddFile(arg, fs.Base(), len(b))
		var s scanner.Scanner
		s.Init(f, b, nil, scanner.ScanComments)

		name := ""
		var prevTok token.Token

		for {
			_, tok, lit := s.Scan()

			if tok == token.EOF {
				break
			}

			switch tok {
			case token.LPAREN:
				if name != "" {
					namesMap[name] = namesMap[name] + 1
					name = ""
				}

			case token.IDENT:
				if prevTok == token.FUNC {
					name = ""
					break
				}
				if prevTok == token.IDENT {
					name = ""
				}
				name += lit

			case token.PERIOD:
				name += "."

			default:
				name = ""
			}

			prevTok = tok
		}
	}

	names := []name{}
	for n, c := range namesMap {
		names = append(names, name{n, c})
	}

	sort.Slice(names, func(i, j int) bool {
		return names[i].c > names[j].c
	})

	for i := 0; i < 30; i++ {
		fmt.Printf("%2d %6d: %s\n", i+1, names[i].c, names[i].n)
	}
}

And running it over all my .go files gave me (with my annotations).

 1  86: len                         # Not too surprising
 2  62: t.Errorf                    # From tests!
 3  54: r.HandleFunc                # gorilla/mux
 4  51: .Methods                    # gorilla/mux
 5  46: byte                        # []byte(someString)
 6  36: fmt.Errorf                  # Returning errors
 7  33: t.Fatalf                    # From tests
 8  26: append
 9  25: make
10  24: string                      # string(someBytes)
11  19: http.NewRequest             # More tests
12  18: middleware.WithUser         # My own middleware!
13  17: fmt.Println                 # I actually thought this would be higher
14  16: toWord                      # Some helper function of mine
15  14: parseBytes                  # Same
16  13: session.LoggedInMiddleware  # Another middleware
17  13: fmt.Printf                  # Again, thought this would be higher
18  12: fmt.Sprintf
19  12: session.NoUserMiddleware    # Middleware...
20  11: request.Header.Set          # From tests
21  11: mix                         # Single function program, I guess being tested
22  10: logg                        # Very simple logging I created the other day
23  10: strings.Replace             # Strings...
24  10: strings.Split
25  10: httptest.NewRecorder        # One would think I write a lot of tests...
26  10: NewLimiter                  # Same rate limiting stuff
27  10: ioutil.WriteFile            # IO!
28   9: hf                          # Helper in tests
29   9: regexp.MustCompile          # Regex!
30   9: BuildKeys                   # My own helper somewhere

A few things stand out to me:

  1. How few calls I make to stuff. I am relatively new in Go (cloc tells me a bit more than 3000 loc in my Github + Bitbucket). I didn’t think too hard about it before running the analysis, but I probably would have guessed calls to len and fmt.Println in the hundreds. I guess the later is misleading as I tend to fmt.Println stuff a lot during development, but then remove them once a function takes shape and works as expected. I feel fairly good in Go (I did that after a few minutes with the language), but I need to write (and ehem read) more code. A lot more.
  2. The first actual function on the list is one concerning tests, yet I feel like I don’t write enough tests. In fact there is a lot of calls to functions related to testing. I’m intruiged.
  3. I call a lot of my own stuff. This is the most surprising to me. A lot on that list is my own stuff. Internal calls to small helpers. I am not sure what that says about the language (if anything). Perhaps that you can get a lot done without too many helpers and abstractions. I definitely tend to have fewer abstractions when writing Go as opposed to JavaScript.
  4. I guess I will be reading the source of gorilla/mux. It is a really nice piece of software and very well documented. My hat off to them for creating a library that is the only outside library (apart from the standard library of course) to even figure on my top 30 list here, and even in prime spot too!
  5. Other libraries to read: