Marek Cermak15 min

One More Language to Go

EngineeringAug 30, 2022

Engineering

/

Aug 30, 2022

Marek CermakGo Platform Engineering Manager

Share this article

Heard of this new be-all and end-all programming language? The one that’s easy to learn, fast-to-compile, performant and multi-platform, and the last language we’ll ever need? Well, then please, let me know which one it is, because I haven’t heard of it. I’m just here to talk about Go.  

Even though it isn’t an omnipotent programming language that will replace all others (and I’m pretty sure no such language will ever exist), Go may at least solve that eternal dilemma of yours: “Which language should I learn next?”

And perhaps, once you learn it, you won’t need to ask yourself that question again.

A Bit of Go History

Making Go happen took roughly five years — from the first designs in 2007 through the first public announcement made by Google in 2009 to the first release of version 1.0 in 2012.

Go was thoughtfully designed by people like Robert Griesemer, Rob Pike and Ken Thompson, engineers with more than 50 years of experience with programming language design. I don’t want this to sound cheeky, but compare that to JavaScript, which was developed in 10 days.

Go or Golang?

You might be wondering why I keep calling the language Go instead of Golang, which is the moniker you might be familiar with. Let this be stated at the very beginning: the language is called Go. The alternative moniker might be credited to an unfortunate naming decision of the official Go website, which was golang.org — because go.org was already taken and, back then, there was no .dev domain.

I like to think about the name this way: The name Go makes perfect sense and is a beautiful play on words. Not only does the word go mean language in Japanese but you can also split the word Google (which designed Go) into go and ogle — the look you have when you find out how easy it is to write code in Go. Brilliant, isn’t it?

Nevertheless, the golang label is quite handy; the extensive use of this misnomer now allows for easier Google searches, tagging and referencing without conflicting with the verb “go.”

Born Out of Frustration

What’s rather funny is that Go was created out of frustration with existing languages and environments. Programming had become too difficult and one had to choose either efficient compilation, efficient execution or ease of programming; all three were not available in the same mainstream language.

Programmers could choose ease over safety and efficiency by moving to dynamically typed languages such as Python and JavaScript (rather than C++) or, to a lesser extent, Java (Google, 2022a).

But the above-mentioned issues couldn’t be addressed well by libraries or tools. It was time for a new language — and Go wasn’t the only one that came as a response. Among others were Rust, Swift and, later, Kotlin. Programming language development became a mainstream field.

Google encountered the issue of slow and inefficient compilation firsthand. One of the issues was an excessive I/O during compilation when the compiler might be instructed to process the same header file hundreds or even thousands of times.

For example, in 2007, when Google instrumented the compilation of a major Google binary, it consisted of thousands of files that, if simply concatenated together, totaled 4.2 MB. By the time the includes (the dependencies of the current source file) had been expanded, over 8 GB were being delivered to the input of the compiler. That’s 2,000 times the size of the source code!

This led to the need for a distributed build system involving many machines, a lot of caching and much complexity just to build a single binary. Even then, the compilation of the binary took 45 minutes (Pike, 2012). One can imagine that with such prolonged builds, there’s quite a lot of time to think — and the requirements for a new programming language arose.

Defining the Design

The new language had to be fast to compile and scalable for large programs with lots of dependencies and large teams working on them. It had to feel familiar and easy to learn so that the engineers could quickly adapt. And it had to be modern — specifically, suitable for modern approaches such as concurrency and web development.

And just like that, based on these requirements and during 45 minutes of built-up frustration (pun intended) and on the brink of coffee overdose, the idea of Go was born.

Being a Gopher

As mentioned, I wanted to convince you that Go is the language you should learn next. Let’s expand on that. I actually think that Go is the language you want to learn next, you just might not know it yet.

Before we get to the tougher topics, let’s dive a bit more into the nature of the language to give you a taste of what being a Gopher is all about.

Collaboration & Community

From the very beginning, Go was arguably a collaborative endeavor. Google led the development efforts, but Go is open-source in nature, something that’s very apparent — just open the GitHub repository. As of July 2022, it has over 100,000 stars, over 15,000 forks and more than 1,000 proposals (out of those, over 250 have already been accepted).

Go is still under active development. With releases happening roughly every six months, the language evolves, changes and adapts to modern needs — and the Go community plays a huge role in this. Being a Go developer (a.k.a. Gopher) also means being part of that community.

Go was designed to be familiar to engineers (roughly C-like). At the time, programmers working at Google were early in their careers and were most familiar with procedural languages from the C family (Pike, 2012). For obvious reasons, the goal was to get the engineers productive as quickly as possible, meaning that the design could not be too radical.

Go’s syntax feels familiar to anyone who’s ever written code in languages like C/C++, Java or TypeScript and even Pascal. There are, of course, differences. Go is not an object-oriented language and doesn’t strictly conform to the OOP paradigm. That being said, the language is relatively easy to learn. I do say relatively, though…

The syntax and semantics are rather simple to grasp (Go only has 25 keywords). However, the difference between a person who can write a code in Go and a Gopher is how is the code written and structured. Even though Go enforces a clean code style — this is one of the cool features about Go; no need for code style checks and prettifiers — the code style is just the tip of the iceberg.

The “Symptoms”

For some reason, becoming a Gopher seemingly comes with a certain degree of mental health issues. The symptoms include obsession with code design; sudden emotional attachment to interfaces; occasional rage outbursts due to disrespect for common patterns and guidelines; and an impudent lack of respect for other programming languages.

Gophers also tend to follow common patterns and best practices like repository structure, naming semantics and error handling. Knowing these patterns and following them makes Go engineers more effective — and it’s what makes the difference between an engineer writing a code in Go and a Gopher.

Being a Gopher means being an embodiment of clarity, curiosity and hunger for improvement.

Features

In the early 2000s, computers had become enormously faster — but programming itself hadn’t advanced nearly as much. Multiprocessors were becoming mainstream but there was little support for multiprocessing in languages such as C/C++ or Java.

Go addressed the issues of the existing programming languages — fast compilation, efficient execution, ease of programming and first-class support for concurrency or parallelism — by attempting to combine the ease of programming of an interpreted, dynamically typed language with the efficiency and safety of a statically typed, compiled language. It also aimed to be modern, with support for networked and multicore computing.

Finally, working with Go is intended to be fast; it should take at most a few seconds to build a large executable on a single computer. To meet these goals required addressing several linguistic issues: an expressive but lightweight type system; concurrency and garbage collection; rigid dependency specification; and so on.

Let’s discuss which Go features solve some of these issues.

Compilation

Did I mention that the idea of creating Go was a result of a coffee overdose? While this may or may not be true, what certainly is true is that compilation was demanding, to say the least.

At the time, Google’s codebase contained over two billion lines of code. On top of that, it was mostly monolithic architecture (Metz, 2015).

Go compiles significantly faster than C/C++. We’ve already mentioned the inefficiency of the header imports. Go, on the other hand, has no dependency cycles and no unused dependencies. The unused dependencies are explicitly prohibited by the compiler; Go tooling even automatically removes them during development (assuming you use one of the modern IDEs or you’re a proper VIM maximalist with a boatload of plugins).

There are also other reasons for the speed: No complexities (only 25 keywords), no symbol table (unlike C/C++) and optimization features like imports listed at the beginning of a file and inlining of function calls, to name a few.

Go uses a compiler called gc. The compiler was originally written in C to avoid bootstrapping difficulties — i.e., having the need for a Go compiler to set up a Go environment. However, with the release of Go 1.5, the compiler was converted to Go, effectively making Go self-hosting (Google, 2022a).

Concurrency

The modern computing environment enables applications to improve their efficiency by processing instructions asynchronously. Specifically for web servers that serve multiple clients at the same time — think thousands or even millions of clients — being able to process the clients asynchronously instead of one by one is vital in terms of not only application performance but also stability.

Not many languages provide first-class support for concurrency or parallelism at the language level. Go embodies a variant of communicating sequential processes (CSP) which is a formal language to describe patterns of interaction in concurrent systems. Go implements this pattern using channels and goroutines. The approach involves the composition of independently executing functions of otherwise regular procedural code (Pike, 2012).

Go uses a very different approach to concurrency compared to languages such as C++. As a Gopher, the mantra that is attributed to Rob Pike is: “Don’t communicate by sharing memory, share memory by communicating.”

To expand on this, in Go, the state is shared over channels. Channels allow goroutines to communicate and synchronize their execution through the use of contexts (out of the scope of this post) and are usually used to synchronize goroutines. This is concurrency on steroids.

Goroutines are lightweight threads of asynchronous execution. They are native constructs that can be spawned using a single keyword and are so lightweight that Go can run tens of thousands of them at a time.

On the other hand, in C++ the state is shared through the use of shared memory. Threads in C++ are also rather resource-intensive, especially the creation of the threads and context switching presents some overhead.

Consistency

Let me ask you a few questions. Ever forgotten to put a semicolon in the code? How about forgetting about an unused variable? Have you refactored a code and left an unused import? Struggled with a different code style when opening a new repository?

If you answered “yes” to any of these questions, you might be the right person to appreciate that programming in Go will relieve you of all of these concerns.

Go ships along with a set of tools that include a formatter that enforces a code style and removes unused dependencies. Unused variables are forbidden by the compiler so the code doesn’t even successfully compile.

Even though the enforced code style might sound controversial to some, it makes it easy to navigate any code, enforces best practices and reduces ambiguity.

Dependencies

We’ve already talked about the huge overhead that the include statements — and therefore dependencies — present in C/C++. It’s worth mentioning how Go improves this process.

To make Go scale dependency-wise, the language defines unused dependencies as a compile-time error (not a warning). If the source file imports a package it does not use, the program will not compile. This guarantees that the dependency tree for any Go program is precise and has no extraneous edges. And that, in turn, guarantees that no extra code will be compiled when building the program — which minimizes compilation time (Pike, 2012).

The single biggest reason why Go compiler is so efficient and fast is the design and implementation of imports. Pike (2012) demonstrates that in the following example.

Consider three packages – A, B and C. Package A imports B and B imports C. Given the design of Go, A is not allowed to import package C; it would create a dependency cycle, which is prohibited by the compiler. A, however, is still able to use package C transitively through package B. To build the program, C is compiled first, then B and then C.

During the compilation of individual packages, object files are produced (used during the linking and relocation process, and composed of dependencies, debug information, a list of indexed symbols, etc). These object files contain all the type information necessary for the compiler to execute the import statement. This means that when B is compiled, the generated object file includes all type information for all dependencies of B that affect the public interface of B (Pike, 2012). When A is compiled, the compiler then reads the object file for B and not its source code!

This compiler design means that when the compiler executes an import clause, it opens exactly one file (compare that to up to hundreds of files opened by the C/C++ compiler). Moreover, given that imports are listed at the beginning of each source file, the compiler can really optimize the I/O required to read the dependency graph.

As a sweet cherry on top for those who stick around until the end of this heavy topic, let me share some data. In 2012, at the very beginnings of Go (and I can guarantee you that Go has evolved greatly since then), Google measured the compilation of a large Google program written in Go and compared that to the C++ analysis that was done before the migration to Go. The Go code compiled fifty times faster than its C++ equivalent.

That’s the difference between getting a coffee overdose during the build and having a pleasant cup of coffee, perhaps with a piece of cake.

Memory Management

So far, we’ve covered the preparation of binaries (compilation and builds). The runtime, however, is where the magic happens, especially when it comes to memory management and garbage collection.

Go’s memory management is, in my opinion — careful though, I am a geeky person — one of the language’s most impressive features. For those of you geeks out there, this section is for you.

A Go runtime has to store the objects somewhere and, of course, that somewhere is the memory. To be more precise, there are two memory locations: the heap and the stack. Memory is preferably allocated on the stack (a LIFO data structure). This kind of memory allocation is preferred for obvious reasons – it’s simple and effective, the objects are allocated in LIFO fashion and the memory is immediately freed once they run out of scope.

You might be asking what this “scope” is and how the program knows when an object runs out of scope. It’s the Go compiler that does the heavy lifting by performing escape analysis. The logic is surprisingly simple: If the compiler can determine the lexical scope of an object — i.e., its lifetime — then the object will be allocated on the stack.

Go has a stack per goroutine and it will allocate objects to the stack whenever possible. However, it can only be used for objects that are referenced within a given scope (i.e., within a function). In contrast, the heap is used to allocate memory for objects referenced outside of a scope or for which a scope can’t be determined. These might be statically defined constants, structs and in particular, pointers.

Generally, if there is a pointer to an object in Go, the object is stored on the heap where it has to be taken care of by the garbage collector. Go standard toolchain provides a runtime library that ships with every application, and this runtime library includes a garbage collector.

Garbage Collection

Garbage collection (shortened as GC) refers to tracing garbage collection, which identifies live (in other words, in-use) objects by following pointers transitively. An object, to be precise and follow the terminology, is a dynamically allocated piece of memory that contains Go value(s). The pointer is a memory address that references any value within an object (Google, 2022b).

This is where it starts to get geekier and geekier, but please, bear with me for a bit. It’s quite interesting, I promise!

The objects that I mentioned, together with pointers to other objects, form an object graph. To identify live memory, the GC walks the object graph starting at the program's roots — pointers that identify objects that are definitely in use by the program. Two examples of roots are local variables and global variables. The process of walking the object graph is referred to as scanning.

Go uses a non-generational concurrent tri-color mark and sweep garbage collector (try to say that five times fast!). Based on the generational hypothesis, short-lived objects are reclaimed most often and thus generational garbage collectors — used by languages like Java or Python — collect recently allocated objects first. Since Go assumes that the short-lived objects are tied to a lexical scope and therefore likely allocated on the stack, Go uses non-generational GC.

There are two components of the GC: the mutator and the collector. Mutator executes application code, allocates new objects on the heap and updates existing ones (making some objects no longer reachable as a result). Collector detects these out-of-scope objects that are no longer reachable and frees the allocated memory. Both of them run concurrently.

The GC uses the mark-sweep technique, which means that to keep track of its progress, the GC also marks the values it encounters as live. Once tracing is complete, the GC then walks over all memory in the heap and makes all memory that is not marked available for allocation. This process is called sweeping. (Google, 2022b) The process is quite difficult and I won’t go any further here, but I encourage you to study this on your own — it’s certainly worth it!

The point of this explanation is to emphasize that there are costs worth understanding. GC is inherently quite expensive and it does impact application runtime — for example, the runtime is paused while the GC executes. It is quite intuitive if you think about it, but the more often garbage collection cycles happen, the more often the program is paused, which hinders the performance.

As a side note, GC itself uses CPU and physical memory (live heap) that is rather small in comparison, so we can completely omit it from consideration with a clean conscience..

Performance

One of the first features often associated with Go is performance. While it is true that Go is close to high-level languages in terms of syntax, it is also close to low-level languages in terms of performance.

Go is often compared to languages like C++ and Rust in terms of benchmarks. Sounds almost too good to be true, doesn’t it? There are various reasons behind Go being quite performant. Strong, static typing; efficient memory management; compilation to machine code — i.e., no virtual machines — and compiler optimizations (see the next section).

Instead of convincing words, let’s go with numbers instead. The following figures show data from benchmarks generated as of August 9, 2022. The main goal of these benchmarks is to compare performance differences between various programming languages.

It’s important to note that various implementations might use different optimizations, but the benchmarks tend to be facilitated in a controlled environment, making results as reliable as possible. You can find more details here.

Figures 1 and 2 show the results of the two selected benchmarks: spectral norm and HTTP server. These benchmarks were chosen to represent the actual difference in Go’s performance in various contexts.

As we can see in the spectral norm benchmark, the performance of Go in this setup is nothing to be excited about. In fact, it’s about 2.5x slower than the fastest C++ benchmark. Clearly, for this benchmark (based on matrix multiplication), Go isn’t the best tool for the job. Even though it’s quite efficient in terms of resources — the size of the points represents memory usage — the computation time is still rather high even compared to Go’s direct competitor, Rust.

On the other hand, Go and Rust dominate the HTTP serving benchmark, where they strongly outperform the rest of the programming languages. Interestingly enough, Go’s performance is much more reliable than Rust’s. In terms of numbers, Go’s standard deviation on the benchmark is 4.1ms whereas Rust exhibits roughly 10x the deviation — 42ms. Given the difference of 14ms in performance between Go and Rust, Go will probably outperform Rust in the long run.

Fig. 1: Benchmark: Spectral Norm, August 09 2022 (Data: Benchmarks)

Fig. 2: Benchmark: HTTP Server, August 09 2022 (Data: Benchmarks)

Go in a Nutshell (Worth Cracking)

I am a strong believer in the concept of picking the right tools for the job. Go is not suitable for everything. It is not the be-all and end-all of programming languages. That being said, it is designed to build highly performant, fast-to-compile, lightweight services with minimal development overhead.

It came into existence to make things better, faster, more robust and, in general, to make our lives easier and the engineers happier. And I think it’s doing a solid job at all of that.

I see a parallel with other languages, such as JavaScript, Swift or Kotlin. Just like these languages were supposed to be an alternative (or even a replacement) for PHP, Objective-C and Java, Go is supposed to be a better alternative to C++ when it comes to web development.

Close to high-level languages in terms of syntax but competing with low-level languages in terms of performance, Go provides a specific set of features that enable engineers to write lightweight services with native support for concurrency and enforce consistent and explicit code.

It’s kind of difficult to describe a programming language in words. Just like it’s difficult to describe a song without singing it or food without tasting it. However, I encourage you to have a taste of Go.

Share this article