A practical guide to protocol buffers (Protobuf) in Go (Golang)

Care to share? ...Share on TumblrShare on LinkedInShare on RedditShare on FacebookShare on Google+

The following article targets an audience with interest in software design, protocol buffers or computer programming in general, the article is fairly technical and assumes some knowledge in programming concepts.

Protocol buffers and Golang are two very recent fascinations of mine ever since I saw them in action. As an engineer, I can see a lot of power that can be harvested by leveraging either or both technologies which is probably why they are all over the place in Google’s infrastructure. This article serves as a practical tutorial into how to use protocol buffers in Go with some diving into the language features. Because this article is rather long, I decided to divide it into different sections:


Why Protocol Buffers?

Put very simply, a protocol buffer or protobuf encodes and decodes data so that multiple applications written in different programming languages can exchange a large number of messages quickly and reliably without overloading the network. From my experience, the beautiful thing about Protocol buffers is that the more data you tend to send, the more performance you’d get if you compare with other methods of sending the same data load, which can be really helpful if you have a lot of data that you need to send across the network between two nodes. In practice what protocol buffer libraries do is that they compress the messages to send in a serialized binary format by providing you the tools to encode the messages at the source and decode the messages at the destination. Protocol buffers currently are supported in multiple programming languages as outlined in https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns however the level of support varies.

Since protobuf is only concerned with the encoding, compression, decompression and decoding of a data, you need to take care of how to send it. Of course, The most obvious choice is using sockets which I will cover here, however you can get more creative and try a more fancy transport method like zeromq for example.

Why Go?

Go (or Golang) is a relatively new programming language developed by Google to replace some complex C/C++ components utilized in Google’s application layer. What sets Go apart; is the fact that it was built from the grounds up to provide performance that is destined in the future to compete with very powerful languages like C/C++ while supporting a relatively simple syntax that resembles dynamic languages but at the same time not as confusing as languages like Haskell or Ocaml. Go is garbage collected, however it doesn’t rely on virtual machines to achieve that. It compiles everything down to the machine level, you simply choose the type of platform (Windows, Mac..etc) that you’d like the binary to run on when you build, and then the compiler will produce a single binary that works on that platform which makes it native and cross platform at the same time. The way Go approaches common topics like OOP and threading is different than its predecessors, which makes it a testimony on how the future programing language should be like. Personally, I like to view Go as a magic tool to obtain super powers, and I am sure we all love super powers. The fact that I can write a piece of code that is simple in syntax, and deploy it anywhere without having to install any libraries, virtual machines or frameworks on the target machine is like how the green lantern has the ability to do anything anywhere whenever he feels like it using a mere green ring.

Go , in my opinion, is perfect for microservice architectures which we will be seeing a lot of in the future. A microservice architecture is an architecture at which you divide the responsibilities of your application to smaller services that only focus on specific tasks. These services can then communicate between themselves to obtain the information they need to produce results.

Source Code


Tutorial Overview

This tutorial is my effort to learn and share how to use protocol buffers in Go. I will try to cover some of the Go language interesting features in the process, however an awesome place to start learning Go is here . We’ll do that by building the following pieces:

1- A Go TCP client that can read data from a csv file then sends it to a TCP server using protocol buffers

2- A Go TCP server that can accept messages from multiple clients and then utilizes Protobuf libraries to decipher the messages before writing them to a CSV file

In order to demo protocol buffer support for another programming language, I included a Python client that uses protocol buffers which can also send data to the Go server.

Let’s start.

Preparing your development environment

First step is to setup your development environment. In order to do that, we need to:

1- Install Go =>

a. Go to https://code.google.com/p/go/wiki/Downloads?tm=2

b. Pick the file that corresponds to your development environment operating system

c. Ensure that a GOPATH environmental variable is defined to point to your Go workspace ( the folder at which you will save your Go code)

d. Ensure that your GOPATH root has three main folders: bin, pkg and src.

e. If you are confused about how to setup your Go environment, check the following article: http://golang.org/doc/code.html

2- Install Python 2.7.6 (For Windows users only) =>

a. Go to https://www.python.org/download/releases/2.7.6/

b. Pick the file that corresponds to your operating system

c. Ensure that your PATH environmental variable points to the Python.exe file.

3- Get Protobuf =>

a. Install both protobuf compiler and source files from https://code.google.com/p/protobuf/downloads/list

b. Install hg Mercurial at http://mercurial.selenic.com/downloads

c. To get the Go protocol buffer libraries, type go get -u code.google.com/p/goprotobuf/… from your command console.

d. To get  and build the Python Protobuf libraries, check the README.txt file in the Python folder included in the Protobuf for build and install instructions. Typically it involves typing Python setup.py install

Preparing your Protobuf message

Before you write any code, you need to first define the Protobuf message that will include the data you need to be serialized and sent. Here is what you need to do:

1- Design the shape and structure of the message based on the data that you need serialized

2- Write a proto file that describes your message

3- Use the Protocol Buffer compiler file (Protoc) to convert the Proto file to a library specific to your programming language of choice so that you can include it in your code

4- For a practical example about how to go about Preparing your Protobuf message, check the section below. Otherwise for detailed information about how to write the Proto file, check https://developers.google.com/protocol-buffers/docs/proto

Creating and using the Proto file for our project

Let’s illustrate preparing a Protobuf message by preparing a message for our project:

Imagine that I need to serialize and send a message called “TestMessage” ,which includes information identifying the client that sent it. The client information needed is the client name, the client ID and the client type. I also want my message to include a list of items, such that each item has an ID, a name, a value and a type. Now for the purpose of of this tutorial, I want the package or library name that includes all of this to be called “ProtobufTest”.

Here’s a sketch for how the TestMessage looked like inside my head, let’s assume 5 items are included in the message for now:


1- To define a message called “TestMessage” in a proto file, it’s as simple as typing

2- We agreed that the package or library name is to be called “ProtobufTest”, this can be defined by using the “package” keyword in the Proto file, and don’t forget the semicolon:


3- Let’s say that the item type can only be one of three types: TypeX, TypeY or TypeZ. This can be achieved by using an enum type as follows:

4- Now assuming that the only fields that must be filled in the message are the client ID, the client name , and each item ID. The Proto file will end up looking like this:

If you haven’t guessed it yet, the keyword “repeated” is used to define a list of objects,  in this case it’s a list of object type MsgItem.

5- After creating the .Proto file, we now need to use protoc in order to get a friendly library for our language of choice to use in our project. For that you open a terminal and start protoc with the correct parameters to get a library for your language. A typical protoc syntax would look like this (assuming the language is C++):

6- For our project, I named the .proto filename as ProtoTest.proto. If we assume that the Protoc file exists on the same folder the .proto file, the syntax will look like this for Python:

This will produce a ProtoTest_pb2.py file that we’ll use with the Python code.

In case of Go, when building in windows, I found out that  you need to ensure that the proto-gen-go.exe file is in the same folder as the proto file. Again, if we also assume that protoc exists in the same folder, the syntax will look like this:

This will produce a ProtoTest.pb.go file which we’ll use in our Go code.

Now that we have our auto generated libraries that include our message definition ready, it’s time to jump into writing the applications.

Preparing a CSV file for our project

In order for our CSV file to include the messageitems information, the CSV file will need to include the itemid,itemname,itemvalue and the itemType. With that, the CSV file should look like this:


The Go TCP client

Let’s start by covering what the Go TCP client application should do:

1- The application needs to be able to take two command line arguments: the CSV filename from which we need to read the data and the socket address (IP address and port number) of the TCP server to which we need to send the data

2- Retrieve the data to send from the CSV file

3- Send the data to the destination TCP server

In order to take and use command line arguments in Go, we’ll utilize the flag library, I found this link to be extremely helpful in explaining how to use the Go flag library. Let’s assign command ‘-f’ for obtaining the filename and command ‘-d’ for obtaining the destination address:

From the above snippet, I assigned a default value of CSVValue.csv to the CSV file name and a default destination socket address of

We’ll take the file name and pass it to the retrieveDataFromFile function which takes the file name and returns a serialized stream of bytes encoded by Protobuf which represents the data retrieved from that file, we’ll then take that serialized stream of bytes which happens to be a byte array and send it to the destination server address using sendDataToDest . I use the checkError function to check for errors produced instead of showering the code with if statements to check for errors.

Now let’s add all that together and look at the full code for the main function in our Go Protobuf client:

The difference between using “:=” vs “=” when assigning a value in Go, is that “:=” can be used to not only assign but also define the variable during compile time so that you don’t always have to add a line defining the variable like in most other compiled languages.

Now let’s dive into the three functions in our code:  checkErrorretrieveDataFromFile , and sendDataToDest

checkError(err error) takes an argument of type error and then it checks it. If there is an actual error; the error message will be printed, and then the application will exist. Otherwise nothing happens. Here’s how that looks in Go syntax:

retrieveDataFromFile(fname *string)([]byte,error ) will simply take a pointer to a string for the filename from where we will obtain our data and will return an array of bytes representing the protobuf serialized stream of bytes that was extracted from the csv file, as well as an error type which will be either nil if the serialization process succeeded or whatever error occurred while serializing the data.

It starts by opening the CSV file in question and then it uses a Go CSV reader to go through the file contents and retrieve the data. This requires both the os and the encoding/csv  Go packages. The code is quite simple:

The keyword defer in Go is quite similar to the finally  block in C# or Java; it simply means that whenever the function exit, file.close() needs to be called.

Next, I want my client program to read the headers row from the CSV file and map each header name to a column number.  I will use a very interesting feature of Go, I will create a new “type” and will simply call it “Headers” which in reality is just an array of strings then I will write my own function that belongs to this type to get the column number or index from each header name. It’s easier to explain by an example, so in our case our headers look like this:


For our program, a variable of “Headers” type will contain four values: itemid, itemname, itemvalue and itemType. I want to write a function that will return 1 if I give it “itemname” as an argument, 0 for itemid ..etc. This will make my program capable of handling the case at which  a user supplies it with a CSV file where the headers are not placed in the default order above. First we define the “Headers” type as an array of strings:

Here’s how the getHeaderIndex() function will look like:

The “for .. range” combination is Go’s way to quickly loop through an array, similar to “foreach” in C# or “for .. in” combination in Python. basically the function will return the index of the provided string “headername”, otherwise it will return –1. Just in case you are wandering, yes we could have just created the function in a way such that it took both the headers string array and the headername string as arguments however creating a function that belongs to the “Headers” type makes a better design since this function conceptually can only be applied on the Headers string array.

Now let’s use Go’s CSV library to read the first line of the CSV file which will be the headers and retrieve the index of each Header:

The code gives you an idea on how Go’s syntax feels natural when you design your program the Go way.

Now it’s time to start preparing our Protocol buffer object in which we will place the data we read from the CSV file.  My protoc compiler auto generated a ProtoTest.pb.go file from my Protobuf file which I will need to include in my project alongside the google Protobuf package. As mentioned before, I named my package ProtobufTest. The import code should look like this after retrieving Google’s Protobuf libraries:

It’s finally the time to initialize the protobuf object that contains our message then populate it with our client information

The keyword “new” will create a pointer that points to the ProtobufTest.TestMessage type. Whereas ProtobufTest is the Go package name auto created by the protoc compiler from my ProtoTest.proto file, and TestMessage is a Go struct type which contains the body of my message. You can think of a struct type in Go as an object type in other languages. Even though “ProtoMessage” is a pointer, I can still just use ‘.’ to access the fields inside TestMessage because Go will automatically dereference the pointer to the struct or object it points to as shown below

Another important remark is that when you are assigning values to a protobuf  message Go struct ,it’s preferrable to convert your values to proto.’valuetype’ as shown above before assigning them to the appropriate field in the protobuf  message struct. This simply returns a pointer to the said value which is how Go protobuf library is designed to store values. If you are wondering what CLIENT_NAME, CLIENT_ID and CLIENT_DESCRIPTION are, they are simply three constants that represent this particular Go client, here are their definitions

Now, I will write a loop to go through the rest of the CSV file, extract data and pass it to the ProtoMessage pointer that I created. By the time this for loop is done, ProtoMessage will point to the entire message that I want to send to my TCP server. As a reminder the message we are trying to send contains a list of “messageitems” at which we store item id, item name, item value and item type

Let’s dissect the main parts of this for loop:

  1. The for Loop will break  either once we get to the end of file (EOF) or if an error occured while reading the csv file, every record is an array of strings representing the line that got read by csvreader.Read()
  2. We use the strconv package to convert the strings we obtain from the CSV file to ints before we assign them as value to the appropriate protobuf message item field. strconv.Itoa() is a function that converts a string to an int type directly. Be default, Go’s int type is a 64 bit int (int64), however I defined the int values as int32 in my proto file, so that’s why I had to do another conversion to int32 before passing the value to the protobuf struct
  3. In case of item type, I used ‘&’ to directly convert to a pointer before assigning to the testMessageItem.ItemType. The reason why that was made is possible is because the ‘&’ keyword in Go acts similar to  the keyword in C++ where it gets me the memory address of the variable that follows it, which I can then simply assign to a pointer since a pointer is simply a memory address of a variable
  4. Whenever we finish assigning all the needed values to a message item, we append it to the end of the ProtobufMessage.Messageitems array
  5. I print the converted data at the end of each loop so that the program can indicate the progress

Now whenever the for loop is done, it is now time to convert our ProtoMessage to an array of serialized bytes suitable for sending across the network, this is simply accomplished by the proto.Marshal() method as follows

Now let’s put all that code together to show how the retrieveDataFromFile function will look like:

With this out of the way, all what is left is to send the array of serialized bytes across the network to our destination server. If you remember earlier in our main function, we obtained the pointer of the destination address and assigned it to  a variable called dest. Here is the function that takes dest as well as the array of serialized bytes then send them to our Go TCP protobuf server:


We use the Go’s net package for TCP communication, as you can see socket communications in general are very straight forward in Go. “conn” is basically a variable that contains the buffer that will end up being sent through the  newly created TCP channel. So by simply using conn.Write(data) we sent our array of bytes to the Go TCP server.

With that, our Go Client is complete, now let’s start working on our Go Protobuf server. For our Go server to be efficient, it needs some threads. So before we dig deeper into the code, how about we cover a bit of a practical overview regarding Go’s threads?


Concurrency in Go

Concurrency is one of Go’s most powerful features, the language was built in a way that allows users to very quickly and efficiently spin special  threads that are very light weight. The Go special threads are called “goroutines”, it is recommended to not to be shy when it comes to creating new goroutines in Go, programs could have hundreds of thousands of goroutines with no noticeable issues in performance. In reality goroutines are not always parallel threads, how they run depends on your program and your hardware architecture (like the number of cores for example). However, what you need to know is that they are designed to stay out of the way of the execution of other pieces of your program as much as possible which makes them perfect for a high performing application. In order to create a new goroutine, you simply use the keyword “go” before calling the function to be run concurrently.

See how simple was that?! No more need for complex syntax whenever you need to create a new thread.

The other thing that typically comes to mind whenever you are about to write a concurrent application is synchronization and shared objects between threads. Go provides locking capabilities when needed via it’s standard library, however the most recommended way in Go to achieve harmony between goroutines is Go channels. Go channels are basically a way for the different goroutines to share data, it is basically a buffer where one goroutine can push data, then another goroutine retrieves this data


To create a channel the “make” keyword is used, you can then use “<-“ to either push values into the channel or to retrieve values from the channel

Again very simple. An important remark here is that whenever you try to receive a value from the channel, the goroutine will block till a value is received. For more detailed information on goroutines and channels, an awesome video on concurrency in Go can be found here.

Now that we got that out of the way, it’s time to get to our Go server’s code.


The Go TCP Server

So let’s jump right into it, here is what we need our Go TCP Protobuf to do:

  1. Listen to the TCP socket expected to receive the data, in our case it’s IP address 127.0.01 and port number 2110
  2. Read the protobuf message and extract the values from it
  3. Write the extracted values to a CSV file
  4. Ensure that no issues will occur if multiple clients send data at the same time to the port

So in order to achieve best performance and to write the program in a way that will showcase Go’s capabilities, I will use different goroutines for the main tasks in the program. That means we will have a goroutine responsible for extracting values from the protobuf message  as well as another goroutine to write the extracted values to a CSV file. Anything else could be included in the main routine of the application

Let’s start by the goroutine that will handle the writes to the CSV file, as mentioned earlier this thread is expected to take the values retrieved by another thread from the protobuf message then write out to a CSV file. In order for that thread to only act whenever we retrieve a protobuf message, it will need to rely on a channel that will contain the protobuf message. So in other words, the goroutine will lock as long as there are no protobuf messages in the channel, otherwise it will take the values from the channel and proceed to write it to a csv file.

We can create that goroutine by using a Go function literal , a Go function literal is basically a function body without a function name, like an anonymous function

So obviously writeValuesTofile(message) is responsible for writing the values included in the message to a CSV file. We will utilize the CSV package we used in the client program.

We will use the same checkError() function we used for the client

Now let’s go back to the main function, we already created a goroutine that waits for a protobuf struct to be fed to it then it writes the data contained in this protobuf data struct to the CSV file. Now it is time to listen to the TCP port then whenever data is received on that port, a new goroutine converts it to a protobuf struct of the desired type.

All in all, our main function should look like this:

Now let’s dig deeper into the handleProtoClient() routine which will simply extract the data received via TCP then convert it to the protobuf message type which we will pass on to our go channel so that the writeValuesTofile() routine can get it and use it

One concept worth mentioning from the above is the Go slices. Basically a slice is the closest thing to a “flexible  array” in Go. They are fast and efficient ,however sometimes they are not as straight forward as fancy containers in other programming languages. To define and initialize a slice you need to use the make keyword and you need to define the initial expected capacity of your array, you can then use functions like append() or copy() to perform operations in your array. A really good article that covers tricks when using slices can be found here.

And… we’re done Smile

Build & compile your code

Like any other programming languages, there are multiple ways to build your Go code. However, one of the quickest and most practical way to build in my opinion is to use the “go install” command courtesy of the go tool. Here is how simple it is:

  • From the command prompt\terminal, browse to the folder where your source code resides, then type “go install”. Here’s how that looks like for the GoProtoClient

  • An executable file while magically show up in your GOPATH\bin folder.

If you are not familiar with what’s meant by the bin folder, go back to the How to prepare your development environment? section.

Running the program

1- Run the server so that it is ready  to accept connections and data


2- Run the client and point it to the csv file and destination address


3- Observe output from the server


4- Observe the CSVValues.csv file


One last word

I really hope that tutorial was useful for you. If you have any comments or feedback, feel free to let me know. Again, all the code is available here, you can play and experiment with it  as much as you want, but remember it’s for education purposes so use it at your own risk.

Interested to learn more about Go? Please take a moment to check my Mastering Go Programming course. It provides a unique combination of covering deep internal aspects of the language, while also diving into very practical topics about using the language in production environments.

Mastering golang programming
Care to share? ...Share on TumblrShare on LinkedInShare on RedditShare on FacebookShare on Google+

12 thoughts on “A practical guide to protocol buffers (Protobuf) in Go (Golang)”

    1. Really glad you liked the article, Cap’n proto seems very promising, I will most definitely explore it in more details.

  1. I read a lot of interesting articles here. Probably you spend
    a lot of time writing, i know how to save you a lot of time, there is an online tool that
    creates readable, SEO friendly posts in minutes, just search in google – laranitas free content source

Leave a Reply

Your email address will not be published.