High performing APIs with gRPC

9 min readFeb 17, 2020

Before reactive streams were a thing

Back in 2015, when looking at creating a new API your best bet would have probably been to go with Spring Boot Web and REST services. It offers many out of the box functionalities to get you started, as well as the whole Spring Cloud Netflix stack which allows you to properly scale your application in a micro-services based architecture.

However, horizontal scaling and high volume transactions only partially cover your API use-cases. Situations where you have a high volume of data which you would want to transfer or process in a transactional and REST-like manner would not be easily achievable. Yes, of course you could stream data and rely on solutions such as JMS or Kafka but my personal opinion is that it is too low coupled between services, and too tightly coupled to the JVM based programming languages (e.g. Java).

Meanwhile at Google they were dealing with a different issue regarding their micro-services communication. They traditionally opted for an RPC approach (remote procedure call) called Stubby, and were looking to change it to a new standard which is less interconnected with their infrastructure and more open to the outside.

This is the context in which gRPC was born, as an intercommunication solution across micro-services written in different programming languages, and meant to handle high volumes of data. More details about the framework and it’s motivation can be found on their website: https://grpc.io/about/.

To summarize a few performance details which I think qualifies it to the list of high performance APIs, gRPC offers the following features:

HTTP/2 network protocol via Netty which translates to faster transfer speed compared to HTTP/1
Google protobuf (GPB) data types which offer a high compression rate on serialization, meaning less data transferred compared to JSON
API definition first, implementation second
Allows asynchronous data streaming responses
Allows non-JVM APIs as well as clients
Easy(er) API versioning and backwards compatibility

On it’s own gRPC is not a traditional framework, but rather a set of common definitions (protos) which can be compiled in any of their supported languages (12 currently), and implemented as you choose based on a generated stub.

Example gRPC API— A SWAPI story

To start experimenting with gRPC, we need some data and data models. As I’ve mentioned before, gRPC is API definition first and implementation second.

For this purpose I’ve found SWAPI ( https://swapi.co/) which is a fun REST API that serves data from the Star Wars universe. I managed to get blocked quite quickly when trying to pull some of their data, but found an existing CSV export of the same thing (https://www.kaggle.com/jsphyg/star-wars) which we’ll be using from now on.

If you want to follow the example project we’ll be using in this section, you can find it here: https://github.com/cristiantoader/grpc-gpb-medium-swapi

Based on the SWAPI REST documentation and the downloaded CSV files, I’ve decided to make a couple gRPC services around the “character” and “planet” data sets. To easily infer the data types for our new models I’ve used the protobuf proto3 documentation: https://developers.google.com/protocol-buffers/docs/proto3.

The end result is quite straight forward, 2 services which return the entire data sets, or a filtered version of them.

You might be tempted to not use a wrapper object for the findByFilter method in CharacterService. Protocol buffers however have the benefit of easy backwards compatibility. So using the wrapper allows your API design to also be backwards compatible without making any changes in the clients, the alternative being to keep overloading those RPC definitions. This is something to keep in mind when creating your API definitions.

You can add new fields to your message formats without breaking backwards-compatibility; old binaries simply ignore the new field when parsing. So if you have a communications protocol that uses protocol buffers as its data format, you can extend your protocol without having to worry about breaking existing code. — https://developers.google.com/protocol-buffers/docs/overview

Compiling these proto definitions into usable code can be done into multiple programming languages. We will be using Java and a maven build, which will generate stub classes that we’ll need to override/implement in order to add their actual functionality as can be seen below.

Example gRPC API service

I simply could not let go of Spring’s dependency injection. Integrating it with gRPC is quite easy if this is all you’re looking for. When creating your Netty server you simply need to register your bean instances as Netty services. This is more of a quick hack, and integrating anything else such as spring-security or actuator won’t work out of the box because you cannot rely on the spring-boot-web(flux) features embedded in the spring embedded server and autoconfigurations. A very basic setup of spring-boot together with Netty and gRPC can be seen below.

Example of using Spring within the gRPC Netty server

If you’re seeking additional features from the spring suite, others have ventured to create spring-boot-starter maven modules for gRPC. Notable mentions are the following:

Coming back to our SWAPI example, a very similar setup is done on the client side. Importing the compiled proto definitions into your project let’s you create one of the following three ways of interacting with the gRPC service.

Asynchronous (ServiceNameStub)
Blocking call (ServiceNameBlockingStub)
Future (ServiceNameFutureStub)

Note that a blocking stub doesn’t block until the whole request is serviced and all data is retrieved. It is relying on data being streamed from the server to the client. I’ve added an example of some calls being made to our SWAPI gRPC service below.

Google Protobuf compression in practice

Reading about Google’s protobuf objects having better performance due to data compression when serializing object instances is reassuring that performance would improve. I was surprised to see by how much though.

I’ve ran a test scenario with our SWAPI gRPC API, and the results are the following from the application server logs:

2020-02-15 16:00:39.027  INFO 19008 --- [ault-executor-0] o.c.medium.planet.SwapiPlanetGrpcApi     : Received request to find all planets.
2020-02-15 16:00:39.308  INFO 19008 --- [ault-executor-0] o.c.medium.planet.SwapiPlanetService     : GPB 3778 vs JSON 11243 response size.
2020-02-15 16:00:39.308  INFO 19008 --- [ault-executor-0] o.c.medium.planet.SwapiPlanetGrpcApi     : Successfully completed request to find all planets.
2020-02-15 16:00:39.433  INFO 19008 --- [ault-executor-0] o.c.m.character.SwapiCharacterGrpcApi    : Receiver request to find all characters.
2020-02-15 16:00:39.511  INFO 19008 --- [ault-executor-0] o.c.m.character.SwapiCharacterService    : GPB 5930 vs JSON 15592 response size.

That is a remarkable difference! The two responses are 66% and 61% lower than the JSON alternative. The data sets we’re using are tiny, but when transferring gigabytes of data over the network the impact is huge. This made me curious to make a comparison with Apache Avro as well, but we won’t talk about that in this post.

To explain a little what is happening in the test and to better understand the results, I am first computing the total JSON size of the response by:

serializing to JSON the collection of java objects that i’m reading from the SWAPI dataset
converting the result to a byte array and counting the total number of bytes

Afterwards, similarly for the GPBs I am:

Taking the same fields and data and converting them to GPB objects
serializing each GPB to a byte array and counting the total number or bytes

Challenges Integrating Spring Security

On it’s own gRPC does one thing, and does it well and that is to serve large amounts of data in a relatively easy to maintain backwards compatible API.

The real challenges appear if you want more than that. For example maybe you want 0Auth or WebSSO or even better Spring Security support for authentication / authorization. This is a very painful exercise, and it can be achieved but not easily.

You can get started by reading this article, the basic idea is quite good. In order to hook spring security into netty-grpc you need to create the equivalent of spring HTTP filters as Netty ServiceInterceptors which would allow you to add session information and populate the Spring SecurityContextHolder.

After fixing dependency issues where spring security requires classes from spring-web, and disabling the embedded tomcat, you might start thinking that you’ve made it. It’s all starting to finally look good. But you didn’t… Netty handles request asynchronously, meaning that the thread that triggers the request is not necessarily the one which will provide the response (such that it does not block). This issue starts to be obvious when querying your API with heavy concurrent requests. The asynchronicity completely breaks your Spring security context which relies on thread local, you basically have no reliable security.

This is fixable by creating a grpc-context for each incoming request. When you are processing the response you need to make sure you are binding to your original grpc-context. This process ensures that you are on the same thread as you were when the context was created (so thread local is usable again). If you want to be extra safe you can even re-populate your security context holder with the request’s authenticated principal.

Some shortcomings

In my opinion one of the biggest shortcomings for gRPC is that it does not scale horizontally in a very modern way. Looking at the gRPC website they recommend a number of options, but predominantly you’re stuck with either using a proxy LB such as Nginx, or relying on DNS load balancing essentially. If you decide to go the traditional route using a proxy, be careful to choose one which supports the HTTP/2 protocol. Otherwise you’re looking at a performance drop.

From official documentation https://github.com/grpc/grpc/blob/master/doc/load-balancing.md

What I’ve experimented with was client side load balancing. I couldn’t fully integrate with the Spring Netflix stack because gRPC is not designed to do that. A description of the gRPC load balancing vision can be found here.

Again, lessons learned here have come through trial and error and debugging within the grpc java and netty modules. I’ve tried to use Eureka, similarly as in the following article. And again, yes, you’re tempted to think you’ve made it. You just create a NameResolver based on Eureka service discovery on the client side and that’s it, but there’s a catch and it’s called “usePlaintextOnly” which is something you would never do in a production environment.

To expand a bit on the Service Discovery (e.g. Eureka, Zookeeper) approach to gRPC name resolvers, is that they are NOT name resolvers. GRPC in reality expects to use a DNS name resolver to check available hosts that can service your requests within a managed channel. It creates a sub-channel for each request you make to each of those machines, but the API is a bit in-flexible (a missed opportunity really). Whenever it’s spawning a sub-channel, it calls the NameResolver’s “getServiceAuthority()” method and that’s a major constraint from a certificate perspective.

You are forced to have one single authority which will be matched with the server’s certificate alternate names during the handshake process. This means you need to have all your gRPC services for a serviceId type registered either with a wildcard certificate, or including each other in their alternate names.

An alternative API design approach here would have been to request from the NameResolver the service authority for a specific InetSocketAddress, or to have mutual TLS authentication between the ServiceDiscovery server and the gRPC client and trust the addresses it provides.

The reality today

The reality today is that things evolved since 2015. In the meantime in September 2017 Spring started supporting Reactive Programming and introduced spring-webflux, a Netty based non-blocking HTTP/2 web server based on reactive streams (you can stream responses in an asynchronous way).

At a first sight, the only thing holding back spring-webflux is that great ~66% compression rate that JSON simply doesn’t have. You might even be tempted to write a Spring HTTP message converter for it, but only to find it already exists part of the spring-web module since 2014 as a “application/x-protobuf” mime type.

I am still in the process of investigating some performance benchmarks between gRPC and spring-webflux using protobufs. I don’t expect any significant performance differences, but let’s see in a future article.

Haven’t tried it myself, but I am guessing Spring Cloud Netflix support is much easier to integrate with Spring Webflux (if it’s not provided out-of-the-box), with articles showing it’s as simple as adding a few dependencies and annotations.

However, as a personal preference I am still a fan of gRPC. I really like their client/server communication API. Using proto definitions that get compiled into template servers and clients is much more type safe than to deserialize to data types not enforced by the API. It simply feels safer to use the gRPC stubs than the reactive WebClient or RestTemplate.

If you want to play with gRPC on your own, have your own opinion and experience, please feel free to use the template project I’ve created on github for this article as a starting point.

I hope you’ve found this article useful, and please feel free to share a comment below!