Structuring Golang for AWS Lambda and API Gateway: Modular Monolith
July 14, 2022
(updated July 24, 2022)
TLDR: In this article, we go over how we can build a bunch of serverless functions from a unified modular monolithic codebase in Go / Golang, as an alternative to having one project per Lambda Function per endpoint. This approach is suitable for moderately large projects.
The basic idea being to write a monolith and then package different entry points () as separate deployments. This is an alternative to serverless hell, where you have a billion different functions with copypasta all over the place.
Monolith, microservices and serverless
If you are reading this blog, this can only mean either one of two things:
- you are my mom
- you have 5,000+ IQ
I'll assume your IQ is above 5,000 and you already have an Einstein-level understanding of microservices and monoliths.
Let's roughly define these two words still because I will be using them in a loose sense here:
- a monolith is a project that does everything. A monolith would be responsible for the entire domain of the application and is deployed as a discrete piece of software.
- microservices are smaller services that are responsible for smaller parts of the domain. For example, you could have a service responsible for managing Orders, another one for managing Users and so on. They may interact with each other via HTTP, gRPC, some sort of async queue or whatever else but should be fully decoupled and only interact via some public API.
Serverless stacks usually call for a microservice-like architecture. Some people deploy larger apps like Express.js or some other framework as a single Lambda Function, but that is not so common. Having giant Lambda functions with a lot of bootstrapping and such may also have performance implications (bad ones). Some people write serverless functions that handle a bunch of different messages and then they write a homemade routing sort of layer in the Lambda handler to parse the event being received and then branch based on that. The latter can get quite unwieldy.
The most common approach is to deploy smaller functions that perform a single task. In the case of an API, that would mean a 1:1 mapping between API endpoint and function. This is also what the serverless cult advocates (try telling them you are putting Express.js into a single Lambda and see what happens).
If you are on AWS, writing a serverless HTTP API would involve setting up Amazon API Gateway and defining routes that proxy incoming HTTP requests on to different Lambda functions. You would have one route for each Lambda function.
If you are doing it in the AWS Console and not with some sort of Infrastructure as Code (IaC) framework, then you will be in a world of pain when comes time to replicate your stack on to multiple environments as the configuration of those serverless services can be quite the jungle of knobs and buttons to click and tweak.
As the ancient saying goes:
The AWS Console is for n00bz. Infrastructure as Code is for hackermanz and gigachads alike.
Serverless Hell: spaghetti explosion
In any API, some endpoints are bound to share a bunch of underlying logic. If you have some CRUD endpoints for example, chances are there is going to be a lot of code reuse you can do between the endpoint and the endpoint for example.
Think things like defining types, defining models if you are using an ORM, bootstrapping code, connection pooling, error handling wrappers, logger, etc. If you have an endpoint for creating a for example, you would probably like to re-use some form of validation logic when updating a .
If you are jumping head first into the fine Amazon API Gateway and AWS Lambda cuisine without putting much thought into what you are cooking up, chances are you are going to end up with a bunch of disjointed Lambda functions that only share code through copypasta from hell.
If you are not doing it spaghetti copypasta style, you might end up writing the same code over and over again, forgetting that it is already implemented in some other Lambda Function you forgot existed. You might even end up running 50 different versions of the same dependencies from one Lambda Function to the next.
In other words, your little serverless adventure might leave you with a giant pile of dodgy code scattered across a bunch of independent tiny projects without clear indications of what is shared copypasta code and what is Function-specific code.
Here's what your directory structure might look like but more likely in and with more unhinged function names:
Now imagine this. You have been at it for a year. You are out of control. You create anywhere from 10 to 5,000,000 new Lambda Functions a day. You are too far gone, no one can stop you anymore. You are spinning up 'em SQS Queues all over the place, EventBridge into SNS into SQS into Kinesis out to S3 and back.
Your Lambda function names are becoming more and more obscure. In fact, you have just made one named . At this point, you have exchausted almost every possible sequence of ASCII characters and now need to brute force through to find new names.
It has become impossible to locate which functions use what code. You are now lost in a forrest of your own making. Chances are that if a tree were to fall in this forrest of yours, you would not be hearing a sound because you never got around to setting up unified monitoring.
Good job! 👏👏👏
Monolith(-esque) structure to the rescue
The best solution at this point would be for you to quit your job, change your name and rebuild your life somewhere else where no one knows you and more importantly, where no one knows what you did (which we described in the previous section).
If you can't do that though, what you can do is structure your forrest of Lambda functions into more of a monolithic codebase.
While monoliths have their shortcomings, the limitations mostly relate to scaling, both in terms of handling contributions from a bunch of different people at the same time and in terms of infrastructure scaling - a monolith is typically hard to scale horizontally.
Let's recap and see what we demand from life and our project:
- We want for it to be easier to make changes that span multiple (related) endpoints. For example, if I want to add some metadata in every response, I do not want to have to go through every function and copy-paste chunks of code all over the place. Or if the password hashing / salting and whatever encryption changes, I do not want to redefine that logic in the function and in the function.
- We want for it to be easier to step through the code. For example, using code editor features such as or . Those features can change your life and help people onboard more easily.
- We want for every endpoint to be deployed as a separate piece of infrastructure / compute, such that every endpoint gets its own monitoring goodies (e.g. CloudWatch Logs group, metrics, and so on) and can scale independently from the other ones. In our case here, we want every endpoint to be deployed to its own AWS Lambda function.
- We want to have consistent dependency versions across all functions. This one I guess might be a bit controversial, because this introduces some unnecessary coupling between the different packages. But in most scenarios, this would not be a problem.
The idea that immediately comes to mind is to build some sort of library / SDK / package that implements all of the underlying logic and then import it in every function. However, doing this well adds significant complexity in terms of workflows and such.
We can think of it in reverse. We define a monolith-esque project that contains everything we need. From database code to everything else. Then we simply define different entry points as different packages under and we build each entry point as its own deployment package which we then deploy as independent AWS Lambda Functions using AWS SAM.
This is the idea we are going to be working with here. Let's first define our serverless infrastructure a little bit just so we have something to work with. I'll be using AWS SAM in this article. We will be building up our AWS SAM template as we go along, adding endpoints to the Amazon API Gateway and all that.
AWS SAM is simple enough that you can get the gist of it even if you are not familiar with it. I would not call AWS SAM particularly wonderful to work with, but it is pretty good to define serverless APIs.
Defining the API Gateway
Let's start by defining a basic API in there. We won't be defining routes for now. We will do that later once we have covered the directory structure and what the code should look like.
Alright we are done with this AWS SAM template API Gateway skeleton for now. We will come back to it to add the API Gateway endpoints and the Lambda Functions.
Structuring our project
One thing many find frustrating when starting out with Go is the number of times people in the Go communities say something along the following lines when asked about project structure best practices:
Structure your project however you want, there is no right or wrong in Go for anything goes. MVC bad. Layered bad. and directories good. and directories bad, actually. Everything in one giant file good.
In fact, I even listened to an entire podcast where 3 grown adults were all agreeing with one another that design patterns are for boomers and that all the Go code in the world should be merged into a single file. The podcast was one hour long. I listened to it in its entirety.
Anyway, you are on my blog here (I have an article about how I built it).
You won't be getting any of that wise guy wishy-washy attitude over here. In fact, I'll even go full project structure bigot on you if I have to.
So here's what our bigoted project structure looks like:
If your face is turning red at the idea of using , I empathize (not really). The controversial choice of using maybe warrants a quick note.
I like using just as a way to separate the content of the app from all the surrounding files. Things like Makefiles, GitHub Actions configurations, , and all such tooling goodies are better kept outside of the application logic, in my view.
Feel free to use some other directory structure - if you want to lift all the packages from out into the root directory, nothing is stopping you, except for perhaps wisdom, righteousness and the 5,000+ IQ points we mentioned at the beginning of this article.
The directory structure
Let's go over the main directories.
The directory is pretty standard in Go projects. Typically, that directory contains your package and function.
The code you have in there usually does bootstrapping sort of stuff like parsing command-line arguments and such and then passing that config on to the actual application code.
Many projects have different packages in there - for example, you could have a CLI in there.
In our case, we are creating a directory in there and we are creating a new directory for each function.
For example would be a function in our example.
Here's what your code for could look like:
This is your Lambda Function handler. It contains almost no logic besides just parsing the event and extracting parameters and handling the top-level error flow.
It extracts the parameters from the event (URL path parameter passed on by API Gateway in this case), creates a service, calls the service and passes it some arguments and returns a response to the user.
It is somewhat akin to a in something like the MVC structure (which some of you might think is something for boomers).
In this file alone, we reuse a bunch of internal pieces of code from different packages we wrote under :
We reuse the database package to inject it into our service. In that package we could have connection pooling logic (though we probably want to do that outside of the Lambdas altogether, but you get the idea). If we were using sharded databases, the logic for that routing could also be in there.
All the logic happens in a call to some method on the :
The code for that service is under and has access to all the other monolith-esque bits and pieces of code.
If we had another API endpoint to a user, we could just duplicate this Lambda and instead of that call, we would call a different method on the service, like:
Of course you have to actually implement your packages in . We are not covering this here because that would make the article too long. But we can roughly go over what that might look like at a high-level.
A common way of structuring an app (in our case that would be everything we put into ) is to layer the logic. We write some sort of interface that implements business logic and makes call to a that implements some variant of the Repository Pattern to interact with the data store (database of any sort).
The Repository Pattern abstracts away the database interfacing logic. The basic schtick is you make some package or struct called (or , or whatever you want) which exposes functions such as , , , and so on.
Then in your service layer, you implement actual business logic, and perhaps things like transaction logic that span multiple store operations. Your service would expose higher-level methods like , and then in there, you would call methods on the repository (ideally via an to make it easier to test), like , or whatever else.
Or then again, of course, you could also just put everything into a single file and tell everyone you know that files that are 100M LoC long are the best to work with, like the people from the podcast I mentioned earlier.
Adding API Gateway endpoints in AWS SAM
Let's say we have the following 3 Lambda Functions in our directory:
We would need to update our AWS SAM template ( file from the previous section) like so:
Deployment can sometimes be a bit of a headache to set up. I would recommend to run to spin up all the related stuff - S3 bucket for deployment assets, CI user, IAM policies for it and so on. Just running that SAM CLI command will guide you through the process.
The Lambda Function resources for your project might also have additional attributes - e.g. for , Environment variables, CORS, IAM policies. So your final AWS SAM template will look a bit different, and your (covered below) will be different too so that will require a bit of tweaking on your end.
Regardless, here's the basic gist of what your might look like:
You can re-use some of these commands in whichever CI/CD platform you are using. For example, setting up GitHub Actions for AWS SAM is pretty straightforward, it is essentially just about running these few commands you have in the and passing in some secrets and you are good to go.
We have set up a Go project that is roughly structured as follows:
- directory contains AWS Lambda handlers which are all different entry points into the application. Each package in there does very little, pulling parameters out of the Lambda Event the function receives and then calling the actual business logic stuff in .
- contains all of our business logic and is accessible to all of our different Lambda Function handlers located in . We named that directory / package but could have named it something else and we could have put its content as top-level directories instead.
We have set up an AWS SAM template () file that defines our serverless stack, which consists of:
- AWS API Gateway - each endpoint routes to a Lambda Function (one of the package in the directory)
- Lambda Functions
We went over a minimal that contains the main AWS SAM CLI commands to build, package and deploy the application stack.
We now have a monolithic-esque structure for our Go project with Lambda Functions as API controller-like handlers. We can step through the code using our code editor features because everything is connected. But each function is deployed as its own AWS Lambda.
There you go. That was it. There are more things I wanted to mention like setting up GitHub Actions for CI/CD, or how to bridge the gap between say Terraform and that serverless configuration. Maybe in the next articles.
I'd be interested to hear how you all manage large Amazon API Gateway projects backed by Lambda Functions in Go.