Let Your AWS Lambdas Survive Thousands of Connections
Aurora Serverless v2 might not be up to the task
AWS Lambda is pretty awesome. Serverless “Functions as a Service” where you can deploy code without worrying about servers, scaling, or infra.
Well, that’s the dream. The reality can be a bit nearer earth, especially when Lambda bumps up against other services, say databases. This isn’t Lambda’s fault (Lambda really is awesome), as it is a function of how AWS services work (or don’t) together.
So, what’s the answer? Pooled connections are the way to go in this scenario, but AWS seems to be struggling even here. Let’s look into why.
When AWS Lambda and Aurora Serverless v2 Don’t Mix
At its core, the issue stems from how Lambda functions handle database connections. Each Lambda instance typically creates its own database connection, typically to Aurora Serverless v2 within the AWS ecosystem. The problem is Lambda’s autoscaling can spawn hundreds or thousands of concurrent instances within seconds. This fundamental impedance mismatch between Lambda’s scaling model and Aurora’s connection management creates a perfect storm for production incidents.Â
Consider this typical Node.js Lambda code:
This code appears reasonable, and you’ve probably written something similar dozens of times. But in a serverless context, it can lead to severe issues. For example:Â
- Connection pool exhaustion. In Amazon Aurora Postgres, the maximum number of database connections is determined by the max_connections parameter. In Aurora Serverless v2, the default maximum connections for an instance with 1 Aurora Capacity Unit (ACU) is 90; for an instance with 64 ACUs, it is 5,000. With each Lambda instance maintaining its own connection, a sudden traffic spike can exhaust this pool within seconds. It’s also important to note that each active connection consumes memory, so manually setting a high max_connections value without enough ACUs allocated also leads to performance issues.
- Connection zombie state. When AWS Lambda functions establish database connections, each function instance typically creates its own connection. When Lambda instances go idle, their connections don’t immediately terminate. Instead, they linger in a “zombie” state, consuming resources without providing value. This is a common issue in serverless environments where functions scale rapidly.Â
- Cold start penalties. New Lambda instances must establish fresh database connections, adding latency (often 100-300ms) to cold starts in Aurora Serverless v2.
Is Amazon RDS Proxy the Solution? Often, It’s Not
To mitigate these issues, AWS recommends using Amazon RDS Proxy, a service that establishes a connection pool and reuses connections within this pool. The idea is to let RDS Proxy reduce the memory and CPU overhead associated with opening new database connections for each function invocation. RDS Proxy controls the number of database connections to help prevent oversubscription and manages connections that can’t be immediately served by queuing or throttling them.Â
But RDS Proxy has its limitations. It imposes hard limits on concurrent connections, which can lead to increased query latency and higher DatabaseConnectionsBorrowLatency metrics. RDS Proxy can also struggle with long-running transactions, as certain SQL operations can cause all subsequent statements in a session to be pinned to the same underlying database connection reducing the efficiency of connection reuse. Setting up and managing RDS Proxy is also far from straightforward, and it’s also not free.
A Real-World ExampleÂ
“Neon worked out of the box, handling hundreds of Lambdas without any of the connection issues we saw in Aurora Serverless v2. On top of that, Neon costs us 1/6 of what we were paying with AWS” (Cody Jenkins, Head of Engineering at Invenco)
Invenco, an e-commerce logistics company, suffered these problems. Their architecture involved Lambda functions processing payment transactions against Aurora Serverless v2. In theory, AWS Lambda and Aurora Serverless v2 should be a match made in cloud heaven. But Aurora Serverless v2 struggled to handle the concurrent connections from their Lambda functions during traffic spikes, and adding RDS Proxy didn’t solve the issues.
Why? Let’s take a look at a typical Aurora Serverless v2 connection pattern during a traffic spike:
- t=0s:Â Â Normal traffic: 100 requests/sec, 20 active Lambda instances with 20 DB connections
- t=1s:Â Â Traffic spike begins: 2000 requests/sec hit the API Gateway
- t=1.2s:Â Lambda auto-scaling triggers, spinning up 200 new instances
- t=1.3s:Â 220 concurrent connection attempts to Aurora (20 existing + 200 new)
- t=1.4s:Â Aurora connection queue begins backing up
- t=1.5s:Â New connections start getting refused, only 100 total connections accepted
- t=1.6s:Â Application errors begin cascading, requests start failing
This pattern shows how a sudden 20x increase in incoming traffic creates a cascade effect. The traffic spike triggers Lambda’s autoscaling, and each new Lambda instance attempts to create its own database connection.
Aurora Serverless v2, despite being “serverless,” can’t scale its connection capacity as rapidly as Lambda scales compute, and the mismatch between Lambda’s scaling speed and Aurora’s connection capacity leads to failures.
The Solution: PgBouncerÂ
Neon, a serverless Postgres service that can be an alternative to Aurora Serverless v2, takes a fundamentally different approach to the connection management problem by integrating PgBouncer directly into its architecture.Â
Rather than requiring a separate proxy service like RDS Proxy, Neon connection pooling is built into every Neon endpoint. Here’s how it works:
This seemingly minor change routes your connections through PgBouncer in transaction pooling mode, which fundamentally alters how connections are managed:
- Connection scaling. While a standard Postgres instance might support only 450 connections (with 1 CPU/4GB RAM), Neon’s pooler supports up to 10,000 concurrent connections.
- Resource management. Instead of each Lambda creating a persistent connection, the pooler maintains a shared pool of 64 database connections per user/database pair. These connections are recycled efficiently across your Lambda invocations.
- Queue instead of fail: New requests are queued rather than rejected when all connections are in use. This is particularly crucial for serverless architectures where traffic can spike unexpectedly.
Remember: connection pooling isn’t magic. Those 10,000 concurrent connections still share a limited pool of actual database connections. However, for serverless architectures with many concurrent but infrequent database operations, this pattern provides the scalability needed without the operational complexity of managing your own connection pooling infrastructure.
How to Use Neon Connection Pooling With AWS Lambda
Let’s build an app that might need to take advantage of this type of pooling. We will mimic a service like Invenco with fulfillment, sales, and inventory management endpoints.Â
We’ll start with Neon. We’ll create a new project, our schema, and some mock data:
What goes here isn’t super important; we just want to ensure we have some somewhat realistic API calls for our lambda functions. At this point, we also want to grab our database URL. Importantly, we want the “Pooled connection” option:
The only difference from a user’s point of view is the “-pooler” addition to the connection string. But as we’ll see, this makes a big difference.
With our DB string, we’ll go ahead and set up our Lambda functions (Here’s the entire AWS Lambda <> Neon setup details). First, we want to install Serverless. This framework will abstract away much of the AWS infrastructure configuration, handling everything from function deployment to API Gateway setup through declarative YAML files:
This will take you through the setup for a new lambda function on AWS. When complete, navigate to the new directory created by the serverless step and install the node-postgres package, which you will use to connect to the database.
Now, we need to create our actual functions. Here are the functions we’re going to use:
This seems a lot, but we want to show what happens when you call multiple Lambda functions with DB connections. Lastly, we need to update the serverless.yml file that will have been autogenerated to add these endpoints:
Notice that this is where we’re adding our Neon connection string. This serverless.yml file acts as the infrastructure-as-code definition, declaring how our Lambda functions should be configured, what triggers them (HTTP endpoints), and what environment variables they need access to. It’s the single source of truth for our serverless architecture.
Now, all we have to do is deploy this:
You will get back something like this:
This means all your functions are now deployed to those endpoints, and you’re ready to go–you have serverless Lambda functions and a pooled connection to your Neon database.
Testing Our SetupÂ
Let’s test how this works. We won’t bore you with the details. If you want them, the code is in this repo:
https://github.com/argotdev/neon-lambdaÂ
We will run two load tests that simulate multiple users making concurrent requests to different endpoints. Test 1 will do this using a pooled connection to Neon, while test 2 will do this using a regular connection to Neon.
Here are the results for test 1:
Here are the results for test 2:
As we can see, the pooled connection outperformed the regular connection. The regular connection was still OK (and with good error catching, it would have probably worked properly), but not the 100% pass of the pooled connection.
We can see why in our Neon dashboard:
That first connection set is from test 1 with the pooled connection. Those 7,799 requests used just eight connections in total. The second spike is from test 2, which hit a max of 93 total connections. Now you can see why some might have failed and why pooled connections are vital when you have any load on serverless functions with Lambda. Imagine what happens when you have thousands of requests, like with Invenco!
There is a bonus here to using Neon. Here’s what happened when the connections started to rise during that second test:
You can see that once Neon detected more usage for both memory and compute, it autoscaled the available resources to match. This is why, with better error logic, we’d probably have seen closer to 100% success on the second test, even with more connections, as we’d have retried the API calls and found more resources available.
Rethinking Database Scaling for Serverless Applications
We’ve seen here that connection management isn’t just a technical detail—it’s a fundamental architectural concern when building serverless applications. The traditional approach of “one Lambda, one connection” breaks down at scale, and even AWS’s solutions, like RDS Proxy, introduce complexity without fully solving the problem.
Neon’s approach stands out for two key reasons:
- It makes connection pooling a first-class citizen. Adding “-pooler” to your connection string is all it takes—no additional services, no complex configuration.
- The autoscaling capabilities work in concert with the connection pooling. Neon responded by scaling up resources rather than just failing connections when our non-pooled test hit limits.
This matters for teams building serverless applications. It means you can focus on building features rather than wrestling with infrastructure. The future of serverless isn’t just about scaling compute—it’s about all parts of your stack working together when that scaling happens.
Neon has a Free Plan. Create an account here and try it yourself (no credit card required).