To explain a little bit of the use case…
I’m using lambda to do post processing. Which means receiving the data from different servers/services, packaging it to send the data to a third party and dealing with possible errors. The result of that is asynchronously sent to the origin server as a callback. I don’t need to show the user anything. I just need this done until the a certain time of the day (“closing time”). This closing time varies. But, it is better if done quickly so my client’s back-office can deal with certain kind of errors manually before the closing time.
Until now i was only considering problems with sending the data to the third party. To deal with errors like third party timeouts or other temporary errors i do have lambda set to 2 retries and a 10 minute retry as a guarantee.
I wasn’t considering the possibility of reaching the RDS connection limit.
This processing is done in a multiple step process:
First of all, when the origin server sends the data, minor validation is done, the request is marked as accepted and sent to another lambda (application job). It’s not a problem If i have a connection limit error on receiving the origins server request. The origin server has the responsibility of retrying it.
On the application job that packages and sends the data, if any temporary error happens, i mark the request for retry. Here is the problem. If don’t have a database connection, i cannot even mark it for retry. Which means my request marked as accepted will stay on a limbo.
I know this retry is better to have anyway as a guarantee. But, i would like to solve the connection limit problem if possible.
Is it clear now? 