Flooding DynamoDB

There are three DynamoDB limits you must know

  1. An individual record in DynamoDB is called an item, and a single DynamoDB item cannot exceed 400KB.
  2. There is a 1MB limit on the size of an individual scan or query request.
  3. You can use up to 3,000 Read Capacity Units (RCUs) and up to 1,000 Write Capacity Units (WCUs) on a single partition per second.

During a high velocity event one afternoon we learned the important lesson of #3

DynamoDB has a read limit of 3,000 per second

Our user traffic is very sporadic and comes in bursts. We receive a lot of requests in just 10 seconds!!

We managed to get through the rest of the day, painfully, but we made it. We were now tasked with solving this for the next day.

In 12 hours!

The data stored is small and only in a handful of rows. Partitioning and sharding would not help. We didn’t have time to incorporate DAX.

The team came up with two options:

  1. Setup Redis Cache
  2. Cache the data at the edge in our CDN

I took ownership of setting up the CDN proof of concept. I found the following header:

The Cache-Control max-age directive lets you specify how long (in seconds) that you want an object to remain in the cache before the CDN gets the object again from the origin server. The minimum expiration time our CDN supports is 0 seconds. The maximum value is 100 years.

Hmmm….how would 1 second work?

What would be our cache hit vs cache misses?

We ran some tests and experienced favorable results:

1.6 cache misses per second = 84% cache hits

This exceeded our expectations.

We ran more tests and averaged 81% cache hits

Ultimately the Redis cache prevailed and that saved the day. It was kept in place and is currently running until we could architect the right solution, but it was good knowing that we could use the CDN if and when needed for this challenge as well as any similar future use cases.

Looking forward into the future, we will be using edge computing with MQTT to solve this challenge and reduce calls to the origin while reducing response time to the client.

Hope this helps if you are stuck in the same predicament with time ticking…

--

--

--

Principal Site Reliability Engineer. Cyber Security Professional. Technologist. Leader.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Migrating from on-premise to cloud technologies #cloud #saas

SQL Databases vs MongoDB: A Brief Analysis

SQLDB VS MONOGODB

Roobykon Ruby Digest 2018 : Issue 1

Object-Oriented Approach in Go

AMA Recap of Dude’s Army x Modefi

Unity tip of the day: Ledge Grab for Platformers in Unity, the easy way

🐍 Design APIs with Flask-RESTPlus: decorators

How to Launch Docker Containers with the Official Python Library

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dale Frohman

Dale Frohman

Principal Site Reliability Engineer. Cyber Security Professional. Technologist. Leader.

More from Medium

Implementing trust for your enterprise API’s

Monitoring User Onboarding using Synthetic Canaries

Architecture(x86 vs ARM) in Lambda

AWS AppSync: GraphQL an Alternative to REST 🌱