DynamoDB Single Table Design: A Brilliant Hack That's Probably Torturing Your Team

DynamoDB Single Table Design: A Brilliant Hack That's Probably Torturing Your Team

I'll be the brave one and say it out loud - Single Table Design in DynamoDB is premature optimization for 90% of applications.

Now before you close this tab in righteous indignation, just hear me out. I've seen this pattern implemented beautifully at hyper scale ...and I've also seen it turn a $200/month database into a $50,000/year developer productivity black hole. The difference isn't about the pattern itself. It's about whether your application actually needs it. And in most of the cases - it doesn't.


The Filing Cabinet Thought Experiment

Imagine you've just been hired at a company. Your first task: find Invoice #4521 from March.

In Office A, there are three filing cabinets: "Invoices," "Customer Records," and "Orders." You walk to the Invoices cabinet, flip to March, pull the file. Done. Your predecessor labeled everything clearly. Why? Because they weren't a psychopath.

In Office B, there's one gigantic filing cabinet. Everything is in there - labelled with postits - but all of it in there. Invoices, customer records, orders, important notes from 2019, someone's lunch receipt. Each file is labeled with a code: INVOICE#4521, CUSTOMER#789, ORDER#2024-03-15#001. The robot filing clerk who designed this system can retrieve any document in 3 milliseconds. But not you - you don't know the invoice number for the March recept, and so you have to look through each-and-every-file-one-by-one.

This is Single Table Design versus Multi-Table Design. And that 3-millisecond advantage? In my experience, it's rarely worth the tears.


A Brief History of a Brilliant Hack

This was a shocker for me - single Table Design wasn't born from best practices but was born from desperation.

In the early 2010s, DynamoDB was a far more constrained beast. It was and always been a specialised tool, for a specific job. Back then you got five Global Secondary Indexes (GSIs) per table. A hard limit of five. If you had six different query patterns, too bad. Spin up another table, provision separate capacity units, and watch your AWS bill as well as the operational overhead multiply.

A small group of architects at Amazon (led by Rick Houlihan) found a clever workaround. Instead of fighting the limitations, they embraced them. They created "Index Overloading": generic partition keys (PK) and sort keys (SK) that could store any entity type. A single GSI could now support users, orders, invoices, and product catalogs. The robot filing clerk was happy.

A bug was officially now a feature. This was genuinely brilliant. In 2012.

Here's what changed:

  • GSI limits increased from 5 to 20 to 25
  • On-Demand pricing eliminated the "table-per-entity" cost penalty
  • DynamoDB got faster, and networks got faster too

The original constraints that justified Single Table Design? Largely gone. There's no reason for the pattern to exist but it still does. It does not because it is still optimal, but because it has become dogma. Conference talks, blog posts, and certification exams continued to preach it as "the way." I'm literary adding to the debt here with this post!


The Trade-Off Nobody Talks About

Let's get mechanical for a moment. Single Table Design offers one undeniable advantage: pre-joined data retrieval.

By cramming a user details, their recent orders, their shipping addresses and what not into the same partition, you can fetch everything in a single query. One network round-trip instead of potential dozens. In theory, this is fantastic. The database is doing the work.

In practice, you've just shifted that work (and then some) to your application code.

Aspect Single Table Multi-Table
Query Speed Marginally faster (single round-trip) Slightly slower (parallel queries)
Code Complexity High (maintain composite keys, prefixes) Low (standard CRUD operations)
Debugging Nightmare (generic PK, SK everywhere) Straightforward (descriptive names)
Schema Changes Super Expensive ETL migrations Add a table, add an index
Analytics Readiness Painful (everything's mixed) Natural (one table = one concept)

The database is slightly more happier. Your developers - significantly less happy. And here's the thing about developer happiness: it directly correlates with how fast your team ships features and how often they introduce bugs.


The Cognitive Load Crisis

Forrest Brazeal once described a well-optimized Single Table Design as looking "more like machine code than a simple spreadsheet." He wasn't being hyperbolic.

Here's what debugging looks like in a Single Table:

PK                    SK                          Data
USER#123              METADATA#                   { "name": "Alice", "email": "..." }
USER#123              ORDER#2024-03-15#001        { "total": 59.99, ... }
USER#123              ORDER#2024-03-14#002        { "total": 124.50, ... }
USER#123              ADDRESS#HOME                { "street": "123 Main St", ... }
USER#123              ACTIVITYLOG#2024-03-15T10:  { "action": "login", ... }

Quick: find all the orders ...and filter by date. Now explain to the new hire why the sort key starts with ORDER# but the address one starts with ADDRESS#. Now explain the tilde hack (appending ~ to sort keys so parent items sort correctly)...

This is what I call ASCII sorcery: clever tricks that work beautifully for the database but create tribal knowledge dependencies. This is rarely documented, or even if it is - the documentation rarely keeps up with how the table grows. New team members can't be productive until they've absorbed months of context. Simple debugging sessions become archaeology expeditions. You cannot do a Distinct query to get all variation. Above all - every exporation is very very costly. Single tables are not self documented.

"Don't torture your developers for $5 a month."

That's not my quote. It's the emerging consensus from engineers who've lived through the Single Table experience.


The $5 vs. $5,000 Math

Let me make this concrete with some napkin math.

The Single Table pitch: By pooling all entities into one table, you share provisioned capacity. Economics of scale - less waste, better utilization, lower bill.

The reality: On-Demand pricing charges per request, not per table. Whether your reads hit one table or twenty, the per-RRU cost is identical.

But let's say you're on Provisioned Capacity, and Single Table genuinely saves you $100/month in capacity pooling efficiency. Sounds great, right?

Now consider:

  • 40 hours of architect time designing the Single Table schema
  • 20 hours of documentation (because nobody can understand it otherwise)
  • 8 hours per bug debugging through cryptic partition keys
  • 16 hours for every schema change requiring an ETL migration

At even $75/hour (conservative for senior engineers), you've spent $6,000 before you've saved your first dollar. Break-even is five years away. And that assumes no bugs, no churn, and no feature changes. (Spoiler: these assumptions have never been true for any project, ever.) You may have few blog posts, talks to speak of the design but the original goal remains a gaol for the future.


The Analytics Wall

Every application eventually needs to answer adhoc questions like:

  • "What was our year-over-year growth by product category?"
  • "Which users are most likely to churn?"
  • "What's the average order value by region?"
  • "Which user caused the spike in April?"

DynamoDB is extremely exceptional at transactional workloads ...but also equally poor for analytics. Asking DynamoDB to do analytics is like asking a sprinter to run a marathon: technically possible, deeply uncomfortable for everyone involved. When you inevitably need to stream your data into Redshift or a BI tool, Single Table Design becomes your worst enemy.

Remember that beautiful mixed table of users, orders, and addresses? Your analytics team now sees:

pk sk value (SUPER type)
USER#123 METADATA# {"name": "Alice", ...}
USER#123 ORDER#2024-03-15#001 {"total": 59.99, ...}

They'll need to write PartiQL queries to "shred" this semi-structured data back into relational format. The performance gains you realized in your transactional database? You have to give it back, with interest, in the analytical layer.

Multi-Table Design, by contrast, maps naturally: Users table → users in Redshift. Orders table → orders in Redshift. Your analysts can write normal SQL on day one.


The Microservices Trap

Single Table Design and microservices are philosophical opposites pretending to be friends.

Microservices is built on each service having its own data. Independent deployment, independent scaling, clear separation boundaries. Single Table instead forces multiple services to share one table. A schema change for Service A can break Service B. A "noisy neighbor" service can end up throttling everyone. IAM permissions become a nightmare (good luck granting access to specific entity types in a shared table).

This is the Shared Database Anti-Pattern wearing a NoSQL costume. The whole point of separate services is reducing blast radius. A single shared table instead makes it a ticking clock.


When Single Table Actually Makes Sense (The 10%)

I'm not arguing Single Table Design is always wrong. In specific contexts, it's the right tool:

1. Hyper-scale core services: If you're processing millions of queries per second and operate at sub-millisecond latency which directly impacts revenue, the eliminated network round-trip matters. Think identity services, high-frequency trading systems, or core infrastructure at Netflix-scale.

2. Stable, immutable access patterns: URL shorteners, logging pipelines, or event stores where access patterns that are genuinely set in stone. If your queries haven't changed in three years and won't change in the next three, Single Table's rigidity is a feature to embrace.

3. Global Database: DynamoDB was one of the first truely globally available database I worked with. Regional database, across the globe, all synced with each other - it was magical. Understanding the consistency model was the key to making it work.

4. Expert teams with DynamoDB depth: If your team has shipped multiple Single Table implementations and genuinely understands the trade-offs, you're equipped to make an informed choice.

For everyone else (startups iterating on product-market fit, teams with normal DynamoDB experience, applications that will need analytics, systems where requirements change quarterly), Multi-Table Design is almost certainly the better bet.


The Pragmatic Path Forward

Whenever I hear DynamoDB, or see a Single Table design doc - the following goes through my mind:

Can we start with Multi-Table Design? One table per entity type, with descriptive attribute names. Standard GSIs for your query patterns. This gives us readable data in the AWS Console and straightforward debugging. At early stage, things would change, access patterns would be refined and data driven decissions need analytics layer with flexible querying.

Can we use parallel queries? DynamoDB is fast. But then two parallel queries at 15ms each often complete faster than developers expect — and the scaffolding is dramatically simpler. When you don't need to shave off milliseconds - don't try to.

I'd consider Single Table only when:

  • We've profiled the application and network latency is definately the bottleneck
  • The access patterns are stable and well-understood
  • The team has the expertise to maintain it, or we've experimented with non-critical use case first
  • The math actually works (savings > complexity cost)

Make use of tooling if you must. There are now libraries like ElectroDB, DynamoDB-Toolbox, and OneTable which can reduce Single Table complexity. They're good tools, genuinely. But they're also treating a symptom rather than questioning whether you need the disease.


The Bottom Line

Single Table Design was a brilliant architectural hack for a database service which had real limitations. Today those limitations have largely been removed. The pattern remains useful in specific large scale scenarios, but for the vast majority of applications - it sign of premature optimization of the most expensive kind: trading marginal database efficiency for massive developer inefficiency.

The most valuable resource in your organization isn't the RCU or the WCU but instead it's the focused, productive time of your engineering team. Go build for agility first. Optimize for database efficiency only when the profiler tells you to and not when a conference talk from 2018 implies you should.

AI Disclosure: Rumor has it that this article was crafted by a real human named Mayank (mayankraj.com fame) — but who's to say? The artwork, in some cases, took quite a few virtual brushstroke from Bing Image Generator. Claude and Gemini kindly helped in hunting down typos and grammatical oopsies. But all the sentences (including this very one), the bad puns, quirky ideas, and alleged human charm? That's (probably) all Mayank... if he even exists!