AWS DynamoDB

AWS DynamoDB (DDB) is a NoSQL, is a highly-durable B-tree. Meaning searches, sequential access, insertions, and deletions can be done in logarithmic time (fast).

Available in these regions: us-east-2, us-east-1, us-west-1, us-west-2, ca-central-1. DynamoDB is a tier-one service/1st class citizen in all AWS regions because AWS rely on it.

2019 AWS Prime Day it delivered
Across the 48 hours of Prime Day, these were 7.11 trillion calls to the DynamoDB API , peaking at 45.4M requests per second.

2020 AWS Prime Day it delivered
Across the 66 hours of Prime Day, there were 16.4 trillion calls to the DynamoDB API, peaking at 80.1M requests per second.

Characteristics

  • You can try DynamoDB locally (install Java first, then start a DynamoDB locally) or $ dynamodb local (by Docker)
  • Fully managed. No need to upgrade. No Storage limit. Scaled over 100TB: One Year of DynamoDB.
  • Its philosophy is Single Table Design (compare to the Relational Database world where you have many normalized tables).
  • All data encrypted: Amazon DynamoDB encrypts all customer data at rest. Access control is secured by IAM roles.
  • It is a NoSQL made by a bunch of MySQLs.

DynamoDB Terms mapped to Relational Database

  • table is a table in Relational Database
  • item is a row in Relational Database
  • attribute is a column in Relational Database
  • Primary key is either one attribute (partition key) or two attributes (partition key + sort key). Use HASH (for a partition key) and RANGE (for a sort key). See this post.
  • Supports JSON since 2014
  • PK: Partition key
  • SK: Sort Key
  • GSI: Global Secondary Index
  • LSI: Local Secondary Index

Who are using DynamoDB

Notion, Dropbox, Adobe Autodesk, Netflix, Duolingo, Snapchat.

Duolingo Case Study

31 billion rows
24K read / 3.3K write ops

Duolingo supports 80 different languages courses, 18 million active use, 6 billion exercises, only 2 in DevOps.

Dropbox Alki, or how we learned to stop worrying and love cold

Alki: Audio Logs application.

hot: high I/O throughput, fast, random access => RAM
cold: low I/O throughput. => Disk

They use Amazon DynamoDB as the hot store, Amazon S3 as the cold store.

Pricing

Charge money by reading/write operations from disk + storage instead of number of CPUs, RAM compares to AWS RDS / other usual databases pricing scheme.

  • $1.25 per million write request units
  • $0.25 per million read request units

On-demand or (Planned) Provisioned Capacity. After how application behaved figured out (access patterns identified). Costs can be projected with DynamoDB.

There are consistent read (immediately read after write) and eventually read (not immediately read after write, 2x cheaper than consistent read). Read something not exists also costs 1 / 0.5 read capacity for consistent/eventually read.

Besides write/read capacity:

Data Storage

  • $250 for 1TB per month
  • $2500 for 10TB per month
  • $25000 for 100TB per month

Real-world cost I found

  • $6k per month. On-demand cost per quarter: 1TB data, 10 billion rows, 1 billion read, 200M write, 2.5ms latency Source
  • $6 per month. 7 million read, 3 million write Source

What is available

  • Query — Read multiple items with the same partition key. The result is paginated.
  • Scan — Read all items in a table
  • PutItem
  • BatchWriteItem to add items / delete items
  • GetItem — Get a single row
  • BatchGetItem — Get 100 items from 1..N tables

FAQ