DynamoDB for Beginners: Moving Beyond Relational Database Thinking

As our application CGCircuit transitions from a custom web application to integrating with Shopify, we're reimagining our data architecture with Amazon DynamoDB. In this post, I'll share insights from this journey to help you understand when and how to use DynamoDB effectively.

The Mental Shift: From SQL Tables to Access Patterns

The biggest challenge when moving from MySQL to DynamoDB isn't syntax or setup—it's the fundamental change in how you think about data. Let me illustrate this with our Users table.

In MySQL, we might design a Users table like this:

CREATE TABLE users (
  user_id INT PRIMARY KEY,
  first_name VARCHAR(255),
  last_name VARCHAR(255),
  email VARCHAR(255),
  is_admin BOOLEAN,
  membership_id INT,
  created_at TIMESTAMP
  -- ... other fields
);

With this structure, we can query users in virtually any way:

SELECT * FROM users WHERE email = 'user@example.com'
SELECT * FROM users WHERE is_admin = true
SELECT * FROM users WHERE created_at > '2024-01-01'

Each query pattern is equally efficient (assuming proper indexing).

In DynamoDB, this approach doesn't work.

Instead of starting with table structure, you must begin by listing your access patterns:

Get user by user_id
Get users by shopify_id
Get users created between date range
Get active users
Get admin users
Get featured authors
Get users in membership

Then, you design your table and indexes to support these specific patterns—and only these patterns.

A Real Example: The Users Table

Here's how we implemented our Users table in DynamoDB:

def create_users_table():
    table = dynamodb.create_table(
        TableName=f"{TABLE_PREFIX}users",
        KeySchema=[
            {'AttributeName': 'user_id', 'KeyType': 'HASH'}  # Partition key
        ],
        AttributeDefinitions=[
            {'AttributeName': 'user_id', 'AttributeType': 'S'},
            {'AttributeName': 'shopify_id', 'AttributeType': 'S'},
            {'AttributeName': 'created_at', 'AttributeType': 'S'},
            {'AttributeName': 'is_admin', 'AttributeType': 'S'},
            {'AttributeName': 'is_featured_author', 'AttributeType': 'S'}
        ],
        GlobalSecondaryIndexes=[
            {
                'IndexName': 'shopify_id-index',
                'KeySchema': [
                    {'AttributeName': 'shopify_id', 'KeyType': 'HASH'}
                ],
                'Projection': {'ProjectionType': 'ALL'},
                'ProvisionedThroughput': {'ReadCapacityUnits': 5, 'WriteCapacityUnits': 5}
            },
            {
                'IndexName': 'admin-created_at-index',
                'KeySchema': [
                    {'AttributeName': 'is_admin', 'KeyType': 'HASH'},
                    {'AttributeName': 'created_at', 'KeyType': 'RANGE'}
                ],
                'Projection': {'ProjectionType': 'ALL'},
          'ProvisionedThroughput': 
				{'ReadCapacityUnits': 5, 'WriteCapacityUnits': 5}
            },
            {
                'IndexName': 'featured_author-created_at-index',
                'KeySchema': [
                    {'AttributeName': 'is_featured_author', 'KeyType': 'HASH'},
                    {'AttributeName': 'created_at', 'KeyType': 'RANGE'}
                ],
                'Projection': {'ProjectionType': 'ALL'},
                'ProvisionedThroughput': {'ReadCapacityUnits': 5, 'WriteCapacityUnits': 5}
            }
        ],
        BillingMode='PROVISIONED',
        ProvisionedThroughput={'ReadCapacityUnits': 5, 'WriteCapacityUnits': 5}
    )
    return table

Let's break down what's happening here:

Primary Key: user_id is our partition key (HASH)
Global Secondary Indexes (GSIs):
- shopify_id-index: For looking up users by Shopify ID
- admin-created_at-index: For finding admin users, sorted by creation date
- featured_author-created_at-index: For finding featured authors, sorted by creation date

Key DynamoDB Concepts Illustrated

1. Primary Key Structure

DynamoDB offers two types of primary keys:

Simple: Just a partition key (user_id in our case)
Composite: Partition key + sort key

Our Users table uses a simple primary key because we primarily access users by their unique ID. For tables where range queries are important, we'd use a composite key.

2. Design for Access Patterns

Notice how we created GSIs specifically for our common access patterns. Want to look up a user by Shopify ID? Use the shopify_id-index. Need all admin users? Query the admin-created_at-index with is_admin = 'true'.

3. No Ad-hoc Queries

In MySQL, you can write arbitrary queries:

SELECT * FROM users WHERE email LIKE '%@company.com' AND is_premium = true ORDER BY last_login DESC;

In DynamoDB, you can only efficiently query based on the indexes you've defined. For example, our schema doesn't support querying by email domain or premium status efficiently.

Writing Data Access Logic

Let's look at how we'd implement basic user operations with this structure:

class UsersTable:
    def __init__(self):
        self.table_name = f"{TABLE_PREFIX}users"
        self.table = dynamodb.Table(self.table_name)
    
    def get_user_by_id(self, user_id):
        """Get a user by their primary user_id"""
        response = self.table.get_item(Key={'user_id': str(user_id)})
        return response.get('Item')
    
    def get_user_by_shopify_id(self, shopify_id):
        """Get a user by their Shopify ID using GSI"""
        response = self.table.query(
            IndexName='shopify_id-index',
            KeyConditionExpression=Key('shopify_id').eq(str(shopify_id))
        )
        items = response.get('Items', [])
        return items[0] if items else None
    
    def get_admin_users(self, created_after=None):
        """Get all admin users, optionally filtered by creation date"""
        key_condition = Key('is_admin').eq('true')
        
        if created_after:
            key_condition = key_condition & Key('created_at').gt(created_after)
            
        response = self.table.query(
            IndexName='admin-created_at-index',
            KeyConditionExpression=key_condition
        )
        return response.get('Items', [])
    
    def create_user(self, user_data):
        """Create a new user"""
        # Ensure boolean values are stored as strings for GSI keys
        if 'is_admin' in user_data:
            user_data['is_admin'] = str(user_data['is_admin']).lower()
        
        if 'is_featured_author' in user_data:
            user_data['is_featured_author'] = str(user_data['is_featured_author']).lower()
            
        # Ensure user_id is a string
        user_data['user_id'] = str(user_data['user_id'])
        
        self.table.put_item(Item=user_data)
        return user_data

Notice how our query methods map directly to the access patterns we defined. There's no "find users by arbitrary criteria" method—you can only look up data using the indexes you defined.

When to use DynamoDB vs. MySQL

Use DynamoDB When:

You have predictable access patterns: DynamoDB excels when you know exactly how you'll query your data.
You need single-digit millisecond performance at any scale: DynamoDB can maintain consistent performance from 1 to 1,000,000 requests per second.
You have high write throughput requirements: DynamoDB's distributed architecture handles massive write loads efficiently.
You want managed scaling with minimal operational overhead: With on-demand capacity, DynamoDB scales automatically with no manual intervention.
Your data has a simple structure: If your data fits a key-value or document model without complex relationships.

Use MySQL When:

You need flexible querying: If you can't predict all query patterns in advance, SQL's flexible query language shines.
Your data is highly relational: If you need to join many tables or have complex relationships, relational databases excel.
You need complex transactions: MySQL supports ACID transactions across multiple tables and rows.
Your application requires complex aggregations: GROUP BY, HAVING, complex JOINs—these SQL features don't exist in DynamoDB.
You have a fixed scale with predictable growth: If your scale is moderate and predictable, the complexity of DynamoDB might not be worth it.

Real-World Implementation Tips

1. Denormalize Aggressively

In MySQL, we normalize data to avoid duplication. In DynamoDB, we often duplicate data to avoid joins.

For example, in our Users table, we might include membership plan details directly in the user record, even though this duplicates data across users with the same plan.

2. Use Composite Attributes

For our watched_info table, we use a composite primary key: video_user (the video_id#user_id). This pattern is common in DynamoDB to represent many-to-many relationships.

3. Plan for Data Evolution

DynamoDB is schema-less, which is both powerful and dangerous. You can add attributes at any time, but changing key structures requires creating new tables and migrating data.

Document your data model thoroughly and think ahead about future access patterns.

4. String Representations for Boolean Indexes

Notice in our create_user method, we convert boolean values to strings:

if 'is_admin' in user_data:
    user_data['is_admin'] = str(user_data['is_admin']).lower()

This is because DynamoDB doesn't support boolean partition keys in GSIs. You must use strings like 'true' and 'false' instead.

Conclusion

DynamoDB requires a fundamental shift in how you think about data modeling. Start with access patterns, not table structure. Design for specific queries, not general flexibility. Optimize for the critical path, not for every possible scenario.

When designed properly, DynamoDB offers incredible performance, seamless scaling, and minimal operational overhead. But it requires careful planning and a willingness to learn a new data modeling paradigm.

As we continue our journey from a custom application to a Shopify-integrated platform, DynamoDB's flexibility and performance have proven invaluable—once we embraced its unique approach to data.

Happy NoSQL modeling!

DynamoDB for Beginners: Moving Beyond Relational Database Thinking

The Mental Shift: From SQL Tables to Access Patterns

A Real Example: The Users Table

Key DynamoDB Concepts Illustrated

1. Primary Key Structure

2. Design for Access Patterns

3. No Ad-hoc Queries

Writing Data Access Logic

When to use DynamoDB vs. MySQL

Use DynamoDB When:

Use MySQL When:

Real-World Implementation Tips

1. Denormalize Aggressively

2. Use Composite Attributes

3. Plan for Data Evolution

4. String Representations for Boolean Indexes

Conclusion

Recent Posts

Comments

Carlo Sansonetti

AI Automation Expert

Cloud Development

Leadership Career Coach

The Mental Shift: From SQL Tables to Access Patterns

A Real Example: The Users Table

Key DynamoDB Concepts Illustrated

1. Primary Key Structure

2. Design for Access Patterns

3. No Ad-hoc Queries

Writing Data Access Logic

When to use DynamoDB vs. MySQL

Use DynamoDB When:

Use MySQL When:

Real-World Implementation Tips

1. Denormalize Aggressively

2. Use Composite Attributes

3. Plan for Data Evolution

4. String Representations for Boolean Indexes

Conclusion

Comments

Carlo Sansonetti​

AI Automation Expert

Cloud Development

Leadership Career Coach

Carlo Sansonetti