DynamoDB: How To Design & Query One To Many Relationships

This guide, is going to go over how to design and query one-to-many relationships in Dynamo DB. We are going to focus on the “single table design” version of this problem.

If you are using multi table design: The plan for the design and query is exactly the same as you would do in a traditional DB. Just connect the two tables using the PK (partition key). The “child record” will have the PK of the parent record & we are done.

Our Goal : No Scans! Only Query

We need to be able to fetch the “children” (the many part of the relationship) with a DynamoDB Query.

We could have made it such that you get the result with a Scan. But that would be very inefficient and costly and slow.

Here is a quote about the difference between Queries and Scans from the AWS docs.

If possible, you should avoid using a Scan operation on a large table or index with a filter that removes many results. Also, as a table or index grows, the Scan operation slows. The Scan operation examines every item for the requested values and can use up the provisioned throughput for a large table or index in a single operation. For faster response times, design your tables and indexes so that your applications can use Query instead of Scan.

– From “Best practices for querying and scanning data (AWS Doc)

The Project / Data We Will Work On: A Mobile Phone Database

As an example, we are going to try to design a Database of Mobile phones. (A very small part of it.)

Let me first show you the ERD of a traditional DB.

Mobile Phone Database ERD With Many One To Many Relationships(Click on the image to open in full screen)

As you can see, the DB contains the following entities:

  1. Mobile Phone
  2. Brand
  3. Mobile Phone Camera
  4. In The Box (What comes in the box with a mobile phone)

Below is the list of “one to many” relationships in the DB:

  1. A Brand Has Many Mobile Phones
  2. A Mobile Phone Has Many Mobile Phone Cameras
  3. A Mobile Phone Has Many Items In The Box

Now, let’s try to construct this using a “single table design”.

Method 1 : Using A Partition Key & Sort Key To Create Relationships

The idea here is that when representing a one-to-many relationship, we use the following structure:

  1. For the parent of the relationship, use the partition key
  2. For the child in the relationship, use the sort key

And, what exactly do we store in the PK & SK?

For both the PK and SK, we store a string (hash) for uniquely identifying the particular entity.

For example: a4e5f41f3733586c5b143e80bb39cab8

But, we prefix that with the type of entity:

For example: PHONE

PHONE#a4e5f41f3733586c5b143e80bb39cab8

Using the above 2 things, we are going to be able to query everything we need!

Let’s Use An Excel / Google Sheet To See How All The Data Can Be Stored In A Single Table

I have prepared a Google Sheet, which you can study to see how all the data is stored.

Note: To make things more clear, I have made the hashes small and human-readable and having a clue about the kind of data they are representing.

You can view it here in a new screen.

I found it very useful to use Google Sheets to design how the data will be storedYour mileage may vary.

Check out the video below in order to see how all the one to many queries will be made.

How To Run A Query To Get All The Brands (Not ‘one to many’, but we will start with this one)

Now, we will start to see how these queries will be made in the AWS Console.

In order to get the brands, we will use the following Query Parameters:

  1. PK (Partition key) = BRAND
  2. No Sort Key Used

Note: Brand is a little bit different from the other queries. The Brand entity does not have a parent. So the PK of the Brand entries is just “BRAND”.

query to get all the brands. only use PK and no SK(Click on the image to open in full screen)

How To Get All The Phones Where The Brand Is Apple (Brand Has Many Phones, A One-To-Many Relationship)

In order to get the brands, we will use the following Query Parameters:

  1. PK (Partition key) = BRAND#apple_3EM (we got this from the last query)
  2. SK (Sort Key) Begins With PHONE

This is where the prefix we put in comes in handy. All the children will have the same prefix of “PHONE” and their PK will be BRAND#apple_3EM.

getting all the phones that belong to a brand(Click on the image to open in full screen)

How To Get All The Cameras Where The Phone Is iPhone 13 (Phone Has Many Cameras, A One-To-Many Relationship)

In order to get the brands, we will use the following Query Parameters:

  1. PK (Partition key) = PHONE#iphone_13_hbS (we got this from the last query)
  2. SK (Sort Key) Begins With CAMERA

one to many relationship between phone and cameras(Click on the image to open in full screen)

Method 2: When The Children Don’t Take Much Data To Store & You Are Never Going To Query On Basis Of The Children’s Attributes

There is one last one-to-many relationship we have not talked about yet. A phone has many things in the box.

Now, in this case, note that:

  1. The data that represents the items in a box is a small array of items like: Charger, Wire, Booklet. This will always be quite small and will surely take up less space than 400k. 400kb is the max size of the data. (See the image below.)
  2. Also, we are not at all likely to query on basis of the items in the box.

In this case, we can just put the items in an array or list and store it directly on the parent.

Note The Item Size Limits From The AWS Docs..

dynamo db item size limits from the AWS docs

(Click on the image to open in full screen)

Here is how things look on the AWS console when you create a “List” attribute…

aws dynamo db has many relationship using a list

(Click on the image to open in full screen)

This might make you feel very uncomfortable because it breaks everything you have learned about DB design in normal relational DBs like Postgres, MySQL etc.

Having said that, this is the NoSQL way for some has many relationships.

Conclusion

There are some more ways of doing one-to-many relationships. For example, using a GIS etc. But the above suggested methods are quite versatile. They should help you model the data in most situations.

Hope this helped.