DynamoDB - Part2

WeDoIT is Cloud solutions & consulting company founded by cloud experts. As part of give back (or) knowledge sharing with community, our team has started creating these tutorials/cloud topic summaries.   Check out cloud our computing cost optimizer tool INVOKE to save cloud hosting costs by 50 to 80%.

WeDoIT is Cloud solutions & consulting company founded by cloud experts. As part of give back (or) knowledge sharing with community, our team has started creating these tutorials/cloud topic summaries.  Check out cloud our computing cost optimizer tool INVOKE to save cloud hosting costs by 50 to 80%.

Tables Best practices

Tables are distributed across multiple partitions.

DynamoDB is “optimized” for uniform distribution of items across a table’s partitions.

A single partition can hold approximately 10GB of data and 3000 read capacity units (RCU) and 1000 write capacity units (WCU).

Formula to calculate how many partitions DynamoDB might create for a table is:
          (provisioned read capacity units / 3000) + (provisioned write capacity units / 1000)

For example: if you had created the table with 1,000 read capacity units and 1,000 write capacity units, then a single partition would not be able to support the specified throughput capacity:
    ( 1,000 / 3,000 ) + ( 1,000 / 1,000 ) = 1.333 --> 2
In this case, DynamoDB will create 2 partitions for the table to store the data.

If your table’s data CAN FIT in “single partition” (even with estimated data growth), then DON’T worry about fine tuning tables (or) spending time on applying these best practices.

If your table data is (or could be) distributed across more than one partition, you need to follow best practices to achieve “maximum provisioned throughput capacity”.  Otherwise your DynamoDB calls might end up with degrade performance due to “Hot spots”.

“Hot spots” means accessing ONLY FEW partitions repeatedly out of all the partitions in a table. For example, if table has 10 partitions, but you are data I/O operations are most of the times on only first two partitions, those two partitions will be considered as “hot spots”.

Why “hot spots” are bad? Reason is, when your table has more than 1 partition, table’s provisioned capacity will be split across partitions approximately, due to which 1) “provisioned capacity” allocated for non-used partitions will be wasted 2) In high load situations response time might not be what you would expect.

For example, assume you provisioned 100 RCU and 20 WCU for your table and table has 10 partitions. Now table’s provisioned capacity will be split between partitions, which means each partition is going to 10 RCU, 2 WCU. If your table designed in a way that most of the requests are served by only FIRST TWO partitions, then capacity allocated for remaining 8 partitions are not utilized optimally.

Avoid I/O “hot spots” by designing your tables read and write activity spread across all items in the tables.

When your workload (read/write requests) access more “distinct partition key values“, DynamoDB can spread the activity across partitions by utilizing maximum number of  “provisioned throughput/capacity”.

From above sentence, we can conclude that “individual items workload patterns” and “primary key/ partition key” determines the optimal usage of provisioned throughput  (in other words, “throughput efficiency”) per table.

So, while designing table, pick a “primary key/ partition key”  by keeping two things in mind:
  1. A field/attribute which could span large number of distinct values (for example UserID in users table)
  2. Requests (for read/write) based on this field/attribute are FAIRLY as random as possible

Good balance between these things is the best we can get to maximize provisioned throughput.

Good way to estimate how your primary key might utilize provisioned throughput is: “partition key values access to the total number of partition key values in table grows”.

If you are looking for any help with migrating your solutions to cloud providers like AWS, Azure, Google (or) looking for cloud solutions architecture assistance, we are happy to talk to you to address your needs. Click  here to contact us.

Comments

Popular posts from this blog

AWS Stop Vs Terminate Vs Reboot

How to Configure DNS with Route 53 and GoDaddy

INVOKE Cloud recognized as a Rising Star for IT Management Software by FinancesOnline Directory