Skip to main content

Database Best Practices

General Database Best Practices

ACID and BASE

ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that describe a more strict set of consistency guarantees. It is focused on ensuring that transactions are processed in a consistent and reliable manner, and that the data stored in the system is always in a consistent state.

BASE (Basically Available, Soft-State, Eventually Consistent) is a set of properties that describe a more relaxed set of consistency guarantees. It is focused on availability and scalability, rather than strict consistency.

RDBMS

AWS RDS

AWS Aurora

MySQL

  • Backup

    • All MySQL databases should be backed up regularly.
  • Schedule Maintenance

    • Routine maintenance of MySQL databases should be scheduled and performed.
  • Secure Access

    • Access to MySQL databases should be limited to authorized users only.
  • Enable Binary Logging

    • Enable binary logging to maintain a record of all changes in the database.
  • Use Appropriate Storage Engines

    • Appropriate storage engines should be used for each database.
  • Utilize Indexes

    • Indexes should be used to speed up the search process.
  • Tune MySQL Configuration

    • MySQL configuration should be tuned regularly for optimal performance.
  • Monitor Database Performance

    • Database performance should be monitored regularly to detect potential issues.
  • Develop Guidelines

    • Guidelines should be developed in order to ensure proper use of MySQL databases.

Multiple Table Joins

  • Using Primary Key for Joins

    • Primary keys should be used to join tables together. This ensures that the data is being matched up correctly and that only one row from each table is being joined.
  • Using Foreign Keys for Joins

    • Foreign keys should be used to join tables together. This ensures that the data is being matched up to the correct table and that only the relevant rows are being joined.
  • Using Exact Data Types for Joins

    • When joining tables together, make sure to use the same data types for the columns being joined. This will ensure that no errors are being created and that the data is being matched up correctly.
  • Using Aliases for Joins

    • When joining multiple tables together, it is important to use aliases to make sure that the column names are unique. This will make it easier to read the query and understand what is being joined.
  • Using Joins

    • When joining multiple tables together, it is important to use the correct join type. Depending on the type of join being used, different results may be returned. Therefore, it is important to use the right join type for the job.

Vertical Scaling

  • Monitoring

    • Monitor metrics such as CPU, memory, disk I/O and disk space utilization to understand how the server is performing and how it reacts to different workloads.
  • Query Optimization

    • Analyze queries to identify and remove costly operations such as full table scans and joins.
  • Indexes

    • Create indexes on tables to improve query performance.
  • Partitioning

    • Partition tables to enable better query performance, since the data can be queried on a smaller subset of the table.
  • Sharding

    • Use sharding to distribute your database workload over multiple database instances or server nodes.
  • Caching

    • Use caching to reduce the load on the database server and improve query performance.
  • Replication

    • Use replication to increase database availability and performance by distributing database queries over multiple database instances.
  • Storage

    • Choose the right type of storage for your database. For example, if you have large amounts of data, consider using SSDs for better performance.

NoSQL - AWS

AWS DynamoDB

According to AWS docs:

  • Partition key design – High-cardinality partition key to distribute load evenly.

  • Adjacency list design pattern – For managing one-to-many and many-to-many relationships.

  • Sparse index –

    • Use sparse index for your GSIs.
    • When you create a GSI, you specify a partition key and optionally a sort key. Only items in the base table that contain a corresponding GSI partition key appear in the sparse index.
    • This helps to keep GSIs smaller.
  • Index overloading – Use the same GSI for indexing various types of items.

  • GSI write sharding – Shard wisely to distribute data across the partitions for efficient and faster queries.

  • Large items – Store only metadata inside the table, save the blob in Amazon S3, and keep the reference in DynamoDB. Break large items into multiple items, and efficiently index by using sort keys.

  • Data modeling

    • Data modeling is a key concept for successful implementation of DynamoDB.
    • It is highly recommended to use the NoSQL Workbench for Amazon DynamoDB to understand how to best structure your data.
    • This will help you build the correct data model and optimize the read and write performance.
  • Choose the right partition key

    • Choosing the right partition key is important as this is used to physically store and retrieve data in DynamoDB.
    • It is advised to use the partition key which will generate uniform access patterns. This will help you save costs and improve performance.
  • Use sparse indexes

    • DynamoDB provides the ability to create secondary indexes on one or more attributes.
    • It is recommended to use sparse indexes for attributes that are not present in all items.
    • This will help save storage and minimize the cost.
  • Use batch operations

    • DynamoDB supports batch operations which allow you to perform multiple operations in a single request.
    • It is recommended to use batch operations when possible as this will improve the performance and reduce costs.
  • Use caching

    • Caching is a great way to improve the performance of DynamoDB. It is recommended to use caching for frequently accessed items and query results. This will help reduce the number of requests to DynamoDB and improve the performance.
  • Use the right read/write capacity mode

    • DynamoDB provides two modes for read and write capacity: on-demand and provisioned. Using the right mode is important as it will help reduce costs and improve performance.
  • Optimize for workloads

    • Optimizing your workloads is important to reduce costs and improve performance. It is recommended to use the Amazon DynamoDB Accelerator (DAX) to improve the read performance of your workloads.
  • Use the right data type

    • Choosing the right data type is important for efficient storage and retrieval of data. It is recommended to use the correct data type for each attribute to save storage and optimize performance.
  • Monitor performance

    • Monitoring the performance of DynamoDB is important to ensure your application is performing optimally. It is recommended to use Amazon CloudWatch to monitor the performance and take corrective action if necessary.
  • Use server-side encryption

  • It is recommended to use server-side encryption to protect your data from unauthorized access. This will help ensure your data is safe and secure.