Database Antipatterns
(Compiled from AWS, Azure, Google and and other community sources)
The following are antipatterns to avoid:
General Database Antipatterns
Avoid:
- Overnormalization:
- This is when the database design is overly normalized, leading to too many tables and complex relationships
- Solution: Make sure to design the database in a way that is easy to understand and maintain.
- Overreliance on Stored Procedures:
- Stored procedures can be useful for improving security, but can also add complexity and make debugging difficult.
- Solution: Use stored procedures sparingly and only when necessary.
- Poor Indexing:
- Indexing can improve query performance, but poor indexing can lead to inefficient queries.
- Solution: Utilize the appropriate indexing strategy for the given data.
- Poor Security Practices:
- Not using encryption, using weak passwords, or not following best practices can create security holes in the database.
- Solution: Follow security best practices and use encryption wherever possible.
- Poor Data Validation:
- Not validating data coming into or out of the database can lead to data integrity issues.
- Solution: Apply appropriate data validation on data entering and exiting the database.
- Unnecessary Complexity:
- Avoiding the use of complex design techniques when a simpler approach would suffice.
- Solution: Utilize the simplest design that meets the requirements.
- Poor Error Handling:
- Not handling errors in a consistent and reliable way can lead to unexpected and unpredictable behavior.
- Solution: Utilize a consistent and reliable error handling strategy.
- Poor Data Partitioning: Partitioning data improperly can lead to performance issues. Solution:
- Utilize an appropriate data partitioning strategy for the given data.
- Poor Monitoring:
- Not monitoring the database for performance or security issues can lead to unexpected problems.
- Solution: Utilize a suitable monitoring strategy for the given data.
- Poor Backup Strategy:
- Not having a suitable backup strategy can lead to data loss.
- Solution: Utilize a reliable and consistent backup strategy.
- Failing to divide large datasets into smaller subsets with similar characteristics, thereby missing out on the opportunity to use databases that are tailored to meet the data and growth needs.
- Not examining data access patterns initially, which can lead to extra time and effort being required in the future.
- Opting for data storage techniques that do not increase in scale at the rate required.
- Utilizing just one database type and supplier for all tasks.
- Maintaining the same database due to existing internal knowledge and expertise of it.
- Employing the same database solution because it worked well in an offline setting.
- Not evaluating data access patterns.
- Not being familiar with the setup options of the selected data management solution.
- Relying simply on increasing the instance size without considering other available configuration options.
- Not assessing the scaling capabilities of the chosen solution.
- You only use the pre-defined metrics provided by your monitoring software.
- You only review metrics when there are problems.
- You only track system-level metrics, not capturing data access or usage metrics.
- You only use one database vendor to streamline operations management.
- You presume that data access patterns will not fluctuate over time.
- You embed complex transactions, rollback, and consistency logic into your application.
- The database is configured to prepare for potential surges in traffic, leaving the database resources idle for most of the time.
- You share the same database for both transactional and analytical purposes.
SQL and RDBMS Database Antipatterns
- Not having a clear data model:
- Not having a clear data model can lead to issues with data consistency, performance, and scalability.
- Using the wrong data types:
- Using the wrong data types can lead to issues with data consistency, performance, and storage requirements.
- Not normalizing the database:
- Not normalizing the database can lead to data inconsistencies, redundancy, and reduced scalability.
- Overreliance on indexing:
- Overreliance on indexing can lead to poor performance, increased storage requirements, and reduced scalability.
- Not optimizing queries:
- Not optimizing queries can lead to poor performance, especially as the size of the database increases.
- Not using transactions:
- Not using transactions can lead to data inconsistencies, security vulnerabilities, and reduced durability.
- Not backing up the database:
- Not backing up the database can lead to data loss in case of hardware failures, software bugs, or other issues.
- Not handling concurrency correctly:
- Not handling concurrency correctly can lead to data inconsistencies, race conditions, and other concurrency issues.
- Not considering the growth of the database:
- Not considering the growth of the database can lead to issues with performance, storage requirements, and scalability as the size of the database increases.
- Not considering security:
- Not considering security can lead to data breaches, data loss, and other security vulnerabilities.
- Overreliance on triggers:
- Overreliance on triggers can lead to poor performance, data inconsistencies, and other issues.
- Using cursors instead of set-based operations:
- Using cursors instead of set-based operations can lead to poor performance and reduced scalability.
- Not considering the cost of JOINs:
- Not considering the cost of JOINs can lead to poor performance and reduced scalability, especially for large databases.
- Using too many stored procedures:
- Using too many stored procedures can lead to poor performance and reduced scalability, especially for complex databases.
- Not considering the impact of data archiving and deletion:
- Not considering the impact of data archiving and deletion can lead to issues with performance, storage requirements, and data consistency.
- Not considering the impact of time zones:
- Not considering the impact of time zones can lead to issues with data consistency, especially for distributed databases.
- Not considering the impact of hardware and network limitations:
- Not considering the impact of hardware and network limitations can lead to issues with performance, scalability, and durability, especially for high-traffic or high-volume databases.
- Not enforcing referential integrity:
- Not enforcing referential integrity can lead to data inconsistencies, orphaned records, and reduced data quality.
- Not considering data archiving and data retention policies:
- Not considering data archiving and data retention policies can lead to issues with performance, storage requirements, and data compliance.
- Using too many nullable columns:
- Using too many nullable columns can lead to issues with data consistency, data quality, and performance.
Key-Value Database Antipatterns
Avoid:
- Storing too much unstructured data:
- Storing unstructured data in a key-value database can lead to issues with querying, indexing, and data retrieval.
- Not defining a clear data model:
- Not defining a clear data model can lead to issues with data consistency, performance, and scalability.
- Using overly complex key structures:
- Using overly complex key structures can lead to issues with data retrieval and indexing.
- Not considering data serialization and deserialization:
- Not considering data serialization and deserialization can lead to issues with performance and data consistency.
- Using too many indexes:
- Using too many indexes can lead to issues with performance and storage requirements.
- Not optimizing queries:
- Not optimizing queries can lead to poor performance, especially as the size of the database increases.
- Not handling concurrency correctly:
- Not handling concurrency correctly can lead to data inconsistencies, race conditions, and other concurrency issues.
- Not considering the growth of the database:
- Not considering the growth of the database can lead to issues with performance, storage requirements, and scalability as the size of the database increases.
- Not considering data backup and recovery:
- Not considering data backup and recovery can lead to data loss in case of hardware failures, software bugs, or other issues.
- Not considering security:
- Not considering security can lead to data breaches, data loss, and other security vulnerabilities.
- Not considering data partitioning:
- Not considering data partitioning can lead to issues with performance and scalability as the size of the database increases.
- Not considering data consistency and durability:
- Not considering data consistency and durability can lead to data loss, data corruption, and other issues.
- Using too many secondary indexes:
- Using too many secondary indexes can lead to issues with performance and storage requirements.
- Not considering data encoding and compression:
- Not considering data encoding and compression can lead to issues with performance and storage requirements.
- Not considering the type of workload:
- Not considering the type of workload can lead to issues with performance and scalability.
- Using a single key for all data:
- Using a single key for all data can lead to issues with performance, storage requirements, and data retrieval.
- Not considering the trade-offs between consistency, availability, and partition tolerance:
- Not considering the trade-offs between consistency, availability, and partition tolerance can lead to issues with data consistency, availability, and scalability.