Storage Checklist

Growth projections

Identify the types of data to be stored and the expected growth rate for each type.
Estimate the average object size for each type of data and the expected number of objects.
Calculate the total amount of data that will need to be stored by multiplying the average object size by the expected number of objects.
Consider any data compression, deduplication, or other data reduction techniques that may be used to lower the amount of data stored.
Account for data retention policies, backups, and disaster recovery requirements that may affect storage growth projections.
Monitor actual storage utilization and adjust projections as needed.

Example projections of monthly usage at 10%/month growth rate:

Month	1mb/user	2mb/user	10mb/user	Total Cumulative Usage
1	1.05mb	2.1mb	10.5mb	10.5mb
2	1.1025mb	2.205mb	11.025mb	21.525mb
3	1.157625mb	2.3125mb	11.576mb	33.101mb
4	1.215531mb	2.431mb	12.1563mb	45.258mb
5	1.277986mb	2.5559mb	12.7799mb	58.038mb
6	1.344918mb	2.6898mb	13.4492mb	71.488mb
7	1.416631mb	2.8332mb	14.1663mb	85.655mb
8	1.493363mb	2.9865mb	14.9336mb	100.589mb
9	1.575352mb	3.1505mb	15.7535mb	116.343mb
10	1.662762mb	3.3253mb	16.6276mb	133.07mb
11	1.755889mb	3.5119mb	17.5589mb	150.629mb
12	1.854959mb	3.7099mb	18.5496mb	169.179mb

Type of Storage

Option	S3	EBS	EFS
Type of workload	Frequent, large quantities	Frequent, small quantities	Low performance
Data durability	Highest	Slightly lower	Slightly lower
Access patterns	Frequent, large quantities	Frequent, small quantities	Low performance
Cost	Most cost-effective	More expensive	More cost-effective
Security	Highest	Slightly lower	Slightly lower
Performance	Moderate	Higher	Lower
Scalability	High	More scalable	More elastic
Elasticity	High	-	More elastic
Cost	Low	More expensive	More cost-effective
Data durability	High	Slightly lower	Slightly lower

Object Storage Checklist

Store all objects in separate buckets
Use unique object names
Define a clear object lifecycle
Implement security best practices
Consider using multiple cloud providers
Consider data transfer costs
Optimize object storage performance
Utilize object versioning judiciously
Monitor and optimize costs regularly
Ensure interoperability and portability between cloud providers

Block Storage Checklist

Oversize volumes appropriately
Use multiple disks for data storage
Utilize the appropriate block storage type for the data's access patterns and performance requirements
Implement encryption for block storage volumes
Consider the impact of block size when storing data
Monitor and optimize costs regularly
Use block storage for persistent data only
Implement access controls
Consider the impact of network latency
Plan for disaster recovery
Test backup and recovery procedures regularly
Consider the impact of volume type
Avoid over-provisioning block storage resources

Archive Storage Checklist

Determine which data should be stored in archive storage based on its access patterns and performance requirements.
Consider the impact of using archive storage for all data storage needs and decide whether it's appropriate for your organization's needs.
Implement retention policies to specify how long data should be retained in archive storage.
Implement encryption for archive storage to protect against security vulnerabilities and data breaches.
Consider the impact of access latency when choosing the location of archive storage.
Implement backup and recovery procedures in addition to replication for archive storage.
Ensure that archive storage is only used for data that is appropriate for archiving.
Regularly review and delete expired data based on retention policies to avoid unnecessary storage costs and potential compliance violations.
Regularly test backup and recovery procedures to ensure that data can be restored in the event of a disaster or other unexpected event.
Plan for disaster recovery by replicating data across multiple regions or cloud providers to minimize the risk of data loss and downtime.

AWS Archive Storage checklist

Amazon S3 offers a variety of storage classes designed for different data access patterns and performance requirements. Here's an overview of the different S3 storage classes:

Amazon S3 Standard: This is the default storage class for S3 and provides high durability, availability, and performance for frequently accessed data.
Amazon S3 Intelligent-Tiering: This storage class uses machine learning to automatically move data between two access tiers based on changing access patterns. This can help optimize costs for data with unknown or changing access patterns.
Amazon S3 Standard-Infrequent Access (S3 Standard-IA): This storage class is designed for infrequently accessed data that needs to be readily available when accessed. It offers a lower storage cost than Amazon S3 Standard, but with a slightly longer retrieval time and a per-object retrieval fee.
Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA): This storage class is similar to S3 Standard-IA, but data is stored in a single availability zone, which makes it less expensive than S3 Standard-IA. However, it's not as durable as S3 Standard-IA and is best suited for infrequently accessed data that can be recreated easily.
Amazon S3 Glacier: This is a low-cost storage option for data archiving and long-term retention. Retrieval times for data stored in Glacier can be several hours, so it's best suited for infrequently accessed data with long retention periods.
Amazon S3 Glacier Deep Archive: This storage class provides the lowest cost storage option for long-term data archiving and digital preservation. Retrieval times for data stored in Glacier Deep Archive can be up to 12 hours, so it's best suited for data that is rarely accessed and has long retention periods.

Amazon S3 Glacier Deep Archive

Amazon S3 Glacier Deep Archive is a storage class in Amazon S3 Glacier that provides the lowest cost storage option for long-term data archiving and digital preservation. It is designed for customers who need to retain large amounts of data for many years or decades, and who don't require immediate or frequent access to that data.
Data stored in Amazon S3 Glacier Deep Archive is durably stored across multiple availability zones within an AWS region and is designed for 99.999999999% durability. This means that even in the unlikely event that multiple disks or facilities fail, there is still a high level of data durability and availability.
The retrieval times for data stored in Amazon S3 Glacier Deep Archive can be as long as 12 hours, which makes it best suited for infrequently accessed data that is not needed immediately. Retrieving data from Amazon S3 Glacier Deep Archive is a multi-step process that involves initiating a retrieval job, waiting for the data to become available, and then downloading the data.
Because Amazon S3 Glacier Deep Archive is a low-cost storage option, it's ideal for storing large amounts of data that is rarely accessed and has long retention periods, such as compliance data, backups, and archives.
Determine which data should be stored in Amazon S3 Glacier or Amazon S3 Glacier Deep Archive based on its access patterns and performance requirements.
Consider the impact of using archive storage for all data storage needs and decide whether it's appropriate for your organization's needs, and if not, consider other Amazon S3 storage classes, such as Amazon S3 Standard or Amazon S3 Infrequent Access.
Implement lifecycle policies to specify how long data should be retained in Amazon S3 Glacier or Amazon S3 Glacier Deep Archive, and automatically transition it to the appropriate storage class based on its age or other criteria.
Implement server-side encryption for Amazon S3 Glacier and Amazon S3 Glacier Deep Archive to protect against security vulnerabilities and data breaches.
Consider the impact of access latency when choosing the region and availability zone where Amazon S3 Glacier or Amazon S3 Glacier Deep Archive should be located.
Implement backup and recovery procedures in addition to replication for Amazon S3 Glacier or Amazon S3 Glacier Deep Archive, such as by using AWS Backup to create and manage backups.
Ensure that Amazon S3 Glacier or Amazon S3 Glacier Deep Archive is only used for data that is appropriate for archiving, and consider other Amazon S3 storage classes for frequently accessed or transactional data.
Regularly review and delete expired data based on lifecycle policies to avoid unnecessary storage costs and potential compliance violations.
Regularly test backup and recovery procedures to ensure that data can be restored in the event of a disaster or other unexpected event.
Plan for disaster recovery by replicating data across multiple AWS regions or cloud providers to minimize the risk of data loss and downtime, and consider using AWS Storage Gateway to create a hybrid cloud storage solution

Storage Checklist

Growth projections​

Type of Storage​

Object Storage Checklist​

Block Storage Checklist​

Archive Storage Checklist​

AWS Archive Storage checklist​