Amazon RDS (Relational Database Service) offers a variety of instance types, each optimized for different workloads. These instances come with varying levels of CPU, memory, storage, and networking capacities. Choosing the right instance type is crucial because it directly affects three key aspects:
Performance: An undersized instance can lead to slow database queries, while an oversized one may result in unnecessary costs.
Cost: Larger instances are more expensive, so selecting the right size helps balance performance and cost.
Scalability: Instance size impacts your ability to scale effectively. Starting with an appropriate instance size sets you up for future growth.
Understanding RDS Instance Types
AWS offers a broad variety of instance types for both EC2 and RDS, each tailored to specific use cases:
Beware of Burstable Instances
Burstable instances (T2, T3, T4) are cost-effective but operate using a credit system. When the CPU utilization is below the baseline, credits are accumulated.
These credits can be consumed when the CPU usage spikes. Once the credits are exhausted:
Standard Mode: The instance is throttled to a lower performance, and no additional charges are incurred.
Unlimited Mode (default for T3 and T4): The CPU continues operating without throttling, but AWS bills extra for the burst usage, which can lead to higher costs.
Factors Influencing Instance Size
When deciding on an RDS instance size, consider the following factors:
CPU Utilization: High CPU performance is necessary for fast query execution. Monitor CPU usage regularly and adjust accordingly.
Memory Requirements: Adequate memory is critical for performance. Memory-optimized instances are ideal for workloads that require large amounts of RAM.
Storage Needs: Evaluate whether you need General Purpose, Provisioned IOPS, or Magnetic storage based on your workload’s requirements.
Network Bandwidth: Ensure the instance can handle your network traffic.
Monitoring Performance Metrics
AWS CloudWatch provides key performance metrics to help monitor your RDS instance:
CPU Utilization: Percentage of CPU capacity used.
Freeable Memory: How much RAM is available.
Read/Write Throughput: Data read from or written to disk per second.
Network Throughput: Rate of incoming and outgoing network traffic.
DB Connections: Number of client sessions connected to the database.
Regularly review these metrics in the CloudWatch Monitoring tab to ensure optimal performance.
Right-Sizing Strategies
Techniques for matching instance size to workload demands.
General Rules
We'll consider downsizing if over a four-week period:
If: vCPU utilization averages < 40% and memory utilization averages < 50%
Then: Evaluate stepping down to the next lower in the same instance family
We will consider up-sizing if over a four-week period:
If: vCPU utilization averages > 80% and memory utilization averages < 50%
Then: Evaluate a compute-optimized instance upgrade
If: vCPU utilization averages < 50% AND memory utilization averages > 80%
Then: Evaluated a memory-optimized instance upgrade
If: vCPU utilization averages > 80% and memory utilization averages < 80%
Then: Upgrade within the same instance family
While there's no direct 'memory utilization percentage' metric in AWS RDS, we can estimate it using FreeableMemory
UsedMemory = TotalMemory - FreeableMemory
MemoryUtilizationPercentage = (UsedMemory / TotalMemory) * 100
Note: The FreeableMemory metric doesn't encapsulate all the ways a database engine like MySQL or PostgreSQL uses memory. You may need to consult OS-level metrics or database-specific tools
Additionally, consider percentile-based utilization (e.g., p90, p95) for a more accurate understanding of peak usage periods. Plan for future growth by adding a buffer (e.g., 20%) to your utilization thresholds.
Example
Here’s an example of how different metrics (average, p95, p99.5) and headroom might affect your decision:
Reserved Instances vs. On-Demand Instances
Amazon offers two primary pricing models for RDS instances:
Best Practices for Optimizing RDS Instances
Database Tuning: Before upgrading your instance, ensure your database is tuned for performance.
Analyze PostgreSQL logs to find slow-running queries. Use tools like pgBadger to pinpoint resource-intensive queries.
Optimize SQL queries. Use EXPLAIN ANALYZE to understand query execution plan, and configure memory settings.
Adjust the shared_buffers parameter for the buffer pool.
Configure maintenance_work_mem and work_mem for local memory usage.
Use parallel restoration during data import/export. Set max_parallel_workers appropriately.
Continuous Monitoring: Regularly analyze performance metrics and iteratively adjust parameters for optimal performance.
Right-Sizing: Implement a right-sizing schedule and enforce instance tagging to streamline the process.
Conclusion
Right sizing is the most effective way to control cloud costs. It involves continually analyzing instance performance and usage needs and patterns—and then turning off idle instances and right sizing instances that are either over provisioned or poorly matched to the workload. Because your resource needs are always changing, right sizing must become an ongoing process to continually achieve cost optimization. You can make right sizing a smooth process by establishing a right-sizing schedule for each team, enforcing tagging for all instances, and taking full advantage of the powerful tools that AWS and others provide to simplify resource monitoring and analysis.
References
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.DBInstanceClass.html#Concepts.DBInstanceClass.Summary
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/burstable-credits-baseline-concepts.html
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Tuning.concepts.html
https://aws.amazon.com/blogs/database/optimizing-and-tuning-queries-in-amazon-rds-postgresql-based-on-native-and-external-tools/