RDS Rightsize Postgres Cluster
AWS Resource Type:
Amazon RDS for PostgreSQL
Planning the compute requirements for a Postgres RDS cluster can be challenging, often leading users to provision excess capacity. This Finder/Fixer inspects Postgres RDS instance usage data to determine if a target instance is overprovisioned and safely resizes it to provide cost savings.
Criteria for identifying the opportunity:
This Finder identifies opportunities for Postgres RDS instances that meet ALL of the following criteria:
instance created >= 31 days ago
instance type has not changed in the last 31 days
instance family is not burstable (i.e., not in the T family type)
instance can be changed to an x2g instance type without reducing memory
new instance type will not exceed 90% when calculating the maximum CPU utilization, network throughput, and storage throughput (based on CloudWatch metric calculations)
instance is not reserved, OR the RI released by this resizing action will be used up by other instances in the same region
Potential savings (range in % on annual basis):
Resizing instances to their next smaller instance type in a family typically results in a 30-50% reduction in annualized spend based on the latest RDS pricing.
What happens when the Fixer is executed?
For each opportunity, the Fixer executes the following steps to change the RDS instance to the appropriate type:
First, the Fixer uses the RDS DescribeDBInstances API to confirm that the instance is not in a failed state.
Changing the type of an RDS instance requires a restart, so the Fixer defers the retyping operation until the next maintenance window by calling the RDS ModifyDBInstance API with the ApplyImmediately parameter set to false.
Next, the Fixer uses the RDS AddTagsToResource API to tag the instance with information to assist with rollbacks, including the original type of the RDS instance.
Is it possible to rollback once CloudFix implements the fixer?
Yes. The Fixer configures CloudWatch alarms to monitor the CPU utilization, network throughput, and storage throughput of the instance after it has been retyped.
If any one of these metrics exceeds 90% of its maximum capacity for a 1 hour period, a CloudWatch alarm is activated and the automated rollback is triggered to revert the instance to its original type.
Note that the rollback process calls the RDS ModifyDBInstance API with ApplyImmediately=false, so the instance will be reverted to its original type in the next maintenance window.
Can CloudFix implement the fix automatically once I accept the recommendation?
Does this fix require downtime?
Yes - Retyping an RDS instance results in downtime, so the Fixer defers the retyping operation until the next maintenance window by calling the RDS ModifyDBInstance API with ApplyImmediately=false.