Lab complete!
Now that you have completed this lab, make sure to update your Well-Architected review if you have implemented these changes in your workload.
Click here to access the Well-Architected Tool
Create a single S3 bucket that contains two folders - od_pricedata and sp_pricedata, these will contain the on-demand pricing data and the Savings Plans pricing data.
Log into the console via SSO, go to the S3 service page:
Click Create bucket:
Enter a Bucket name starting with cost (we have used cost-sptool-pricingfiles, you will need to use a unique bucket name) and click Create bucket:
Click on the (bucket name):
Click Create folder:
Enter a folder name of od_pricedata, click Save:
Click Create folder:
Enter a folder name of sp_pricedata, click Save:
You have now setup the S3 bucket with the two folders that will contain the OnDemand and Savings Plans pricing data.
Go to the IAM Console
Select Policies and Create Policy
Edit the following policy and replace (S3 pricing bucket) with your bucket name, on the JSON tab, enter the following policy, click Review policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3SPTool",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:DeleteObjectVersion",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::(S3 pricing bucket)/*"
}
]
}
Policy Name S3PricingLambda, Description Access to S3 for Lambda SPTool function, click Create policy:
Select Roles, click Create role:
Select Lambda, click Next: Permissions:
Select the S3PricingLambda policy, click Next: Tags:
Click Next: Review
Role name SPToolS3Lambda, click Create role:
Create the On-Demand Lambda function to get the pricing information, and extract the required parts from it.
Go to the Lambda service page:
Click Create function:
Enter the following details:
Click Create function
Copy and paste the following code into the Function code section:
# Lambda Function Code - SPTool_OD_pricing_Download
# Function to download OnDemand pricing, get out the required lines & upload it to S3 as a zipped file
# It will find 'OnDemand' and 'Compute Instance', and write to a file
# Written by natbesh@amazon.com
# Please reachout to costoptimization@amazon.com if there's any comments or suggestions
import boto3
import gzip
import urllib3
def lambda_handler(event, context):
# Create the connection
http = urllib3.PoolManager()
try:
# Get the EC2 OnDemand pricing file, its huge >1GB
r = http.request('GET', 'https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.csv')
# Put the response data into a variable & split it into an array line by line
plaintext_content = r.data
plaintext_lines = plaintext_content.splitlines()
# Varaible to hold the OnDemand pricing data
pricing_output = ""
# Go through each of the pricing lines to find the ones we need
for line in plaintext_lines:
# If the line contains 'OnDemand' or 'Compute Instance' then add it to the output string
if ((str(line).find('OnDemand') != -1) and (str(line).find('RunInstances') != -1)):
pricing_output += str(line.decode("utf-8")).replace('"', '')
pricing_output += "\n"
# Add the output to a local temporary file & zip it
with gzip.open('/tmp/od_pricedata.txt.gz', 'wb') as f:
f.write(pricing_output.encode())
# Upload the zipped file to S3
s3 = boto3.resource('s3')
# Specify the local file, the bucket, and the folder and object name - you MUST have a folder and object name
s3.meta.client.upload_file('/tmp/od_pricedata.txt.gz', 'bucket_name', 'od_pricedata/od_pricedata.txt.gz')
# Die if you cant get the pricing file
except Exception as e:
print(e)
raise e
# Return happy
return {
'statusCode': 200
}
Edit the pasted code, replacing bucket_name with the name of your bucket:
Click Deploy above the code
Scroll down and edit Basic settings:
Scroll to the top and click Test
Enter an Event name of Test, click Create:
Click Test:
The function will run, it will take a minute or two given the size of the pricing files and processing required, then return success. Click Details and verify there is headroom in the configured resources and duration to allow any increases in pricing file size over time:
One of the most common reasons for this lab failing is the pricing files growing in size & Lambda times out. We have configured a balance between cost/performance with 4Gb memory, if the lab fails at some point - ensure the lambda has enough memory and passes this test.
Create the Savings Plan Lambda function to get the pricing information, and extract the required parts from it.
Go to the Lambda service page:
Click Create function:
Enter the following details:
Click Create function
Copy and paste the following code into the Function code section:
# Lambda Function Code - SPTool_SP_pricing_Download
# Function to download SavingsPlans pricing, get out the required lines & upload it to S3 as a zipped file
# It will get each regions pricing file in CSV, find 'Usage' and '1yr', and write to a file
# Written by natbesh@amazon.com
# Please reachout to costoptimization@amazon.com if there's any comments or suggestions
import boto3
import gzip
import urllib3
import json
def lambda_handler(event, context):
# Create the connection
http = urllib3.PoolManager()
try:
# Get the SavingsPlans pricing index file, so you can get all the region files, which have the pricing in them
r = http.request('GET', 'https://pricing.us-east-1.amazonaws.com/savingsPlan/v1.0/aws/AWSComputeSavingsPlan/current/region_index.json')
# Load the json file into a variable, and parse it
sp_regions = r.data
sp_regions_json = (json.loads(sp_regions))
# Variable to hold all of the pricing data, its large at over 150MB
sp_pricing_data = ""
# Cycle through each regions pricing file, to get the data we need
for region in sp_regions_json['regions']:
# Get the CSV URL
url = "https://pricing.us-east-1.amazonaws.com" + region['versionUrl']
url = url.replace('.json', '.csv')
# Create a connection & get the regions pricing data CSV file
http = urllib3.PoolManager()
r = http.request('GET', url)
spregion_content = r.data
# Split the lines into an array
spregion_lines = spregion_content.splitlines()
# Go through each of the pricing lines
for line in spregion_lines:
# If the line has 'Usage' then grab it for pricing data, exclude all others
if (str(line).find('Usage') != -1):
sp_pricing_data += str(line.decode("utf-8"))
sp_pricing_data += "\n"
# Compress the text into a local temporary file
with gzip.open('/tmp/sp_pricedata.txt.gz', 'wb') as f:
f.write(sp_pricing_data.encode())
# Upload the file to S3
s3 = boto3.resource('s3')
# Specify the local file, the bucket, and the folder and object name - you MUST have a folder and object name
s3.meta.client.upload_file('/tmp/sp_pricedata.txt.gz', 'bucket_name', 'sp_pricedata/sp_pricedata.txt.gz')
# Die if you cant get the file
except Exception as e:
print(e)
raise e
# Return happy
return {
'statusCode': 200
}
Edit the pasted code, replacing bucket_name with the name of your bucket:
Click Deploy above the code
Edit Basic settings below:
Scroll to the top and click Test
Enter an Event name of Test, click Create:
Click Test:
The function will run, it will take a minute or two given the size of the pricing files and processing required, then return success. Click Details and verify there is headroom in the configured resources and duration to allow any increases in pricing file size over time:
One of the most common reasons for this lab failing is the pricing files growing in size & Lambda times out. We have configured a balance between cost/performance with 4Gb memory, if the lab fails at some point - ensure the lambda has enough memory and passes this test.
We will setup a CloudWatch Event to periodically run the Lambda functions, this will update the pricing and include any newly released instances.
Go to the CloudWatch service page:
Click on Events, then click Rules:
Click Create rule
For the Event Source select Schedule and set the required period, we have selected 5 days, click Add target:
Add the SPTool_ODPricing_Download Lambda function, and click Add target:
Add the SPTool_SPPricing_Download Lambda function, and click Configure details:
Add the name SPTool-pricing, optionally add a description and click Create rule:
You have now successfully configured a CloudWatch event, it will run the two Lambda functions and update the pricing information every 5 days.
We will prepare a pricing data source which we will use to join with the CUR. In this example we will take 1 year No Upfront Savings Plans rates and join them to On-Demand pricing. You can modify this part to select 3 year or Partial or All-Upfront rates.
Go to the Glue Service page:
Click Crawlers from the left menu:
Click Add crawler:
Enter a crawler name of OD_Pricing and click Next:
Ensure Data stores is the source type, click Next:
Click the folder icon to list the S3 folders in your account:
Expand the bucket which contains your pricing folders, and select the folder name od_pricedata, click Select:
Click Next:
Click Next:
Create an IAM role with a name of SPToolPricing, click Next:
Leave the frequency as Run on demand, and click Next:
Click on Add database:
Enter a database name of pricing, and click Create:
Click Next:
Click Finish:
Select the crawler OD_Pricing and click Run crawler:
Once its run, you should see tables created:
Repeat Steps 3 through to 17 with the following details:
Open the IAM Console in a new tab, click Policies:
Click on the AWSGlueServiceRole-SPToolPricing role:
Type in SPTool and click on the policy name AWSGlueServiceRole-SPTool:
Click Edit policy:
Click JSON:
Edit the Resource line by removing the od_pricedata folder to leave the bucket:
Click Review policy:
Click Save changes:
Go back to the Glue console, select the SP_Pricing crawler, click Run crawler:
Click on Databases:
Click on Pricing:
Click Tables in pricing:
Click od_pricedata:
Click Edit schema:
Click double next to col9:
Select string and click Update:
Click Save:
You have successfully setup the pricing data source. We have a database of on demand and Savings Plans rates.
Now that you have completed this lab, make sure to update your Well-Architected review if you have implemented these changes in your workload.
Click here to access the Well-Architected Tool