Lab complete!
Now that you have completed this lab, make sure to update your Well-Architected review if you have implemented these changes in your workload.
Click here to access the Well-Architected Tool
If AWS updates pricing table with new column, values might get out of sync. Please contact costoptimization@amazon.com if you encounter any problems
We will combine the pricing information with our Cost and Usage Report (CUR). This will give us a usage data source which contains a summary of your usage at an hourly level, with multiple pricing dimensions.
Go to the Athena service page:
Copy the following query to Athean:
CREATE EXTERNAL TABLE `od_pricedata`(
`sku` string,
`offertermcode` string,
`ratecode` string,
`termtype` string,
`pricedescription` string,
`effectivedate` string,
`startingrange` string,
`endingrange` string,
`unit` string,
`priceperunit` double,
`currency` string,
`relatedto` string,
`leasecontractlength` string,
`purchaseoption` string,
`offeringclass` string,
`product family` string,
`servicecode` string,
`location` string,
`location type` string,
`instance type` string,
`current generation` string,
`instance family` string,
`vcpu` string,
`physical processor` string,
`clock speed` string,
`memory` string,
`storage` string,
`network performance` string,
`processor architecture` string,
`storage media` string,
`volume type` string,
`max volume size` string,
`max iops/volume` string,
`max iops burst performance` string,
`max throughput/volume` string,
`provisioned` string,
`tenancy` string,
`ebs optimized` string,
`operating system` string,
`license model` string,
`group` string,
`group description` string,
`transfer type` string,
`from location` string,
`from location type` string,
`to location` string,
`to location type` string,
`usagetype` string,
`operation` string,
`availabilityzone` string,
`capacitystatus` string,
`classicnetworkingsupport` string,
`dedicated ebs throughput` string,
`ecu` string,
`elastic graphics type` string,
`enhanced networking supported` string,
`from region code` string,
`gpu` string,
`gpu memory` string,
`instance` string,
`instance capacity - 10xlarge` string,
`instance capacity - 12xlarge` string,
`instance capacity - 16xlarge` string,
`instance capacity - 18xlarge` string,
`instance capacity - 24xlarge` string,
`instance capacity - 2xlarge` string,
`instance capacity - 32xlarge` string,
`instance capacity - 4xlarge` string,
`instance capacity - 8xlarge` string,
`instance capacity - 9xlarge` string,
`instance capacity - large` string,
`instance capacity - medium` string,
`instance capacity - metal` string,
`instance capacity - xlarge` string,
`instancesku` string,
`intel avx2 available` string,
`intel avx available` string,
`intel turbo available` string,
`marketoption` string,
`normalization size factor` string,
`physical cores` string,
`pre installed s/w` string,
`processor features` string,
`product type` string,
`region code` string,
`resource type` string,
`servicename` string,
`snapshotarchivefeetype` string,
`to region code` string,
`volume api name` string,
`vpcnetworkingsupport` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
'quoteChar'='\"',
'separatorChar'=',')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://bucketname/od_pricedata/'
TBLPROPERTIES (
'CrawlerSchemaDeserializerVersion'='1.0',
'CrawlerSchemaSerializerVersion'='1.0',
'UPDATED_BY_CRAWLER'='Pricing CSV',
'areColumnsQuoted'='true',
'averageRecordSize'='1061',
'classification'='csv',
'columnsOrdered'='true',
'compressionType'='none',
'delimiter'=',',
'objectCount'='1',
'recordCount'='2089892',
'sizeKey'='2217375799',
'skip.header.line.count'='6',
'typeOfData'='file')
Change the bucketname to your bucket name and click Run:
Run the following query to create a single pricing data source, combining the OD and SP pricing:
CREATE VIEW pricing.pricing AS SELECT
sp.location AS Region,
sp.discountedoperation AS OS,
od."Instance Type" InstanceType,
od.Tenancy Tenancy,
od.priceperunit ODRate,
sp.discountedrate AS SPRate
FROM pricing.sp_pricedata sp
JOIN pricing.od_pricedata od ON
((sp.discountedusagetype = od.usageType)
AND (sp.discountedoperation = od.operation))
WHERE od.priceperunit IS NOT NULL AND
sp.location NOT LIKE '%Any%'
AND sp.purchaseoption LIKE 'No Upfront'
AND sp.leasecontractlength = 1
and od.TermType = 'OnDemand'
group by 1,2,3,4,5,6
Next we’ll join the CUR file with that pricing source as a view. Edit the following query, replace costmaster.costmasterfile with your existing database name and tablename of your CUR, then run the rollowing query:
CREATE VIEW costmaster.SP_Usage AS SELECT
costmaster.line_item_usage_account_id,
costmaster.line_item_usage_start_date,
to_unixtime(costmaster.line_item_usage_start_date) AS EpochTime,
costmaster.product_instance_type,
costmaster.product_location,
costmaster.product_operating_system,
costmaster.product_tenancy,
SUM(costmaster.line_item_unblended_cost) AS ODPrice,
SUM(costmaster.line_item_unblended_cost*(cast(pr.SPRate AS double)/cast(pr.ODRate AS double))) SPPrice,
abs(SUM(cast(pr.SPRate AS double)) - SUM (cast(pr.ODRate AS double))) / SUM(cast(pr.ODRate AS double))*100 AS DiscountRate,
SUM(costmaster.line_item_usage_amount) AS InstanceCount
FROM costmaster.costmasterfile costmaster
JOIN pricing.pricing pr ON (costmaster.product_location = pr.Region)
AND (costmaster.line_item_operation = pr.OS)
AND (costmaster.product_instance_type = pr.InstanceType)
AND (costmaster.product_tenancy = pr.Tenancy)
WHERE costmaster.line_item_product_code LIKE '%EC2%'
AND costmaster.product_instance_type NOT LIKE ''
AND costmaster.product_operating_system NOT LIKE 'NA'
AND costmaster.line_item_unblended_cost > 0
AND costmaster.line_item_line_item_type like 'Usage'
GROUP BY costmaster.line_item_usage_account_id,
costmaster.line_item_usage_start_date,
costmaster.product_instance_type,
costmaster.product_location,
costmaster.product_operating_system,
costmaster.product_tenancy
ORDER BY costmaster.line_item_usage_start_date ASC,
DiscountRate DESC
The code above will capture ONLY on-demand usage, as defined by costmaster.line_item_line_item_type like ‘Usage’. You can remove this to include Savings Plan usage, to see total commitment you should have, instead of additional commitment required.
Verify the data source is setup by editing the following query, replace costmaster. with the name of the database and run the following query:
SELECT * FROM costmaster.sp_usage limit 10;
You now have your usage data source setup with your pricing dimensions. You can modify the queries above to add or remove any columns you want in the view, which can later be used to visualize the data, for example tags.
Now that you have completed this lab, make sure to update your Well-Architected review if you have implemented these changes in your workload.
Click here to access the Well-Architected Tool