Level 300: Automated CUR Updates and Ingestion
- Nathan Besh, Cost Lead, Well-Architected
- Derrick Gold, Software Development Engineer, AWS Insights
If you wish to provide feedback on this lab, there is an error, or you want to make a suggestion, please email: firstname.lastname@example.org
Table of Contents
This step is used when there is a single CUR being delivered, and have it automatically update Athena/Glue when there are new versions and new months data.
We will follow the steps here: https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/setting-up-athena.html#use-athena-cf to implement the CloudFormation template, which will automatically update existing CURs, and include new CURs when they are delivered.
NOTE: IAM roles will be created, these are used to: - Add event notification to existing S3 buckets - Create s3 buckets and upload objects - Create and run a Glue crawler - Create and update a Glue database and tables
- Please review the CloudFormation template with your security team.
We will build the following solution:
Log into the console as an IAM user with the required permissions. Go to the S3 dashboard, go to the bucket and folders which contain your CUR file. Open the CloudFormation(CF) file and save it locally:
Here is a sample of the CF file:
Go to the CloudFormation dashboard and create a stack:
Load the template and click Next:
Specify the details for the stack and click Next:
Review the configuration, click I acknowledge that AWS CloudFormation might create IAM resources, and click Create stack:
You will see the stack will start in CREATE_IN_PROGRESS:
Once complete, the stack will show CREATE_COMPLETE:
Click on Resources to view the resources that it will create:
Go to the AWS Glue dashboard:
Click on Databases and click the database starting with athenacurcfn:
View the table within that database and its properties:
You will see that the table is populated, the recordCount should be greater than 0. You can now go to Athena and load the partitions and view the cost and usage reports.
This step is used when there are multiple CURs being delivered into the same bucket - for example a CUR with hourly granularity and one with daily granularity. This will automatically update Athena/Glue when there are new versions and new months data for both reports.
The easiest way to work with multiple CURs is to deliver each CUR to a different S3 bucket, and follow the previous process. If you must deliver to a single bucket, configure your CURs with different prefixes or folders and follow this process.
- Log into the console as an IAM user with the required permissions, verify you have multiple CURs with different prefixes being delivered into the same bucket. We will have the following configuration:
Format: <bucket name>/<prefix>/<report_name>/ Configuration: <bucket name>/DailyCUR/daily/ <bucket name>/HourlyCUR/hourly/
Open the S3 console, and navigate to one of the directories where CURs are stored. Open and save the crawler-cfn.yml file:
Open the file in your favourite text editor
Modify the following lines to remove all references to the prefix or report name. Replace the first line with the second in each case:
Name: 'athenacurcfn_daily' Name: 'athenacurcfn'
Resource: arn:aws:s3:::<bucket name>/DailyCUR/daily/daily* Resource: arn:aws:s3:::<bucket name>*
Name: AWSCURCrawler-daily Name: AWSCURCrawler
Path: 's3://<bucket name>/DailyCUR/daily/daily' Path: 's3://<bucket name>'
and under Exclusions after .zip add:
ReportKey: 'DailyCUR/daily/daily' ReportKey: ''
DatabaseName: athenacurcfn_daily DatabaseName: athenacurcfn
Location: 's3://<bucket name>/DailyCUR/daily/cost_and_usage_data_status/' Location: 's3://<bucket name>/cost_and_usage_data_status/'
A modified sample is provided here: Code/crawler-cfn.yml Look for the comments: ### New line
Save the template file.
Go to the CloudFormation dashboard and execute the template you just created
Go to the Glue dashboard and verify that there is a single database, containing multiple tables:
Delete the Glue database, select the database name, click Action and click Delete database:
Delete the CloudFormation stack, select the stack, click Actions and click Delete stack: