Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
624 views
in Technique[技术] by (71.8m points)

amazon web services - Clear All Existing Entries In DynamoDB Table In AWS Data Pipeline

My goal is to take daily snapshots of an RDS table and put it in a DynamoDB table. The table should only contain data from a single day.

For this have a Data Pipeline set up to query a RDS table and publish the results into S3 in CSV format.

Then a HiveActivity imports this CSV into a DynamoDB table by creating external tables for the file and an existing DynamoDB table.

This works great, but older entries from the previous day still exist in the DynamoDB table. I want to do this within Data Pipeline if at all possible. I need to:

1) Find a way to clear the DynamoDB table, or at least drop/recreate it, or 2) Include an extra column of the snapshot date and find a way to clear out all older entries.

Any ideas on how I can do this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use DynamoDb Time to Live(TTL) which allows you to set an expiration time after which items are auto deleted from the DynamoDb table. TTL is very useful for cases where data loses it's relevance after a specific time period and in your case it can be start time of next day.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...