Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
600 views
in Technique[技术] by (71.8m points)

scala - Write dataframe as csv to S3 with kms encrypted keys without providing key

I have created CSV files through spark dataframe which are getting KMS encrypted automatically.

For your reference, I am giving a sample code snippet that is creating these KMS encrypted files. If you see while writing I am not giving any KMS key. It will be really helpful if you tell the root cause.

val df=spark.read.format("csv").option("header", "true").load("s3:///test/App_IP.csv")
df.createOrReplaceTempView("test")
val df1=spark.sql("select name from test")
df1.coalesce(1).write.format("com.databricks.spark.csv").option("header", "true").save("s3://test/city5/")

This code I am executing from spark-shell in EMR cluster (emr-5.24.0), spark version is Spark 2.4.2

question from:https://stackoverflow.com/questions/65890643/write-dataframe-as-csv-to-s3-with-kms-encrypted-keys-without-providing-key

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use S3 Encryption as described in the EMR docs Amazon S3 Server-Side Encryption:

fs.s3.enableServerSideEncryption: When set to true, objects stored in Amazon S3 are encrypted using server-side encryption. If no key is specified, SSE-S3 is used. fs.s3.serverSideEncryption.kms.keyId: Specifies an AWS KMS key ID or ARN. If a key is specified, SSE-KMS is used.

Create a cluster with SSE-S3 enabled:

aws emr create-cluster --release-label emr-5.24.0 
--instance-count 3 --instance-type m5.xlarge --emrfs Encryption=ServerSide

Create a cluster with SSE-KMS enabled:

aws emr create-cluster --release-label emr-5.24.0  --instance-count 3 
--instance-type m5.xlarge --use-default-roles 
--emrfs Encryption=ServerSide,Args=[fs.s3.serverSideEncryption.kms.keyId=<keyId>]

Or by providing a cluster configuration JSON :

[
  ...
   {
    "Classification":"emrfs-site",
    "Properties": {
       "fs.s3.enableServerSideEncryption": "true",
       "fs.s3.serverSideEncryption.kms.keyId":"<keyId>"
    }
  }
]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...