1.1 Config Changes Required for the data transfer from HDFS to S3 and vice versa:

1.1 Config Changes Required for the data transfer from HDFS to S3 and vice versa:

In order to protect your AWS Access and Secret Keys, you can add them first in the site files of your hadoop cluster as below. Make the below changes in the configuration files. Mostly these files would be present in /etc/hadoop/conf and /etc/hive/conf directories.

Hdfs-site.xml and Hive-site.xml files

<property>
<name>fs.s3a.access.key</name>
<value>YOUR_ACCESS_KEY</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>YOUR_SECRET_KEY</value>
</property>
  • Restart all affected Services (NameNode, YARN, MapReduce, Hive, Oozie, etc)

  • Assign the Administration Access Policy to the AWS User.

  • After this please verify that S3 Endpoint Bucket is accessible from your On Prem Hadoop Cluster as below.

hdfs dfs -ls s3a://<bucket-name>/*
hdfs dfs -mkdir s3a://<bucket-name>/newdirectory
hdfs dfs -put /opt/mnt/<local_file.csv> s3a://<bucket-name>/newdirectory

results for ""

    No results matching ""