If you have ever tried to troubleshoot connecting EMR to a persistent remote metastore, you know it can be challenging. Here are the steps I've taking to test changes.
1. SSH into the master node of the cluster
2. sudo cp /usr/lib/hive/conf/hive-site.xml /usr/lib/hive/conf/hive-site.xml.old
2. sudo vi /usr/lib/hive/conf/hive-site.xml
3. Make changes as needed
4. ps -ef | grep metastore
5. kill <pid returned from previous step>
6. nohup hive --service metastore &
7. beeline -h jdbc:hive2://localhost:10000 -u hadoop
Now test your changes. Once you iterate and get the correct hive-site.xml settings, you can put them in your EMR config file and try launching a fresh cluster.
Friday, March 3, 2017
Tag-Based Security for EMR Clusters in a Shared AWS Environment
In an shared AWS environment with multiple developers, you often want to ensure developers have the ability to launch personal resources, and embed sensitive information into those resources, without the fear that sensitive information would be visible to your entire group of developers. Consider the following scenario:
Therefore the only modifications needed were
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "statement1",
"Condition": {
"StringEquals": {
"elasticmapreduce:RequestTag/owner": "${aws:username}"
}
},
"Effect": "Allow",
"Action": [
"iam:PassRole",
"iam:AddRoleToInstanceProfile"
],
"Resource": [
"arn:aws:iam::012345678910:role/EMR_DefaultRole",
"arn:aws:iam::012345678910:role/EMR_EC2_DefaultRole"
]
},
{
"Sid": "statement2",
"Condition": {
"StringNotEquals": {
"elasticmapreduce:ResourceTag/owner": "${aws:username}"
}
},
"Effect": "Deny",
"Action": [
"elasticmapreduce:AddTags",
"elasticmapreduce:Describe*"
],
"Resource": [
"*"
]
}
]
}
The key to tag-based authorization that was always missing from the picture in my mind was the RequestTag element and an IAM policy variable within a conditional allow statement. Statement 1 above effectively states that a developer is only allowed these elevated IAM privileges if he or she puts a tag on their resource request where tag name = owner and tag value = their username. This means they must put their username on every EMR resource request they make or the request will fail. Statement 2 above denies the developer the ability to add or remove tags, or to describe EMR resource unless the resource contains a tag where tag name = owner and tag value = their username.
- Developers want to launch personal resources for developing code
- Developers need to embed personal credentials for external services like databases into the services they launch, either by supplying EC2 user data at launch or via a resource config file like the EMR config file
- You do not want developers to see any resources or sensitive information other than their own
- You want an account admin to be able see all resources even if that admin is not an account owner
Therefore the only modifications needed were
- Allow users to add a limited set of IAM roles to EC2 instances
- Prevent users from seeing EMR service information for clusters they did not launch
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "statement1",
"Condition": {
"StringEquals": {
"elasticmapreduce:RequestTag/owner": "${aws:username}"
}
},
"Effect": "Allow",
"Action": [
"iam:PassRole",
"iam:AddRoleToInstanceProfile"
],
"Resource": [
"arn:aws:iam::012345678910:role/EMR_DefaultRole",
"arn:aws:iam::012345678910:role/EMR_EC2_DefaultRole"
]
},
{
"Sid": "statement2",
"Condition": {
"StringNotEquals": {
"elasticmapreduce:ResourceTag/owner": "${aws:username}"
}
},
"Effect": "Deny",
"Action": [
"elasticmapreduce:AddTags",
"elasticmapreduce:Describe*"
],
"Resource": [
"*"
]
}
]
}
The key to tag-based authorization that was always missing from the picture in my mind was the RequestTag element and an IAM policy variable within a conditional allow statement. Statement 1 above effectively states that a developer is only allowed these elevated IAM privileges if he or she puts a tag on their resource request where tag name = owner and tag value = their username. This means they must put their username on every EMR resource request they make or the request will fail. Statement 2 above denies the developer the ability to add or remove tags, or to describe EMR resource unless the resource contains a tag where tag name = owner and tag value = their username.
Subscribe to:
Posts (Atom)