Skip to content

AWS Permissions

Harrier should run with read-only AWS permissions scoped to the runtimes and log locations it investigates.

EMR On EC2

Required read actions:

elasticmapreduce:DescribeCluster
elasticmapreduce:DescribeStep
elasticmapreduce:ListInstanceGroups
elasticmapreduce:ListSteps
s3:ListBucket
s3:GetObject
cloudwatch:GetMetricStatistics

EMR Serverless

Required read actions:

emr-serverless:GetApplication
emr-serverless:GetJobRun
emr-serverless:ListJobRunAttempts
s3:ListBucket
s3:GetObject
logs:DescribeLogStreams
logs:GetLogEvents
cloudwatch:GetMetricStatistics

EMR On EKS

Required read actions:

emr-containers:DescribeVirtualCluster
emr-containers:DescribeJobRun
emr-containers:ListJobRuns
s3:ListBucket
s3:GetObject
logs:DescribeLogStreams
logs:GetLogEvents

Optional Kubernetes diagnostics require namespace-scoped read access to pods, pod logs, and events. Kubernetes access is best-effort and not required for baseline EMR on EKS log investigation.

Example IAM Policy Shape

Scope S3 and CloudWatch Logs resources to known log locations whenever possible. Some EMR APIs require Resource: "*", so pair them with region/account controls and narrow Agent Space membership.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EmrRead",
      "Effect": "Allow",
      "Action": [
        "elasticmapreduce:DescribeCluster",
        "elasticmapreduce:DescribeStep",
        "elasticmapreduce:ListInstanceGroups",
        "elasticmapreduce:ListSteps",
        "emr-serverless:GetApplication",
        "emr-serverless:GetJobRun",
        "emr-serverless:ListJobRunAttempts",
        "emr-containers:DescribeVirtualCluster",
        "emr-containers:DescribeJobRun",
        "emr-containers:ListJobRuns"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "ap-southeast-2"
        }
      }
    },
    {
      "Sid": "LogBucketRead",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": "arn:aws:s3:::example-emr-log-bucket",
      "Condition": {
        "StringLike": {
          "s3:prefix": [
            "emr/*",
            "emr-serverless/*",
            "emr-eks/*"
          ]
        }
      }
    },
    {
      "Sid": "LogObjectRead",
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": [
        "arn:aws:s3:::example-emr-log-bucket/emr/*",
        "arn:aws:s3:::example-emr-log-bucket/emr-serverless/*",
        "arn:aws:s3:::example-emr-log-bucket/emr-eks/*"
      ]
    },
    {
      "Sid": "CloudWatchRead",
      "Effect": "Allow",
      "Action": [
        "logs:DescribeLogStreams",
        "logs:GetLogEvents",
        "cloudwatch:GetMetricStatistics",
        "cloudwatch:GetMetricData",
        "cloudwatch:ListMetrics"
      ],
      "Resource": "*"
    }
  ]
}

Optional Kubernetes RBAC

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: harrier-pod-reader
  namespace: emr-jobs
rules:
  - apiGroups: [""]
    resources: ["pods", "pods/log"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["", "events.k8s.io"]
    resources: ["events"]
    verbs: ["get", "list", "watch"]

Harrier must not create, patch, delete, exec into, port-forward to, or restart Kubernetes resources.