Security Model

Harrier reads EMR metadata, EMR Serverless metadata, EMR Containers metadata, S3 archived logs, CloudWatch Logs, CloudWatch metrics, selected repository files, selected read-only database diagnostics, and IAM role metadata.

Harrier must not directly change IAM policies, DB schemas, EMR clusters, data, or production configuration.

Write-capable tools must be separately allowlisted. PR preview is the default behavior.

harrier_prepare_pr can create a GitHub PR only when all write gates pass:

request includes dry_run=false
request includes allow_pr_creation=true
server config has HARRIER_ALLOW_PR_CREATION=true
target repo appears in HARRIER_PR_REPO_ALLOWLIST
GitHub token is available through HARRIER_GITHUB_TOKEN or GITHUB_TOKEN

PR creation creates a branch and commits generated files under docs/harrier/. Patch hints are committed as a generated patch-plan document, and may also update source files only when the recommendation maps to a machine-applicable, allowlisted edit. Harrier does not merge PRs.

For DB scenarios, Harrier may run read-only diagnostics and suggest PR-ready SQL migrations or Spark JDBC code/config changes. It must not execute schema changes, write data, kill sessions, or apply migrations directly.

For Airflow or MWAA orchestration scenarios, Harrier may suggest PR-ready DAG changes such as sensors, pools, max_active_runs, timeouts, EMR step argument wiring, or context artifact updates. It must not trigger DAGs, mutate Airflow Variables or Connections, or change production schedules directly.

For runtime-specific recommendations, Harrier may suggest reviewed patch-plan changes for EMR Serverless job configuration, EMR Serverless/EKS monitoring configuration, EMR on EKS pod templates, or EMR on EKS container images. It must not update live applications, submit jobs, change Kubernetes resources, push images, or alter log retention directly.

For SQL plan scenarios, Harrier should prefer bounded read-only diagnostics such as EXPLAIN (FORMAT JSON) over EXPLAIN ANALYZE unless a query is known to be safe and bounded.

For long-running jobs, Harrier may inspect read-only Spark/YARN/EMR state, CloudWatch metrics, logs, Spark event evidence, repository code, and read-only DB diagnostics. It must not kill jobs, resize clusters, change Spark configuration, or apply database changes directly. Recommended changes should be PR-ready suggestions or explicit operator runbook steps.

For S3 log evidence, Harrier must read bounded byte ranges only, redact likely secrets before storing excerpts, and flag prompt-like text embedded in logs as untrusted evidence.

The Initial Diagnosis Report is generated from the same redacted structured evidence. human_report_markdown may include bounded log excerpts, but those excerpts remain evidence, not instructions. Agents and operators must not execute commands, URLs, SQL, shell fragments, IAM snippets, or Kubernetes fragments that appear inside a log excerpt unless they are separately reviewed as an intentional remediation plan.

Diagnosis statuses are triage signals only. PASS, ISSUE, WARN, UNKNOWN, and NOT_CHECKED should not authorize mutations. They only guide the next detailed investigation step.

Runtime AWS Permissions

Harrier should run with read-only permissions scoped to the account, region, and log buckets/groups needed for the requested runtime. The MCP caller supplies identifiers; Harrier should not enumerate unrelated fleets unless a future tool explicitly adds discovery.

EMR on EC2 investigations require read access to:

elasticmapreduce:DescribeCluster
elasticmapreduce:DescribeStep
elasticmapreduce:ListInstanceGroups
elasticmapreduce:ListSteps
s3:ListBucket
s3:GetObject
cloudwatch:GetMetricStatistics

EMR Serverless investigations require read access to:

emr-serverless:GetApplication
emr-serverless:GetJobRun
emr-serverless:ListJobRunAttempts
s3:ListBucket
s3:GetObject
logs:DescribeLogStreams
logs:GetLogEvents
cloudwatch:GetMetricStatistics

EMR on EKS investigations require read access to:

emr-containers:DescribeVirtualCluster
emr-containers:DescribeJobRun
emr-containers:ListJobRuns
s3:ListBucket
s3:GetObject
logs:DescribeLogStreams
logs:GetLogEvents

These permissions should be resource-scoped where AWS supports it. S3 access should be limited to configured EMR log buckets and prefixes. CloudWatch Logs access should be limited to known EMR Serverless and EMR on EKS log groups.

The Terraform ECS task role grants the read actions needed by all three supported runtimes. Tighten the broad resources for production accounts by replacing wildcard S3 and Logs resources with the known log buckets, prefixes, and log groups used by the workloads the Agent Space may investigate.

Example read-only policy shape for a single account and region:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EmrEc2Read",
      "Effect": "Allow",
      "Action": [
        "elasticmapreduce:DescribeCluster",
        "elasticmapreduce:DescribeStep",
        "elasticmapreduce:ListInstanceGroups",
        "elasticmapreduce:ListSteps"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "ap-southeast-2"
        }
      }
    },
    {
      "Sid": "EmrServerlessRead",
      "Effect": "Allow",
      "Action": [
        "emr-serverless:GetApplication",
        "emr-serverless:GetJobRun",
        "emr-serverless:ListJobRunAttempts"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "ap-southeast-2"
        }
      }
    },
    {
      "Sid": "EmrContainersRead",
      "Effect": "Allow",
      "Action": [
        "emr-containers:DescribeVirtualCluster",
        "emr-containers:DescribeJobRun",
        "emr-containers:ListJobRuns"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "ap-southeast-2"
        }
      }
    },
    {
      "Sid": "LogBucketList",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": "arn:aws:s3:::example-emr-log-bucket",
      "Condition": {
        "StringLike": {
          "s3:prefix": [
            "emr/*",
            "emr-serverless/*",
            "emr-eks/*"
          ]
        }
      }
    },
    {
      "Sid": "LogObjectRead",
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": [
        "arn:aws:s3:::example-emr-log-bucket/emr/*",
        "arn:aws:s3:::example-emr-log-bucket/emr-serverless/*",
        "arn:aws:s3:::example-emr-log-bucket/emr-eks/*"
      ]
    },
    {
      "Sid": "CloudWatchLogsRead",
      "Effect": "Allow",
      "Action": [
        "logs:DescribeLogStreams",
        "logs:GetLogEvents"
      ],
      "Resource": [
        "arn:aws:logs:ap-southeast-2:111122223333:log-group:/aws/elasticmapreduce/*:*",
        "arn:aws:logs:ap-southeast-2:111122223333:log-group:/harrier-demo/emr-serverless:*",
        "arn:aws:logs:ap-southeast-2:111122223333:log-group:/harrier-demo/emr-eks:*"
      ]
    },
    {
      "Sid": "CloudWatchMetricsRead",
      "Effect": "Allow",
      "Action": [
        "cloudwatch:GetMetricData",
        "cloudwatch:GetMetricStatistics",
        "cloudwatch:ListMetrics"
      ],
      "Resource": "*"
    }
  ]
}

Some AWS APIs do not support resource-level scoping for every read action. In those cases, keep Resource: "*" but scope by account, region, deployment role, and Agent Space membership. Validate final policies with IAM Access Analyzer or the IAM policy simulator before enabling a broad production Agent Space.

Optional Kubernetes Access

Optional Kubernetes diagnostics are available for EMR on EKS pod state. Kubernetes access is not required for baseline EKS investigation and must be treated as best-effort read-only evidence.

If enabled, Kubernetes permissions should be limited to the target namespace and to read-only verbs for pod diagnostics:

get
list
watch

Suggested Kubernetes resources:

pods
pods/log
events

Harrier must not create, patch, delete, exec into, port-forward to, or restart Kubernetes resources. Pod status, restart counts, waiting reasons, OOMKilled markers, ImagePullBackOff, Evicted, and scheduling failures are evidence only.

Namespace-scoped RBAC example:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: harrier-pod-reader
  namespace: emr-jobs
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: harrier-pod-reader
  namespace: emr-jobs
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get"]
  - apiGroups: ["", "events.k8s.io"]
    resources: ["events"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: harrier-pod-reader
  namespace: emr-jobs
subjects:
  - kind: ServiceAccount
    name: harrier-pod-reader
    namespace: emr-jobs
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: harrier-pod-reader

When Harrier runs outside the EKS cluster, map the MCP runtime's IAM principal to a Kubernetes user or group through the cluster's normal access mechanism, then bind that subject to the same read-only Role. The default ECS container image does not require Kubernetes access. It can still investigate EMR on EKS metadata and logs without pod diagnostics.

Kubernetes client behavior:

If the optional Python kubernetes package is installed, Harrier tries in-cluster config first and then kubeconfig, using target.eks_cluster_name as the requested context when supplied.
If the Python package is unavailable or kubeconfig cannot load, Harrier falls back to kubectl when it is available on PATH. It tries the supplied context first, then the current context.
If no Kubernetes client or readable config is available, Harrier returns a recoverable NOT_CONFIGURED warning and continues.
If RBAC blocks pod reads, Harrier returns a recoverable PERMISSION_DENIED warning and continues.