Demo Lab
Harrier EMR Demo Lab is the companion repository for controlled AWS validation. It creates disposable EMR incidents so Harrier can be tested against real Spark failure evidence instead of only synthetic fixtures.
The production MCP server lives in harrier-emr-mcp. The demo lab owns the AWS infrastructure, Spark jobs, scenario runners, expected findings, validation harness, alarms, cleanup scripts, and cost-control docs.
What It Covers
| Runtime | Coverage | Example Evidence |
|---|---|---|
| EMR on EC2 | Broad scenario coverage | EMR steps, YARN application IDs, S3 logs, CloudWatch metrics |
| EMR Serverless | Focused Spark failure coverage | job run metadata, S3 logs, CloudWatch logs and metrics |
| EMR on EKS | Focused Spark and Kubernetes coverage | EMR Containers job runs, pod status, S3 logs, CloudWatch logs |
| MWAA local runner | Orchestration demo | scenario DAGs packaged for ECS Fargate |
Scenario Catalog
| Scenario | Runtime Coverage | Expected Finding |
|---|---|---|
happy_path |
EC2, Serverless, EKS | success |
executor_oom |
EC2, Serverless, EKS | EXECUTOR_OOM |
driver_oom |
EC2 | DRIVER_OOM |
missing_dependency |
EC2, Serverless | DEPENDENCY_MISSING |
s3_access_denied |
EC2, EKS | S3_ACCESS_DENIED |
s3_path_missing |
EC2, Serverless | S3_PATH_MISSING |
bad_input_data |
EC2, Serverless | BAD_INPUT_DATA |
image_pull_failure |
EKS | EKS_IMAGE_PULL_FAILURE |
pod_pending_resource_pressure |
EKS | EKS_POD_PENDING |
The demo lab also includes EC2-focused data, SQL, storage, and Livy scenarios
such as data_skew, hdfs_full, db_connection_failure, db_lock_timeout,
db_large_join_spill, db_bad_sql_plan, and livy_session_failure.
Validation Flow
flowchart LR
Deploy["Deploy demo infra"] --> Scenario["Run scenario"]
Scenario --> Context["Export Harrier context"]
Context --> MCP["Call Harrier MCP"]
MCP --> Compare["Compare expected finding"]
Compare --> Report["Write validation report"]
Report --> Cleanup["Cleanup or destroy"]
Validation reports are written under .harrier-demo/validation/ in the demo
repository and should not be committed.
Safety First
The demo lab creates real AWS resources and real AWS cost. Use a sandbox account, review the cost and cleanup docs before deployment, and destroy resources when validation is complete.
Repository
The demo lab repository is:
https://github.com/the-platform-layer/harrier-emr-demo-lab
If you have access to the repository, start with its README, cost controls, scenario catalog, and cleanup guide before running live workloads.