Automated Pipeline | Kearney AI Sandbox

What is this and why does it matter?

One of Kearney's core service offerings is helping federal agencies detect anomalies in their financial systems: unusual transactions, access patterns, or records that warrant closer review. Building and validating that kind of detection capability traditionally requires substantial setup time before any testing can happen.

The sandbox pipeline removes that obstacle. It automates the entire process from data generation through evaluation, so you can test a detection approach end-to-end in a single run and see concrete results within a few minutes.

What this is not

This pipeline runs on synthetic data only. The results are for internal validation only, not for client deliverables. The path from sandbox results to a client-ready capability requires internal review and formal accreditation.

What the pipeline does

Stage 1

Generate Synthetic Data

Creates realistic federal financial records mimicking PBIS and STARS-FL formats. A configurable percentage are seeded as anomalies.

Stage 2

Normalize to OCSF

Converts raw records to the Open Cybersecurity Schema Framework, a standard format used across the federal security community.

Stage 3

Score for Anomalies

Runs an IsolationForest algorithm to score each record. High scores indicate unusual patterns that deviate from normal behavior.

Stage 4

Evaluate Results

Compares the model's predictions against the known anomaly labels and computes precision, recall, and F1 score metrics.

What you get at the end

A JSON evaluation report saved to your S3 model artifacts bucket. The report shows how accurately the detection model identified the seeded anomalies and provides per-record scores for inspection. An email notification is sent to the sandbox owner when the run completes.

How to run the pipeline

The pipeline is triggered by invoking the inference harness Lambda function with a batch_test mode event. This can be done from your SageMaker Studio notebook using the AWS SDK, or from the AWS Lambda console directly.

Your team lead can provide a starter notebook that includes the invocation code. The default run generates 200 synthetic records with a 5% anomaly rate. Both values are configurable.

Viewing results

Results are written to the model artifacts S3 bucket under the experiments/ prefix, organized by run ID. You can browse and download them from the AWS S3 console or read them directly in a notebook. The evaluation report is a structured JSON file readable in any text editor or imported into Excel.

Monitoring runs in CloudWatch

Every pipeline run produces structured log entries in CloudWatch Logs under the /kearney/sandbox/kearney-ai/experiments log group. If a run fails, this is the first place to check. The sandbox CloudWatch dashboard shows invocation counts, duration, and error rates for quick health monitoring.

Questions about the pipeline

The pipeline is primarily used by the cybersecurity engineering team. If you are a financial auditor exploring the sandbox, the Prompt Engineering Assistant on the home page is likely a better starting point for your work. Contact sandbox-support@kearneyco.com if you want to set up a pipeline demonstration.

The Automated Detection Pipeline

What is this and why does it matter?

What the pipeline does

What you get at the end

How to run the pipeline

Viewing results

Monitoring runs in CloudWatch