How Ideas2IT Built the Agentic AI System That Turns Oncology Guideline PDFs into Deterministic Treatment Pathways for One of South Florida's Largest Public Health Systems

A large public health system needed to reduce the decision friction oncologists face when mapping pathology reports to therapy regimens. Ideas2IT built a two-pipeline agentic AI system on Azure AI Foundry that converts MOFFIT guideline PDFs into deterministic decision logic and produces explainable treatment recommendations from real-world biomarker inputs.

Client

Major US Healthtech

Industry

Healthcare

Service

Agentic AI

Artificial Intelligence

Location

South Florida, USA

Stack

Azure AI Foundry · GPT-4.1 · Python

01 Challenge

Oncologists at a large public health system were navigating complex guideline PDFs with branching logic, clinical qualifiers, and exception paths that don't translate cleanly into point-of-care decisions. Biomarker values arrived in inconsistent formats, prior treatment history changed the valid option set, and no system existed that could traverse multiple treatment lines while explaining every regimen it accepted or rejected.

02 Solution

Ideas2IT designed an agentic decision workflow with two separated pipelines: a batch pipeline that converts MOFFIT guideline PDFs into structured decision tree logic, and a real-time inference pipeline that takes pathology and biomarker inputs and returns ranked regimen shortlists with full reasoning. GPT-4.1 on Azure AI Foundry handles inference. Entity matching replaces vector search, making context retrieval deterministic and auditable.

03 Outcome

The platform converts unstructured oncology guideline PDFs into traversable decision logic, handles biomarker inputs across inconsistent formats and high-combination scenarios, and produces explainable regimen recommendations with explicit reasoning for every selection and rejection across multiple lines of treatment.

Phase 01

Turning unstructured clinical guidelines into model-ready decision context

Guideline Structuring Pipeline: Converting PDF Pathways into Deterministic Decision Logic

The first engineering decision set the constraint for everything that followed: a general-purpose language model cannot safely traverse oncology guidelines that live as branching logic inside PDFs. The source had to be converted before inference was possible.

Ideas2IT built

  1. a batch pipeline that takes MOFFIT guideline PDFs and converts them into structured decision-tree representations the inference layer can traverse with constraints.
  2. All conditions, prior treatment checks, and biomarker references cited in each pathway were extracted upfront during this batch phase.
  3. Pulling that extraction out of the inference loop reduced runtime complexity and made real-time recommendations faster and less error-prone.

The output of this pipeline is a structured pathway the agent can walk, step by step, with constraints enforced at each node.

This phase produced

  • Batch PDF-to-decision-tree conversion pipeline
  • Structured pathway representations of MOFFIT oncology guidelines
  • Upfront extraction of conditions, prior checks, and biomarker references
  • Constraint-encoded decision logic ready for deterministic retrieval

Phase 02

Replacing semantic search with entity-matched, auditable context retrieval

Deterministic Retrieval and Agentic Inference: An 8-Step Sequence That Mirrors Clinical Reasoning

With structured pathway logic in place, the design question was how the inference agent would retrieve context. Vector search was explicitly ruled out. Ideas2IT used entity matching instead, so context retrieval is predictable, reproducible, and auditable across every patient query.

On top of that retrieval layer, the inference agent runs an 8-step sequence that mirrors how an oncologist walks through eligibility and exclusions for each treatment line: performance status, prior drug exposure, resistance markers, drug family naming variants, comorbidities, and line-specific constraints are each addressed in sequence.

This mirrors clinical reasoning so the agent does not skip constraint checks the way a flat prompt would.

This phase produced:

  • Entity-matching retrieval layer
  • 8-step agentic inference sequence
  • Performance status and prior exposure constraint handling
  • Drug resistance and drug family naming logic
  • Multi-line treatment pathway traversal
  • Deterministic, auditable context retrieval

Phase 03

Building a system that checks its own reasoning before returning output

Reflective Validation Loop and Explainability Layer: Generation, Gap Detection, and Iterative Correction

A single-pass generative system is insufficient for clinical decision support. Ideas2IT implemented a reflective pattern using two connected agents: one generates the regimen shortlist with reasoning, and the second checks it for constraint misses, unsafe logical leaps, and gaps across the treatment lines covered.

Where gaps are found, the generation agent is forced to correct and resubmit. The cycle repeats until the output satisfies the validation rules.

The explainability layer runs through this same loop, producing reasoning for why each regimen was included and why others were rejected, traceable to the pathway constraints, not to probabilistic inference.

This phase produced:

  • Generation agent for regimen shortlisting and reasoning
  • Validation agent for gap detection and constraint verification
  • Reflective correction loop
  • Explainability output for regimen selection and rejection
  • Guardrails against hallucination across multi-line treatment scenarios
  • Support for inconsistent biomarker input formats

The Outcome

Category Metric Description
Agentic pipeline stages 8 steps Inference sequence mirrors oncologist eligibility and exclusion workflow
Context retrieval method Deterministic Entity matching replaces vector search; behavior is auditable and reproducible
Processing architecture 2 pipelines Batch guideline structuring separated from real-time inference for safety and speed
Reasoning output Selection + rejection Every regimen recommendation includes explicit reasoning for inclusion and exclusion
Validation mechanism Reflective loop Generation-validation agent pair forces iterative correction before output is returned
Compliance HIPAA Maintained across all data handling and inference layers
Treatment lines Multiple Pathway traversal covers multi-line treatment scenarios with constraint enforcement at each line
The platform's reliability is a product of the architectural choices made before a single inference call ran. Separating batch guideline structuring from real-time inference meant the model never had to parse a PDF under latency pressure. Replacing vector search with entity matching meant context retrieval behavior was predictable and traceable. The 8-step inference sequence and the reflective validation loop meant constraints were enforced, not assumed. The explainability output is not a UI feature bolted on at the end. It was built into the reasoning workflow from the first design decision.