Agentic AI for Cancer Pathway Detection: Converting PDFs into Deterministic Regimen Reasoning

From Oncology Guideline PDFs to Deterministic Treatment Pathways: A Case Study in Agentic AI for Clinical Decision Support

‍

One-liner summary:

A large public healthcare system partnered with Ideas2IT to build an agentic AI workflow that converts oncology guideline PDFs into deterministic decision trees and delivers explainable regimen shortlists for oncologists based on real-world biomarker inputs and clinical constraints.

The Problem with the Status Quo

A leading public healthcare system in South Florida wanted to reduce the decision friction oncologists face when mapping a pathology report to the right therapy regimen across multiple lines of treatment.

The challenge was not a lack of guidelines. It was the opposite: complex guideline PDFs with branching logic, exceptions, and clinical qualifiers that are hard to operationalize at the point of care, especially when biomarker values arrive in inconsistent formats and prior treatment history changes the valid options.

They needed a system that could narrow regimens reliably, explain decisions, and behave like a careful clinician, not a creative chatbot.

Where the Gaps Were

They needed an AI system that could do five hard things at once:

Convert guideline PDFs into a standardized structure that can be safely used as model context.
Prevent hallucinations with strict validation and guardrails.
Keep prompts generic enough to work across different cancer pathways and multiple lines of treatment.
Handle biomarker inputs in many formats and combinations without breaking the workflow.
Generate clinical-grade reasoning for both:
- why a regimen is selected, and
- why other regimens are rejected.

On top of that, the system had to account for real clinical constraints that commonly invalidate choices:

performance status
prior drug exposure
drug resistance
different drug names within the same family

What We Delivered

Ideas2IT designed the solution as an agentic decision workflow with deterministic context retrieval and explicit step-by-step reasoning.

1) Built two separate pipelines for safety and speed

Batch pipeline: Converts guideline PDFs into structured decision logic (decision-tree-like representation).
Real-time pipeline: Takes pathology and biomarker inputs and returns regimen recommendations with explanations.

This separation ensured guideline processing doesn’t slow down inference and reduces runtime complexity.

2) Chose a model + platform built for production reasoning

Implemented in Azure AI Foundry using GPT 4.1 to balance reasoning quality and latency.

3) Replaced vector search with deterministic retrieval

Instead of vector search, the system uses entity matching so guideline context retrieval is predictable and reproducible.

This was a deliberate design choice to reduce ambiguity and make behavior auditable.

4) Pre-extracted pathway constraints to reduce inference load

All conditions, prior checks, and biomarkers referenced in the guideline pathways were extracted upfront so real-time inference is faster and less error-prone.

5) Designed an 8-step inference agent to mimic clinician workflow

The inference agent follows an 8-step sequence that mirrors how an oncologist walks through eligibility and exclusions so it does not miss constraints across lines of treatment.

6) Added a reflective “generation + validation” loop

A connected pair of agents runs in a reflective pattern:

one generates the regimen shortlist and reasoning
the other checks gaps, constraint misses, and unsafe leaps
then forces iterative correction until the output meets rules

Outcomes We Achieved

A pathway-driven agentic AI system that turns unstructured oncology guideline PDFs into structured, deterministic treatment logic and produces explainable regimen recommendations from messy real-world inputs.

Delivered capabilities included:

Standardized guideline ingestion from PDFs into decision logic usable by the model
Deterministic context retrieval via entity matching (more reproducible behavior)
Guardrails and validation flow to reduce hallucination risk
Support for inconsistent biomarker formats and high-combination scenarios
Explainability: reasons for selection + rejection across regimen options
Coverage of real clinical modifiers like performance status, prior exposure, resistance, and drug family naming variation

‍

Key Takeaways

Agentic systems in clinical settings must prioritize determinism and guardrails over “semantic flexibility.”

Splitting offline guideline structuring from online inference improves both safety and latency.

Explainability isn’t a UI feature. It must be designed into the reasoning workflow from step one.

When the “source of truth” is a PDF guideline, the win isn’t RAG. The win is turning the guideline into a structured pathway that the model can traverse with constraints, validation, and traceable decisions.

Co-create with Ideas2IT

We show up early, listen hard, and figure out how to move the needle. If that’s the kind of partner you’re looking for, we should talk.

We’ll align on what you're solving for - AI, software, cloud, or legacy systems

You'll get perspective from someone who’s shipped it before

If there’s a fit, we move fast — workshop, pilot, or a real build plan

Trusted partner of the world’s most forward-thinking teams.

Tell us a bit about your business, and we’ll get back to you within the hour.