Intelligent Data Automation with LLMs

Philipp Hartenfeller

$whoami

Philipp Hartenfeller

Düsseldorf, Germany
Oracle APEX Dev since 2016
Product Lead at United Codes
Web Dev and Databases

Blog: hartenfeller.dev/blog

Bluesky: @hartenfeller.dev

Agenda

(Shift + A: jump to Agenda, Shift + O: overview)

The pitch

LLMs are not just chatbots
They turn unstructured text and images into structured data
Perfect for relational databases
No Python, we can do it from PL/SQL
DB devs have perfect opportunities to become AI engineers

LLM quickstart

What is an LLM?

Large Language Model: generates text
Simple flow: text input → processing → text output
No memory, no learning — that all happens around the LLM
One specific instance is called a model; the companies that train them are providers

Conversations without memory?

The whole history gets sent again with every request
Request 1 → : 👱🏼 Capital France?
Request 1 ← : 🤖 Paris
Request 2 → : 👱🏼 Capital France? + 🤖 Paris + 👱🏼 River?
Request 2 ← : 🤖 Seine

Tokens & context window

Text is measured in tokens, not characters or words
~0.75 words per token in English
The context window = how much text the model can consider at once (history + system prompt + user message + response)
You pay per input token + per output token

Model providers

🇺🇸	🇨🇳	🇪🇺
OpenAI: GPT	Alibaba: Qwen	Mistral
Anthropic: Claude	DeepSeek
Google: Gemini	MoonshotAI: Kimi
xAI: Grok	MiniMax
Amazon: Nova

Models excel at different things — don't lock yourself into one provider.

First PL/SQL call

The inference API

You send a JSON array of messages, you get a text response back. This is just the messages part — the full request also includes model, and other parameters. Exact structure differs between providers.

Frameworks do the heavy lifting

You don't want to handle this manually for each provider
Python: LangChain, LlamaIndex
JavaScript: Vercel AI
Java: LangChain4j, Spring AI
Oracle: UC AI, Select AI (23ai/26ai native), APEX AI

Simplest possible call

Our demo company: ACME Insurance

Fictional mid-size insurance company
Offers home, auto, and life insurance products
Handles thousands of customer inquiries and claims daily
Uses Oracle APEX + PL/SQL as their core platform
We will follow ACME through several real-world LLM use cases

Domain Q&A from PL/SQL

Output

=== What is a deductible? ===
  A deductible in insurance is the amount of money you must pay
  out-of-pocket before your insurance company starts to cover the
  costs of a claim. For example, if your policy has a $500
  deductible and you file a claim for $2,000, you would pay the
  first $500 and the insurer would cover the remaining $1,500.
  Deductibles are common in auto, health, and homeowners insurance,
  and choosing a higher deductible typically lowers your premium.

Free-form text. Great for chat — but what about our relational world?

Structured output

The problem

Free-form text is unpredictable and hard to parse
For databases and automation we want structure
How do we tell the LLM exactly what shape we want back?

JSON schema in, JSON out

You provide a JSON schema describing the desired output
The model is constrained to return JSON matching that schema
JSON Schema is a pre-existing standard
Common use cases: data extraction, classification, content tagging, decision support

Schema for customer extraction

The PL/SQL call

Output

Reasoning

What is reasoning?

New capability in advanced LLMs (2024+)
Model thinks step-by-step before producing the final answer
Generates internal "thinking tokens" to decompose, explore, and validate
Effective for complex logic, multi-step analysis, math, code
Cost: 3–10x more tokens, 2–10s slower
Don't use for: simple tasks, real-time, cost-sensitive ops

Why fraud detection needs reasoning

No single red flag: risk comes from combining many weak signals
Cross-checking claim text, history, timing, amounts, locations
Weighing contradictions: plausible story vs. suspicious pattern
Decision has real cost: false positives anger customers, false negatives pay out fraud

Enable reasoning + structured output

A peek at the reasoning

=== MODEL REASONING (excerpt) ===
Claim is $9,800 against a $10,000 limit — that's 98% of coverage.
"Max-out" claims correlate strongly with fraud; flagging as a signal.

Policy started 18 days ago. Industry baseline for first claim is ~180+
days. Early claims after policy inception are a classic indicator.

No witnesses and no police report for a collision claim. Not proof of
fraud on its own — could be a minor parking incident — but combined
with the above, the pattern strengthens.

Customer risk score is 85 vs. typical 40-50 for this segment. Worth
weighting but shouldn't dominate the decision alone.

None of these is conclusive individually. Together: 3 strong signals +
1 elevated baseline → cannot auto-approve, but evidence is not strong
enough for outright denial. Right call is MANUAL_INVESTIGATION_REQUIRED
and route to SIU.

Output: high-risk claim

=== FRAUD RISK ASSESSMENT ===
  Overall Risk: HIGH
  Fraud Score: 82/100
  Recommendation: MANUAL_INVESTIGATION_REQUIRED

  Risk Factors Identified:
  1. [HIGH] Claim amount near coverage limit
  $9,800 claimed against $10,000 limit — fits "max-out" pattern.
  2. [HIGH] Very early claim after policy start
  Filed 18 days in vs. industry average of 180+ days.
  3. [MEDIUM] Elevated customer risk score (85 vs. typical 40-50)
  4. [MEDIUM] No witnesses, no police report

  Suggested Actions:
  1. Assign to special investigations unit (SIU)
  2. Request police report and any dashcam/road camera footage
  3. Cross-check prior claims across carriers via ISO ClaimSearch
  4. Hold payment pending investigation

File input

Multimodal models

Modern frontier models accept text and images, PDFs (some also audio/video)
Send a text prompt plus files; get a text answer back
PDFs: extracted as text and rendered as page screenshots
For CSV / Excel / DOCX: prepare locally as text and send as text
Token-priced like everything else; big images get scaled down automatically

Common use cases

Invoice processing: extract vendor, amount, date, line items
Receipt scanning for expense reports
Contract analysis: review and summarize
Diagram understanding: convert architecture diagrams to docs

The claim photo

Send a BLOB from your DB

Output (gpt-5.4)

=== VEHICLE DAMAGE ASSESSMENT ===
Vehicle Identification:
Brand (guess): Renault
Model (guess): Megane Scenic
Color: Black

Damaged Parts:
1. Engine bay / front structural [SEVERE] Est. $5,000
Engine compartment crushed; likely radiator and frame damage
2. Roof front / A-pillar area [SEVERE] Est. $2,500
Buckled from impact, possible structural deformation
3. Front suspension / alignment [SEVERE] Est. $2,200
4. Front fenders [SEVERE] Est. $1,400
5. Hood [SEVERE] Est. $1,200
6. Front bumper [SEVERE] Est. $900
7. Windshield [SEVERE] Est. $700
8. Headlights, grille, doors… Est. $2,750

Total Estimated Repair Cost: $16,650
Overall Severity: TOTAL_LOSS

Tools

Let the LLM call your PL/SQL functions

Generative vs agentic

So far: LLM produces text
What if the LLM could trigger an action — query a table, call an API, write data?
That's function calling, also known as tools
We define tools (name, description, parameter schema) and pass them with the request
The LLM asks us to call a tool with specific arguments — we execute it and return the result
Then it can ask again, or produce a final answer

The tool loop

Tool design principles

Names matter — clear, action-oriented: search_customers not tool_1
Descriptions are prompts — the LLM picks tools based on description quality
Right granularity — not too broad (do_everything), not too narrow (get_user_first_name)
Limit count — each definition costs input tokens; 20+ tools degrades accuracy
Filter per context — only send relevant tools (use tags)
Never raise — return descriptive error strings; LLMs recover and retry

Register a tool

Enable tools and ask

What happened under the hood

Message flow:
→ System prompt
→ User asked question
→ AI decided to call tool(s): DEMO_GET_CUSTOMER_INFO
→ System returned tool result for: DEMO_GET_CUSTOMER_INFO
→ AI provided final answer

AI Final Response:
Ashley Taylor (CUST-1000037), customer since 2022. Lives in
Phoenix, AZ. Risk score: 38 (low). 2 active policies:
• POL-AUTO-2024-00012 — Auto, $50k coverage
• POL-HOME-2024-00018 — Home, $400k coverage
Recent claims: 1 (status: closed, paid).

Tool calls made: 1

Agentic AI

End-to-end claim processing from a single email

From one tool to a suite

An agent = LLM + tools + loop
Give it a goal, it decides what to call and when
Multiple tools = multiple business operations — the model orchestrates them
Tip: pair with reasoning for complex workflows

The claim-handling agent

Four PL/SQL tools, one autonomous workflow:
DEMO_GET_POLICY_INFO — validate the policy exists and is active
DEMO_FILE_CLAIM — create the claim, return claim number
DEMO_ASSIGN_CLAIM — route to a specialist with appropriate priority
DEMO_SEND_EMAIL — confirm with the customer

One call, full workflow

What the agent did, on its own

Tool Calls Made by Agent:
───────────────────────────────────────
Tool: DEMO_GET_POLICY_INFO
Args: { "policy_number": "POL-HOME-2024-00001" }
Result: Active. Coverage $400k. Holder: Robert Anderson.

Tool: DEMO_FILE_CLAIM
Args: { "policy_number": "POL-HOME-2024-00001",
"claim_type": "Water Damage",
"description": "Basement flood from burst pipe..." }
Result: CLM-1000089 filed.

Tool: DEMO_ASSIGN_CLAIM
Args: { "claim_number": "CLM-1000089",
"specialization": "Property",
"priority": "HIGH" }
Result: Assigned to Maria Lopez (Property).

Tool: DEMO_SEND_EMAIL
Args: { "to_email": "robert.anderson@email.com",
"subject": "Claim CLM-1000089 received - urgent water
  damage",
"body": "Hi Robert, we've filed your claim..." }
Result: Email sent.

You only need PL/SQL

✓ Any modern Oracle DB
✓ An API key from a provider of your choice
✓ UC AI (open source) / APEX 26.1
✗ No Python · No MCP servers · No Autonomous DB

Resources

UC AI: github.com/united-codes/uc-ai
UC AI docs: united-codes.com/products/uc-ai/docs/
JSON Schema builder: json-schema builder
Model benchmarks: artificialanalysis.ai
Blog: hartenfeller.dev/blog

Thank you! — Q&A

philipp@united-codes.com

Slides + demos available after the talk