From Logon to Detection: AI-Powered Fingerprints with PLoB and Foundation-Sec

Shannon Davis

Over half of cybersecurity breaches now involve valid credentials — not zero-days or phishing payloads. Just logins. And the adversaries using them? They're staying undetected for months inside sensitive environments.

We built PLoB (Post-Logon Behaviour Fingerprinting and Detection) to tackle this problem directly — using AI-driven fingerprinting, graph analysis, vectorization, and contextual analysis via LLMs to flag risky behavior immediately after logon, before the real damage begins.

The primary model being used is our new Foundation-Sec-8B-Instruct model.  Foundation-Sec-8B-Instruct is designed for security practitioners, researchers, and developers building AI-powered security workflows and applications.

Why This Matters: Credential Abuse is the #1 Breach Vector

According to the Cisco Talos IR Trends report for Q1 2025, over 50% of incidents involved the use of valid credentials. The Verizon DBIR attributes 22% of breaches to credential abuse. Mandiant’s M-Trends report shows stolen credentials now surpass phishing as an initial access vector.  And Splunk SURGe’s report, Macro-ATT&CK perspective, shows that valid account use by adversaries shot up by 17% between 2023-2024

These aren’t just statistics. They reflect a shift in how APTs operate: living off the land with legitimate credentials, evading traditional rules, heuristics, and baselines.

The Stack: From Flat Logs to Fingerprint Search

PLoB turns raw security data (e.g., from Splunk) into a rich session graph using the Neo4j graph database, modeling relationships between users, hosts, logons, and spawned processes. This lets us query post-logon behavior as a structured narrative — not a list of flat events.

We then generate a fingerprint: a compact, human-readable summary of session behavior, highlighting signals like:

  • Novel commands
  • Machine-speed execution
  • Suspicious fan-out patterns

Fingerprint text

This fingerprint is converted into a vector via Open AI’s text-embedding-3-large model, then stored in Milvus, a high-speed vector DB.

Fingerprint vector

We use text-embedding-3-large for its capacity to encode security nuance at 3072 dimensions enabling subtle distinctions between legitimate admin scripts and their weaponized twins. It's also natively compatible with Foundation AI’s agent memory store and vector-native workflows.

The Detection Strategy: Anomalies from Embeddings

Every new session vector is compared to historical behavior using Cosine Similarity:

  • Outlier Hunt: If similarity < 0.92, it’s behaviorally unique (e.g., novel LoL tools)
  • Cluster Hunt: If similarity > 0.99, it’s unnaturally repetitive (e.g., bot activity)

Anomaly Detection

The Problem: When Malicious Looks Normal

Our early attempts at fingerprinting failed to flag attacks using certutil.exe and schtasks.exe — because they blended in with admin activity. A malicious session scored 0.97 similarity to a legitimate one.

The fix? Prioritizing the signal. We now front-load suspicious behavior in the fingerprint summary. Instead of passively describing activity, we tell the embedding model what matters.

Example:

Key Signals: Novel commands for this user: certutil.exe -urlcache... | Extremely rapid execution (mean delta: 0.08s)

This shift dramatically lowered similarity scores for malicious sessions and improved detection fidelity — turning AI from a passive summarizer into an active threat hunter.

The Analyst Loop: AI Agent Escalation

Once anomalies are identified, they’re passed to our Foundation AI and Open AI agents for context-aware analysis. We are using the new Foundation Sec 8B Instruct model, along with Open AI’s gpt-4o model to provide this analysis.  Each session comes with a scoped prompt:

  • Outlier prompt: Why is this session so unique?
  • Cluster prompt: Why is this session so repetitive?

The model returns structured JSON containing a risk score, key indicators, and analyst-friendly reasoning — giving defenders a head start without requiring full triage.

Foundation Sec Session Analysis

Early Results: High Sensitivity, Low Noise

  • Synthetic malicious sessions now fall below the 0.92 similarity threshold
  • Closest matches are other malicious sessions — not benign baselines
  • Outlier/Cluster split improves both coverage and analyst trust

Thresholds are tunable by environment, and future work will explore auto-tuning based on score distribution and data drift.

A Portable Detection Framework

Though PLoB currently focuses on Windows logs, its logic is platform-agnostic. The core insight is:

Identity + Actions = Behavior

We’re already exploring how this same fingerprinting pipeline can extend to:

  • Cloud: AWS CloudTrail, Azure logs
  • Linux: auditd, process trees
  • Network gear: CLI behavior from routers/firewalls
  • SaaS: Office 365, Salesforce audit trails

What’s Next for PLoB

  • Human-in-the-loop feedback to refine model risk scoring
  • Graph Neural Networks to learn attack paths directly from Neo4j
  • Universal session fingerprinting for hybrid environments

Read the Full Breakdown

At Foundation AI, our goal is to make security AI both interpretable and operational. PLoB fits squarely into that mission — combining graph-native behavioral modeling with lightweight LLM agents that don’t just detect threats, but explain them. We believe this kind of context-aware fingerprinting will be critical in a world where threats increasingly hide behind the façade of normality.

Want the full technical breakdown, session examples, and detection results?

Read the full blog on Splunk.com and watch a video where I demonstrate PLoB on the Foundation AI YouTube channel.