The Fast and the Label-Curious: Off-the-shelf classification with Foundation-Sec-8B

Supriti Vijay

Security operations demand continuous classification across multiple taxonomies - SOC analysts classify thousands of alerts daily, threat intelligence teams categorize TTPs and attribution, vulnerability managers assess CVSS scores and remediation priorities. Traditional classification approaches require extensive fine-tuning, labeled datasets, and computational overhead that security teams cannot afford.

Foundation-Sec-8B offers a different approach: leveraging the base model's cybersecurity training to perform classification without fine-tuning, using perplexity-based scoring that transforms next-token prediction into structured classification outputs.

If you'd like to follow along with code or prefer a more hands-on walkthrough, check out the classification cookbook.

Transforming Base Models into Classifiers

Foundation-Sec-8B is trained to predict the next token in cybersecurity text, not to follow instructions. Yet this capability enables effective classification through a key insight: base language models implicitly encode classification knowledge through their training objective.

When Foundation-Sec-8B processes "Critical buffer overflow vulnerability in Apache HTTP server," it has learned statistical patterns indicating this text should continue with vulnerability-related terms rather than phishing or malware indicators. We extract this knowledge by measuring how naturally the model predicts different classification labels as continuations.

This transforms classification from "train a model to output categories" to "measure which category the model finds most natural." No fine-tuning required - just intelligent probing of existing cybersecurity knowledge.

Why Perplexity Drives Classification

Perplexity measures how surprised a language model is by text sequences. When calculating perplexity for "Critical buffer overflow...vulnerability" versus "Critical buffer overflow...phishing," the model shows much lower surprise for the correct continuation.

Perplexity=exp(1Ni=1NlogP(tokenicontext)) \text{Perplexity} = \exp\left( -\frac{1}{N} \sum_{i=1}^{N} \log P(\text{token}_i \mid \text{context}) \right)

Foundation-Sec-8B's training on threat intelligence, vulnerability databases, and security documentation creates strong statistical associations. Appending potential labels and measuring perplexity asks: "Given your cybersecurity training, how natural does this classification seem?"

The method naturally extends to multi-token labels by computing perplexity over the entire appended label sequence. Whether the label is phishing or privilege_escalation, we calculate perplexity across the entire label sequence - crucial for security taxonomies using descriptive labels like "lateral_movement" or "credential_access."

Converting Perplexity to Probabilities

Raw perplexity scores are inverse indicators - lower perplexity means higher confidence. We convert to intuitive probabilities using inverse relationships:

raw_probabilities = {
    label: 1.0 / perp if perp != float('inf') else 0.0 
    for label, perp in perplexity_scores.items()
}

If label A has perplexity 2.0 and label B has perplexity 4.0, then A is twice as likely as B (1.0/2.0 vs 1.0/4.0). The inverse relationship preserves relative ordering while creating meaningful probability distributions.

Collision-Based Confidence Scoring

Classification confidence requires measuring probability distribution concentration. Collision probability quantifies this through squared probability sums:

confidence_score = sum(p**2 for p in normalized_probabilities.values())
Collision Confidence=i=1nP(labeli)2\text{Collision Confidence} = \sum_{i=1}^{n} P(\text{label}_i)^2

For uniform distribution across 5 categories (0.2 each), collision confidence equals 0.2. A concentrated prediction (0.8 on "malware," 0.05 on others) yields 0.65, indicating much higher confidence. The measure scales naturally with distribution sharpness.

This approach works because language model training creates implicit classification boundaries. Here, Foundation-Sec-8B’s training leads it to statistically prefer certain continuations in specific contexts. "APT29 group observed using" strongly predicts threat intelligence terms rather than compliance vocabulary. Perplexity-based classification is able to extract this knowledge without additional training.

## Practical Usage Example

# Initialize classifier with security labels
classifier = PerplexityClassifier(
    model_name="fdtn-ai/Foundation-Sec-8B",
    labels=["malware", "phishing", "vulnerability", "incident_response", "threat_intelligence"],
    run_quantized=True
)

# Example classification
text = "Critical buffer overflow vulnerability discovered in Apache HTTP server"
result = classifier.classify(text)

print(f"Predicted: {result.predicted_label}")
print(f"Confidence: {result.confidence_score:.3f}")
print(f"Perplexity: {list(result.perplexity_scores.values())[0]:.3f}")
Output:

Predicted: vulnerability
Confidence: 0.486
Perplexity: 2.847

Few-Shot Enhancement

Foundation-Sec-8B's cybersecurity pretraining reduces few-shot requirements, but examples improve consistency:

few_shot_examples = [
    ("SQL injection flaw found in web application", "vulnerability"),
    ("Fake Microsoft login page sent to employees", "phishing"),
    ("Ransomware encrypted file servers", "malware")
]

result = classifier.classify(text, few_shot_examples=few_shot_examples)

Few-shot examples guide the model toward consistent output formatting while leveraging its domain knowledge for accurate classification.

Multi-Label Classification

def classify_multi_domain(self, text: str) -> Dict[str, ClassificationResult]:
    """Classify across multiple security domains simultaneously"""
    domains = {
        "threat_type": ["malware", "phishing", "vulnerability", "insider_threat"],
        "severity": ["critical", "high", "medium", "low"],
        "mitre_tactic": ["initial_access", "execution", "persistence", "privilege_escalation"]
    }
    
    results = {}
    for domain, labels in domains.items():
        self.set_labels(labels)
        results[domain] = self.classify(text)
    
    return results

This approach enables simultaneous classification across multiple security taxonomies using the same base model inference.

Implementation Best Practices

  • Prompt Formatting:

    Structure prompts to leverage Foundation-Sec-8B's cybersecurity training:

    def _create_classification_prompt(self, text: str, few_shot_examples: List[Tuple[str, str]] = None) -> str:
        prompt_parts = [
            "This is a cybersecurity text classification task.",
            f"Available labels: {', '.join(self.labels)}",
            "Choose the most appropriate label for the given text.\n"
        ]
        
        if few_shot_examples:
            prompt_parts.append("Examples:")
            for example_text, example_label in few_shot_examples:
                prompt_parts.append(f'Text: """{example_text}"""')
                prompt_parts.append(f"Chosen label: {example_label}\n")
        
        prompt_parts.extend([f'Text: """{text}"""', "Chosen label:"])
        return "\n".join(prompt_parts)
    
  • Confidence Thresholding:

    Implement automated vs. manual review routing:

    def classify_with_routing(self, text: str, confidence_threshold: float = 0.5) -> Dict:
        result = self.classify(text)
        return {
            "prediction": result.predicted_label,
            "confidence": result.confidence_score,
            "automated": result.confidence_score >= confidence_threshold,
            "requires_review": result.confidence_score < confidence_threshold
        }
    

The Good, The Bad, and the Practical

Perplexity-based classification works surprisingly well out of the box - but like any method, it has tradeoffs.

Why it Works Well

  • No fine-tuning loop: You get usable classifications straight from the base model by probing its cybersecurity knowledge.
  • Label control by design: The model chooses only from the provided set, minimizing off-target predictions.
  • Confidence made simple: Collision scores offer a lightweight, interpretable proxy for prediction certainty.
  • Best suited for focused taxonomies: Works efficiently when you care about a small number of well-defined categories.

When to Think Twice

  • Takes time: Each label is a separate model pass, so more labels = more latency. Not ideal for high-speed pipelines.
  • Fine-grained taxonomies need more: If you're classifying across hundreds of categories (e.g., full MITRE ATT&CK matrix), this gets inefficient fast.
  • It’s clever, but not clairvoyant: Domain-specific edge cases may still benefit from supervised learning.

Want to Go Big? Use Finetuning.

If your use case involves large taxonomies, tighter latency budgets, or domain-specific behavior, fine-tuning is the way to go. We’ve included a notebook for it here.

And for those who want a deeper dive, we’ll be releasing a blog post soon walking through the fine-tuning process in more detail. Until then, the notebook has everything you need to get started.

Conclusion

Foundation-Sec-8B enables effective security classification through perplexity-based scoring that leverages the model's cybersecurity domain expertise. The collision confidence mechanism provides reliable uncertainty quantification for operational deployment, while 4-bit quantization maintains performance with 75% memory reduction.

This implementation approach transforms Foundation-Sec-8B's text generation capabilities into structured classification outputs suitable for security operations, threat analysis, and automated incident response workflows.

Start implementing with our classification cookbook and deploy Foundation-Sec-8B from Hugging Face for immediate security classification capabilities.