Beyond Zero-Shot: Fine-Tuning Foundation-Sec-8B for Enhanced Cybersecurity Classification

Takahiro Matsumoto

August 11, 2025

Welcome back to our series introduction the wide variety of capabilities of Foundation-Sec-8B! If you've been following along, you've already seen our blog of how we introduced our cookbook for a better understanding of how our cookbook works, we also uploaded a quick demo video and explored the remarkable agility of off-the-shelf classification with Foundation-Sec-8B. That "zero-shot" magic, where our model can classify data without explicit prior training on that specific task, is truly a game-changer for rapid insights. But what happens when we need a more advanced and specialized solution?

When the Standard Playbook Isn't Enough: The Imperative for Fine-Tuning

While zero-shot classification offers incredible speed and versatility, there are certain use cases when its limitations become apparent. Imagine trying to categorize an ocean of cybersecurity events where the sheer number of distinct classes is overwhelming. Or perhaps you're building a system where every millisecond of latency matters. In these critical scenarios, a general-purpose, zero-shot approach might not deliver the precision or speed you need.

The core reason often boils down to this: zero-shot models are not capable of learning new concepts or adapting to highly specialized use cases. They rely on their pre-existing, broad understanding. If your cybersecurity use case is particularly unique, perhaps involving novel attack vectors or proprietary system logs, the model might be unfamiliar with the specific jargon and patterns required for top-tier performance. This is why fine-tuning can be essential, to transform a powerful generalist into a highly specialized expert.

Crafting a Specialist: Two Paths to Fine-Tuning Foundation-Sec-8B

So, how do we embark on this journey of specialization? We've explored two distinct methods of fine-tuning Foundation-Sec-8B, each offering a unique set of strengths. For those eager to dive into the practicalities, you'll find detailed code examples for both approaches in the finetuning section of the cookbook. Let's unpack both below.

Fine-Tuning as CausalML: The Generative Classifier

Our first approach might seem counter-intuitive at first glance. Instead of explicitly re-training the model for classification, we continue to treat it as a text generator. The trick here is that the model learns to generate the class label as part of its output, effectively classifying by generating.

The beauty of this method lies in its simplicity; it's often easier to prepare data and get the training process up and running. The trained model retains its generative capabilities, meaning it could potentially be repurposed for other text generation tasks beyond the specific classification it was fine-tuned for, offering flexibility for the longterm.

However, this flexibility comes with its own set of challenges. Because the model is still fundamentally a text generator, there's a possibility it might generate outputs that fall outside your predefined set of class labels. It might also struggle to consistently learn the precise distinctions required for accurate classification, leading to performance that isn't always as sharp as one might hope.

For a deeper dive into this approach, explore our Finetuning as CausalML notebook. The notebook demonstrates classifying a cybersecurity description into a Common Vulnerability Scoring System (CVSS) vector using a modified CTI-VSP (Cyber Threat Intelligence Vulnerability Severity Prediction) dataset. For example, given the following description:

In the Linux kernel through 6.7.1, there is a use-after-free in cec_queue_msg_fh, related to drivers/media/cec/core/cec-adap.c and drivers/media/cec/core/cec-api.c.

the model is asked to classify the description to correct labels based on the category in question - if the category is Attack Vector, choices are Network, Adjacent, Local or Physical, while if the category is Integrity Impact, choices are None, Low or High.

Fine-Tuning by Adding a Classification Head: The Dedicated Expert

Our second strategy is more aligned with traditional machine learning classification. Here, we leverage Foundation-Sec-8B's vast pre-trained knowledge as a powerful feature extractor. We then replace its final layer with a purpose-built classification layer. Think of it as attaching a specialized "brain" to the model, specifically designed to output class predictions.

The advantages of this approach are compelling. Since the final layer is explicitly a classification layer, the outputs are guaranteed to be one of your specified classes, eliminating the ambiguity sometimes present with CausalML. Moreover, you can obtain calibrated probabilities for each class, providing a clear measure of the model's confidence in its predictions – a crucial feature for risk assessment in cybersecurity. This dedicated optimization often translates into potentially higher classification performance compared to the CausalML approach.

This specialization comes with certain trade-offs. The fine-tuned model becomes highly focused, making it less useable for tasks outside its specific classification purpose. If your number of classes changes, especially if new ones are added, you'll often find yourself needing to retrain the model entirely. Furthermore, because this new classification layer starts from scratch, it can sometimes require more training data and longer training times to achieve optimal performance.

To explore the intricacies of this method, check out our Finetuning by adding Classification Head notebook. Using the CTI-HAL dataset, the notebook demonstrates how to classify cybersecurity descriptions into their MITRE ATT&CK IDs. For instance, if given '”This malware was capable of stealing significant system and network information,” the model's role is to correctly assign it the ID T1082. It's worth noting that in our specific demonstration, the performance might not appear exceptionally high, largely because the dataset used for fine-tuning was relatively small.

Choosing Your Fine-Tuning Strategy

Ultimately, the choice between these two fine-tuning methodologies depends on your specific requirements. Both approaches empower you to push the boundaries of what Foundation-Sec-8B can achieve, transforming it from a general-purpose language model into a highly effective, domain-specific tool tailored to your unique cybersecurity challenges. We encourage you to experiment with these methods, explore the provided resources, and discover how fine-tuning can unlock new levels of intelligence and efficiency in your security operations. Stay tuned for more insights as we continue to build out the Foundation-Sec-8B series!

You can find our models, including a recently released instruct fine-tuned model, on Hugging Face. Stay updated by following our X account.