Intro: The End of Analysis Paralysis. Time for Pragmatism.
The world of AI is full of buzz around giant language models (LLMs). This can be intimidating and lead to “analysis paralysis”—the feeling that nothing meaningful can be achieved without access to immense computing power.
From my experience, the key is to precisely match the tool to the problem. Instead of immediately reaching for the heaviest, most resource-intensive solutions, a methodical analysis and selection of the optimal technology can be more effective. This is why the strategic choice of Small Language Models (SLMs) is becoming a very interesting alternative.
In this article, I will try to divide SLMs into practical “weight classes” and—more importantly—support it with real-world implementation examples I’ve dug up online.
Category 1: Featherweight (models ~270M – 3B parameters)
These are the most specialized and agile models. They run instantly and can be deployed almost anywhere, even on mobile devices. They can be compared to a precise, dedicated tool—they are not universal, but in their specific application, they perform unrivaled.
- Practical Applications: Text classification, sentiment analysis, keyword extraction, simple FAQ chatbots.
- Examples from the Market:
- Google Gemma 270M: This model is designed specifically for fine-tuning for specific tasks on mobile devices, allowing for the creation of specialized assistants that work offline.
- IBM Granite-4.0-H-Micro (3B): IBM created this model for edge applications where speed and minimal resource consumption are critical.
Category 2: Versatile Tools (models – 4B – 13B parameters)
This is currently the “golden mean” and the most popular category. Models like Mistral 7B, Llama 3/4 8B, or Microsoft PHI-4 offer a fantastic cost-to-capability ratio. They can be seen as the “Swiss Army knife” in the AI arsenal—a reliable foundation for most typical business applications.
- Practical Applications: Advanced RAG systems (intelligent documentation search), marketing content generation, meeting summarization.
- Examples from the Market (I did some digging):
- Microsoft PHI-3 (3.8B) in the Healthcare Sector: Epic Systems, a giant in the US medical software market, has integrated the PHI-3 model. The goal? To improve patient support while maintaining full compliance with strict HIPAA privacy standards, as all sensitive data remains within their internal infrastructure. This is a perfect example of how SLMs solve security issues.
- Gemma 4B in Content Moderation: Adaptive ML, commissioned by the telecom SK Telecom, fine-tuned the Gemma 4B model for multilingual content moderation. The result? The specialized, small model not only matched but outperformed much larger, proprietary models in this specific task.
- Bielik 7B in the Polish Legal Field: The Polish model Bielik, created by the SpeakLeash Foundation and AGH University, shows its strength in niche applications. In legal document analysis tests, as reported by Deviniti, Bielik achieved a higher F1 score (0.95) than the global giant GPT-4 (0.89). This proves that a precisely fine-tuned, smaller model can beat a larger competitor in a specialized field. No wonder its capabilities are already being used by institutions like the Poznan Supercomputing and Networking Center and the Chancellery of the Prime Minister of Poland. But I want to write about Bielik separately because I am very interested in this model myself; I’m just eagerly waiting for it to be supported in Ollama or LM Studio tools. But more on those tools later.
Category 3: Strategic Platforms (models ~27B – 90B parameters)
These are the most powerful of the small models, blurring the line with LLMs. They require more resources but offer deep reasoning and complex document analysis capabilities. They can be compared to a strategic command center that not only executes tasks but can also analyze a broader context and plan multi-step operations.
- Practical Applications: Analysis of multi-page contracts, building simple AI agents (function calling), advanced data analytics.
- Examples from the Market:
- IBM Granite 32B in Corporate Applications: IBM Consulting uses models from the Granite family as its default AI engines. They offer performance comparable to much larger systems at, according to IBM, drastically lower costs. The Granite vision model was trained on 85 million PDF documents, making it an expert in processing corporate documentation.
- Meta Llama (up to 90B) as a Business Foundation: The company (under the pseudonym) Case-Based Research (CBR) uses a fine-tuned Llama model to analyze complex research cases. Thanks to the lightness of the SLM, the company meets strict SLAs for low latency at costs unattainable for larger models. Meanwhile, Instagram and WhatsApp use smaller Llama variants for quick processing of user queries.
Conclusion: Strategy, Not Strength (and Real Savings)
These examples perfectly illustrate that choosing an SLM is a strategic decision that translates into tangible economic benefits:
- Costs: IBM mentions costs lower by as much as 90% compared to the largest models.
- Performance: Inference (the model’s operation) is 2-5x faster, and energy consumption drops by 60-80%.
- Privacy: On-premise deployments, as in the case of Epic Systems, guarantee full control over data.
So what’s the point?
I often see posts on LinkedIn stating that this or that small model provides worse answers than ChatGPT, so what’s the point of it all. In short, if you have sensitive data in your company that you absolutely cannot let out of the organization, let alone to another country, that’s what SLMs are for. I must say that I am personally still thinking about what to use an SLM for, and when I come up with ideas with examples, I will let you know. For now, I plan to attend https://bieliksummit.ai virtually and will report back right after 🙂 If you have other examples of the practical use of small models, let me know.
0 Comments