What is an LLM (large language model)?

13. siječnja 2025. · 13 minuta čitanja

info

Series of articles on AI
This is the first article in a series of four:

LLMs: understanding what they are and how they work (this article).
NLP: exploring Natural Language Processing.
AI Agents: discovering autonomous artificial intelligences.
Comparison and AI Smarttalk’s positioning: an overall synthesis and perspective.

Imagine a field of wildflowers stretching as far as the eye can see, where an oversized swarm of bees is busily buzzing around. They flutter, gather pollen from every bloom, and turn it into incredibly complex honey. That honey is language. And these bees are the LLMs (Large Language Models), those giant language models that work tirelessly to transform vast amounts of textual data into something structured, coherent, and sometimes even highly creative.

In this article, we will dive deep into the bustling hive of LLMs: understanding how these massive bees build and refine their honeycombs (their architecture), what types of pollen they collect (the data), how they coordinate to produce honey (text generation), and finally how to guide and tame these swarms so they deliver a sweet, well-crafted nectar rather than a random substance.

We will cover several key points:

The origins and definition of an LLM
Training techniques and the role of attention
Concrete use cases and limitations
Ethical, energy, and technical challenges
Prompt engineering to get the best out of an LLM
Deployment and maintenance options

We will push the bee analogy quite far. You might find the image of a bee gentle and harmless, but remember that a poorly managed swarm can still inflict quite a few stings. Before we light the smoke to calm them down, let’s explore the very structure of an LLM, which will no longer hold many secrets once you’ve finished reading.

To start, here is a simplified diagram (with no extra commentary) of the path a piece of text takes within an LLM, from input to output, passing through all the key steps:

1. What is an LLM? The swarm that buzzed louder than all the others

1.1. Origin and concept

For several years, Artificial Intelligence research has focused on natural language: how can we make a model understand and generate relevant text? Initially, we used NLP (Natural Language Processing) techniques based on simple rules or basic statistics. Then a crucial step arrived: the advent of Deep Learning and neural networks.

Large Language Models stem from this revolution. They are called “large” because they boast tens or even hundreds of billions of parameters. A parameter is somewhat like the “position of a tiny component” in the hive’s complex organization. Each parameter “learns” to weight or adjust a signal to better predict the next token in a given sequence.

1.2. A hive built on massive amounts of data

To build their hive, LLMs need a huge amount of “pollen”: text. They ingest phenomenal volumes of content, from digitized books to press articles, forums, and social media. By absorbing all that data, the model’s internal structure becomes shaped to capture and reflect language regularities.

Hence, these artificial bees ultimately learn that, in a given context, certain words are more likely to appear than others. They do not memorize text line by line; instead, they learn how to “statistically reproduce” typical forms, syntax, and associations of ideas found in language.

2. Stepping into the hive: an overview of how it works

2.1. Tokenization: gathering pollen piece by piece

The first step is tokenization. We take the raw text and break it into tokens. Imagine a field of flowers: each flower is like a word (or part of a word), from which a bee collects pollen. A “token” can be a whole word (“house”), a fragment (“hou-”, “-se”), or sometimes just a punctuation mark.

This segmentation depends on a vocabulary specific to the model: the larger the vocabulary, the finer the segmentation can be. Tokenization is crucial because the model then manipulates tokens rather than raw text. It is akin to the bee collecting precisely the pollen rather than taking the whole flower.

2.2. Embeddings: turning pollen into vectors

Once the pollen is gathered, it must be converted into a format the model can use: that step is called embedding. Each token is transformed into a vector (a list of numbers) encoding semantic and contextual information.

Think of it as the “color” or “flavor” of the pollen: two words with similar meanings will have similar vectors, just like two related flowers produce similar pollen. This step is essential, as neural networks only understand numbers.

2.3. The “Transformers” layers: the bee dance

In a hive, bees communicate through a “bee dance,” a complex choreography that indicates where the richest pollen is located. In an LLM, coordination is achieved via the attention mechanism (the famous “Attention is all you need” introduced in 2017).

Each Transformer layer applies Self-Attention: for every token, the model calculates its relevance to all other tokens in the sequence. It’s a simultaneous exchange of information, much like every bee saying, “Here’s the pollen type I have; what do you need?”

By stacking multiple Transformer layers, the model can capture complex relationships: it can learn that, in a certain sentence, the word “queen” refers to a concept linked to “bees” or “hive,” rather than “monarchy,” depending on the context.

2.4. Honey production: predicting the next token

Finally, the hive produces honey, i.e., the generated text. After analyzing the context, the model must answer a simple question: “What is the most likely next token?” This prediction relies on the network’s adjusted weights.

Depending on the hyperparameters (temperature, top-k, top-p, etc.), the process can be more random or more deterministic. A low temperature is like a very disciplined bee producing a predictable honey. A high temperature is like a more eccentric bee that can roam more freely and come up with more creative honey, at the risk of being inconsistent.

3. Honey in all shapes: use cases for LLMs

3.1. Assisted writing and content generation

One of the most popular uses is automatic text generation. Need a blog post? A video script? A bedtime story? LLMs can produce surprisingly fluent text. You can even steer the writing style: humorous, formal, poetic, and so forth.

Still, you must check the quality of the honey produced. Sometimes, the swarm can collect the wrong information, leading to “hallucinations”—the bee invents flowers that don’t exist!

3.2. Conversation tools and chatbots

Chatbots powered by LLMs have gained attention thanks to their more natural-sounding conversation. Picture a swarm that, upon receiving your request, flies from flower to flower (token to token) to deliver a fitting response.

These chatbots can be used for:

Customer service
Assistance (text or voice)
Training and interactive tutoring
Language learning

3.3. Automatic translation

Having absorbed texts in many languages, LLMs often know how to switch from one language to another. Many languages share grammatical structures, enabling the artificial bee to recognize them and offer translations. Results are not always perfect, but frequently surpass the quality of older rule-based systems.

3.4. Programming assistance

Some LLMs, such as those behind certain “copilot” systems for coding, can suggest correct code, propose solutions, and fix errors. This usage is increasingly popular, proving that “programming languages” are just another form of textual language in the big hive of content.

3.5. Document analysis and structuring

Besides generating text, LLMs can also summarize, analyze, label (classify), or even extract insights from text. This is quite handy for sorting large volumes of documents, gathering customer feedback, analyzing reviews, etc.

4. Possible stings: limitations and risks

4.1. Hallucinations: when the bee invents a flower

As mentioned, the bee (the LLM) can “hallucinate.” It isn’t connected to a truth database: it relies on probabilities. Hence, it can confidently provide false or nonexistent information.

Remember that an LLM is not an oracle; it predicts text without “understanding” it in a human sense. This can have serious consequences if used for critical tasks (medical, legal, etc.) without supervision.

4.2. Bias and inappropriate content

Bees gather pollen from all kinds of flowers, including dubious ones. Biases present in the data (stereotypes, discriminatory statements, etc.) seep into the hive. We may end up with honey tainted by these biases.

Researchers and engineers strive to implement filters and moderation mechanisms. But the task is complex: it requires identifying biases, correcting them, and avoiding overly restricting the model’s creativity.

4.3. Energy costs and carbon footprint

Training an LLM is like maintaining a giant swarm in a greenhouse heated around the clock. It requires huge computational resources, thus a lot of energy. Environmental concerns are therefore central:

Can we make training more eco-friendly?
Should we limit model size?

Debate is ongoing, and many initiatives aim to lower the carbon footprint through both hardware and software optimizations.

4.4. Lack of real-world contextualization

Though the model is impressive, it often lacks a real-world understanding beyond text. These artificial bees only know textual “pollen.” They do not realize that a physical object weighs a certain amount or that an abstract concept has legal implications, for example.

This gap is evident in tasks requiring deep “common sense” or real-world experiences (perception, action, sensory feedback). LLMs can fail on “easy” questions for a human because they lack sensory context.

5. The art of taming: “prompt engineering”

5.1. Definition

A prompt is the text you supply to the LLM to obtain a response. How you craft this prompt can make all the difference. Prompt engineering involves writing an optimal (or near-optimal) prompt.

It’s like blowing smoke into the hive to calm the bees and show them precisely what job to do: “Go gather pollen in this specific area, in that direction, for this type of flower.”

5.2. Prompt engineering techniques

Clear context: define the LLM’s role. For instance, “You are a botany expert. Explain…”
Precise instructions: specify what you want, the answer’s format, length, style, etc.
Examples: provide sample Q&A to guide the model.
Constraints: if you want to narrow the scope, say so (“Do not mention this topic; respond only in bullet lists,” etc.).

5.3. Temperature, top-k, top-p…

When generating honey, the bee can follow its recipe more or less strictly. Temperature is a key parameter:

Low temperature (~0): the hive is very disciplined. Responses are more “conservative” and coherent but less original.
High temperature (>1): the hive is more imaginative but might go off track.

Similarly, “top-k” limits the model to the k most likely tokens, and “top-p” imposes a cumulative probability threshold (nucleus sampling). Prompt engineering also involves tuning these parameters for the desired outcome.

6. Setting up a hive: deployment and integration

6.1. Deployment options

Hosted API: Use a provider that hosts the model. No heavy infrastructure needed, but you pay per use and rely on a third party.
Open-source model: Install an open-source LLM on your own servers. You retain total control but must handle logistics and energy costs.
Hybrid model: Use a smaller local model for simpler tasks and call an external API for more complex tasks.

6.2. Security and moderation

Deploying an LLM means assuming responsibility for its output. You often need to add:

Filters to block hateful, violent, or discriminatory content
Mechanisms to block sensitive data (e.g., personal information)
A logging and monitoring policy to track exchanges and enhance the system

6.3. Ongoing monitoring and improvement

Even a well-set-up hive needs supervision:

Collect user feedback
Adjust prompts and generation parameters
Update or retrain a more recent model as needed

It’s a continuous process, much like tending a real swarm: monitor its health, correct missteps, and leverage lessons learned.

7. Future flights: toward multimodal and adaptive models

LLMs are only at the beginning of their evolution. Soon, we’ll talk about multimodal models, capable of handling text, images, sounds, and videos—a swarm that gathers not only textual flowers but also visual or auditory ones.

Systems combining vision and language are already emerging, or those linking symbolic reasoning with text generation. The bee might, for example, interpret an image and describe it, or pick up a sound and analyze it in context.

On a societal level, this rapid development raises many questions:

How can we ensure accountability and transparency in using these systems?
What impact on jobs related to writing, translation, or text analysis?
How can we balance competition between major AI players (Big Tech, private labs, open-source projects)?

8. Our next flight path: a look at traditional NLP

In our next article, we’ll dive more generally into NLP (Natural Language Processing). We’ll examine how more classic, sometimes lighter, approaches still coexist alongside these massive LLMs.

Before LLMs, there was the traditional NLP hive, which used supervised classification, semantic search algorithms, syntactic rules, etc. We’ll explore:

Basic methods (bag-of-words, TF-IDF, n-grams)
Pre-Transformer neural models (RNN, LSTM, etc.)
Typical NLP pipelines (tokenization, POS tagging, parsing, etc.)

This will help us understand how the LLM swarm has drawn on a broad ecosystem of earlier research.

9. Conclusion: the art of enjoying honey

We have taken a comprehensive look at LLMs, these gigantic bees capable of turning raw text into sophisticated answers. Here are the key points:

Training: LLMs are trained on massive datasets, learning the statistical patterns of language.
Architecture: Transformer layers are the model’s core, capturing contextual relationships through attention.
Use cases: From writing to translating, chatbots, code suggestions, and more—the range is huge.
Limitations: Hallucinations, biases, energy cost… LLMs are not flawless. They need guidance, oversight, and verification.
Prompt engineering: The art of crafting the right request (and setting the right parameters) to get the best response possible.
Deployment: Various strategies exist—relying on a hosted API, installing an open-source model, or combining both.

Bees are a symbol of organization, collaboration, and the production of delicious honey. In the same way, a well-managed LLM can be a tremendous asset for optimizing, creating, and assisting with numerous language-related tasks. But, like any powerful swarm, it demands caution and respect, or you risk unexpected stings.

In upcoming articles, we’ll continue our journey through the buzzing world of AI and NLP: we’ll see how AI developed around more specific modules (text processing, syntactic analysis, classification) before exploring AI Agents and concluding with a global comparison to understand where AI Smarttalk fits into all of this.

Until then, remember: you don’t have to be an expert to recognize good honey, but taking the time to understand the hive and its bees is the best way to savor it confidently.

See you soon for the next step in our journey through the buzzing world of AI!

1. What is an LLM? The swarm that buzzed louder than all the others​

1.1. Origin and concept​

1.2. A hive built on massive amounts of data​

2. Stepping into the hive: an overview of how it works​

2.1. Tokenization: gathering pollen piece by piece​

2.2. Embeddings: turning pollen into vectors​

2.3. The “Transformers” layers: the bee dance​

2.4. Honey production: predicting the next token​

3. Honey in all shapes: use cases for LLMs​

3.1. Assisted writing and content generation​

3.2. Conversation tools and chatbots​

3.3. Automatic translation​

3.4. Programming assistance​

3.5. Document analysis and structuring​

4. Possible stings: limitations and risks​

4.1. Hallucinations: when the bee invents a flower​

4.2. Bias and inappropriate content​

4.3. Energy costs and carbon footprint​

4.4. Lack of real-world contextualization​

5. The art of taming: “prompt engineering”​

5.1. Definition​

5.2. Prompt engineering techniques​

5.3. Temperature, top-k, top-p…​

6. Setting up a hive: deployment and integration​

6.1. Deployment options​

6.2. Security and moderation​

6.3. Ongoing monitoring and improvement​

7. Future flights: toward multimodal and adaptive models​

8. Our next flight path: a look at traditional NLP​

9. Conclusion: the art of enjoying honey​

Spremni za unapređenjekorisničkog iskustva?