Artificial Mathematics: The Language of Generative Models

When we write a prompt in ChatGPT or any other generative model, we use words, phrases, and human linguistic structures. However, the model does not understand letters, meanings, or context as we do. Instead, it operates purely mathematically: each word, phrase, or idea is transformed into numbers and positions in a high-dimensional mathematical space.

From Letters to Numbers
Language processing in models like ChatGPT is based on a fundamental principle: representing text as numbers. This is done through a process called tokenization, where words or word fragments are converted into sequences of numerical tokens. These tokens are associated with vectors in a mathematical space, allowing the model to manipulate them as coordinates in a vast network of meanings.
|
|
|
|
|
|

Learning as Mathematical Positioning
Training a generative model is essentially about assigning and adjusting the positions of these tokens in a multidimensional space. In essence, a model learns probability distributions: how likely it is for one word to follow another, based on vast amounts of previous data. There is no real understanding, only a sophisticated mathematical correlation between word sequences.
|
|
|
|
|
|
|
|
|
|
|
|

Text Generation: A Numerical Prediction
When we write a prompt, the model calculates the most probable next word based on its prior training. It uses techniques like transformers and embeddings to evaluate the relationship between the words in the prompt and their vector space of knowledge. The output is a sequence of numbers that, when decoded, results in a human-readable response.
In other words, what appears to be a fluent conversation is actually the sequential selection of numbers optimized by probability functions and neural networks.

Visualization of the Generation Process
A generative model follows a probability-based generation process. We can represent it graphically as follows:
Input: "The sky is"
Neural network analyzes probabilities:
("blue" - 85%) | ("gray" - 10%) | ("red" - 5%)
Generated output: "The sky is blue"
Each generated word is the result of statistical calculations based on millions of previously analyzed words.

Limitations and Challenges of Generative Models
Despite their impressive capabilities, generative models have limitations:
- Lack of real understanding: They do not comprehend meaning, only manipulate numerical patterns.
- Dependence on training data: If the data contains biases, the model will replicate them.
- Difficulty handling long-term context: Although improving, they can still lose coherence in long responses.
- Generation of incorrect information or hallucinations: They can produce plausible but incorrect responses.
|
|
|
|
|
|
|
|
|
|

Conclusion: Mathematics in Action
"Artificial mathematics" is the foundation of generative artificial intelligence. There is no interpretation of meanings in human terms, only mathematical calculations that determine patterns and predictions. Every word you see in a ChatGPT response is nothing more than the manifestation of numerical operations on a model trained with millions of previous texts.
Ultimately, generative AI does not understand, it calculates; it does not think, it predicts. What is language to us is pure artificial mathematics to it.
