On Meaning
Meaning is an avenue that I’m deeply drawn to, particularly because it is intangible. Communication is an attempt to express meaning.
Recently, I’ve realized that communication is an ever-iterating of vector embeddings. In other words, an iteration of functions that map the intangible to the tangible.
When we speak to LLMs, words are mapped to tokens which are mapped to vector embeddings. These words attempt to express some meaning and we attempt to capture this meaning via continuous numbers. I believe the more non-discrete a mapping is, the more well it captures meaning. Words are plenty continuous with the massive lexicon of the English language and tokens even more continuous, but numbers are truly continuous. If we look at this micro-pipeline of words to numbers, LLMs succeed in their mapping.
Yet I am still unsatisfied with my interactions with LLMs. The LLM often provides output unsuited to my needs. I see this often in learning abstract concepts. I recently fell into a rabbit hole of developing an intuitive understanding of Lagrange duality. There was an intangible hole that I could not quite diagnose, and by my inability to diagnose, the LLM continuously provided answers that dodged what I needed to hear. After some time, I asked the right question and everything clicked.
Whose fault was this? The LLM’s job is to map words to embeddings and provide a sufficient output from these embeddings. Nothing in this internal pipeline went wrong. We have to zoom out. The pipeline that failed was my failure to translate the meaning I wish to convey into a discrete string of words.
In many ways, the English language itself is a vector embedding. I have some intangible meaning to convey which must be mapped to the vector of English words. This is certainly imperfect.
Look around us: translation, poetry, literature, public-speaking and most relevant, prompt-engineering are all but the same in this grand pipeline of mapping accurate vector embeddings.
What do we do with this information? We can try to optimize LLMs. We treat LLMs as a separate optimization problem, but if LLMs are just part of this grand pipeline, why don’t we zoom out and optimize the function of mapping our internal meaning to the “vector embeddings” that the LLMs receive?
One interesting yet simple thought is analyzing different languages. Consider the following proof:
Suppose X is the intangible meaning you wish to convey. Now suppose C is the Chinese language and E is the English language. We can attempt to convey X in either language C or E. We can also trivially conclude that their vector embeddings are unequal. By inequality, either C or E must be closer to X. Therefore different languages ought to convey meaning more accurately than that of others.
This conclusion leads to more questions. Which language is the most accurate? What makes it more accurate?
And the final question and the reason why I write this essay: What would it take to build a language that captures meaning itself?