<p>In addition to AI news and my own thoughts, I want to gradually talk about the basic principles in AI and LLM. I recall the meme about "this is the base." So today we will discuss one of the fundamental concepts in the field of AI and LLM - embeddings. An embedding is the process of converting words, phrases, or sentences into vectors of numbers while preserving the semantic proximity between them.</p>
<p>For example: the words "car", "машина", and "автомобиль" have the same meaning (even with very different spellings), and thus will be close in n-dimensional space, while the phrases "critical thinking" and "flat earth" will be far apart.</p>
<p>Numerical example: the vector [1, 0] will be close to the vector [0.98, 0.01], but far from [0, 1].</p>
<p>Embeddings are widely used, for example:</p>
<ul>
<li>in translations (LLM understands the semantic proximity of words in different languages);</li>
<li>in segmentation (questions can be grouped by meaning for support);</li>
<li>in determining the proximity of a query to information in a knowledge base (services like <a href="https://www.chatpdf.com">https://www.chatpdf.com</a> work);</li>
<li>in sentiment analysis (determining positive and negative reviews);</li>
</ul>
<p>To use embeddings, you don’t need to understand the details. OpenAI provides an API for calculating embeddings for any text, returning a vector of numbers. By comparing two vectors, you can determine their semantic closeness.</p>
<p><strong>More details</strong></p>
<ul>
<li>What are embeddings <a href="https://platform.openai.com/docs/guides/embeddings">https://platform.openai.com/docs/guides/embeddings</a></li>
<li>Source code on how to use an embed to find relevant info in Wikipedia and use it as context for GPT: <a href="https://github.com/openai/openai-cookbook/blob/main/examples/fine-tuned_qa/olympics-1-collect-data.ipynb">https://github.com/openai/openai-cookbook/blob/main/examples/fine-tuned_qa/olympics-1-collect-data.ipynb</a> <a href="https://github.com/openai/openai-cookbook/blob/main/examples/fine-tuned_qa/olympics-2-create-qa.ipynb">https://github.com/openai/openai-cookbook/blob/main/examples/fine-tuned_qa/olympics-2-create-qa.ipynb</a></li>
<li>Course "Deep Learning for NLP" from Stanford University: <a href="https://web.stanford.edu/class/cs224n/">https://web.stanford.edu/class/cs224n/</a></li>
<li>TensorFlow Embedding Projector - interactive visualization of embeddings: <a href="http://projector.tensorflow.org/">http://projector.tensorflow.org/</a></li>
</ul>
<p>#basics #ai #llm #embeddings #openai</p>
