EN / RU / 🤖
← Back to essays
· Essay · 1 min

Unique Data for Training Models

Unique data for training models is becoming key in a world of similar models.

<p>It seems that in a world where models are becoming more and more similar to each other, unique data for training is becoming key. If something is publicly available (like a Wikipedia article), the value of that data approaches zero; it will be in every model.</p>
<p>On the other hand, data that is not accessible to anyone, especially personalized data, is a different story. For instance, the history of your correspondence in email, logs of your chats in ICQ during childhood and in Telegram now, as well as your notes in a notebook or a second brain system like Roam—this is what represents real value.</p>
<p>Projects that can utilize this data (ideally not just in RAG format) will stand out compared to all others.</p>
<p>What do you think?</p>