Different language models have different capabilities, and different approaches yield very different results. Understanding the limitations and how these models are trained will be increasingly important, especially as it comes time to navigate which tools to use, when and how.
People talk about “data quality” a lot, which is fundamentally what good training is all about, yes, but rarely does anyone define what quality actually means. If you’re a Bitcoiner, you’re probably familiar with ideas like “subjective value”, and it applies in this case.
If you trust the average ‘AI bro’ on Twitter, you’ll believe that you can just upload a few PDF’s, a couple of books, a podcast or company’s financial report, to “train your own AI”.
This is completely false.
There are myths and misconceptions in every industry. The AI space is near the top, due to its potential, and the poorly understood nature of intelligence. People are either over or underestimating things, and simply misconstruing capabilities, every single day.
Nietzsche’s missing ingredient.