Stochastic parrots and AIs on the stopwatch
Is that the smell of content deals going up in smoke? LLMs are reaching a data and compute scaling plateau, and better data might not be the answer.
Scale, we were once assured, was the answer for LLM systems. The publicity hungry commercial prophet and/or charlatan of such thinking has been Sam Altman of OpenAI. Indeed, it was Altman who wrote of a "Moore's Law for Everything" back in 2021.
The answer way back then in the mists of time was that more data and greater computational power would produce better results in LLM usage. Simply, ha said, the more data LLMs have to work with and the more power they have to process that data, the better the results will be.
This week, that notion was brought into serious question by none other than OpenAI dissident and co-founder of Safe Superintelligence (SSI), Ilya Sutskever.
Speaking to Reuters, Sutskever said "The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing.
“Scaling the right thing matters more now than ever.”
Reuters added that, "Sutskever declined to share more details on how his team is addressing the issue, other than saying SSI is working on an alternative approach to scaling up pre-training."
Roughly, this "alternative approach" means looking at what is done with the data once the system has it, and not just adding more.
This all took place as a report in The Information suggested that OpenAI is struggling to make a significant leap forward with its forthcoming Orion model: "While Orion’s performance ended up exceeding that of prior models, the increase in quality was far smaller compared with the jump between GPT-3 and GPT-4, the last two flagship models the company released, according to some OpenAI employees who have used or tested Orion."
There is also a general consensus that most of the high quality data out there has been vacuumed up already, and, as a colleague here at Glide put it, "just adding everything The Economist has ever produced isn't the answer" to the question of diminishing returns.
The increasing use of synthetic data doesn't seem to be helping either, to the surprise of no one who has ever written anything original.
So not more data, and not more computational power. It seems that the solution everyone is heading for is giving the LLMs more time to process. Note, I will not use the word "think" for what are basically pattern recognition systems at heart.
By way of example, the Reuters story quoted Noam Brown, a researcher at OpenAI who has worked on their latest o1 model. "It turned out that having a bot think for just 20 seconds in a hand of poker got the same boosting performance as scaling up the model by 100,000x and training it for 100,000 times longer."
If those words are thought about in terms of cost, then you're looking at a vast difference in expenditure. And make no mistake, it is the economics of such technical advances that dictate the direction they go and eventual success or failure.
At present, OpenAI's GPT-4 remains the perceived gold standard in publicly available systems. Plenty would argue that assessment from a scientific standpoint, but for most people it's the only AI they know, and in this space that's what they mean by gold.
Such is our collective and rather unreasonable expectation about the speed of technological advances in the field, and the promises reinforcing that expectation from people such as Altman, that to realise there may currently be a hard limit to them, at least along the path we were sold on, does put the brakes on the hype train somewhat.
And it raises the immediate question, of course, of what this means for mooted future income from content deals, if - as many hope - having quality content alone is the secret to a big cheque signed by Sam and his ilk.
The phrase currently in play is "test-time compute", along with the slightly more cryptic "inference compute". Essentially, this is a processing step being utilised in order to further analyse the results of the trained data. As in the poker example given by Brown above, if the system is given time to refine an answer, then the 'better' that answer will be.
Twenty seconds. If only everyone had to wait 20 seconds before responding to stuff, we might live in a better world. Apart from the road accidents.
We love a Transatlantic success story, which is why we are thrilled to announce our new partnership with Refact, the LA-based agency which specialises in helping content-driven companies cut through the noise of the digital world and do fantastic things for their customers.
By partnering with Refact, we improve access to Glide CMS and Glide Nexa - our customer data platform - to U.S.-based publishing and media organisations. (Surely, “organizations”?! - Ed).
They are excellent, and you can read the full announcement for more.
NYC Judge dismisses OpenAI copyright lawsuit
A New York judge has dismissed a case brought by Raw Story and AlterNet against OpenAI for copyright infringement, on non-IP grounds - they failed to prove 'injury'. While the ruling may lead to disappointment within publishing circles, it does not necessarily lead to all other cases in the state going the same way; NYC is where the New York Times is mounting its big case against OpenAI, seen as a trailblazer for the industry in the US.
Read
AI giants over-rely on publisher content
A Ziff Davis study reveals that AI companies like OpenAI and Google rely much more heavily on publisher content to train their models than they disclose. As expected, this has enraged publishers yet more over copyright theft, as the findings magnify previous research that AI firms depend on publisher content for both current and future models.
Read
Google's bot extortion
Renewed attention on the (in our view) monopoly-defining Google requirement which says you must allow its AI content scraping bot to access your site alongside its search engine bot, or you effectively disappear your business from search results. It's back in the limelight again after it started monetising publisher content in AI Overviews without compensation. Google takes with one hand, and also the other!
Read
Popular AI shocker
The Beatles' "Now and Then" has been nominated for two Grammys, 50 years after their breakup. The track, made possible by AI technology that isolated John Lennon's vocals from a 1970s demo, is considered the band's final song and marks a surprising return thanks to modern technology.
Read
AI firms missing the point
What's preventing more widespread adoption of AI by the publishing and media industries? Our very own Rich Fairbairn talks about the unique challenges publishers and media face when it comes to AI content.
Read
Chegg v ChatGPT: a dismal tale
Chegg, once a leading edtech platform, has seen its value plummet by 99% due to the rise of free AI tools like ChatGPT. Chegg's decline paints a cautionary tale about the risks of relying on AI tools without fully adapting to an evolving market.
Read
Google November 2024 Core Update: what we know so far
The November 2024 core update, which began on the 11th, brings in some mixed reactions, with some sites seeing further ranking drops, while others report no changes yet. It claims to prioritise genuinely useful content over SEO-driven content. As always, check your traffic graphs.
Read
Bluesky gains 700K users, closing in on Threads
"Not X" Bluesky added 700K users in a week to reach 14.5 million total, making it the second most popular social app in the US. Alongside recent updates and the decentralised nature which seem to be attracting users seeking alternatives to X and Threads, Bluesky's appeal lies in its more open, user-driven approach.
Read
Guardian's X-it
The Guardian has dropped X due to its concerns over harmful content and nuttery, turning off its route to a total of 27m followers across all its official accounts. The decision highlights a broader trend of media organisations reassessing their engagement with and from social media platforms.
Read



