Rather than needing tens of thousands of machines and millions of dollars to train a new model, an existing model can now be customized on a mid-priced laptop in a few hours. This fosters rapid innovation.
Nice summary of how innovation in AI might move out of the largest few companies.
Daily Beast
Coupled with the Writers Guild strike and the arguably reckless pace at which companies are willing to adopt a mostly unproven, experimental, and demonstrably harmful technology, the world seems to be falling headfirst into a labor struggle the likes of which it hasn’t seen in quite a while.
The answers here are to vote for labor-friendly politicians and unionize. Good evergreen advice but also frustratingly vague for a specific looming threat.
The Verge
That’s because there is no actual precedent for saying that scraping data to train an AI is fair use; all of these companies are relying on ancient internet law cases that allowed search engines and social media platforms to exist in the first place. It’s messy, and it feels like all of those decisions are up for grabs in what promises to be a decade of litigation.
The current round of language and image model speculation is based on the premise that using any public data for training is fair use not a massive copyright violation.
Washington Post
The Post’s analysis suggests more legal challenges may be on the way: The copyright symbol — which denotes a work registered as intellectual property — appears more than 200 million times in the C4 data set.
This humble website is included in the C4 corpus. You can use this tool to see if your copyright has also been violated.
Dan York
If we as technologists want to help the broader public understand these AI systems, both their opportunities and challenges, then we need to speak in plain language.

I do think we need to go back to the beginning and just say “ChatGPT lies”.
I’m guilty of trying to find the perfect word with the correct nuance or shade of meaning to describe a situation. Sometimes that impulse works against clarity.
Samuel R. Bowman
36% of another sample of 480 researchers (in a survey targeting the language-specific venue ACL) agreed that “It is plausible that decisions made by AI or machine learning systems could cause a catastrophe this century that is at least as bad as an all-out nuclear war” (Michael et al., 2022).
So we've got that going for us. LLMs strategically manipulating people into acquiring power sure sounds like a serious flaw in the software. A bit more information and context at the unfortunately named NYU Alignment Research Group. (ARG? Seriously?!)

I like to think of language models like ChatGPT as a calculator for words.

This is reflected in their name: a “language model” implies that they are tools for working with language. That’s what they’ve been trained to do, and it’s language manipulation where they truly excel.

Nice framing for thinking about the best ways to use LLMs.

The pace of change in AI does feel as if it could soon overtake our collective ability to process it. And the change signatories are asking for — a brief pause in the development of language models larger than the ones that have already been released — feels like a minor request in the grand scheme of things.
Not sure it’s possible to slow this train but it is interesting to hear arguments that we should let society absorb what we already have before releasing more evolutions.
"The models are built on statistics. They work by looking for patterns in huge troves of text and then using those patterns to guess what the next word in a string of words should be. They’re great at mimicry and bad at facts."
I, for one, welcome our intelligent octopus overlords.
Some Words To Not
"Letting an AI system do this work for you means giving up all of that. It's like sending a robot to do your WestWorld vacation for you, and just sharing the photos it took on your Instagram feed. Behaving in this way is not at all about cheating, it is about missing the whole point. If you care about having clear ideas and becoming better at what you do, you want to be writing."
I don't see anything in this post.
"But the need for humans to label data for AI systems remains, at least for now. 'They’re impressive, but ChatGPT and other generative models are not magic – they rely on massive supply chains of human labor and scraped data, much of which is unattributed and used without consent,' Andrew Strait, an AI ethicist, recently wrote on Twitter. 'These are serious, foundational problems that I do not see OpenAI addressing.'"
Warning: this article is disturbing. Companies shouldn't be able to cause people psychological damage to get funding.
"The problem with this description isn't just that it's wrong. It's that the AI is eliding an important reality about many loans: that if you pay them down faster, you end up paying less interest in the future. In other words, it's feeding terrible financial advice directly to people trying to improve their grasp of it."
Our unevenly distributed AI future is terrible already! Update (1/23): And who cares about copyright?
« Older posts