The current round of language and image model speculation is based on the premise that using any public data for training is fair use not a massive copyright violation.
The Post’s analysis suggests more legal challenges may be on the way: The copyright symbol — which denotes a work registered as intellectual property — appears more than 200 million times in the C4 data set.This humble website is included in the C4 corpus. You can use this tool to see if your copyright has also been violated.
"…where you can download, share, and reuse millions of the Smithsonian’s images—right now, without asking. With new platforms and tools, you have easier access to more than 4.4 million 2D and 3D digital items from our collections—with many more to come."Fun collection to browse through. I can even post this image of a Bell X-1 cockpit without attribution!
"At minimum, Stable Diffusion’s ability to flood the market with an essentially unlimited number of infringing images will inflict permanent damage on the market for art and artists."Describing image models as sophisticated collage tools takes some of the mystery out of AI and makes it clear work is being used without consent. This essay has a clear description of the diffusion process.
"What a strange, unexpected delight to be asked to return with the express goal of researching what the Commons has become and understanding how cultural institutions around the world have evolved through being a part of it. We want to design a stronger future for the program, with enduring longevity at its heart."Great to hear this! The new Flickr owners are investing in its Flickr Commons program.