MicrostockGroup Sponsors


Author Topic: about AI training thru machine learning  (Read 1595 times)

0 Members and 1 Guest are viewing this topic.

« on: September 30, 2022, 12:31 »
0
there are 2 aspects of concern here and many others are conflating them in the other thread - the creation of new, unique images isn't violating any copyrights, and Getty et al aren't making that claim.





And the saddest thing to me is that they did not even bother to pay for the images used to train the AI, as they were clearly watermarked.
Yes, they give copyright to the "describers" of the AI generated images, but they used images to train it where they did not own copyright themselves. They can probably get away with this legally, but moraly I find all of this highly repulsive. They basically used our own images without paying for them to create something that one day in the future will most likely destroy our line of work.

this is an important, but separate issue.  given that millions of images are being used in training, the effect of any particular image on the new creation is minuscule.

So, the question is whether the owners of the training images should be compensated, and if so, how? and how much?

What are the possible ways to address this concern?



« Reply #1 on: September 30, 2022, 13:15 »
+3

Simple.
Copyright owners have to be asked for agreement
BEFORE the use of ai training.

« Reply #2 on: September 30, 2022, 14:51 »
+4
How is the training issue (not seeking permission or giving compensation to the copyright holder) different from using samples in music? There've been lots of lawsuits over this and I don't think the notion that the sample is short gets you off the hook.

The fact that you can't create these images without a large database to "train" with is not at issue, as far as I know. The fact that there are lots of people's copyrighted work that you're only stealing a very little bit from doesn't really change the basics of the transaction. Even images lifted from social media have copyright - the person who snapped the image holds it.

It's hard not to draw the conclusion that a big tech entity can rely on the lousy economics from individual copyright holders perspective when crowd-stealingsourcing: paying a lawyer to go after the misuse is too expensive for most people to afford. StealingSourcing internationally makes it even less likely people will come after you.

IMO it's likely individual creators of the works used to train AI systems probably can't do anything about this wholesale misuse of their work, but that doesn't alter the fact that there is wholesale misuse.

https://www.vondranlegal.com/five-music-infringement-cases-mixingsampling

https://www.highsnobiety.com/p/unauthorized-rap-samples/

Music Sampling Lawsuits: Does Looping Music Samples Defeat the De Minimis Defense?

« Reply #3 on: September 30, 2022, 18:12 »
0
How is the training issue (not seeking permission or giving compensation to the copyright holder) different from using samples in music? There've been lots of lawsuits over this and I don't think the notion that the sample is short gets you off the hook.

The fact that you can't create these images without a large database to "train" with is not at issue, as far as I know. The fact that there are lots of people's copyrighted work that you're only stealing a very little bit from doesn't really change the basics of the transaction. Even images lifted from social media have copyright - the person who snapped the image holds it.....

thanks for those links!
Sampling music takes one work & adds/modifies it; similar to extracting sky, background from an image & pasting it into your image. but there some relevant issues:


==========
C. Fair Use Defense
Exceptions exist to the exclusive rights granted to copyright owners. The
fair use doctrine allows someone other than the copyright owner to use the
copyrighted work in a reasonable manner without permission.   

...
If the trier of fact determines that the level
of unauthorized use does not rise above the threshold of substantial similarity,
then the trier should find the unauthorized use to be de minimis. [too small to matter]

...
 Problems arise when the
appropriation is quantitatively small and not the heart of the work.

the difference again is that these sampling cases address specific, detectable copying. in ML the individual images are
so, if the data used is only one of thousands tested, it isnt the heart of the work? (very different than transforming one image)

for ML training, the images are analyzed and classified. the result is not copying or storing the image per se, but storing a digital analysis of the image retrievable by its tags. such a process is one way, not commutative --  you can't reconstruct the image from the new collection [for some a wonky details:  https://nanonets.com/blog/machine-learning-image-processing/#working-of-machine-learning-image-processing ]  analogous to netflix scraping copyrighted works and producing recommendations?

 In Sandoval v. New Line Cinema,

 artist Sandoval alleged that New Line
Cinemas theatrical release of Seven infringed his copyrighted work when his
photographs appeared in a movie scene without his permission.  The court
reiterated the de minimis analysis in Ringgold, but held as a matter of law, that
the use of the photographs was de minimis because the photographs were not
quantitatively observable.

similarly, no individual image is observable in images created de Novo? To create a new image, the generative phrase is compared with those non-image tables.  the training results are a novel expression.

quotes from https://cpb-us-e1.wpmucdn.com/sites.suffolk.edu/dist/5/1153/files/2018/01/SRWILSONV1N1N-rp2cve.pdf (hopefully fair use!)

 the copyright questions arise in the collection & use of the imagers in the training phase, not in the later creative phase.  should infringement in the first area contaminate the later use of that resource?   if you use a thesaurus which was prepared using copyrighted collections/compilations, are you liable if you use that book to write the next great novel?  similarly, the results of training are usable in creating new images. IANAL

« Last Edit: September 30, 2022, 18:39 by cascoly »

« Reply #4 on: October 01, 2022, 00:16 »
+1
Read the article. There ARE concerns about copyrights. No one is conflating anything. There are still issues to iron out. The AI is STILL using parts of other peopkes images without permission. Yes, that might be five pixels, or it could be 1/4 of the image not altered.

And this about Dall-E: This public debut comes without answers to some key questions. It's not clear if AI-generated art is fair use or stolen, for instance Getty Images and similar services have banned the material out of concern it might violate copyright. While this expansion will be welcome, it might test some legal limits.
https://www.engadget.com/dall-e-ai-image-generator-beta-no-waitlist-173746483.html


 

Related Topics

  Subject / Started by Replies Last post
0 Replies
3162 Views
Last post August 31, 2007, 07:56
by leaf
24 Replies
15713 Views
Last post March 17, 2010, 05:34
by leaf
17 Replies
5870 Views
Last post July 30, 2012, 15:50
by vonkara
26 Replies
7731 Views
Last post April 07, 2015, 05:58
by Difydave
2 Replies
2495 Views
Last post December 24, 2022, 10:10
by Year of the Dog

Sponsors

Mega Bundle of 5,900+ Professional Lightroom Presets

Microstock Poll Results

Sponsors