MicrostockGroup Sponsors


Author Topic: Shutterstock+AWS press release: your images as AI training data  (Read 5070 times)

0 Members and 1 Guest are viewing this topic.

« on: July 29, 2021, 13:07 »
+12
https://investor.shutterstock.com/news-releases/news-release-details/shutterstockai-launches-data-aws-data-exchange-advance-computer

I think that the announcement means that there will be images and metadata made available by Shutterstock to be used to train AI systems or other non-traditional uses with no compensation to the owners of the images or metadata (Shutterstock doesn't do any keywording). In spite of their claims about rigorous review, keyword spam is rife on their site

"The datasets include collections of images and 3D models from Shutterstock.AI's library of 400 million visual assets, along with metadata backed by rigorous human and AI review. The datasets span multiple industry categories, and have been curated to align with some of the most common computer vision applications in ecommerce, travel and tourism, self-driving cars, and consumer electronics."

https://www.shutterstock.com/blog/shutterstock-ai-aws-team-up-to-help-companies-with-computer-vision

Here's some pricing information on the AWS web site:

https://aws.amazon.com/marketplace/seller-profile?id=fb34254c-c7cf-47b8-806c-24045a0a2807

How does that $10,000 price get shared out among contributors?

From the description on AWS as to what you get for your (minimum) $10,000, you're not licensing content, rather "This data license gives you the right to train models for the duration of the subscription. Data sets will be published to your S3 bucket." I could easily see how Shutterstock would decide nothing was due in royalties for training models, even though without contributor content they'd have nothing to offer.
« Last Edit: July 29, 2021, 16:00 by Jo Ann Snover »


« Reply #1 on: July 29, 2021, 15:05 »
+1
Contributors get shared a reset on january

« Reply #2 on: July 30, 2021, 10:56 »
+4
If keyword spam is a true problem then everything will be classified as "background" :-)

A machine learning algorithm walks into a bar. The bartender asks, Whatll you have? The algorithm says, Whats everyone else having?

« Reply #3 on: July 30, 2021, 12:31 »
0
Comments in this thread can be qualified as a brainstorming for a company which puts efforts to pay less for us. Especially when it is, indeed, possible to separate a background info noise from the value data.

« Reply #4 on: July 30, 2021, 13:13 »
+2
I wonder if hidden in one of the changes to the TOS was something saying they could profit off our keywording (intellectual property) without paying us.

« Reply #5 on: July 31, 2021, 02:31 »
+3
I wonder if hidden in one of the changes to the TOS was something saying they could profit off our keywording (intellectual property) without paying us.

Aren't we already sharing our keywords, or allowing others to use our keywords, via the Shutterstock keyword suggestion tool?

« Reply #6 on: July 31, 2021, 15:06 »
+8

Aren't we already sharing our keywords, or allowing others to use our keywords, via the Shutterstock keyword suggestion tool?

Shutterstock isn't charging buyers $10,000+ for 12 months access to the keyword tool. The issue here is them making money they don't share with the contributors who created the source material.

« Reply #7 on: August 01, 2021, 07:06 »
+1
Shutterstock isn't charging buyers $10,000+ for 12 months access to the keyword tool. The issue here is them making money they don't share with the contributors who created the source material.

If I'm not mistaken, there's noting mentioned how keywords are handled as intellectual property or how they can be subject to royalties. Initially, keywords are were meant to support the visibility of the content in the database, they had no other value. Developments of AI and the need to have big datasets to train algorithms changed that. Shutterstock is sitting on such a such a set of data. It would be foolish of them not trying to monetizing that opportunity. And it's a dick move to no share that revenue with the ones who built up that dataset: the contributors.

On the other hand, how much would we get? Let's say the Shutterstock database contains 400 million assets.
That means, on a 10.000$ deal, the keyword set per asset would be worth 0,000025$. If Shutterstock takes 85%, and gives 15% to the contributors... how much would be left?
It would require Shutterstock to sign a lot of those deals before contributors with big portfolio's that match the category restrictions start seeing some significant income from their...  keywording efforts.

I understand your point and agree that it's again a greedy move of a company that milks their sources dry. But I can only apathetically shrug my shoulders, and move on after yet another case of exploiting contributors.

« Reply #8 on: August 03, 2021, 18:40 »
+2
Garbage In - Garbage Out might be amusing here.

If they shovel in some of the non-reviewed, random rubbish complete with keyword spam then all bets are off what the AI will "learn" from that.

Machine learning is only as good as the quality of material its fed.

« Reply #9 on: August 06, 2021, 11:09 »
0
I remember Pixta had a contract with AI training a few years ago.
1. Some contributors were offered to participate
2. For a small fee (it was not a final usage of the specific images on the client's side, that's why it was small)

As far as I remember, keyword sets and titles are intellectual property and it was treated like this before by SS. What's changed since then? They're just know we are going to swallow it.


 

Related Topics

  Subject / Started by Replies Last post
2 Replies
3959 Views
Last post September 18, 2008, 00:09
by bittersweet
2 Replies
4632 Views
Last post October 28, 2008, 00:06
by stormchaser
5 Replies
4304 Views
Last post August 27, 2009, 17:36
by kaycee
1 Replies
4310 Views
Last post December 18, 2009, 06:25
by leaf
34 Replies
12476 Views
Last post June 16, 2020, 11:22
by jjneff

Sponsors

Mega Bundle of 5,900+ Professional Lightroom Presets

Microstock Poll Results

Sponsors