MicrostockGroup

Agency Based Discussion => Shutterstock.com => Topic started by: Jo Ann Snover on July 29, 2021, 13:07

Title: Shutterstock+AWS press release: your images as AI training data
Post by: Jo Ann Snover on July 29, 2021, 13:07
https://investor.shutterstock.com/news-releases/news-release-details/shutterstockai-launches-data-aws-data-exchange-advance-computer (https://investor.shutterstock.com/news-releases/news-release-details/shutterstockai-launches-data-aws-data-exchange-advance-computer)

I think that the announcement means that there will be images and metadata made available by Shutterstock to be used to train AI systems or other non-traditional uses with no compensation to the owners of the images or metadata (Shutterstock doesn't do any keywording). In spite of their claims about rigorous review, keyword spam is rife on their site

"The datasets include collections of images and 3D models from Shutterstock.AI's library of 400 million visual assets, along with metadata backed by rigorous human and AI review. The datasets span multiple industry categories, and have been curated to align with some of the most common computer vision applications in ecommerce, travel and tourism, self-driving cars, and consumer electronics."

https://www.shutterstock.com/blog/shutterstock-ai-aws-team-up-to-help-companies-with-computer-vision (https://www.shutterstock.com/blog/shutterstock-ai-aws-team-up-to-help-companies-with-computer-vision)

Here's some pricing information on the AWS web site:

https://aws.amazon.com/marketplace/seller-profile?id=fb34254c-c7cf-47b8-806c-24045a0a2807 (https://aws.amazon.com/marketplace/seller-profile?id=fb34254c-c7cf-47b8-806c-24045a0a2807)

How does that $10,000 price get shared out among contributors?

From the description on AWS as to what you get for your (minimum) $10,000, you're not licensing content, rather "This data license gives you the right to train models for the duration of the subscription. Data sets will be published to your S3 bucket." I could easily see how Shutterstock would decide nothing was due in royalties for training models, even though without contributor content they'd have nothing to offer.
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: oooo on July 29, 2021, 15:05
Contributors get shared a reset on january
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: melastmohican on July 30, 2021, 10:56
If keyword spam is a true problem then everything will be classified as "background" :-)

“A machine learning algorithm walks into a bar. The bartender asks, ‘What’ll you have?’ The algorithm says, ‘What’s everyone else having?’”
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: changingsky on July 30, 2021, 12:31
Comments in this thread can be qualified as a brainstorming for a company which puts efforts to pay less for us. Especially when it is, indeed, possible to separate a background info noise from the value data.
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: pancaketom on July 30, 2021, 13:13
I wonder if hidden in one of the changes to the TOS was something saying they could profit off our keywording (intellectual property) without paying us.
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: Roscoe on July 31, 2021, 02:31
I wonder if hidden in one of the changes to the TOS was something saying they could profit off our keywording (intellectual property) without paying us.

Aren't we already sharing our keywords, or allowing others to use our keywords, via the Shutterstock keyword suggestion tool?
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: Jo Ann Snover on July 31, 2021, 15:06

Aren't we already sharing our keywords, or allowing others to use our keywords, via the Shutterstock keyword suggestion tool?

Shutterstock isn't charging buyers $10,000+ for 12 months access to the keyword tool. The issue here is them making money they don't share with the contributors who created the source material.
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: Roscoe on August 01, 2021, 07:06
Shutterstock isn't charging buyers $10,000+ for 12 months access to the keyword tool. The issue here is them making money they don't share with the contributors who created the source material.

If I'm not mistaken, there's noting mentioned how keywords are handled as intellectual property or how they can be subject to royalties. Initially, keywords are were meant to support the visibility of the content in the database, they had no other value. Developments of AI and the need to have big datasets to train algorithms changed that. Shutterstock is sitting on such a such a set of data. It would be foolish of them not trying to monetizing that opportunity. And it's a dick move to no share that revenue with the ones who built up that dataset: the contributors.

On the other hand, how much would we get? Let's say the Shutterstock database contains 400 million assets.
That means, on a 10.000$ deal, the keyword set per asset would be worth 0,000025$. If Shutterstock takes 85%, and gives 15% to the contributors... how much would be left?
It would require Shutterstock to sign a lot of those deals before contributors with big portfolio's that match the category restrictions start seeing some significant income from their...  keywording efforts.

I understand your point and agree that it's again a greedy move of a company that milks their sources dry. But I can only apathetically shrug my shoulders, and move on after yet another case of exploiting contributors.
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: gnirtS on August 03, 2021, 18:40
Garbage In - Garbage Out might be amusing here.

If they shovel in some of the non-reviewed, random rubbish complete with keyword spam then all bets are off what the AI will "learn" from that.

Machine learning is only as good as the quality of material its fed.
Title: Re: Shutterstock+AWS press release: your images as AI training data
Post by: PokemonMaster on August 06, 2021, 11:09
I remember Pixta had a contract with AI training a few years ago.
1. Some contributors were offered to participate
2. For a small fee (it was not a final usage of the specific images on the client's side, that's why it was small)

As far as I remember, keyword sets and titles are intellectual property and it was treated like this before by SS. What's changed since then? They're just know we are going to swallow it.