MicrostockGroup Sponsors


Author Topic: How Algorithms Work  (Read 2806 times)

0 Members and 3 Guests are viewing this topic.

Annie2022

« on: November 19, 2022, 17:41 »
+4
I came across this last night and thought it was interesting. Etsy explains how their search algorithm works:

https://www.etsy.com/seller-handbook/article/375461474487

and

https://www.etsy.com/seller-handbook/article/366469719354

Quote
How do we determine listing quality?

New shops and listings start with a neutral quality score. When a new listing is created, it gets a small, temporary boost in search results so Etsy search can quickly learn more about how shoppers interact with it. To determine a listings quality score, we look at things like clicks, favorites, and purchases. Etsy search looks at things like past shopper behavior to predict how likely shoppers are to purchase listings from your shop in the future. So if you sell vintage or one-of-a-kind items and are frequently creating new listings, Etsy search looks at how popular listings from your shop have been in the past to predict how popular they might be with buyers in the future.

Optimizing your listings with conversion in mind is one step you can take to improve your search ranking and your shops visibility

Quote
Listing quality score

When a buyer searches on Etsy, our goal is to help them find items they want to purchase. To do this, Etsy search looks at clues from shoppers to determine whether a listing is appealing and meets their expectations once they click. We look at how well a listing convertshow many people view it and then make a purchaseto determine whether buyers are interested in it, which boosts that listings quality score and placement in search results.

Quote
Customer and market experience score

In addition to helping people find items they want to purchase, we want buyers to have a great experience when they shop on Etsy. To do this, we give each shop a customer and market experience score and factor it into search placement.



Also, earlier this year, Roscoe started a thread (which unfortunately was taken down because of some unrelated fighting) that demonstrated how Indivstock openly shows how photos rank on searches with a plus/minus system. Things like how editorials lose considerable ranking points as they age, or how holidays and seasonal files have additional points to increase ranking, or if an item gets lots of views but doesn't sell, it loses ranking points. I can't remember them all, but it was very revealing.



I am guessing that the basics of designing an algorithm has a lot of similarities, whether it's for an online marketplace or a microstock agency, and can be very helpful when deciding what to shoot.


Interested in what others say.



« Reply #1 on: November 20, 2022, 21:27 »
0
The basics of how the algorithm works is obvious - put pictures that are likely to be bought in front of buyers. Obviously previous purchases on similar searches has a high rank.  What is less obvious are all the other factors such as location, age of the image, the port as a whole (and other factors about the artist suck as how often they upload, their location, and how well their other works do, and who knows what else), camera, image size, and who knows what other things that they use. Plus they are probably continuously doing testing and changing it.

Add in the fact that there is real money to be made in SEO and gaming the system and it is a lot messier than it started out.

Annie2022

« Reply #2 on: November 24, 2022, 14:26 »
+5
Ah, I found it - this is what Roscoe sent me after the Indivstock thread was deleted:

Quote
image ranking| May 22, 2022
The ranking of individual images is now determined as follows.The following values have been converted to approximate percentages. The following descriptions are simplified reformulations.

+ 0.02% views in approx. 1 week period.
+- 0.20% views and downloads in relation.
+- 2.00% Marked as "outstanding" during selection.
+ 3.00% Keywords contain current topics, e.g.: climate change, Ukraine, Christmas, new year, ...
+ 5.00% new image and "outstanding".
+ 5.00% People photography.
- 10.00% Image was marked as "for adults only".
- 5.00% Image was marked as "editorial only".
- 5.00% Image was marked as "free".
+ 2.00% Marked as "outstanding" during selection and contains current topics.
- 20.00% Image is free but contains no keyword on current topics.
+- 2.00% Image is free and contains keywords on current topics.
+ 3.00% Image is free and keywords contain important long-term topics, e.g.: Bitcoin.
+ 2.00% keywords contain current topics, multiple match.
+ 2.00% Title contains current topics.
+ 2.00% Title contains current topics, multiple match.
+ 3.00% Artist bonus in general as well as keywords and titles of images predominantly without "spam" keywording, also title.
+ 2.00% Artist bonus in general as well as portfolio mostly popular.
+ 2.00% Artist bonus in general as well as portfolio mostly "outstanding".
+ 2.00% Keywords contain current topics, multiple matches in combination e.g.: "price" + "gas".
+ 2.00% keywords contain current topics, multiple matches in a combination e.g.: "price" + "gas" and marked as "outstanding".
+ 1.00% Keywords contain important long-term topics e.g.: Technology.
- 20.00% Image was marked as "editorial only" and older than current month.
- 10.00% Image was marked as "editorial only" and older than current and last year.
- 10.00% Image was marked as "editorial only" and older than current year.
+ 1.00% keywords contain current topics and image is not free but "outstanding".
+ 1.00% Image was recently purchased and is not "editorial only".
+ 2.00% Image has been bought several times and is not "only editorial".
- 40.00% Keyword older than current year. e.g.: 2021 in 2022.
+ 7.00% first 500 from top ranking of previous cycles, current year.
+ 3.00% first 500 from top ranking of previous cycles, older than current year.
- 100.00% compensation with massive increase in views due to Google, in proportion.
- 100.00% compensation, with massive increase in views but no download, in proportion, without Google.
+ 10.00% Image has just been downloaded and is not free.
+ 2.00% Image has just been downloaded and is free.
- 3.00% unnecessary keywords, number of keywords greater than necessary.
- 5.00% Title and keywords do not match.
- 1.00% keywords contradict each other e.g.: "background" and "isolated".
- 5.00% keywords contradict each other very strongly e.g.: photo of a woman but keyword "men".

Keywording and findability. Titles and keywords are removed individually, removed completely or replaced by the title. In the majority of cases, this also has an influence on the ranking. Short excerpt.
Spam like multiple commas in the description or repetitions.
If editorial but no creation date.
Mutually exclusive descriptions, e.g: Winter and summer.
Unclear description of what applies, spring and summer and autumn.
Remove typical non-descriptive words like "high resolution, close up, seamless" if inappropriate.
Remove untrue words like "seamless" if not seamless.
If "young" but not applicable.
If "tourists" but not a person.
If "tourism" but not a typical tourist image.
If "clouds" but not relevant.
If "environment", "business", "concept" but not relevant.
If "isolated" then not "background".
If "artificial" for plants but not as title.
If plural, but 1.
If "garden" but only single plant.
If single plant without name.
Convert symbols such as paragraph, question mark to words.
Remove "'and, the, with, in, at, of, ...".
If "Europe" but not relevant.
If "photography" but not relevant.
If "architecture" but not thematic.
If "decoration, decorative, antique, painting" but not applicable.
If "amazing, aesthetic, ..." but not descriptive.
If "beauty, beautyful,.." but neither woman, women, man, men, person, etc.".
If "beauty, beautyful,..." but animal or plant".
If "Close up images of, ...".
If "Detailed, showing,...".
If "Vector" but not vector.
If "template" but not template.
If "during, seen, view from..." but not matching.
If "group" but not human or clearly "group".
If "nature" but spam.


Uncle Pete

  • Great Place by a Great Lake - My Home Port
« Reply #3 on: November 25, 2022, 14:03 »
+4
I liked this part:

- 3.00% unnecessary keywords, number of keywords greater than necessary.
- 5.00% Title and keywords do not match.
- 1.00% keywords contradict each other e.g.: "background" and "isolated".
- 5.00% keywords contradict each other very strongly e.g.: photo of a woman but keyword "men".

« Reply #4 on: November 25, 2022, 16:59 »
+1
Ah, I found it - this is what Roscoe sent me after the Indivstock thread was deleted:
https://www.indivstock.com/members/support_feedback

Annie2022

« Reply #5 on: November 25, 2022, 18:42 »
+1
I liked this part:

- 3.00% unnecessary keywords, number of keywords greater than necessary.
- 5.00% Title and keywords do not match.
- 1.00% keywords contradict each other e.g.: "background" and "isolated".
- 5.00% keywords contradict each other very strongly e.g.: photo of a woman but keyword "men".

Yes.

Bad news for bad keywording.

Also bad news for editorials:

- 20.00% Image was marked as "editorial only" and older than current month.
- 10.00% Image was marked as "editorial only" and older than current and last year.
- 10.00% Image was marked as "editorial only" and older than current year.

bad news for lots of views but no sales:

- 100.00% compensation, with massive increase in views but no download, in proportion, without Google.


Good news for topical subjects:

+ 3.00% Keywords contain current topics, e.g.: climate change, Ukraine, Christmas, new year, ...
+ 2.00% keywords contain current topics, multiple match.
+ 2.00% Title contains current topics.
+ 2.00% Title contains current topics, multiple match.
+ 2.00% Keywords contain current topics, multiple matches in combination e.g.: "price" + "gas".
+ 1.00% Keywords contain important long-term topics e.g.: Technology.

and good news for artists with quality work:

+ 3.00% Artist bonus in general as well as keywords and titles of images predominantly without "spam" keywording, also title.
+ 2.00% Artist bonus in general as well as portfolio mostly popular.
+ 2.00% Artist bonus in general as well as portfolio mostly "outstanding".





OM

« Reply #6 on: November 26, 2022, 04:42 »
0
I liked this part:

- 3.00% unnecessary keywords, number of keywords greater than necessary.
- 5.00% Title and keywords do not match.
- 1.00% keywords contradict each other e.g.: "background" and "isolated".
- 5.00% keywords contradict each other very strongly e.g.: photo of a woman but keyword "men".

Yes.

Bad news for bad keywording.

Also bad news for editorials:

- 20.00% Image was marked as "editorial only" and older than current month.
- 10.00% Image was marked as "editorial only" and older than current and last year.
- 10.00% Image was marked as "editorial only" and older than current year.

bad news for lots of views but no sales:

- 100.00% compensation, with massive increase in views but no download, in proportion, without Google.


Good news for topical subjects:

+ 3.00% Keywords contain current topics, e.g.: climate change, Ukraine, Christmas, new year, ...
+ 2.00% keywords contain current topics, multiple match.
+ 2.00% Title contains current topics.
+ 2.00% Title contains current topics, multiple match.
+ 2.00% Keywords contain current topics, multiple matches in combination e.g.: "price" + "gas".
+ 1.00% Keywords contain important long-term topics e.g.: Technology.

and good news for artists with quality work:

+ 3.00% Artist bonus in general as well as keywords and titles of images predominantly without "spam" keywording, also title.
+ 2.00% Artist bonus in general as well as portfolio mostly popular.
+ 2.00% Artist bonus in general as well as portfolio mostly "outstanding".

Well (re)found Annie. Very interesting information. Thanks.

"+- 2.00% Marked as "outstanding" during selection"

Question: Who is doing this selection and marking as outstanding?

« Reply #7 on: November 26, 2022, 04:42 »
+1
While most of the keyword ratings make sense to me, I don't agree with all of them and don't think they are always benefitical to the customer.
For example: "If "isolated" then not "background".
The term "Isolated" in microstock language means plain, shadowless, easy to remove background" (definition by Dreamstime, but it's mostly similar wherever you look).
I usually add the keyword "isolated" along with "white background". So the same image would have the keywords "isolated" and "background" and they are not contradicting each other at all. The same goes for the example "keywords contradict each other very strongly e.g.: photo of a woman but keyword "men"."
What if I have for example a "photo of a woman with bruises" and in the background there is a blurry man, depicting domestic violance? Then both keywords would be very relevant and not contradicting at all.

I find the whole "keywords contradicting each other" thing problematic. Whether two keywords are really contradicting each other depends too much on the context of the image, but an automated keyword ranking system can't understand that.

 Other example: "If "garden" but only single plant." There are all kinds of different plants, like indoor plants, agricultural plants or plants you would plant in a garden. If I show the latter, I would add the keyword "garden plant" even if it was just one plant shown in the picture, to make the keywords as specific as possible, but apparently I am not supposed to do so?

Another issue I have with the downranking images of editorial images that are older than a month/a year. With some that makes sense, but many editorial images don't lose relevance after a month, or even a year. I have an image from a climate change protest that I submitted maybe 3 years ago that still keeps selling, though it keeps selling less and less over time. The topic isn't any less relavant now than it was 3 years ago and the image has not lost any of its usefulness. So it would gain 3% for "current topic", but lose 20% after just a month and another 10% after a year. Doesn't make sense to me.
« Last Edit: November 28, 2022, 06:59 by Her Ugliness »

« Reply #8 on: November 26, 2022, 11:00 »
+2
...
Another issue I have with the downranking images of editorial images that are older than a month/a year. With some that makes sense, but many editorial images don't lose relevance after a month, or even a year. I have an image from a climate change protest that I submitted maybe 3 years ago that still keeps selling, though it keeps selling less and less over time. The topic isn't any less relavant now than it was 3 years ago and the image has not lost any of its usefulness. So it would gain 3% for "current topic", but lose 20% after just a month and another 10% after a year. Doesn't make sense to me.

a larger issue is 'editorial' is not only 'newsworthy'  - it applies to any image lacking needs property or model releases 

and a silly one - a depiction of a Nov 30 event would be penalized if submitted Dec 1

finally 'editorial only' isn't even used in most cases since most agencies mark editorial images.   

« Reply #9 on: November 28, 2022, 05:27 »
0
i don't think this applies to all agencies,it'd take too much time for a reviewer to flag what is outstanding or not...other parameters may be flagged by AI i guess...


 

Related Topics

  Subject / Started by Replies Last post
2 Replies
4366 Views
Last post January 10, 2007, 10:34
by snem
20 Replies
8646 Views
Last post June 19, 2008, 10:27
by dullegg
61 Replies
21530 Views
Last post March 17, 2011, 12:29
by VB inc
24 Replies
8043 Views
Last post September 26, 2011, 04:14
by michealo
39 Replies
15460 Views
Last post January 07, 2017, 07:55
by SpaceStockFootage

Sponsors

Mega Bundle of 5,900+ Professional Lightroom Presets

Microstock Poll Results

Sponsors