Very interesting analysis - in particular, the part about the distribution of keyword counts.
The stats for the top pages suggest that ranking algorithm might be biased towards images with fewer keywords, hence having fewer keywords might give you a higher chance to reach the top pages.
This hypothesis can be tested by creating a boxplot of image rank distribution grouped by a number of keywords - do you think you could include this in "part 2"?
The stats for the top pages suggest that ranking algorithm might be biased towards images with fewer keywords, hence having fewer keywords might give you a higher chance to reach the top pages.
This hypothesis can be tested by creating a boxplot of image rank distribution grouped by a number of keywords - do you think you could include this in "part 2"?

