Tag Archives: Amazon

Have bad graphs and faulty analysis led to evidence that Amazon has fake reviewers? Read on…

In my first post about Nick Bilton’s flawed analysis of the Amazon’s Kindle I left a few questions unanswered. One of those questions had to do with the ratings of the reviewers themselves. Since Amazon allows each review to be rated by anyone, it might be interesting to see if the number of people who found a review useful varied by the number of stars the reviewer gave to the Kindle. So I ran an analysis examining Kindle 2 reviews.

So here are 4 plots*. The first shows all reviews. Along the horizontal axis is the number of people reported to have found the review useful. Along the vertical axis is the star rating of the review. The plot on the upper right shows the same distribution, but for non-verified purchasers of Kindle2 only. The plot on the lower left shows the same distribution, but this time for reviewers who Amazon said actually purchased a Kinde2. The plot on the lower right brings the Amazon verified and Amazon non-verified purchasers together. Each red + sign is an Amazon Verified purchaser and each blue circle is a non-verified purchaser.

Four scatterplots

Evidence of fake reviews?

These four charts tell us an interesting story. Each point on the chart represents a review. So in each chart (except on the bottom right**) you’re seeing 9,212 points. The two charts on top are roughly the same. That’s because the first chart shows all reviews and the second one shows just the reviews submitted by non-verified Kindle2 purchases. You may recall that 75% of the reviews on the Kindle2 were submitted by people who Amazon said didn’t buy a Kindle2. So those dots dominate the charts. But take a look at the chart on the bottom left. You’ll notice that the cluster of reviews at the bottom of top two charts, the ones between 1 and 2 stars and stretching out all the way to the end of the X axis are gone. We knew that the non-verified purchasers were four times more likely to give a one star review compared to a verified purchaser, but we didn’t know that the 1 star non-verified reviewer were getting lots of people finding their reviews useful.

This dynamic really pops in the bottom right hand chart, the one with the red and blue lines in it. The blue line is made up of non-verified purchasers. As the number of people who said they found the review useful increases (starting around 8), the line dives down towards the 1-2 star ratings. The downward slope of the curve for the verified purchasers is much, much gentler.

This is a bit of a head-scratcher. I’ve heard people say that Amazon is full of fake reviews. These people aren’t saying that Amazon is the one doing the faking, but people who have some product that competes against the product being reviewed, or just people with an axe to grind. Is this an example of that? Do the fakers get their friends to say that their reviews are helpful? Maybe the Kindle2 verified purchasers post reviews that people just don’t find helpful. Right now, I don’t know what the correct answer is. But I have a feeling that some intelligent text-mining of the data will help flesh out an answer. Be on the lookout for a post about just that topic, by Marc Harfeld, coming soon, right here.

*To make the graphs easier to decipher I’ve excluded any review with more than 50 people finding the review useful. Taking the horizontal axis beyond 50 makes the plot very difficult to read. In all, this amounts to excluding 92 reviews out of the 9,304 I have gathered on the Kindle2. Because the star ratings are integers between 1 and 5, I needed to introduce a random jitter to the points (1 star becomes 1.221, another 1 star becomes 1.1321) so that they wouldn’t completely overlap each other on the scatterplot. I did the same to the values of how many people found each review helpful.
**Please note, to make an apples to apples comparison for chart on the bottom right, I had to reduce the number of non-verified reviewers down to the same number of amazon-verified reviewers. The sampling was a simple random sample, so it did not distort the distribution.

Pie Charts and faulty analytics in the NYTimes? Watch as the Biz Intel Guru fixes a seriously flawed blog post.

“Is Amazon Working Backward?” That’s the title of NYTimes blogger Nick Bilton post on Dec 24, 2009. Mr. Bilton is writing about Amazon’s product, the Kindle. Regarding the Kindle, he writes, “customers aren’t getting any happier about the end product.”

The day Mr. Bilton posted his story, best-selling author Seth Godin poked holes in it. Mr. Godin’s post is titled, “Learning from bad graphs and weak analysis.” Below is a brief listing of the serious flaws in Mr. Bilton’s approach. The listing is a mashup of Mr. Godin’s thoughts and mine.

1. Bilton should know better than to use pie charts because it’s really hard to determine the percentages when we’re looking at parts of a circle. Bar charts would’ve been much better. Stephen Few has stressed this for years. If you’re posting a chart in the NYTimes, you’d better have read your Stephen Few and Edward Tufte.
2. When your charts are the main support for your story, you’d better get them right. Mr. Bilton did get the table of numbers to the left of the pie charts correct. Perhaps he’d be better served by relying on them over the pie charts to make his point.
3. When you’re analyzing something, you shouldn’t compare opposite populations while ignoring their differences.

Mr. Godin cited 4 specific problems with the piece, ranging from the graphs being wrong (later corrected) to Bilton misunderstanding the nature of early adopters. In addition, Mr. Godin writes, “Many of the reviews are from people who don’t own the device.” Obviously, it’s hard to take a review of a Kindle seriously if the reviewer doesn’t own a Kindle. These are the different populations I’m talking about in item #3 above. I’ll address some of Mr. Godin’s concerns with Bilton’s post now and fill in some of the gaps that Godin left to be filled.

Mr. Bilton tried to make the case that each new version of the Kindle is worse than the one before it. His argument is based almost exclusively on the pie charts below, specifically, the gold slices of each pie. The gold slices are the percentage of one star reviews (lowest possible) each Kindle receives.

Here are the original 3 pies that Mr. Bilton showed in his post.

Despite difficulties in estimating the size of each slice in a pie chart, it is apparent that the 7% slice in the first pie chart is much larger than 7%. His corrected version is here.

Another problem Godin has with Bilton’s piece goes to the nature of early adopters. “The people who buy the first generation of a product are more likely to be enthusiasts,” writes Godin. The first ins are more forgiving than the last ins. I can’t really argue with that insight. My brother, an avid tech geek, is an early adopter of lots of tech gadgets. He was the first person I knew to buy an Apple Newton. I don’t recall a single complaint from him about the Newton, despite it not being able to recognize handwriting, which was its main selling point.

Mr. Godin’s claim that many of the reviewers don’t own a Kindle intrigued me the most. If I could quantify the number of one star reviewers who don’t own a Kindle then I could show the difference in one star ratings between the two groups, owners and non-owners.

I recreated the dataset that Mr. Bilton used for his analysis, 18,587 reviews in all. I also read up on how Amazon determines if a reviewer is an “Amazon Verified Purchaser.” Basically, Amazon says that if the reviewer purchased the product from Amazon, they’ll be flagged with the Amazon Verified Purchase stamp. So let’s see, do the one star ratings vary between the Amazon Verified Purchaser reviews compared to the non-Amazon Verified Purchaser reviews? Why yes, they do!

Amazon Kindle one Star reviews

Amazon Kindle 1 Star reviews

It’s clear from these charts that the reviewers who didn’t purchase a Kindle are much more likely to give a one star rating compared to the reviewers who Amazon verified as purchasing the Kindle. With each Kindle release, the non-verified Kindle owners were consistently four times more likely to give a one star review than the Amazon Verified Reviewers—the ones who actually purchased a Kindle. What’s up with that?

Let’s look at the reviews from the verified purchasers. The percentage of one star ratings each new Kindle edition receives doubles from 2% with Kindle 1, to 4% with Kindle 2, and then moves up to 5% with KindleDX. However, this evidence provides very weak support for Bilton’s claim that Kindle owners are getting progressively less happy.

What about the reviewers who are happy to very happy with the Kindle, the four and five star reviewers? Once again, the non-verified Kindle reviewers provide consistently lower ratings than the reviewers who actually own a Kindle. And once again we see the trend of the non-verified reviewers liking each new version of the Kindle less than the previous one. The four and five star ratings for actual owners of the Kindle jibe with Mr. Godin’s claim that the early adopters are more likely to be enthusiasts than those late to the game.

4 & 5 star Amazon Kindle Reviews

Four & five star Amazon Kindle Reviews

So there you have it, Mr. Godin’s hunches are correct!

What’s most interesting to me, though, is the fact that 75% of reviews of the Kindle aren’t made by people who own a Kindle. On my next post on this subject we’ll hear from a good friend of mine, and text mining expert, Marc Harfeld. We’ll mine the text of the 15,000 customer reviews looking for differences in the words used between the verified and non-verified Kindle owners. Perhaps that will shed light on this mystery. We’re also going to weight the reviews by the number of people who told Amazon that they found the review helpful. You’d think that a review that was helpful to 1 out of 3 people is different than a review that was found helpful by 18,203 out of 19,111 people, like this one.

Lastly, we’d love to hear suggestions from you on other next steps we might take with this analysis.

Thanks for reading.