Software weeds out fake online reviews

Many online reviews appear too good to be true – and they are. Earlier this year, for example, the Federal Trade Commission fined a music tuition company $250,000 for posting fake reviews of its own products.

But it’s not always been easy to distinguish the fake reviews from the real ones – until now. Cornell researchers say they’ve developed a software program that’s pretty good at it.

Examining 800 reviews of Chicago hotels, the system was able to pick out deceptive reviews with almost 90 percent accuracy, says the team.

“While this is the first study of its kind, and there’s a lot more to be done, I think our approach will eventually help review sites identify and eliminate these fraudulent reviews,” says graduate student Myle Ott.

The team first asked a group of people to deliberately write false positive reviews of 20 Chicago hotels, which were given to three human judges along with an equal number of real ones.

But the judges scored no better than chance in identifying the fakes – indeed, they couldn’t even agree on which reviews they thought were deceptive.

But the software was able to do better, thanks to a rather subtle analysis of terms within the text.

Truthful hotel reviews, for example, tend to use concrete words relating to the hotel, like ‘bathroom’,” ‘check-in’ or ‘price’.

Fake reviewers, on the other hand, seem so deperate to make their accounts sound plausible that they set the scene with terms such as ‘business trip’ or ‘my husband’.

Other differences include the use of words referring to human behavior and personal life, and sometimes in the amount of punctuation or long words. And deceivers also use more verbs, while truth-tellers use more nouns.

Using these guidelines, the researchers trained a computer on a subset of true and false reviews, then tested it against the rest of the database. The best results, they found, came from combining keyword analysis with the ways certain words are combined in pairs. Adding the two scores together identified deceptive reviews with 89.8 percent accuracy.

Right now, the system is only of limited use, as the database covers only hotel reviews – and only reviews of hotels in Chicago, at that. But there’s no reason the principle couldn’t be extended to reviews of any product.

Then, says Ott, this sort of software could be used by review sites as a ‘first-round filter’. If one particular product gets a lot of reviews that score as deceptive, the site could investigate further.

“Ultimately, cutting down on deception helps everyone,” says Ott. “Customers need to be able to trust the reviews they read, and sellers need feedback on how best to improve their services.”