Google asks for user help to wipe out scraper sites

Google’s ramping up its efforts to defeat so-called scraper sites – websites which lift content from others more or less word for word in order to improve their search rankings.

Earlier this year, the company launched an update called Panda designed to do just that – but many perfectly legitimate sites say they’ve since seen their rankings fall, while certain scraper sites still ride high.

“Google is testing algorithmic changes for scraper sites (especially blog scrapers),” says the company. “We are asking for examples, and may use data you submit to test and improve our algorithms.”

Google’s also explained just a little about how its algorithm – more carefully guarded than the recipe for Coca Cola – works. It has several hundred elements, and around 500 changes are made every year.

“By some counts, we change our algorithm almost every day,” says Google fellow Amit Singh.

It’s a constant tweaking process, says engineering director Scott Huffman.

“There are almost always a set of motivating searches, and these searches are not performing as well as we would like,” he says. “Ranking engineers then come up with a hypothesis about what signal, what data could we integrate into our algorithm.”

The first stage is people – testers who, Google says, have been trained to decide whether one set of rankings is better than another. They’re shown the results from two versions of the algorithm and asked to choose.

You and I are involved too, with Google also looking at the results of real searches. A small proportion of users searches will be sent to a ‘sandbox’, where different metrics are used.

In 2010, says Google, it ran over 20,000 different experiments, inntroducing changes where the results seemed to merit it.

“We really analyze each potential change very deeply to try to make sure that it’s the right thing for users,” says Huffman.