Results of the "Duplicate Pages" SEO experiment


(Eric Shepherd) #1

I’ve finally had an opportunity to write up the results of our second SEO experiment for MDN content, the “duplicate pages” experiment. Duplicate pages are those which are similar enough that search engines tend not to successfully differentiate among them, which can cause incorrect search results, poor ranking of individual pages, or splitting up of “Google juice” among the similar pages.

If you’d like to know more about the experiment and what was learned, please read my blog post Results of the MDN “Duplicate Pages” SEO experiment.

If you have questions or comments about the experiment or its results, feel free to ask here!

Eric Shepherd
Senior Technical Writer
MDN Web Docs

(Florian Scholz) #2

Thanks for the write-up!

Where can I find a list of duplicate pages? I often see pages that have the same title, for example. Did you query the database to get a list of these?

An example I’ve just hit now: There were these two pages:

Obviously the latter was created by mistake. I’ve hit this case many times and the action to take here is to simply redirect the second to the first. This takes only a few seconds and I would love to fix this throughout the site.


(Eric Shepherd) #3

That’s a good question.

I actually got a list from our SEO guy. I don’t know how he created it but will find out. In the meantime, if you like, I can see about posting a public copy of the list.


(Jwhitlock) #4

On the “Outcome” table, can you include the percent increase in clicks and impressions for the whole site, so we can compare to the baseline change?

(Eric Shepherd) #5

There are tools that can gather this information. I’m working on coming up
with one that will work for us. The current list I worked off (which is now
over a year old) was provided by our SEO contractor at that time.

(Eric Shepherd) #6

Here’s a spreadsheet with the duplicate pages within the open web documentation as perceived by the Moz Pro crawler. As you can see, there are four sets of pages that are seen as being overly similar. I’ve created a collapsible group for each, with the top row of each group being the canonical page and the ones indented inside the collapsible block of rows being the ones that are duplicates of it.

I’m a little unclear how some of these come off as being duplicate. I think it may be because of the low ratio of non-template text on some of them. There’s so much heading and table row caption material that the actual legit material isn’t enough to override it to make the pages look different. This means the main problem is finding ways to make some of these pages longer without unnatural bloat.