Colons and Organic Sitelink Title Text: What is Google Doing?

Posted on Wednesday, August 31st, 2011 by Print This Post Print This Post

Categories - Featured, SEO

In researching Google’s sitelink Title text methodology, I discovered something a bit odd specific to how Google uses colons for sitelink Title text creation.

NOTE: the term “colon” will be used quite often in this post and is in reference to the punctuation mark. So get your mind out of the gutter!

Anyway…

At first glance, regardless of whether Google pulls the sitelink Title text from the Title tag, on-page content or anchor text; it appeared Google might not pull any additional content after a colon when creating sitelink Title text. However, there ARE instances where Google is actually including a colon in the sitelink Title text for a website where a colon is not used in the specific text within the site, and then including the content after it.

Confused yet?

As always, the best way to show this is through a few examples. Come with me as we examine Google sitelink Title text creation and colons (stop laughing!).

Example 1a: No Content Pulled after Colon (PDF)

In the example below, we see a sitelink to a PDF (Google Instant). Since this is a PDF, it isn’t possible to have a fixed Title tag, but Google is obviously pulling something to create a Title for it.

The on-page Title of this PDF is “Google Instant: Potential Impact on SEM and SEO”.

Anchor Text: This doc is linked to internally and externally using the following text:

Internally: “Google Instant: Potential Impact on SEM and SEO” (http://www.thesearchagency.com/whitepapers/ )

Externally: “Google Instant: Potential Impact on SEM and SEO” (http://www.thesearchagents.com/2010/09/new-white-paper-analyzes-impact-of-google-instant-on-sem-and-seo/_

“Google Instant: Potential Impact on SEM and SEO” is the primary anchor text used to link to the PDF as well as the on-page title; so, the assumption is Google is pulling the anchor text to use for the sitelink Title text, but only pulling the content up to the colon and ignoring the rest. An argument could be made that they are referencing the content within the PDF to create the sitelink Title text, but in either case, they’re not pulling the text after the colon.

Example 1b: No Content Pulled after Colon Usage (web page)

Here’s another example where we see the colon being used as a sort of stop word.

In the Fandango example below, we see a sitelink for the X-Men: First Class movie:

The link in the sitelink goes to the X-Men: First Class page BUT only shows the term X-Men. There are several X-Men movies, so why is Google not including First Class?

Let’s take a look at the page components 

URL: http://www.fandango.com/xmen:firstclass_133869/movieoverview

Title Tag:

H1

Anchor Text:

This page is linked to internally AND externally using the following text:

Internally: ”X-Men: First Class

(http://www.fandango.com/januaryjones/filmography/p298504)

Externally: ”X-Men: First Class

(http://www.meetup.com/libertarian-364/events/20926851/)

There definitely appears to be consistency in colon usage for on-page content and internal/external linking for this page. It’s unclear if Google is using the anchor text, Title tag, or on-page content to create the sitelink Title text; but regardless, it seems they’re stopping at the colon and not pulling anything else (ok seriously, stop the laughing!).

HOWEVER, this isn’t consistent. I discovered an instance where they’re adding a colon where a colon wasn’t originally used and then including content after it.

Example 2: No Colon Used BUT Google Added it AND Included Post-colon Content

In the example below, we see a sitelink result for IMDb. You’ll notice that their X-Men sitelink Title text also doesn’t have First Class in the Title text; however, their Battle: Los Angeles sitelink Title text has a colon AND the text after it.

When I took a closer look at IMDb, I noticed they do use a colon on their X-Men: First Class page but they do NOT use a colon on their Battle: Los Angeles page.

Let’s take a look at the page components of their X-Men: First Class page:

URL: http://www.imdb.com/title/tt1270798/

Title Tag

H1

Anchor Text:

This page is linked to internally AND externally using the following text:

Internally:  ”X-Men: First Class

(http://www.imdb.com/name/nm0413168/)

Externally:  ”X-Men: First Class

(http://herocomplex.latimes.com/2010/08/06/mined-to-death-xmen-director-says-hollywood-is-killing-the-superhero-movie/)

Externally:  ”X-Men First Class

(http://wearecartel.org/nobudgetfilmfestival2011.php)

On-page, they use X-Men: First Class. When looking at the backlink anchor text, the majority of the text used is X-Men: First Class, but there were a few instances where the colon was not used. Regardless of where Google is pulling the sitelink Title text, they’re acknowledging the colon and ignoring the content after it.

Now let’s take a look at the Battle: Los Angeles page components:

URL: http://www.imdb.com/title/tt1217613/

Title Tag

H1

Anchor Text:

This page is linked to internally and externally using the following text:

Internally:  ”Battle Los Angeles

(http://www.imdb.com/name/nm0001173/)

Externally:  ”Battle: Los Angeles

(http://www.firstshowing.net/2010/check-out-the-first-official-teaser-poster-for-battle-los-angeles/)

Externally:  ”Battle Los Angeles

(http://shreveport-bossierfilm.com/past_productions.html)

IMDb does not use the colon anywhere within their site in reference to Battle: Los Angeles. It looks like backlink anchor text usage is mixed so some use the colon and some do not.

So why is Google choosing to ignore content after the colon in X-Men: First Class BUT adding the colon in Battle: Los Angeles when IMDb does NOT use it within their site AND choosing to include the content after the colon? Google isn’t adding the colon in the Title of the page’s organic listing.

Random Guess: Google looks to Wikipedia (or other official websites) for official titles to use in sitelink Title text

Both the IMDb X-Men: First Class and Battle: Los Angeles pages are linked to from Wikipedia; however, the difference is the Wikipedia page uses a colon in referencing the movie Battle: Los Angeles and when linking to IMDb.

Also, the official website for the movie (http://www.battlela.com/site/ ) and the Sony Pictures website (http://www.sonypictures.com/homevideo/battlela/) refer to the movie as “Battle: Los Angeles” (with the colon).  So it’s safe to say the official title structure of the movie is Battle: Los Angeles.

So what the hell is Google doing????

 

QUESTIONS FOR GOOGLE

When Google sees a colon used within text it wants to use for a sitelink Title, does the algo assume the colon is separating two unique and different statements within the text and ignore the content after the colon so Google can create a clean sitelink Title that’s somewhat clear and within 30 characters?

Does Google crosscheck official titles of movies with Wikipedia and/or the movie studios to create sitelink Titles? If so, why don’t they do it for related organic listings?

When Google forces the usage of a colon in sitelink Title text, is there a glitch in the algo that inadvertently includes the text after it when it should still ignore it? OR, is the glitch that they are assuming too much with colons by not pulling the text after a colon when actually they should (as is the case with the X-Men examples)?

What if IMDb did NOT use a colon in X-Men: First Class on their site? Would Google force the usage of the colon and then include First Class?

Regardless of the answers, these examples further support the theory that Google doesn’t use one specific page component to create sitelink Title text AND creates the text in a couple different ways. I think the best thing you can do at this point is stay consistent with on-page content and internal/external linking, and maybe use your colons lightly (OK, go ahead and laugh at that one).

Tags | , , , , , , , , , ,

10 Responses to “Colons and Organic Sitelink Title Text: What is Google Doing?”

  1. stever says:

    Maybe Google is pulling the title from the PDF metadata title?

  2. Valentina says:

    Thanks in support of sharing such a good thought, paragraph is nice, thats why i
    have read it entirely

  3. Jeremy Estes says:

    Nice detective work. This is starting to rear it’s ugly head again, and your 2-year-old post sheds a lot of light on the subject.

  4. Nathaniell says:

    I guess you know you’re and SEO nerd when you find articles like this absolutely fascinating. I was checking my rank for a page and noticed the discrepancies between some results showing NO colon, and some showing the whole title.

    When I searched for my main keyword (pre-colon), the content after the colon did not show in the site title. When I searched for a related keyword (including a word in the post-colon title), the full title appeared.

    Unfortunately, we seem no closer to finding a real answer to what’s going on, but in my case, it was clear the the search term influenced what was shown.

    Thanks for taking the time to do such a detailed write-up.

  5. Ailsa Ross says:

    Hey there,

    Great article! I was wondering if anyone has done testing on colons yet?

    To play it on the safe side, I wrote to some of my freelancers: “Question your colons: Google recognizes colons as separators which delineate keyword phrases. So if the title reads ‘Toronto: Food Tour, When Pigs Fry’, Google will read the keyword phrases as being ‘Toronto’ and ‘Food Tour When Pigs Fry’.” Is that the right advice to give?

    Thanks,
    Ailsa

  6. Vimax says:

    Very good article. I’m going through a few of these issues as
    well..

  7. tom sakell says:

    Thanks for this blog entry.

    I’m wondering about rhyme/ reason for choosing the SiteLink Description text. Any ideas?

    • Typically they pull content from the meta description. HOWEVER, I have seen instances where they pull the second sentence in a meta description instead of the first. I’ve seen this when the second sentence is shorter than the first and by default fits the sitelink description space better. If all sentences are long, I have seen them just pull the first one and cut it off.

      The bigger questing is should we optimize meta descriptions more for sitelinks? Or should we have one sentence in our meta descriptions that fit the character count for sitelink descriptions that is pulled by default. hmmmmmmmm.

  8. Danilo says:

    The cloud disperse as well as the self shines like
    the sun. A different way to manage your downloads is actually to sync them up relating to the PC along with
    your Blackberry. All you should do to start playing a
    relevant video or audio file is usually to open the downloaded torrent
    (the torrents themselves are typically only a few kilobytes in proportions, so they
    really download instantly.

Trackbacks/Pingbacks

  1. Pandia SEM Wrap-up September 4 | WORDPRESS PORTAL

Leave a Reply

Follow Us on Twitter

Authors