Using non-alphanumeric characters in Sitemap URLs

development, encoding, google, links, php Add comments

This article in Google Help explains how to deal with special characters in Sitemaps that you can submit to Webmaster tools in order to increase the number of indexed pages of your website.

The main point is: the URLs must contain ASCII symbols only.

It can be done this way:

  • (obvious) ampersand, both quotes and <> symbols must be encoded,
  • Unicode symbols must be encoded, eg. ü must be converted to %FC sequence,
  • URLs that you submit must follow the  RFC-3986

If you use PHP, pay attention to one thing: it seems rawurlencode should be used instead of the usual urlencode since it’s follows the RFC-3986 as stated in PHP documentation.

Comments are closed.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in