Google has recently done a series on the usability of multilingual websites and it got me thinking about multilingual SEO. How do you, in fact, optimize the same website for keywords in multiple languages?
But let’s start with the core basics. In simple terms, a multilingual website is a website that has content in more than one language. And such website has a lot of on-page stuff that is often done wrong. Let’s take a look at some common issues:
1) Language recognition
Once Google’s crawler lands on your multilingual website, it starts with determining the main language on every page. Google can recognize a page as being in more than one language but you can avoid crawler confusion by doing the following:
- Stick to only one language per page
- Avoid side-by-side translations
- Use the same language for all elements of the page: headers, sidebars, menus, etc.
Some web editors create code-level attributes automatically but these attributes are not very reliable, so keep in mind that Google ignores all code-level information (from “lang” attributes to DTD (Document Type Definitions) during language recognition.
2) URL structure
A typical pet peeve of SEO but even more so with multilingual websites. To make the most of your URLs, consider language-specific extensions. Language-specific extensions are often used on multilingual websites to help users (and crawlers) identify the sections of the website they are on and the language the page is in. For example:
This is a great way to organize URLs on a multilingual website because not only does it help the user, but it also makes it easier for the crawler to analyze the indexing of your content. But what if you want to create URLs with characters other than English? Here’s how to do it right:
- Use UTF-8 encoding for non-English characters
- Make sure your UTF-8 encoded URLs are properly escaped when linked from within your content
i.e. if a URL contains an é, which is a non-English character: http://www.website.ca/fr/contént.html
here’s how it will look properly escaped: http://www.website.ca/fr/cont%C3%A9nt.html
It is important to note that Google directly extracts character encodings from HTTP headers, HTML page headers, and content. There isn’t much you need to do about character encoding, other than watching out for conflicting information – for example, between content and headers. While Google can recognize different character encodings, use UTF-8 on your website whenever possible.
3) Crawling and Indexing
Another common area of focus for SEO. On multilingual websites, follow these recommendations to get more pages crawled:
- Avoid redirects based on user’s perceived language: they could, in fact, prevent both users and SEs from looking at more pages on your site.
- Keep the content for each language on separate URLs
- Cross-link page by page
Last but not least, please remember that Google does not recommend automatic translations.
By getting the on-page basics right, you will set a great base for your multilingual SEO in the future and, unlike so many others, you will not have to beg (in multiple languages) SE crawlers to come and index your content.