I’m using permalinks in WP as: domain.com/category/post_name
The issue is that post names have non-latin characters such as chinese, hebrew, and arabic. So it encodes them to something like: %20%18%6b%20
therefore it counts every symbol’s character as an actual character, ending up with 3x times more length that it truncates some very short slugs.
How to fix that? Or how to extend the length limit at least? I’ve tried to extend the length of the database field “post_name” from 200 to 500, But it’s still truncating short.
Permalinks like http://example/שָׁלוֹם
are actually working in my WordPress 3.3. Could be the remove_accents() improvements for i18n permalinks.
As Sean & Steve noted,
- make sure you’re using WordPress ≥ 3.3
- make sure your .htaccess file contains a rule similar to
RewriteRule . /index.php [L]
- check that your database is UTF-8 encoded (and consider converting to UTF-8 if not).
[My original answer follows, not so relevant now but maybe still useful:]
See
- http://queryposts.com/function/sanitize_title/
- http://queryposts.com/function/remove_accents/
- http://queryposts.com/function/esc_url/
If your post titles contain some ASCII characters, you can strip out non-ASCII characters when generating post slugs.
Some plugins may help:
-
http://wordpress.org/extend/plugins/strings-sanitizer/
Aggressively sanitizes titles for clean, SEO friendly post slugs, and media filenames during upload. Works by converting common accentuated UTF-8 characters, as well as a few special cyrillic, hebrew, spanish and german characters.
-
http://wordpress.org/extend/plugins/universal-slugs/
[…] if you happen to speak a language that uses characted that are not included on the English alphabet, then you either have to bear with massive, odd looking permalinks, or manually update each one whenever you write a post or a page. […] The plugin will also remove common words such as “and”, “και”, “το”, “the” etc. from the URLs, as they do simply contribute to the URL length without adding anything to the meaning or the SEO value.
-
http://wordpress.org/extend/plugins/pinyin-slug/
For example, when you publish a post with a title like this: “Chinese PinYin” WordPress automatically assigns a long filename to your post, called a post slug: /%e4%b8%ad%e6%96%87%e6%8b%bc%e9%9f%b3
[…] With Chinese PinYin plugin activated, the slug for our example blog post would look like this: /zhongwenpinyin
-
http://wordpress.org/extend/plugins/remove-utf-8-from-slug/
remove all UTF-8 from title to permalink
-
http://wordpress.org/extend/plugins/pinyin-seo/
Convert Chinese characters to Pinyin Permalinks.
Also, some of the multilingual plugins might be able to translate your slugs into English (and hence Latin-only characters) automatically, but I haven’t used any of them, so I’m not sure.