CJK Support

Auto-unwrapping of lines with multi-byte characters #

This issue came up on this emacs-china thread .. the issue was that when consecutive lines in Org source had Chinese characters, in HTML the last character on one line got separated from the first character on the next line by a space, which is not grammatically correct in Chinese.

So in such cases, those lines must be unwrapped without any spaces to separate those characters across the lines.

That of course would not be grammatically correct in English and even other languages with multi-byte characters (few examples: Hindi, Gujarati).

So that line-unwrapping with space removal is done only if,

  1. The locale is auto-detected to be Chinese or Japanese via environment variables LANGUAGE, LC_ALL or LANG, or
  2. The locale is manually set to Chinese or Japanese by setting it to zh or ja using #+hugo_locale: keyword (or EXPORT_HUGO_LOCALE property).
Fork me on GitHub