KNOWLEDGE BASE ARTICLE

Regex and Special Characters

When performing regular expressions that need to capture special characters (eg. Chinese or Arabic character sets), you have options on how you go about this.

Option 1 - Refence the character group

All CJK characters - REGEX([\p{IsCJKUnifiedIdeographs}]{1,})
All Arabic characters - REGEX([\p{IsArabic}]{1,})

Option 2 - Refence the character's Unicode value(s)

All CJK characters - REGEX([\uF900-\uFAAD]{1,})
All Arabic characters - REGEX([\u0600-\u06FF]{1,})

Of course, specific characters and character groups can be referenced also. This would typically be best done using the Unicode option above.

Note that these character references are not supported in the Umango Regex builder. You will need to reference them manually.

More details on how you can reference special characters in your regex can be found here.

Link to this article http://umango.com/KB?article=112