ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software. ICU is released under a nonrestrictive open-source license that is suitable for use with both commercial software and with other open-source or free software. Convert text data to or from Unicode and nearly any other character set or encoding. ICU's conversion tables are based on charset data collected by IBM over the course of many decades and is the most complete available anywhere. Compare strings according to the conventions and standards of a particular language, region or country. ICU's collation is based on the Unicode Collation Algorithm plus locale-specific comparison rules from the Common Locale Data Repository, a comprehensive source for this type of data.
Features
- Support for handling text containing a mixture of left to right (English) and right to left (Arabic or Hebrew) data
- ICU's regular expressions fully support Unicode while providing very competitive performance
- Locate the positions of words, sentences, paragraphs within a range of text, or identify locations that would be suitable for line wrapping when displaying the text
- Format numbers, dates, times and currency amounts according the conventions of a chosen locale
- Compare strings according to the conventions and standards of a particular language, region or country
- A thorough set of timezone calculation APIs are provided