Internationalization with Qt
The internationalization of an application is the process of making
the application usable by people in countries other than one's own.
In some cases internationalization is simple, for example, making a US
application accessible to Australian or British users may require
little more than a few spelling corrections. But to make a US
application usable by Japanese users, or a Korean application usable
by German users, will require that the software operate not only in
different languages, but use different input techniques, character
encodings and presentation conventions.
See also the Qt Linguist manual.
Step by Step
Writing multiplatform international software with Qt is a gentle,
incremental process. Your software can become internationalized in
the following stages:
Use QString for all User-visible Text
Since QString uses the Unicode encoding internally, every
language in the world can be processed transparently using
familiar text processing operations. Also, since all Qt
functions that present text to the user take a QString as a
parameter, there is no char* to QString conversion time overhead.
Strings that are in "programmer space" (such as QObject names
and file format texts) need not use QString; the traditional
char* or the QCString class will suffice.
You're unlikely to notice that you are using Unicode;
QString, and QChar are just like easier versions of the crude
const char* and char from traditional C.
Use tr() for all Literal Text
Wherever your program uses "quoted text" for text that will
be presented to the user, ensure that it is processed by the QApplication::translate() function. Essentially all that is necessary
to achieve this is to use QObject::tr(). For example, assuming the
LoginWidget is a subclass of QWidget:
LoginWidget::LoginWidget()
{
QLabel *label = new QLabel( tr("Password:"), this );
...
}
This accounts for 99% of the user-visible strings you're likely to
write.
If the quoted text is not in a member function of a
QObject subclass, use either the tr() function of an
appropriate class, or the QApplication::translate() function
directly:
void some_global_function( LoginWidget *logwid )
{
QLabel *label = new QLabel(
LoginWidget::tr("Password:"), logwid );
}
void same_global_function( LoginWidget *logwid )
{
QLabel *label = new QLabel(
qApp->translate("LoginWidget", "Password:"),
logwid );
}
If you need to have translatable text completely
outside a function, there are two macros to help: QT_TR_NOOP()
and QT_TRANSLATE_NOOP(). They merely mark the text for
extraction by the lupdate utility described below.
The macros expand to just the text (without the context).
Example of QT_TR_NOOP():
QString FriendlyConversation::greeting( int greet_type )
{
static const char* greeting_strings[] = {
QT_TR_NOOP( "Hello" ),
QT_TR_NOOP( "Goodbye" )
};
return tr( greeting_strings[greet_type] );
}
Example of QT_TRANSLATE_NOOP():
static const char* greeting_strings[] = {
QT_TRANSLATE_NOOP( "FriendlyConversation", "Hello" ),
QT_TRANSLATE_NOOP( "FriendlyConversation", "Goodbye" )
};
QString FriendlyConversation::greeting( int greet_type )
{
return tr( greeting_strings[greet_type] );
}
QString global_greeting( int greet_type )
{
return qApp->translate( "FriendlyConversation",
greeting_strings[greet_type] );
}
If you disable the const char* to QString automatic conversion
by compiling your software with the macro QT_NO_CAST_ASCII
defined, you'll be very likely to catch any strings you are
missing. See QString::fromLatin1() for more information.
Disabling the conversion makes programming cumbersome.
If your source language uses characters outside Latin-1, you
might find QObject::trUtf8() more convenient than
QObject::tr(), as tr() depends on the
QApplication::defaultCodec(), which makes it more fragile than
QObject::trUtf8().
Use QKeySequence() for Accelerator Values
Accelerator values such as Ctrl+Q or Alt+F need to be
translated too. If you hardcode CTRL+Key_Q for "Quit" in
your application, translators won't be able to override
it. The correct idiom is
QPopupMenu *file = new QPopupMenu( this );
file->insertItem( tr("&Quit"), this, SLOT(quit()),
QKeySequence(tr("Ctrl+Q", "File|Quit")) );
Use QString::arg() for Simple Arguments
The printf() style of inserting arguments in strings
is often a poor choice for internationalized text, as it is
sometimes necessary to change the order of arguments when
translating. Nonetheless, the QString::arg()
functions offer a simple means for substituting arguments:
void FileCopier::showProgress( int done, int total,
const QString& current_file )
{
label.setText( tr("%1 of %2 files copied.\nCopying: %3")
.arg(done)
.arg(total)
.arg(current_file) );
}
Produce Translations
Once you are using tr() throughout an application, you can start
producing translations of the user-visible text in your program.
Qt Linguist's manual provides
further information about Qt's translation tools, Qt Linguist, lupdate and lrelease.
Translation of a Qt application is a three-step process:
- Run lupdate to extract translatable text from the C++ source
code of the Qt application, resulting in a message file for
translators (a .ts file). The utility recognizes the tr() construct
and the QT_*_NOOP macros described above and produces .ts files
(usually one per language).
- Provide translations for the source texts in the .ts file, using
Qt Linguist. Since .ts files are in XML format, you can also
edit them by hand.
- Run lrelease to obtain a light-weight message file (a .qm
file) from the .ts file, suitable only for end use. Think of the .ts files as "source files", and .qm files as "object files". The
translator edits the .ts files, but the users of your application
only need the .qm files. Both kinds of files are platform and
locale independent.
Typically, you will repeat these steps for every release of your
application. The lupdate utility does its best to reuse the
translations from previous releases.
Before you run lupdate, you should prepare a project file. Here's
an example project file (.pro file):
HEADERS = funnydialog.h \
wackywidget.h
SOURCES = funnydialog.cpp \
main.cpp \
wackywidget.cpp
FORMS = fancybox.ui
TRANSLATIONS = superapp_dk.ts \
superapp_fi.ts \
superapp_no.ts \
superapp_se.ts
When you run lupdate or lrelease, you must give the name of the
project file as a command-line argument.
In this example, four exotic languages are supported: Danish, Finnish,
Norwegian and Swedish. If you use qmake (or tmake), you usually don't need an extra project
file for lupdate; your qmake project file will work fine once
you add the TRANSLATIONS entry.
In your application, you must QTranslator::load() the translation
files appropriate for the user's language, and install them using QApplication::installTranslator().
If you have been using the old Qt tools (findtr, msg2qm and mergetr), you can use qm2ts to convert your old .qm files.
linguist, lupdate and lrelease are installed in $QTDIR/bin. Click Help|Manual in Qt Linguist to access the user's
manual; it contains a tutorial to get you started.
While these utilities offer a convenient way to create .qm files,
any system that writes .qm files is sufficient. You could make an
application that adds translations to a QTranslator with
QTranslator::insert() and then writes a .qm file with
QTranslator::save(). This way the translations can come from any
source you choose.
Qt itself contains about 400 strings that will also need to be
translated into the languages that you are targeting. You will find
translation files for French and German in $QTDIR/translations as
well as a template for translating to other languages.
Typically, your application's main() function will look like this:
int main( int argc, char **argv )
{
QApplication app( argc, argv );
// translation file for Qt
QTranslator qt( 0 );
qt.load( QString( "qt_" ) + QTextCodec::locale(), "." );
app.installTranslator( &qt );
// translation file for application strings
QTranslator myapp( 0 );
myapp.load( QString( "myapp_" ) + QTextCodec::locale(), "." );
app.installTranslator( &myapp );
...
return app.exec();
}
Support for Encodings
The QTextCodec class and the facilities in QTextStream make it easy to
support many input and output encodings for your users' data. When an
application starts, the locale of the machine will determine the 8-bit
encoding used when dealing with 8-bit data: such as for font
selection, text display, 8-bit text I/O and character input.
The application may occasionally require encodings other than the
default local 8-bit encoding. For example, an application in a
Cyrillic KOI8-R locale (the de-facto standard locale in Russia) might
need to output Cyrillic in the ISO 8859-5 encoding. Code for this
would be:
QString string = ...; // some Unicode text
QTextCodec* codec = QTextCodec::codecForName( "ISO 8859-5" );
QCString encoded_string = codec->fromUnicode( string );
...; // use encoded_string in 8-bit operations
For converting Unicode to local 8-bit encodings, a shortcut is
available: the local8Bit() method
of QString returns such 8-bit data. Another useful shortcut is the
utf8() method, which returns text in the
8-bit UTF-8 encoding: this perfectly preserves Unicode information
while looking like plain US-ASCII if the Unicode is wholly US-ASCII.
For converting the other way, there are the QString::fromUtf8() and
QString::fromLocal8Bit() convenience functions, or the general code,
demonstrated by this conversion from ISO 8859-5 Cyrillic to Unicode
conversion:
QCString encoded_string = ...; // Some ISO 8859-5 encoded text.
QTextCodec* codec = QTextCodec::codecForName("ISO 8859-5");
QString string = codec->toUnicode(encoded_string);
...; // Use string in all of Qt's QString operations.
Ideally Unicode I/O should be used as this maximizes the portability
of documents between users around the world, but in reality it is
useful to support all the appropriate encodings that your users will
need to process existing documents. In general, Unicode (UTF-16 or
UTF-8) is best for information transferred between arbitrary people,
while within a language or national group, a local standard is often
more appropriate. The most important encoding to support is the one
returned by QTextCodec::codecForLocale(), as this is the one the user
is most likely to need for communicating with other people and
applications (this is the codec used by local8Bit()).
Since most Unix systems do not have built-in support for converting
between local 8-bit encodings and Unicode, it may be necessary to
write your own QTextCodec subclass. Depending on the urgency, it may
be useful to contact Trolltech technical support or ask on the
qt-interest mailing list to see if someone else is already working
on supporting the encoding. A useful interim measure can be to use the
QTextCodec::loadCharmapFile() function to build a data-driven codec,
although this approach has a memory and speed penalty, especially with
dynamically loaded libraries. For details of writing your own
QTextCodec, see the main QTextCodec class documentation.
Localize
Localization is the process of adapting to local conventions such as
date and time presentations. Such localizations can be accomplished
using appropriate tr() strings, even "magic" words, as this somewhat
contrived example shows:
void Clock::setTime(const QTime& t)
{
if ( tr("AMPM") == "AMPM" ) {
// 12-hour clock
} else {
// 24-hour clock
}
}
Localizing images is not recommended. Choose clear icons that are
appropriate for all localities, rather than relying on local puns or
stretched metaphors.
System Support
Operating systems and window systems supporting Unicode are still in
the early stages of development. The level of support available in the
underlying system influences the support Qt provides on that platform,
but applications written with Qt need not generally be too concerned
with the actual limitations.
Unix/X11
- Locale-oriented fonts and input methods. Qt hides these and
provides Unicode input and output.
- Filesystem conventions such as
UTF-8
are under development
in some Unix variants. All Qt file functions allow Unicode,
but convert all filenames to the local 8-bit encoding, as
this is the Unix convention
(see QFile::setEncodingFunction()
to explore alternative encodings).
- File I/O defaults to the local 8-bit encoding,
with Unicode options in QTextStream.
Windows
- Qt provides full Unicode support, including input methods, fonts,
clipboard, drag-and-drop and file names.
- File I/O defaults to Latin-1, with Unicode options in QTextStream.
Note that some Windows programs do not understand big-endian
Unicode text files even though that is the order prescribed by
the Unicode Standard in the absence of higher-level protocols.
- Unlike programs written with MFC or plain winlib, Qt programs
are portable between Windows 95/98 and Windows NT.
You do not need different binaries to support Unicode.
Supporting More Input Methods
While Trolltech doesn't have the resources or expertise in all the
languages of the world to immediately include support in Qt, we are
very keen to work with people who do have the expertise. Over the next
few minor version numbers, we hope to add support for your language of
choice, until everyone can use Qt and all the programs developed with
Qt, regardless of their language.
Languages with single-byte encodings (European Latin-1 and KOI8-R,
etc.) and multi-byte encodings (East Asian EUC-JP, etc.) are
supported. Support for the "complex" encodings: those requiring
right-to-left input or complex character composition (e.g. Arabic,
Hebrew, and Thai script) is implemented, but the range of Indic
scripts (Hindi, Devanagari, Bengali, etc.) is still under development.
The current state of activity is:
Encodings | Status
|
All encodings on Windows
| The local encoding is always supported.
|
ISO standard encodings
ISO 8859-1,
ISO 8859-2,
ISO 8859-3,
ISO 8859-4,
ISO 8859-5,
ISO 8859-7,
ISO 8859-9, and
ISO 8859-15
| Fully supported.
|
KOI8-R
| Fully supported.
|
eucJP, JIS, and ShiftJIS
| Fully supported. Uses eucJP with the XIM protocol on X11,
and the IME Windows NT in Japanese Windows NT.
Serika Kurusugawa and others are assisting with this effort.
kinput2
is the tested input method for X11.
|
eucKR
| Supported.
Mizi Research are assisting with this effort.
hanIM
is the tested input method.
|
Big5
| Qt contains a Big5 codec developed by Ming Che-Chuang.
Testing is underway with the xcin (2.5.x) XIM server.
|
eucTW
| Under external development.
|
More information on the support of different writing systems in Qt can
be found in the documentation about writing systems.
If you are interested in contributing to existing efforts, or
supporting new encodings beyond those mentioned above, your work can
be considered for inclusion in the official Qt distribution, or just
included with your application.
Eventually, we hope to help Unix become as Unicode-oriented as Windows
is becoming. This means better font support in the font servers, with
new developments like the True Type font servers xfsft, xfstt, and x-tt,
as well as UTF-8 (a
Unicode encoding) filenames such as with the Unicode support
in Solaris 7.
Note about Locales on X11
Many Unix distributions contain only partial support for some locales.
For example, if you have a /usr/share/locale/ja_JP.EUC directory,
this does not necessarily mean you can display Japanese text; you also
need JIS encoded fonts (or Unicode fonts), and that /usr/share/locale/ja_JP.EUC directory needs to be complete. For best
results, use complete locales from your system vendor.
Relevant Qt Classes
These classes are relevant to internationalizing Qt applications.