Discussion:
How to add the second (or other) languages
Add Reply
pozz
2025-02-12 16:26:26 UTC
Reply
Permalink
I have an embedded project that runs on a platform without a fully OS
(bare metal). The application can interact with humans with italian
messages. These messages are displayed on a touch screen, sent in the
payload of SMS or push notifications.

I used a very stupid approach: sprintf() with hard-coded constant
strings. For example:

void
display_event(Event *ev)
{
if (ev->type == EVENT_TYPE_ON) {
display_printf("Evento %d: accensione", ev->idx);
} else ...
...
}

Now I want to add a new language.

I could create a new build that replaces the constant strings at
preprocessor time:

#if LANGUAGE_ITALIAN
# define STRING123 "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
# define STRING123 "Event %d: power up"
#endif

void
display_event(Event *ev)
{
if (ev->type == EVENT_TYPE_ON) {
display_printf(STRING123, ev->idx);
} else ...
...
}

This way I can save some space in memory, but I will have two completely
different production binary for the two languages.


Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.

I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
Stefan Reuther
2025-02-12 17:14:26 UTC
Reply
Permalink
Post by pozz
#if LANGUAGE_ITALIAN
#  define STRING123            "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
#  define STRING123            "Event %d: power up"
#endif
[...]
Post by pozz
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.

struct Strings {
const char* power_up_message;
};

I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global variable).

Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.

One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
Post by pozz
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.

The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...

I wouldn't use that on a microcontroller, but it's nice for desktop apps.


Stefan
David Brown
2025-02-12 19:50:18 UTC
Reply
Permalink
Post by Stefan Reuther
Post by pozz
#if LANGUAGE_ITALIAN
#  define STRING123            "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
#  define STRING123            "Event %d: power up"
#endif
[...]
Post by pozz
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
struct Strings {
const char* power_up_message;
};
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
Post by pozz
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
Stefan
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code. Have your code use
something like :

#define DisplayPrintf(id, desc, args...) \
display_printf(strings[language][string_ ## id], ## x)

Use it like :

DisplayPrintf(event_type_on, "Event on", ev->idx);


A little Python preprocessor script can chew through all your C files
and identify each call to "DisplayPrintf". It can collect together all
the id's and generate a header with something like :

typedef enum {
string_event_type_on, ...
} string_index;
enum { no_of_strings = ... };

enum {
lang_English, lang_Italian, ...
} language_index;
enum { no_of_languages = ... };

extern language_index language; // global var :-)
extern const char* strings[no_of_languages][no_of_strings];

Then it will have a C file :

#include "language.h"

language_index language;
const char* strings[no_of_languages][no_of_strings] = {
{ // English
"Event %d: power up", // Event on
...
}
{ // Italian
"Evento %d: accensione", // Event on
}
}

It would generate the strings based on language files:

# english.txt
event_type_on : Event %d: power up
...

If the preprocessor finds a use of DisplayPrintf where the id (which can
be as long or short as you want, but can't have spaces or awkward
characters) does not match the description, it should give an error -
duplicate uses of the same pair are skipped. (You could just use an id
and no description if you prefer.)

Any ids that are not in the language files will be printed out or put in
a file, ids that are in the language files but not used in the program
will give warnings, etc.

It can all be done in a manner that makes it easy to get right, hard to
get wrong, and will not cause trouble as strings are added or removed.

It would be a lot simpler than gettext, and use minimal runtime space
and time. And it should be straightforward to change if you want to
have string tables stored externally or something like that. (I've made
systems with string tables in an external serial eprom, for example.)
pozz
2025-02-16 18:59:58 UTC
Reply
Permalink
Post by David Brown
Post by Stefan Reuther
Post by pozz
#if LANGUAGE_ITALIAN
#  define STRING123            "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
#  define STRING123            "Event %d: power up"
#endif
[...]
Post by pozz
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
   struct Strings {
       const char* power_up_message;
   };
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
Post by pozz
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
   Stefan
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code.  Have your code use
#define DisplayPrintf(id, desc, args...) \
    display_printf(strings[language][string_ ## id], ## x)
    DisplayPrintf(event_type_on, "Event on", ev->idx);
A little Python preprocessor script can chew through all your C files
and identify each call to "DisplayPrintf".
Little... yes, it would be little, but not simple, at least for me. How
to write a correct C preprocessor in Python?

This preprocessor should ingest a C source file after it is preprocessed
by the standard C preprocessor for the specific build you are doing.

For example, you could have a C source file that contains:

#if BUILD == BUILD_FULL
DisplayPrintf(msg, "Press (1) for simple process, (2) for advanced
process");
x = wait_keypress();
if (x == '1') do_simple();
if (x == '2') do_adv();
#elif BUILD == BUILD_LIGHT
do_simple();
#endif

If I'm building the project as BUILD_FULL, there's at least one
additional string to translate.

Another big problem is the Python preprocessor should understand C
syntax; it shouldn't simply search for DisplayPrintf occurrences.
For example:

/* DisplayPrintf(old_string, "This is an old message"); */
DisplayPrintf(new_string, "This is a new message");

Of course, only one string is present in the source file, but it's not
simple to extract it.
Post by David Brown
It can collect together all
    typedef enum {
        string_event_type_on, ...
    } string_index;
    enum { no_of_strings = ... };
    enum {
        lang_English, lang_Italian, ...
    } language_index;
    enum { no_of_languages = ... };
    extern language_index language;        // global var :-)
    extern const char* strings[no_of_languages][no_of_strings];
    #include "language.h"
    language_index language;
    const char* strings[no_of_languages][no_of_strings] = {
    {    // English
        "Event %d: power up",        // Event on
        ...
    }
    {    // Italian
        "Evento %d: accensione",    // Event on
    }
    }
    # english.txt
    event_type_on : Event %d: power up
    ...
If the preprocessor finds a use of DisplayPrintf where the id (which can
be as long or short as you want, but can't have spaces or awkward
characters) does not match the description, it should give an error -
duplicate uses of the same pair are skipped.  (You could just use an id
and no description if you prefer.)
Any ids that are not in the language files will be printed out or put in
a file, ids that are in the language files but not used in the program
will give warnings, etc.
It can all be done in a manner that makes it easy to get right, hard to
get wrong, and will not cause trouble as strings are added or removed.
It would be a lot simpler than gettext, and use minimal runtime space
and time.  And it should be straightforward to change if you want to
have string tables stored externally or something like that.  (I've made
systems with string tables in an external serial eprom, for example.)
Thanks for the suggestion, the idea is great. However I'm not able to
write a Python preprocessor that works well.
David Brown
2025-02-17 08:51:05 UTC
Reply
Permalink
Post by pozz
Post by David Brown
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code.  Have your code use
#define DisplayPrintf(id, desc, args...) \
     display_printf(strings[language][string_ ## id], ## x)
     DisplayPrintf(event_type_on, "Event on", ev->idx);
A little Python preprocessor script can chew through all your C files
and identify each call to "DisplayPrintf".
Little... yes, it would be little, but not simple, at least for me. How
to write a correct C preprocessor in Python?
You don't write a C preprocessor - that's the point.

Tools like gettext have to handle any C code. That means they need to
deal with situations with complicated macros, include files, etc.

You don't need to do that when you make your own tools. You make the
rules - /you/ decide what limitations you will accept in order to
simplify the pre-processing script.

So you would typically decide you only put these DisplayPrintf calls in
C files, not headers, that you ignore all normal C preprocessor stuff,
and that you keep each call entirely on one line, and that you'll never
use the sequence "DisplayPrintf" for anything else. Then your Python
preprocessor becomes :

for this_line in open(filename).readlines() :
if "DisplayPrintf" in line :
handle(line)

This is /vastly/ simpler than dealing with more general C code, without
significant restrictions to you as the programmer using the system.

If you /really/ want to handle include files, conditional compilation
and all rest of it, get the C compiler to handle that - use "gcc -E" and
use the output of that. Trying to duplicate that in your own Python
code would be insane.
Post by pozz
This preprocessor should ingest a C source file after it is preprocessed
by the standard C preprocessor for the specific build you are doing.
#if BUILD == BUILD_FULL
  DisplayPrintf(msg, "Press (1) for simple process, (2) for advanced
process");
  x = wait_keypress();
  if (x == '1') do_simple();
  if (x == '2') do_adv();
#elif BUILD == BUILD_LIGHT
  do_simple();
#endif
The really simple answer is, don't do that.
Post by pozz
If I'm building the project as BUILD_FULL, there's at least one
additional string to translate.
The slightly more complex answer is that you end up with an extra string
in one build or the other. Almost certainly, this is not worth
bothering about. And if it is - say you have a large number of extra
strings in a debug test build - then I'm sure you can find convenient
ways to handle that. At a minimum, you'd probably not bother having
translated versions but fall back to English.
Post by pozz
Another big problem is the Python preprocessor should understand C
syntax; it shouldn't simply search for DisplayPrintf occurrences.
Why not?
Post by pozz
/* DisplayPrintf(old_string, "This is an old message"); */
DisplayPrintf(new_string, "This is a new message");
Of course, only one string is present in the source file, but it's not
simple to extract it.
It's extremely simple to extract it. Remember - /you/ make the rules.
If you don't want to bother skipping such commented-out lines, /you/
pick a convenient way to do so. For example, you would decide that the
opening comment token must be at the start of the white-space stripped
line :

if line.strip().startswith("/*") :
return False

if line.strip().startswith("//") :
return False

(I've been talking about Python here, because that's the language I use
for such tools, and it's a very common choice. If you are not familiar
with Python then you can obviously use any other language you like.)


Or alternatively, have :

#define XDisplayPrintf(...)

And now your commenting system becomes :

XDisplayPrintf(old_string, "This is an old message");
DisplayPrintf(new_string, "This is a new message");

The "XDisplayPrintf" can be inside comments or conditionally uncompiled
code if you like. (You do have to filter out XDisplayPrintf bits from
the earlier check for DisplayPrintf.)
Post by pozz
Thanks for the suggestion, the idea is great. However I'm not able to
write a Python preprocessor that works well.
Sure you can. You just have to redefine what you mean by "works well"
to suit what you can write :-)


For my own use, I probably wouldn't even bother handling commented-out
strings. I have used this kind of technique for message translation and
a variety of other situations.


For more fun, you could switch to modern C++ and use user-defined
literals combined with constexpr template variables to put together a
system that is all within the one source language and is fully checked
at compile-time. I'm not sure it would be clearer, however!
pozz
2025-02-17 15:05:30 UTC
Reply
Permalink
Post by David Brown
Post by pozz
Post by David Brown
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code.  Have your code use
#define DisplayPrintf(id, desc, args...) \
     display_printf(strings[language][string_ ## id], ## x)
     DisplayPrintf(event_type_on, "Event on", ev->idx);
A little Python preprocessor script can chew through all your C files
and identify each call to "DisplayPrintf".
Little... yes, it would be little, but not simple, at least for me.
How to write a correct C preprocessor in Python?
You don't write a C preprocessor - that's the point.
Tools like gettext have to handle any C code.  That means they need to
deal with situations with complicated macros, include files, etc.
You don't need to do that when you make your own tools.  You make the
rules - /you/ decide what limitations you will accept in order to
simplify the pre-processing script.
So you would typically decide you only put these DisplayPrintf calls in
C files, not headers, that you ignore all normal C preprocessor stuff,
and that you keep each call entirely on one line, and that you'll never
use the sequence "DisplayPrintf" for anything else.  Then your Python
            handle(line)
This is /vastly/ simpler than dealing with more general C code, without
significant restrictions to you as the programmer using the system.
If you /really/ want to handle include files, conditional compilation
and all rest of it, get the C compiler to handle that - use "gcc -E" and
use the output of that.  Trying to duplicate that in your own Python
code would be insane.
And this is the reason why it appeared to me a complex task :-)

You're right, this is my own tool and I decide the rules. Many times I
try to solve the complete and general problem when, in the reality, the
border of the the problem is much smaller.

The only drawback is that YOU (and all the developers that work on the
project now and in the future) have to remember your own rules forever
for that project.
Post by David Brown
Post by pozz
This preprocessor should ingest a C source file after it is
preprocessed by the standard C preprocessor for the specific build you
are doing.
#if BUILD == BUILD_FULL
   DisplayPrintf(msg, "Press (1) for simple process, (2) for advanced
process");
   x = wait_keypress();
   if (x == '1') do_simple();
   if (x == '2') do_adv();
#elif BUILD == BUILD_LIGHT
   do_simple();
#endif
The really simple answer is, don't do that.
Post by pozz
If I'm building the project as BUILD_FULL, there's at least one
additional string to translate.
The slightly more complex answer is that you end up with an extra string
in one build or the other.  Almost certainly, this is not worth
bothering about.
Oh yes, but that was only an example. We can think of other scenarios
where the preprocessor could change the string depending on the build.
Post by David Brown
And if it is - say you have a large number of extra
strings in a debug test build - then I'm sure you can find convenient
ways to handle that.  At a minimum, you'd probably not bother having
translated versions but fall back to English.
Post by pozz
Another big problem is the Python preprocessor should understand C
syntax; it shouldn't simply search for DisplayPrintf occurrences.
Why not?
Post by pozz
/* DisplayPrintf(old_string, "This is an old message"); */
DisplayPrintf(new_string, "This is a new message");
Of course, only one string is present in the source file, but it's not
simple to extract it.
It's extremely simple to extract it.  Remember - /you/ make the rules.
If you don't want to bother skipping such commented-out lines, /you/
pick a convenient way to do so.  For example, you would decide that the
opening comment token must be at the start of the white-space stripped
        return False
        return False
I see, other rules: don't use multi-line comments, comments that start
in the middle of a line...
Post by David Brown
(I've been talking about Python here, because that's the language I use
for such tools, and it's a very common choice.  If you are not familiar
with Python then you can obviously use any other language you like.)
Python is fine for me too :-)
Post by David Brown
    #define XDisplayPrintf(...)
    XDisplayPrintf(old_string, "This is an old message");
    DisplayPrintf(new_string, "This is a new message");
The "XDisplayPrintf" can be inside comments or conditionally uncompiled
code if you like.  (You do have to filter out XDisplayPrintf bits from
the earlier check for DisplayPrintf.)
We are always talking about rules. In this case, if you comment
DisplayPrintf() put a leading X.
Post by David Brown
Post by pozz
Thanks for the suggestion, the idea is great. However I'm not able to
write a Python preprocessor that works well.
Sure you can.  You just have to redefine what you mean by "works well"
to suit what you can write :-)
For my own use, I probably wouldn't even bother handling commented-out
strings.  I have used this kind of technique for message translation and
a variety of other situations.
For more fun, you could switch to modern C++ and use user-defined
literals combined with constexpr template variables to put together a
system that is all within the one source language and is fully checked
at compile-time.  I'm not sure it would be clearer, however!
David Brown
2025-02-17 18:09:12 UTC
Reply
Permalink
Post by pozz
Post by David Brown
Post by pozz
Post by David Brown
You don't need a very fancy pre-processor to handle this yourself,
if you are happy to make a few changes to the code.  Have your code
#define DisplayPrintf(id, desc, args...) \
     display_printf(strings[language][string_ ## id], ## x)
     DisplayPrintf(event_type_on, "Event on", ev->idx);
A little Python preprocessor script can chew through all your C
files and identify each call to "DisplayPrintf".
Little... yes, it would be little, but not simple, at least for me.
How to write a correct C preprocessor in Python?
You don't write a C preprocessor - that's the point.
Tools like gettext have to handle any C code.  That means they need to
deal with situations with complicated macros, include files, etc.
You don't need to do that when you make your own tools.  You make the
rules - /you/ decide what limitations you will accept in order to
simplify the pre-processing script.
So you would typically decide you only put these DisplayPrintf calls
in C files, not headers, that you ignore all normal C preprocessor
stuff, and that you keep each call entirely on one line, and that
you'll never use the sequence "DisplayPrintf" for anything else.  Then
             handle(line)
This is /vastly/ simpler than dealing with more general C code,
without significant restrictions to you as the programmer using the
system.
If you /really/ want to handle include files, conditional compilation
and all rest of it, get the C compiler to handle that - use "gcc -E"
and use the output of that.  Trying to duplicate that in your own
Python code would be insane.
And this is the reason why it appeared to me a complex task :-)
You're right, this is my own tool and I decide the rules. Many times I
try to solve the complete and general problem when, in the reality, the
border of the the problem is much smaller.
The only drawback is that YOU (and all the developers that work on the
project now and in the future) have to remember your own rules forever
for that project.
This is embedded development. It is not always easy or straightforward.
When a problem seems difficult, re-arrange it or subdivide it into
things that you /can/ solve. Here I've given one solution (of many
possible solutions) - it makes some things easier, but also requires
other changes. You can use a big, general solution like gettext and
document how that should work in your development, or you can make a
much smaller and simpler, but more limited, custom solution and document
/that/. There are /always/ pros and cons, tradeoffs and balances in
this game.
Post by pozz
Post by David Brown
Post by pozz
This preprocessor should ingest a C source file after it is
preprocessed by the standard C preprocessor for the specific build
you are doing.
#if BUILD == BUILD_FULL
   DisplayPrintf(msg, "Press (1) for simple process, (2) for advanced
process");
   x = wait_keypress();
   if (x == '1') do_simple();
   if (x == '2') do_adv();
#elif BUILD == BUILD_LIGHT
   do_simple();
#endif
The really simple answer is, don't do that.
Post by pozz
If I'm building the project as BUILD_FULL, there's at least one
additional string to translate.
The slightly more complex answer is that you end up with an extra
string in one build or the other.  Almost certainly, this is not worth
bothering about.
Oh yes, but that was only an example. We can think of other scenarios
where the preprocessor could change the string depending on the build.
As the saying goes, you can burn that bridge when you come to it.
Imagining all the possible ways things can go wrong or be complicated
can be a lot more effort than getting a solution for the actual
practical situation.


I am not guaranteeing that my ideas here will be ideal for your needs.
But it is roughly in the direction of a system that I have used
successfully myself, and it's where I would start out in the situation
you described. Hopefully it gives you a good starting point for your
own solution - or at least something to compare to other potential
solutions when judging them.

pozz
2025-02-16 21:56:55 UTC
Reply
Permalink
Post by David Brown
Post by Stefan Reuther
Post by pozz
#if LANGUAGE_ITALIAN
#  define STRING123            "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
#  define STRING123            "Event %d: power up"
#endif
[...]
Post by pozz
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
   struct Strings {
       const char* power_up_message;
   };
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
Post by pozz
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
   Stefan
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code.  Have your code use
#define DisplayPrintf(id, desc, args...) \
    display_printf(strings[language][string_ ## id], ## x)
What is the final "## x"?
Post by David Brown
    DisplayPrintf(event_type_on, "Event on", ev->idx);
Other problems that came to my mind.

There are many functions that accept "translatable" strings, not only
DisplayPrintf(). Ok, I can write a macro for each of these functions.

I could have other C instructions that let the task more complex. For
example:

char msg[32];
sprintf(mymsg, "Ciao mondo");
DisplayPrintf(hello_msg, mymsg);

Python preprocessor isn't able to detect where is the string to translate.
David Brown
2025-02-17 08:57:35 UTC
Reply
Permalink
Post by pozz
Post by David Brown
Post by Stefan Reuther
Post by pozz
#if LANGUAGE_ITALIAN
#  define STRING123            "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
#  define STRING123            "Event %d: power up"
#endif
[...]
Post by pozz
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
   struct Strings {
       const char* power_up_message;
   };
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
Post by pozz
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
   Stefan
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code.  Have your code use
#define DisplayPrintf(id, desc, args...) \
     display_printf(strings[language][string_ ## id], ## x)
What is the final "## x"?
It's a gcc extension that skips the extra comma if args is empty
(combined with a typo in my post - "x" should have been "args").

If you want to stick to standard C, C23 introduced the __VA_OPT__
feature to handle this in a less convenient manner.
Post by pozz
Post by David Brown
     DisplayPrintf(event_type_on, "Event on", ev->idx);
Other problems that came to my mind.
There are many functions that accept "translatable" strings, not only
DisplayPrintf(). Ok, I can write a macro for each of these functions.
Yes.

Or write a single macro for the translation, and use that within those
functions:

DisplayPrintf(trans(event_type_on, "Event on"), ev->idx);
Post by pozz
I could have other C instructions that let the task more complex. For
char msg[32];
sprintf(mymsg, "Ciao mondo");
DisplayPrintf(hello_msg, mymsg);
Python preprocessor isn't able to detect where is the string to translate.
So don't write your code that way.
pozz
2025-02-16 22:15:21 UTC
Reply
Permalink
Post by Stefan Reuther
Post by pozz
#if LANGUAGE_ITALIAN
#  define STRING123            "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
#  define STRING123            "Event %d: power up"
#endif
[...]
Post by pozz
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
struct Strings {
const char* power_up_message;
};
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
Post by pozz
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
In some projects keeping all the translated strings is not a problem.

All the gettext tools seem good (xgettext, marking strings to translate
in the source code, pot file, msginit, msgmerge, msgfmt, po files, mo
files, ..) except the final step.

mo files should be installed in a file-system and gettext library
automatically loads the correct .mo file from a suitable path. All these
things are impractical on microcontroller systems.

Is it so difficult to import mo files as C const unsigned char arrays
and implement the gettext() function to search strings from them?

Another approach could be to rewrite a custom msgfmt tool that converts
a .po file into a simpler .mo file (or directly a .c file) that can be
used by a custom gettext() function.
David Brown
2025-02-17 08:59:50 UTC
Reply
Permalink
Post by pozz
Post by Stefan Reuther
Post by pozz
#if LANGUAGE_ITALIAN
#  define STRING123            "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
#  define STRING123            "Event %d: power up"
#endif
[...]
Post by pozz
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
   struct Strings {
       const char* power_up_message;
   };
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
Post by pozz
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
In some projects keeping all the translated strings is not a problem.
All the gettext tools seem good (xgettext, marking strings to translate
in the source code, pot file, msginit, msgmerge, msgfmt, po files, mo
files, ..) except the final step.
mo files should be installed in a file-system and gettext library
automatically loads the correct .mo file from a suitable path. All these
things are impractical on microcontroller systems.
Is it so difficult to import mo files as C const unsigned char arrays
and implement the gettext() function to search strings from them?
You know the answer... a little Python script that reads mo files and
generates files with C constant arrays. You'd also probably need to
make a few changes to the gettext language choice functions. (I've used
gettext with big Python programs, but never in embedded C code.)
Post by pozz
Another approach could be to rewrite a custom msgfmt tool that converts
a .po file into a simpler .mo file (or directly a .c file) that can be
used by a custom gettext() function.
Stefan Reuther
2025-02-17 18:00:43 UTC
Reply
Permalink
Post by pozz
Another approach could be to rewrite a custom msgfmt tool that converts
a .po file into a simpler .mo file (or directly a .c file) that can be
used by a custom gettext() function.
That's precisely what I tried to suggest (and personally use).


Stefan
Niocláiſín Cóilín de Ġloſtéir
2025-02-13 21:51:10 UTC
Reply
Permalink
Pozz ha scritto:
"Another approach is giving the user the possibility to change the
language at
runtime, maybe with an option on the display."

Ciao!

This is a good idea.
Loading...