#if LANGUAGE_ITALIAN[...]
# define STRING123 "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
# define STRING123 "Event %d: power up"
#endif
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
I know there are many possible solutions, but I'd like to know some suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
Am 12.02.2025 um 17:26 schrieb pozz:
#if LANGUAGE_ITALIAN[...]
# define STRING123 "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
# define STRING123 "Event %d: power up"
#endif
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
struct Strings {
const char* power_up_message;
};
I hate global variables, so I pass a pointer to the structure to every function that needs it (but of course you can also make a global variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g. 'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to 'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
Stefan
On 12/02/2025 18:14, Stefan Reuther wrote:
Am 12.02.2025 um 17:26 schrieb pozz:
#if LANGUAGE_ITALIAN[...]
# define STRING123 "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
# define STRING123 "Event %d: power up"
#endif
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases, >>> I have enough memory to store all the strings in all languages.
Put the strings into a structure.
struct Strings {
const char* power_up_message;
};
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global
variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code >>> and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
Stefan
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code. Have your code use something like :
#define DisplayPrintf(id, desc, args...) \
display_printf(strings[language][string_ ## id], ## x)
Use it like :
DisplayPrintf(event_type_on, "Event on", ev->idx);
A little Python preprocessor script can chew through all your C files
and identify each call to "DisplayPrintf".
It can collect together all
the id's and generate a header with something like :
typedef enum {
string_event_type_on, ...
} string_index;
enum { no_of_strings = ... };
enum {
lang_English, lang_Italian, ...
} language_index;
enum { no_of_languages = ... };
extern language_index language; // global var :-)
extern const char* strings[no_of_languages][no_of_strings];
Then it will have a C file :
#include "language.h"
language_index language;
const char* strings[no_of_languages][no_of_strings] = {
{ // English
"Event %d: power up", // Event on
...
}
{ // Italian
"Evento %d: accensione", // Event on
}
}
It would generate the strings based on language files:
# english.txt
event_type_on : Event %d: power up
...
If the preprocessor finds a use of DisplayPrintf where the id (which can
be as long or short as you want, but can't have spaces or awkward characters) does not match the description, it should give an error - duplicate uses of the same pair are skipped. (You could just use an id
and no description if you prefer.)
Any ids that are not in the language files will be printed out or put in
a file, ids that are in the language files but not used in the program
will give warnings, etc.
It can all be done in a manner that makes it easy to get right, hard to
get wrong, and will not cause trouble as strings are added or removed.
It would be a lot simpler than gettext, and use minimal runtime space
and time. And it should be straightforward to change if you want to
have string tables stored externally or something like that. (I've made systems with string tables in an external serial eprom, for example.)
On 12/02/2025 18:14, Stefan Reuther wrote:
Am 12.02.2025 um 17:26 schrieb pozz:
#if LANGUAGE_ITALIAN[...]
# define STRING123 "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
# define STRING123 "Event %d: power up"
#endif
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases, >>> I have enough memory to store all the strings in all languages.
Put the strings into a structure.
struct Strings {
const char* power_up_message;
};
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global
variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code >>> and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
Stefan
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code. Have your code use something like :
#define DisplayPrintf(id, desc, args...) \
display_printf(strings[language][string_ ## id], ## x)
Use it like :
DisplayPrintf(event_type_on, "Event on", ev->idx);
Am 12.02.2025 um 17:26 schrieb pozz:
#if LANGUAGE_ITALIAN[...]
# define STRING123 "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
# define STRING123 "Event %d: power up"
#endif
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
struct Strings {
const char* power_up_message;
};
I hate global variables, so I pass a pointer to the structure to every function that needs it (but of course you can also make a global variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g. 'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to 'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
Il 12/02/2025 20:50, David Brown ha scritto:
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code. Have your code use
something like :
#define DisplayPrintf(id, desc, args...) \
display_printf(strings[language][string_ ## id], ## x)
Use it like :
DisplayPrintf(event_type_on, "Event on", ev->idx);
A little Python preprocessor script can chew through all your C files
and identify each call to "DisplayPrintf".
Little... yes, it would be little, but not simple, at least for me. How
to write a correct C preprocessor in Python?
This preprocessor should ingest a C source file after it is preprocessed
by the standard C preprocessor for the specific build you are doing.
For example, you could have a C source file that contains:
#if BUILD == BUILD_FULL
DisplayPrintf(msg, "Press (1) for simple process, (2) for advanced process");
x = wait_keypress();
if (x == '1') do_simple();
if (x == '2') do_adv();
#elif BUILD == BUILD_LIGHT
do_simple();
#endif
If I'm building the project as BUILD_FULL, there's at least one
additional string to translate.
Another big problem is the Python preprocessor should understand C
syntax; it shouldn't simply search for DisplayPrintf occurrences.
For example:
/* DisplayPrintf(old_string, "This is an old message"); */ DisplayPrintf(new_string, "This is a new message");
Of course, only one string is present in the source file, but it's not simple to extract it.
Thanks for the suggestion, the idea is great. However I'm not able to
write a Python preprocessor that works well.
Il 12/02/2025 20:50, David Brown ha scritto:
On 12/02/2025 18:14, Stefan Reuther wrote:
Am 12.02.2025 um 17:26 schrieb pozz:
#if LANGUAGE_ITALIAN[...]
# define STRING123 "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
# define STRING123 "Event %d: power up"
#endif
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some
cases,
I have enough memory to store all the strings in all languages.
Put the strings into a structure.
struct Strings {
const char* power_up_message;
};
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global
variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source
code
and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them >>> all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20, >>> you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop
apps.
Stefan
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code. Have your code use
something like :
#define DisplayPrintf(id, desc, args...) \
display_printf(strings[language][string_ ## id], ## x)
What is the final "## x"?
Use it like :
DisplayPrintf(event_type_on, "Event on", ev->idx);
Other problems that came to my mind.
There are many functions that accept "translatable" strings, not only DisplayPrintf(). Ok, I can write a macro for each of these functions.
I could have other C instructions that let the task more complex. For example:
char msg[32];
sprintf(mymsg, "Ciao mondo");
DisplayPrintf(hello_msg, mymsg);
Python preprocessor isn't able to detect where is the string to translate.
Il 12/02/2025 18:14, Stefan Reuther ha scritto:
Am 12.02.2025 um 17:26 schrieb pozz:
#if LANGUAGE_ITALIAN[...]
# define STRING123 "Evento %d: accensione"
#elif LANGUAGE_ENGLISH
# define STRING123 "Event %d: power up"
#endif
Another approach is giving the user the possibility to change the
language at runtime, maybe with an option on the display. In some cases, >>> I have enough memory to store all the strings in all languages.
Put the strings into a structure.
struct Strings {
const char* power_up_message;
};
I hate global variables, so I pass a pointer to the structure to every
function that needs it (but of course you can also make a global
variable).
Then, on language change, just point your structure pointer elsewhere,
or load the strings from secondary storage.
One disadvantage is that this loses you the compiler warnings for
mismatching printf specifiers.
I know there are many possible solutions, but I'd like to know some
suggestions from you. For example, it could be nice if there was some
tool that automatically extracts all the strings used in the source code >>> and helps managing more languages.
There's packages like gettext. You tag your strings as
'printf(_("Event %d"), e)', and the 'xgettext' command will extract them
all into a .po file. Other tools help you manage these files (e.g.
'msgmerge'; Emacs 'po-mode'), and gcc knows how to do proper printf
warnings.
The .po file is a mapping from English to Whateverish strings. So you
would convert that into some space-efficient resource file, and
implement the '_' macro/function to perform the mapping. The
disadvantage is that this takes lot of memory because your app needs to
have both the English and the translated strings in memory. But unless
you also use a fancy preprocessor that translates your code to
'printf(getstring(STR123), e)', I don't see how to avoid that. In C++20,
you might come up with some compile-time hashing...
I wouldn't use that on a microcontroller, but it's nice for desktop apps.
In some projects keeping all the translated strings is not a problem.
All the gettext tools seem good (xgettext, marking strings to translate
in the source code, pot file, msginit, msgmerge, msgfmt, po files, mo
files, ..) except the final step.
mo files should be installed in a file-system and gettext library automatically loads the correct .mo file from a suitable path. All these things are impractical on microcontroller systems.
Is it so difficult to import mo files as C const unsigned char arrays
and implement the gettext() function to search strings from them?
Another approach could be to rewrite a custom msgfmt tool that converts
a .po file into a simpler .mo file (or directly a .c file) that can be
used by a custom gettext() function.
On 16/02/2025 19:59, pozz wrote:
Il 12/02/2025 20:50, David Brown ha scritto:
You don't need a very fancy pre-processor to handle this yourself, if
you are happy to make a few changes to the code. Have your code use
something like :
#define DisplayPrintf(id, desc, args...) \
display_printf(strings[language][string_ ## id], ## x)
Use it like :
DisplayPrintf(event_type_on, "Event on", ev->idx);
A little Python preprocessor script can chew through all your C files
and identify each call to "DisplayPrintf".
Little... yes, it would be little, but not simple, at least for me.
How to write a correct C preprocessor in Python?
You don't write a C preprocessor - that's the point.
Tools like gettext have to handle any C code. That means they need to
deal with situations with complicated macros, include files, etc.
You don't need to do that when you make your own tools. You make the
rules - /you/ decide what limitations you will accept in order to
simplify the pre-processing script.
So you would typically decide you only put these DisplayPrintf calls in
C files, not headers, that you ignore all normal C preprocessor stuff,
and that you keep each call entirely on one line, and that you'll never
use the sequence "DisplayPrintf" for anything else. Then your Python preprocessor becomes :
for this_line in open(filename).readlines() :
if "DisplayPrintf" in line :
handle(line)
This is /vastly/ simpler than dealing with more general C code, without significant restrictions to you as the programmer using the system.
If you /really/ want to handle include files, conditional compilation
and all rest of it, get the C compiler to handle that - use "gcc -E" and
use the output of that. Trying to duplicate that in your own Python
code would be insane.
This preprocessor should ingest a C source file after it is
preprocessed by the standard C preprocessor for the specific build you
are doing.
For example, you could have a C source file that contains:
#if BUILD == BUILD_FULL
DisplayPrintf(msg, "Press (1) for simple process, (2) for advanced
process");
x = wait_keypress();
if (x == '1') do_simple();
if (x == '2') do_adv();
#elif BUILD == BUILD_LIGHT
do_simple();
#endif
The really simple answer is, don't do that.
If I'm building the project as BUILD_FULL, there's at least one
additional string to translate.
The slightly more complex answer is that you end up with an extra string
in one build or the other. Almost certainly, this is not worth
bothering about.
And if it is - say you have a large number of extra
strings in a debug test build - then I'm sure you can find convenient
ways to handle that. At a minimum, you'd probably not bother having translated versions but fall back to English.
Another big problem is the Python preprocessor should understand C
syntax; it shouldn't simply search for DisplayPrintf occurrences.
Why not?
For example:
/* DisplayPrintf(old_string, "This is an old message"); */
DisplayPrintf(new_string, "This is a new message");
Of course, only one string is present in the source file, but it's not
simple to extract it.
It's extremely simple to extract it. Remember - /you/ make the rules.
If you don't want to bother skipping such commented-out lines, /you/
pick a convenient way to do so. For example, you would decide that the opening comment token must be at the start of the white-space stripped
line :
if line.strip().startswith("/*") :
return False
if line.strip().startswith("//") :
return False
(I've been talking about Python here, because that's the language I use
for such tools, and it's a very common choice. If you are not familiar with Python then you can obviously use any other language you like.)
Or alternatively, have :
#define XDisplayPrintf(...)
And now your commenting system becomes :
XDisplayPrintf(old_string, "This is an old message");
DisplayPrintf(new_string, "This is a new message");
The "XDisplayPrintf" can be inside comments or conditionally uncompiled
code if you like. (You do have to filter out XDisplayPrintf bits from
the earlier check for DisplayPrintf.)
Thanks for the suggestion, the idea is great. However I'm not able toSure you can. You just have to redefine what you mean by "works well"
write a Python preprocessor that works well.
to suit what you can write :-)
For my own use, I probably wouldn't even bother handling commented-out strings. I have used this kind of technique for message translation and
a variety of other situations.
For more fun, you could switch to modern C++ and use user-defined
literals combined with constexpr template variables to put together a
system that is all within the one source language and is fully checked
at compile-time. I'm not sure it would be clearer, however!
Another approach could be to rewrite a custom msgfmt tool that converts
a .po file into a simpler .mo file (or directly a .c file) that can be
used by a custom gettext() function.
Il 17/02/2025 09:51, David Brown ha scritto:
On 16/02/2025 19:59, pozz wrote:
Il 12/02/2025 20:50, David Brown ha scritto:
You don't need a very fancy pre-processor to handle this yourself,
if you are happy to make a few changes to the code. Have your code
use something like :
#define DisplayPrintf(id, desc, args...) \
display_printf(strings[language][string_ ## id], ## x)
Use it like :
DisplayPrintf(event_type_on, "Event on", ev->idx);
A little Python preprocessor script can chew through all your C
files and identify each call to "DisplayPrintf".
Little... yes, it would be little, but not simple, at least for me.
How to write a correct C preprocessor in Python?
You don't write a C preprocessor - that's the point.
Tools like gettext have to handle any C code. That means they need to
deal with situations with complicated macros, include files, etc.
You don't need to do that when you make your own tools. You make the
rules - /you/ decide what limitations you will accept in order to
simplify the pre-processing script.
So you would typically decide you only put these DisplayPrintf calls
in C files, not headers, that you ignore all normal C preprocessor
stuff, and that you keep each call entirely on one line, and that
you'll never use the sequence "DisplayPrintf" for anything else. Then
your Python preprocessor becomes :
for this_line in open(filename).readlines() :
if "DisplayPrintf" in line :
handle(line)
This is /vastly/ simpler than dealing with more general C code,
without significant restrictions to you as the programmer using the
system.
If you /really/ want to handle include files, conditional compilation
and all rest of it, get the C compiler to handle that - use "gcc -E"
and use the output of that. Trying to duplicate that in your own
Python code would be insane.
And this is the reason why it appeared to me a complex task :-)
You're right, this is my own tool and I decide the rules. Many times I
try to solve the complete and general problem when, in the reality, the border of the the problem is much smaller.
The only drawback is that YOU (and all the developers that work on the project now and in the future) have to remember your own rules forever
for that project.
This preprocessor should ingest a C source file after it is
preprocessed by the standard C preprocessor for the specific build
you are doing.
For example, you could have a C source file that contains:
#if BUILD == BUILD_FULL
DisplayPrintf(msg, "Press (1) for simple process, (2) for advanced >>> process");
x = wait_keypress();
if (x == '1') do_simple();
if (x == '2') do_adv();
#elif BUILD == BUILD_LIGHT
do_simple();
#endif
The really simple answer is, don't do that.
If I'm building the project as BUILD_FULL, there's at least one
additional string to translate.
The slightly more complex answer is that you end up with an extra
string in one build or the other. Almost certainly, this is not worth
bothering about.
Oh yes, but that was only an example. We can think of other scenarios
where the preprocessor could change the string depending on the build.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,064 |
Nodes: | 10 (0 / 10) |
Uptime: | 148:16:21 |
Calls: | 13,691 |
Calls today: | 1 |
Files: | 186,936 |
D/L today: |
33 files (6,120K bytes) |
Messages: | 2,410,934 |