C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Ideas for the future

From: keld_at <keld_at_[hidden]>
Date: Tue, 30 Jul 2019 16:14:12 +0200
hi all

I would like in the future that C++ programs was as portable
as possible and also as adaptable to cultures as possible, so that when you write
a program it was easy to provide it to as many users a possible.

The first thing - portablity - is what we have been aiming at for many years
and involves a basic character set as we do it now. That is basically ASCII.
No funny characters like · and ± and ÷.

the second - cultural adaptability - is something about having all input and
output in a fashion that users feel natural. We go a long way
with the locale stuff we have, but I would like the language to support string to
be marked as translatable, and an ecosystem to support it. Most serious programs
today are written for translation. So some syntax for strings
like g"translatable text" could be good. And then maybe some notion for voice too
- and other possible outputs - eg for disabled people.

Keld

On Mon, Jul 29, 2019 at 11:02:41PM +0000, Lev Minkovsky wrote:
> All,
>
> Tom Honermann encouraged me to share with you several ideas that at some point in the future may become proposable.
>
> First is the ??? character (Alt-24 with NumLock on). We had a discussion a while back with Bjarne and a few other C++ luminaries in regards to a possible exponentiation operator. None of the more conventional alternatives appeared to be a good candidate, while ??? is a symbol used for that purpose by Donald Knuth, see https://en.wikipedia.org/wiki/Knuth%27s_up-arrow_notation, and would be excellent for readability. Perhaps we can add it at some point to the basic character set. I am not at all worried about its absence on the keyboard, math folks will quickly get used to Alt-24.
>
> I would imagine the right approach for this to happen is to ask ourselves: what is are the specific characters that we wish were in the basic character set? My initial list would be: $,@,???,??? or ·,÷ . $ is already in Microsoft basic character set, see https://docs.microsoft.com/en-us/cpp/cpp/character-sets?view=vs-2019, so perhaps this would be a low-hanging fruit. The middle dot symbol and the obelus could be used as an alternative multiplication and division operators. Swift already has user-defined operators; if we ever get them, it would be awesome to have something like
>
> long long operator ·(long m, long n) { return (long long)m * (long long)n; }
>
> The second, far more impactful idea would be to unicodize the entire language and let the users use keywords in their national languages. Programmers outside USA (surprise, surprise) often think in their native languages and often prefer to write comments in them. For example, I know that the SAP codebase is full of comments in German. A source file is a specialized text, and every language switch is a disorienting experience, especially if these languages are not related. Algol 68 designers already understood this and translated the language into Russian, German, French, Bulgarian, Chinese and Japanese, including of course the keywords. This could facilitate teaching/studying the language as well.
> As an illustration, let us consider 3 variants of Hello-World, first the canonic version with comments, second with the same comments in Russian and third a hypothetical Hello world/???????????? ?????? in C++ with Russian keywords:
>
> ­­­­­­­­­­
> //This is needed for printf
> #include <stdio.h>
>
> //Program entry
> int main()
> {
> //Let's greet the world
> printf("Hello world!\n");
> }
>
>
> //?????? ?????????????????? ?????? printf
> #include <stdio.h>
>
> //???????? ?? ??????????????????
> int main()
> {
> //???????????????????????? ??????
> printf("???????????? ??????!\n");
> }
>
>
> //?????? ?????????????????? ?????? ????????????
> #???????????????? <??????????.??>
>
> //???????? ?? ??????????????????
> ?????? ??????????????()
> {
> //???????????????????????? ??????
> ????????????("???????????? ??????!\n");
> }
>
>
> I would imagine that for most if not all of you, the third example looks like gibberish. I can assure you that, for young future programmers from the countries where English isn???t widely spoken, the first Hello World looks just as gibberishly. Some of them may even be reluctant to enter a career where they would have to deal with pages and pages of such stuff on a daily basis.
>
> Finally, I wanted to show you a couple of additional ???hello-world???s. The first is valid C++ that stress-tests the system it runs on by using English, Russian, Georgian and Chinese words in the same sentence:
>
>
> #include <stdio.h>
>
> main()
> {
> printf(u8"Hello-????????????-???????????????????????????-??????, world!\n");
> }
>
> The second is something I put together as a 21 century version of Hello world. Alas, only a very small fraction of it is now well-formed.
>
>
> /*
>
> The first program to write is the same for all languages:
>
> Print the words
>
> hello, world
>
> #include <stdio.h>
>
> int main()
> {
> printf("hello, world\n");
> }
>
> */
>
> import std.ui; //future UI module
> import std.core;
>
> int main()
> {
> static std::map<std::language_id_t, std::u8string> hellos{ //language_id_t comes from std.ui
> { "English"lid, "Hello, world" }, //a literal produces the right language type
> { "Chinese"lid, "???????????????" },
> { "Hindi"lid, "?????????????????? ??????????????????" },
> { "Spanish"lid, "Hola Mundo" },
> { "French"lid, "Bonjour le monde" },
> { "Arabic"lid, "?????????? ??????????????" },
> { "Bengali"lid, "????????? ???????????????" },
> { "Russian"lid, "????????????, ??????" },
> { "Portuguese"lid, "Olá Mundo" },
> { "Indonesian"lid, "Halo Dunia" },
> { "Urdu"lid, "?????????? ????????" },
> { "German"lid, "Hallo Welt" },
> { "Japanese"lid, "?????????????????????" },
> { "Swahili"lid, "Salamu, Dunia" },
> { "Punjabi"lid, "????????? ???????????? ???????????? ???????????????" },
> { "Telugu"lid, "?????????, ?????????????????????" },
> { "Javanese"lid, "Hello, donya" },
> { "Marathi"lid, "????????????, ??????" },
> { "Turkish"lid, "Selam Dünya" },
> };
>
> //More than 75 % of the world population would be able to read and understand its greeting.
>
> std::post_notification ( //this also comes from std.ui
> // if we can define variables in an if statement, why can't we in a tertiary operator?
> (std::optional<std::u8string> hello = hellos[std::get_language_id()]) //get the default system language
> ? *hello
> : *hellos["English"lid];
> );
> }
>
>
> Thank you ???
>
> Lev Minkovsky
>
>
>

> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode

Received on 2019-07-30 16:14:12