C++ Logo

sg16

Advanced search

Re: Draft of proposed SG16 questions for the 2023 C++ Developer Survey

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Tue, 21 Jun 2022 22:17:34 +0200
On Tue, Jun 21, 2022 at 9:49 PM Tom Honermann <tom_at_[hidden]> wrote:

> On 6/18/22 2:32 PM, Corentin Jabot wrote:
>
>
>
> On Sat, Jun 18, 2022, 19:39 Tom Honermann via SG16 <sg16_at_[hidden]>
> wrote:
>
>> A draft of proposed SG16 questions for the 2023 C++ Developer Survey is
>> now available here
>> <https://docs.google.com/document/d/1lRU7uErn2Vc7LOGG2H3PrzCvmf69u8S_v-43by_Vb9c/edit?usp=sharing>.
>> Anyone with the link should be able to view and comment on the draft.
>> Please feel free to add suggestions, corrections, and other comments.
>>
>> The list of questions (19 currently) is likely too long and will need to
>> be trimmed. For reference, the 2022 C++ Developer Survey
>> <https://isocpp.org/blog/2022/06/results-summary-2022-annual-cpp-developer-survey-lite>
>> (described as "Lite") had 19 questions.
>>
> Thanks Tom,
>
> Yes, the list is pretty long, and remember the survey is biased (a few
> thousands people among those who follow standardisation closely). The
> longer the survey, the less participations. I can easily imagine each study
> group could come up with a long list of questions too, many of which not
> relevant to all participants.
>
> I agree and I've been a little worried about other SGs jumping on the bad
> wagon here :)
>
>
> I guess the essence to what we are going to get to is whether people use
> or would like to use C++ for text processing. Asking that directly is
> probably sufficient. Given a fairly low participation rate, letting people
> write a detailed answer to something like "what would you like to see
> improved in regard to text processing and localization?" would give us good
> reply that we could summarize fairly easily.
>
> The last question in the proposed list is intended for that purpose.
>
>
> I have strong objections to the formulation of question 4, as it isn't
> possible to use emojis in a conforming implementations.
>
> Historically it has been possible to use some emoji, but yes, we fixed
> that.
>
>
> Question 3 is also weird - why these specific languages? It excludes among
> other languages using Cyrillic, Arabic, Brahmic scripts , so probably
> around 2 billions people in total and a fair number of C++ developers -
> although the survey results are likely to be biased towards Europeans and
> north Americans to begin with.
>
> That is an artifact of me being too quick to get draft questions prepared
> and being too uninformed about languages used around the world. The Unicode
> supported scripts list <https://unicode.org/standard/supported.html>
> enumerates 159 scripts. I don't have a good sense of which ones should be
> on this list.
>
> Peter Brett requested this question. Peter, perhaps you have some insight
> into which languages you feel should be explicitly listed?
>

In addition to the existing list: Hindi, Bengali, Arabic, Russian. It's far
from exhaustive but it covers a large chunk of the global population,
without getting technical about which script is derived from which


>
> More importantly, what is the desired outcome of questions 4? C++ support
> arbitrary characters in comments already, and hopefully no one is
> considering restrictions.
> In some way question 4 is also redundant with question 1.
>
> I think the main desire is just to get some data regarding whether
> programmers actually use non-basic-characters in identifiers. If many
> programmers answer yes, that might suggest we should do more analysis to
> see if the identifier restrictions put into C++23 via P1949
> <https://wg21.link/p1949> will require some migration assistance.
> Likewise, if many programmers answer I-didn't-know-that-was-possible, that
> may suggest a lack of awareness worth trying to address in some way. The
> survey itself could serve as a way to increase awareness.
>

Fair enough

>
> If question 1 is going to list EBCDIC, surely it should list shift-jis and
> gb18030
>
> Yes, thank you, I added those.
>
>
> What do we want to learn from questions 9 and 14?
>
> Question 9 goes towards motivation for putting normalization form into the
> type system. E.g., should std::text be parameterized by normalization form.
>
> Question 14 was requested by someone else; I don't recall who. I think the
> intent is to help gauge whether we can stop treating these types as
> character types and instead dedicate them for use as small integers ala
> int8_t and uint8_t. The answer is likely no due to unsigned char being
> used for UTF-8, but having data would be helpful.
>
> What is the motivation behind asking about collation independently of
> locale?
>
> It is an opportunity to ask specifically about use of stdcoll and
> std::collate. That motivation may be too weak to justify the question.
>
>
> Why not merge 15 and 17?
>
> That might be possible. Question 15 probes what purposes people use the
> standard locale facilities for. Question 17 probes what facilities people
> use to actually localize text.
>
Yup, I think that's not a distinction worth making. If people use both
std::locale and icu, they can check 2 boxes.


> Tom.
>
>
>
>
>
>
>
> The set of questions was culled from:
>>
>> - Prior discussion on the SG16 mailing list
>> <https://lists.isocpp.org/sg16/2022/06/3214.php>.
>> - Discussion during the 2022-06-08 SG6 telecon
>> <https://github.com/sg16-unicode/sg16-meetings#june-8th-2022>.
>>
>> Tom.
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>

Received on 2022-06-21 20:17:46