Date: Tue, 21 Jun 2022 15:49:12 -0400
On 6/18/22 2:32 PM, Corentin Jabot wrote:
>
>
> On Sat, Jun 18, 2022, 19:39 Tom Honermann via SG16
> <sg16_at_[hidden]> wrote:
>
> A draft of proposed SG16 questions for the 2023 C++ Developer
> Survey is now available here
> <https://docs.google.com/document/d/1lRU7uErn2Vc7LOGG2H3PrzCvmf69u8S_v-43by_Vb9c/edit?usp=sharing>.
> Anyone with the link should be able to view and comment on the
> draft. Please feel free to add suggestions, corrections, and other
> comments.
>
> The list of questions (19 currently) is likely too long and will
> need to be trimmed. For reference, the 2022 C++ Developer Survey
> <https://isocpp.org/blog/2022/06/results-summary-2022-annual-cpp-developer-survey-lite>
> (described as "Lite") had 19 questions.
>
> Thanks Tom,
>
> Yes, the list is pretty long, and remember the survey is biased (a few
> thousands people among those who follow standardisation closely). The
> longer the survey, the less participations. I can easily imagine each
> study group could come up with a long list of questions too, many of
> which not relevant to all participants.
I agree and I've been a little worried about other SGs jumping on the
bad wagon here :)
>
> I guess the essence to what we are going to get to is whether people
> use or would like to use C++ for text processing. Asking that directly
> is probably sufficient. Given a fairly low participation rate, letting
> people write a detailed answer to something like "what would you like
> to see improved in regard to text processing and localization?" would
> give us good reply that we could summarize fairly easily.
The last question in the proposed list is intended for that purpose.
>
> I have strong objections to the formulation of question 4, as it isn't
> possible to use emojis in a conforming implementations.
Historically it has been possible to use some emoji, but yes, we fixed that.
>
> Question 3 is also weird - why these specific languages? It excludes
> among other languages using Cyrillic, Arabic, Brahmic scripts , so
> probably around 2 billions people in total and a fair number of C++
> developers - although the survey results are likely to be biased
> towards Europeans and north Americans to begin with.
That is an artifact of me being too quick to get draft questions
prepared and being too uninformed about languages used around the world.
The Unicode supported scripts list
<https://unicode.org/standard/supported.html> enumerates 159 scripts. I
don't have a good sense of which ones should be on this list.
Peter Brett requested this question. Peter, perhaps you have some
insight into which languages you feel should be explicitly listed?
>
> More importantly, what is the desired outcome of questions 4? C++
> support arbitrary characters in comments already, and hopefully no one
> is considering restrictions.
> In some way question 4 is also redundant with question 1.
I think the main desire is just to get some data regarding whether
programmers actually use non-basic-characters in identifiers. If many
programmers answer yes, that might suggest we should do more analysis to
see if the identifier restrictions put into C++23 via P1949
<https://wg21.link/p1949> will require some migration assistance.
Likewise, if many programmers answer I-didn't-know-that-was-possible,
that may suggest a lack of awareness worth trying to address in some
way. The survey itself could serve as a way to increase awareness.
>
> If question 1 is going to list EBCDIC, surely it should list shift-jis
> and gb18030
Yes, thank you, I added those.
>
> What do we want to learn from questions 9 and 14?
Question 9 goes towards motivation for putting normalization form into
the type system. E.g., should std::text be parameterized by
normalization form.
Question 14 was requested by someone else; I don't recall who. I think
the intent is to help gauge whether we can stop treating these types as
character types and instead dedicate them for use as small integers ala
int8_t and uint8_t. The answer is likely no due to unsigned char being
used for UTF-8, but having data would be helpful.
> What is the motivation behind asking about collation independently of
> locale?
It is an opportunity to ask specifically about use of stdcoll and
std::collate. That motivation may be too weak to justify the question.
>
> Why not merge 15 and 17?
That might be possible. Question 15 probes what purposes people use the
standard locale facilities for. Question 17 probes what facilities
people use to actually localize text.
Tom.
>
>
>
>
>
>
> The set of questions was culled from:
>
> * Prior discussion on the SG16 mailing list
> <https://lists.isocpp.org/sg16/2022/06/3214.php>.
> * Discussion during the 2022-06-08 SG6 telecon
> <https://github.com/sg16-unicode/sg16-meetings#june-8th-2022>.
>
> Tom.
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
>
>
> On Sat, Jun 18, 2022, 19:39 Tom Honermann via SG16
> <sg16_at_[hidden]> wrote:
>
> A draft of proposed SG16 questions for the 2023 C++ Developer
> Survey is now available here
> <https://docs.google.com/document/d/1lRU7uErn2Vc7LOGG2H3PrzCvmf69u8S_v-43by_Vb9c/edit?usp=sharing>.
> Anyone with the link should be able to view and comment on the
> draft. Please feel free to add suggestions, corrections, and other
> comments.
>
> The list of questions (19 currently) is likely too long and will
> need to be trimmed. For reference, the 2022 C++ Developer Survey
> <https://isocpp.org/blog/2022/06/results-summary-2022-annual-cpp-developer-survey-lite>
> (described as "Lite") had 19 questions.
>
> Thanks Tom,
>
> Yes, the list is pretty long, and remember the survey is biased (a few
> thousands people among those who follow standardisation closely). The
> longer the survey, the less participations. I can easily imagine each
> study group could come up with a long list of questions too, many of
> which not relevant to all participants.
I agree and I've been a little worried about other SGs jumping on the
bad wagon here :)
>
> I guess the essence to what we are going to get to is whether people
> use or would like to use C++ for text processing. Asking that directly
> is probably sufficient. Given a fairly low participation rate, letting
> people write a detailed answer to something like "what would you like
> to see improved in regard to text processing and localization?" would
> give us good reply that we could summarize fairly easily.
The last question in the proposed list is intended for that purpose.
>
> I have strong objections to the formulation of question 4, as it isn't
> possible to use emojis in a conforming implementations.
Historically it has been possible to use some emoji, but yes, we fixed that.
>
> Question 3 is also weird - why these specific languages? It excludes
> among other languages using Cyrillic, Arabic, Brahmic scripts , so
> probably around 2 billions people in total and a fair number of C++
> developers - although the survey results are likely to be biased
> towards Europeans and north Americans to begin with.
That is an artifact of me being too quick to get draft questions
prepared and being too uninformed about languages used around the world.
The Unicode supported scripts list
<https://unicode.org/standard/supported.html> enumerates 159 scripts. I
don't have a good sense of which ones should be on this list.
Peter Brett requested this question. Peter, perhaps you have some
insight into which languages you feel should be explicitly listed?
>
> More importantly, what is the desired outcome of questions 4? C++
> support arbitrary characters in comments already, and hopefully no one
> is considering restrictions.
> In some way question 4 is also redundant with question 1.
I think the main desire is just to get some data regarding whether
programmers actually use non-basic-characters in identifiers. If many
programmers answer yes, that might suggest we should do more analysis to
see if the identifier restrictions put into C++23 via P1949
<https://wg21.link/p1949> will require some migration assistance.
Likewise, if many programmers answer I-didn't-know-that-was-possible,
that may suggest a lack of awareness worth trying to address in some
way. The survey itself could serve as a way to increase awareness.
>
> If question 1 is going to list EBCDIC, surely it should list shift-jis
> and gb18030
Yes, thank you, I added those.
>
> What do we want to learn from questions 9 and 14?
Question 9 goes towards motivation for putting normalization form into
the type system. E.g., should std::text be parameterized by
normalization form.
Question 14 was requested by someone else; I don't recall who. I think
the intent is to help gauge whether we can stop treating these types as
character types and instead dedicate them for use as small integers ala
int8_t and uint8_t. The answer is likely no due to unsigned char being
used for UTF-8, but having data would be helpful.
> What is the motivation behind asking about collation independently of
> locale?
It is an opportunity to ask specifically about use of stdcoll and
std::collate. That motivation may be too weak to justify the question.
>
> Why not merge 15 and 17?
That might be possible. Question 15 probes what purposes people use the
standard locale facilities for. Question 17 probes what facilities
people use to actually localize text.
Tom.
>
>
>
>
>
>
> The set of questions was culled from:
>
> * Prior discussion on the SG16 mailing list
> <https://lists.isocpp.org/sg16/2022/06/3214.php>.
> * Discussion during the 2022-06-08 SG6 telecon
> <https://github.com/sg16-unicode/sg16-meetings#june-8th-2022>.
>
> Tom.
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
Received on 2022-06-21 19:49:15