Date: Wed, 22 Jun 2022 14:07:11 -0400
On 6/21/22 4:17 PM, Corentin Jabot wrote:
>
>
> On Tue, Jun 21, 2022 at 9:49 PM Tom Honermann <tom_at_[hidden]> wrote:
>
> On 6/18/22 2:32 PM, Corentin Jabot wrote:
>>
>>
>> On Sat, Jun 18, 2022, 19:39 Tom Honermann via SG16
>> <sg16_at_[hidden]> wrote:
>>
>> A draft of proposed SG16 questions for the 2023 C++ Developer
>> Survey is now available here
>> <https://docs.google.com/document/d/1lRU7uErn2Vc7LOGG2H3PrzCvmf69u8S_v-43by_Vb9c/edit?usp=sharing>.
>> Anyone with the link should be able to view and comment on
>> the draft. Please feel free to add suggestions, corrections,
>> and other comments.
>>
>> The list of questions (19 currently) is likely too long and
>> will need to be trimmed. For reference, the 2022 C++
>> Developer Survey
>> <https://isocpp.org/blog/2022/06/results-summary-2022-annual-cpp-developer-survey-lite>
>> (described as "Lite") had 19 questions.
>>
>> Thanks Tom,
>>
>> Yes, the list is pretty long, and remember the survey is biased
>> (a few thousands people among those who follow standardisation
>> closely). The longer the survey, the less participations. I can
>> easily imagine each study group could come up with a long list of
>> questions too, many of which not relevant to all participants.
> I agree and I've been a little worried about other SGs jumping on
> the bad wagon here :)
>>
>> I guess the essence to what we are going to get to is whether
>> people use or would like to use C++ for text processing. Asking
>> that directly is probably sufficient. Given a fairly low
>> participation rate, letting people write a detailed answer to
>> something like "what would you like to see improved in regard to
>> text processing and localization?" would give us good reply that
>> we could summarize fairly easily.
> The last question in the proposed list is intended for that purpose.
>>
>> I have strong objections to the formulation of question 4, as it
>> isn't possible to use emojis in a conforming implementations.
> Historically it has been possible to use some emoji, but yes, we
> fixed that.
>>
>> Question 3 is also weird - why these specific languages? It
>> excludes among other languages using Cyrillic, Arabic, Brahmic
>> scripts , so probably around 2 billions people in total and a
>> fair number of C++ developers - although the survey results are
>> likely to be biased towards Europeans and north Americans to
>> begin with.
>
> That is an artifact of me being too quick to get draft questions
> prepared and being too uninformed about languages used around the
> world. The Unicode supported scripts list
> <https://unicode.org/standard/supported.html> enumerates 159
> scripts. I don't have a good sense of which ones should be on this
> list.
>
> Peter Brett requested this question. Peter, perhaps you have some
> insight into which languages you feel should be explicitly listed?
>
>
> In addition to the existing list: Hindi, Bengali, Arabic, Russian.
> It's far from exhaustive but it covers a large chunk of the global
> population, without getting technical about which script is derived
> from which
Thank you, updated!
Tom.
>>
>> More importantly, what is the desired outcome of questions 4? C++
>> support arbitrary characters in comments already, and hopefully
>> no one is considering restrictions.
>> In some way question 4 is also redundant with question 1.
>
> I think the main desire is just to get some data regarding whether
> programmers actually use non-basic-characters in identifiers. If
> many programmers answer yes, that might suggest we should do more
> analysis to see if the identifier restrictions put into C++23 via
> P1949 <https://wg21.link/p1949> will require some migration
> assistance. Likewise, if many programmers answer
> I-didn't-know-that-was-possible, that may suggest a lack of
> awareness worth trying to address in some way. The survey itself
> could serve as a way to increase awareness.
>
>
> Fair enough
>
>>
>> If question 1 is going to list EBCDIC, surely it should list
>> shift-jis and gb18030
> Yes, thank you, I added those.
>>
>> What do we want to learn from questions 9 and 14?
>
> Question 9 goes towards motivation for putting normalization form
> into the type system. E.g., should std::text be parameterized by
> normalization form.
>
> Question 14 was requested by someone else; I don't recall who. I
> think the intent is to help gauge whether we can stop treating
> these types as character types and instead dedicate them for use
> as small integers ala int8_t and uint8_t. The answer is likely no
> due to unsigned char being used for UTF-8, but having data would
> be helpful.
>
>> What is the motivation behind asking about collation
>> independently of locale?
> It is an opportunity to ask specifically about use of stdcoll and
> std::collate. That motivation may be too weak to justify the question.
>>
>> Why not merge 15 and 17?
>
> That might be possible. Question 15 probes what purposes people
> use the standard locale facilities for. Question 17 probes what
> facilities people use to actually localize text.
>
> Yup, I think that's not a distinction worth making. If people use both
> std::locale and icu, they can check 2 boxes.
>
> Tom.
>
>>
>>
>>
>>
>>
>>
>> The set of questions was culled from:
>>
>> * Prior discussion on the SG16 mailing list
>> <https://lists.isocpp.org/sg16/2022/06/3214.php>.
>> * Discussion during the 2022-06-08 SG6 telecon
>> <https://github.com/sg16-unicode/sg16-meetings#june-8th-2022>.
>>
>> Tom.
>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>
>
> On Tue, Jun 21, 2022 at 9:49 PM Tom Honermann <tom_at_[hidden]> wrote:
>
> On 6/18/22 2:32 PM, Corentin Jabot wrote:
>>
>>
>> On Sat, Jun 18, 2022, 19:39 Tom Honermann via SG16
>> <sg16_at_[hidden]> wrote:
>>
>> A draft of proposed SG16 questions for the 2023 C++ Developer
>> Survey is now available here
>> <https://docs.google.com/document/d/1lRU7uErn2Vc7LOGG2H3PrzCvmf69u8S_v-43by_Vb9c/edit?usp=sharing>.
>> Anyone with the link should be able to view and comment on
>> the draft. Please feel free to add suggestions, corrections,
>> and other comments.
>>
>> The list of questions (19 currently) is likely too long and
>> will need to be trimmed. For reference, the 2022 C++
>> Developer Survey
>> <https://isocpp.org/blog/2022/06/results-summary-2022-annual-cpp-developer-survey-lite>
>> (described as "Lite") had 19 questions.
>>
>> Thanks Tom,
>>
>> Yes, the list is pretty long, and remember the survey is biased
>> (a few thousands people among those who follow standardisation
>> closely). The longer the survey, the less participations. I can
>> easily imagine each study group could come up with a long list of
>> questions too, many of which not relevant to all participants.
> I agree and I've been a little worried about other SGs jumping on
> the bad wagon here :)
>>
>> I guess the essence to what we are going to get to is whether
>> people use or would like to use C++ for text processing. Asking
>> that directly is probably sufficient. Given a fairly low
>> participation rate, letting people write a detailed answer to
>> something like "what would you like to see improved in regard to
>> text processing and localization?" would give us good reply that
>> we could summarize fairly easily.
> The last question in the proposed list is intended for that purpose.
>>
>> I have strong objections to the formulation of question 4, as it
>> isn't possible to use emojis in a conforming implementations.
> Historically it has been possible to use some emoji, but yes, we
> fixed that.
>>
>> Question 3 is also weird - why these specific languages? It
>> excludes among other languages using Cyrillic, Arabic, Brahmic
>> scripts , so probably around 2 billions people in total and a
>> fair number of C++ developers - although the survey results are
>> likely to be biased towards Europeans and north Americans to
>> begin with.
>
> That is an artifact of me being too quick to get draft questions
> prepared and being too uninformed about languages used around the
> world. The Unicode supported scripts list
> <https://unicode.org/standard/supported.html> enumerates 159
> scripts. I don't have a good sense of which ones should be on this
> list.
>
> Peter Brett requested this question. Peter, perhaps you have some
> insight into which languages you feel should be explicitly listed?
>
>
> In addition to the existing list: Hindi, Bengali, Arabic, Russian.
> It's far from exhaustive but it covers a large chunk of the global
> population, without getting technical about which script is derived
> from which
Thank you, updated!
Tom.
>>
>> More importantly, what is the desired outcome of questions 4? C++
>> support arbitrary characters in comments already, and hopefully
>> no one is considering restrictions.
>> In some way question 4 is also redundant with question 1.
>
> I think the main desire is just to get some data regarding whether
> programmers actually use non-basic-characters in identifiers. If
> many programmers answer yes, that might suggest we should do more
> analysis to see if the identifier restrictions put into C++23 via
> P1949 <https://wg21.link/p1949> will require some migration
> assistance. Likewise, if many programmers answer
> I-didn't-know-that-was-possible, that may suggest a lack of
> awareness worth trying to address in some way. The survey itself
> could serve as a way to increase awareness.
>
>
> Fair enough
>
>>
>> If question 1 is going to list EBCDIC, surely it should list
>> shift-jis and gb18030
> Yes, thank you, I added those.
>>
>> What do we want to learn from questions 9 and 14?
>
> Question 9 goes towards motivation for putting normalization form
> into the type system. E.g., should std::text be parameterized by
> normalization form.
>
> Question 14 was requested by someone else; I don't recall who. I
> think the intent is to help gauge whether we can stop treating
> these types as character types and instead dedicate them for use
> as small integers ala int8_t and uint8_t. The answer is likely no
> due to unsigned char being used for UTF-8, but having data would
> be helpful.
>
>> What is the motivation behind asking about collation
>> independently of locale?
> It is an opportunity to ask specifically about use of stdcoll and
> std::collate. That motivation may be too weak to justify the question.
>>
>> Why not merge 15 and 17?
>
> That might be possible. Question 15 probes what purposes people
> use the standard locale facilities for. Question 17 probes what
> facilities people use to actually localize text.
>
> Yup, I think that's not a distinction worth making. If people use both
> std::locale and icu, they can check 2 boxes.
>
> Tom.
>
>>
>>
>>
>>
>>
>>
>> The set of questions was culled from:
>>
>> * Prior discussion on the SG16 mailing list
>> <https://lists.isocpp.org/sg16/2022/06/3214.php>.
>> * Discussion during the 2022-06-08 SG6 telecon
>> <https://github.com/sg16-unicode/sg16-meetings#june-8th-2022>.
>>
>> Tom.
>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
Received on 2022-06-22 18:07:15