Out of the 5 options laid out, I feel it would be best to make it ill-formed.  The source code author's intent is (at best) ambiguous, and the onus should not be placed on the compiler to try to make it work.  The status quo allows for programs to be broken in surprising ways.

I think you might be right.
Especially given making it work add implementation burden for the sake of legacy encodings and there is a somewhat easy fix: nfc normalize your sources. 
Wait, does this apply to u, U, and u8 strings? Users can't have non-NFC-normalized strings?

Well, that was the original question:
If the source is nfd normalized, and the execution encoding is not a Unicode encoding, what should be done about combining characters?
Okay. I guess I got the answer. For "plain" strings, there is some desire to make some problematic sequences go away. The perceived badness goes away with NFC normalization and users applying a broad NFC normalization upon their Unicode source may suffer loss to the integrity of their Unicode literals.

The proposed ill-formedness is rather dependent on how "characters" in source code are identified though. Given the intent to preserve (no specific) normalization and therefore non-normalization to maintain the integrity of Unicode literals during the phases of translation, by "characters" we do mean (for UTF-8 source) UCS scalar values.

And yes, normalizing the source does not preserve Unicode literals.

Question is, is that a reasonable work around?

As for normalization, I am writing wording to make sure normalization is preserved in phase 1 when applicable.


