Date: Mon, 25 Feb 2019 08:42:10 -1000
On Sun, Feb 24, 2019, 7:39 PM Ben Boeckel <ben.boeckel_at_[hidden]> wrote:
>
> I have GCC writing out JSON-like syntax right now. It isn't 100% valid
> since it isn't UTF-8, but I don't want *that* in these files either.
>
It seems reasonable to have non-ascii in user-provided data fields. We
should figure out how to handle the case where the user's path is invalid
utf8, like ok linux where it can be a random bag-o-byte or on UCS2
platforms that allow mismatched surrogates. If the compiledb format handles
these cases, we should probably just do whatever they do.
Well, you can't know until you actually compile the BMI whether it has
> changed or not. The best we can ask for is "only update if contents are
> unchanged". Getting that for .o files would be nice as well. Ninja can
> then optimize no-change compilations via `restat`.
>
I didn't just mean for the scan phase. The BMI can change in ways that
don't require the downstream stuff to be recompiled, eg a comment string
was changed on a line of source included only for better error reporting.
Similarly, I could see that something like that happening with the .o and
split-dwarf / osx-style unsplit-split-dwarf.
> And for the love of $diety, don't put any locale- sensitive strings in
> this
> > metadata!
>
> I'd rather have it just be "a series of bytes that is a valid lookup on
> the filesystem". The `\` and `"` characters are escaped using `\` for
> obvious reasons. Maybe we do it for control characters as well. Is that
> good enough for a specification?
>
I think I made my point poorly, and was misinterpreted. I was just making a
joke about /showIncludes. The equivalent behavior would be to make the
field names in the json file match the user's language. I hope no vender is
mean enough to actually do that! Obviously users need to be able to use
their language in their files and paths. I'm not suggesting we limit that
in any way, just that the field names are predictable.
>
>
> I have GCC writing out JSON-like syntax right now. It isn't 100% valid
> since it isn't UTF-8, but I don't want *that* in these files either.
>
It seems reasonable to have non-ascii in user-provided data fields. We
should figure out how to handle the case where the user's path is invalid
utf8, like ok linux where it can be a random bag-o-byte or on UCS2
platforms that allow mismatched surrogates. If the compiledb format handles
these cases, we should probably just do whatever they do.
Well, you can't know until you actually compile the BMI whether it has
> changed or not. The best we can ask for is "only update if contents are
> unchanged". Getting that for .o files would be nice as well. Ninja can
> then optimize no-change compilations via `restat`.
>
I didn't just mean for the scan phase. The BMI can change in ways that
don't require the downstream stuff to be recompiled, eg a comment string
was changed on a line of source included only for better error reporting.
Similarly, I could see that something like that happening with the .o and
split-dwarf / osx-style unsplit-split-dwarf.
> And for the love of $diety, don't put any locale- sensitive strings in
> this
> > metadata!
>
> I'd rather have it just be "a series of bytes that is a valid lookup on
> the filesystem". The `\` and `"` characters are escaped using `\` for
> obvious reasons. Maybe we do it for control characters as well. Is that
> good enough for a specification?
>
I think I made my point poorly, and was misinterpreted. I was just making a
joke about /showIncludes. The equivalent behavior would be to make the
field names in the json file match the user's language. I hope no vender is
mean enough to actually do that! Obviously users need to be able to use
their language in their files and paths. I'm not suggesting we limit that
in any way, just that the field names are predictable.
>
Received on 2019-02-25 19:42:23