C++ Logo


Advanced search

Re: [SG16-Unicode] BOM in JSON (was: Re: SG16 meeting summary for July 31st, 2019)

From: Ben Boeckel <ben.boeckel_at_[hidden]>
Date: Mon, 19 Aug 2019 14:57:02 -0400
On Mon, Aug 19, 2019 at 21:36:38 +0300, Henri Sivonen wrote:
> Presumably the reason to use JSON instead of a custom format is to make the
> format consumable with JSON libraries.


> Therefore, it makes sense for it not
> to profile JSON but to work with off-the-shelf libraries. I haven't
> actually surveyed JSON libraries for UTF-8 BOM acceptance, but there are
> three reasons why UTF-8 BOM acceptance makes sense for a general-purpose
> JSON parsing library:

Presumably no (widely-usable) JSON library *requires* a BOM though to
parse properly.

> 1. Compatibility with Windows-ish text editors for those JSON formats that
> _are_ edited with text editors.

Notepad? edit.exe? I'm OK with not supporting them. Heck, they only
recently added support for Unix newline endings in notepad (edit is dead
AFAICT[1]). Pandering to such a lowest-common denominator would have me
asking why ed users are then left out by not using a line-oriented
format ;) (answer: there is more structure here than a
linefeed-delimited format would make easy to parse anyways).

> 2. Consistency with Web browsers.

I don't see why a web browser would care about these files. I guess
there are people using browsers as editors (Atom and VSCode), but I'd
expect those to try and be more editor-y than browser-y personally and
not choke on BOM-less JSON.

> 3. Doing the MAY from the RFC aligns with Postel's Law (which admittedly
> has lost quite a bit of its charm).

Reading the RFC, it says that BOM is *not* valid (network-transferred)
JSON. I infer from that statement that it isn't valid JSON either. The
`MAY` is on allowing readers to tolerate a BOM, not on writers *making*
a BOM. I see no reason to poke a hole in it for this purpose. Just as
readers and writers will have to disable any normalization logic,
they'll have to disable the BOM where relevant as well.


[1] Much to my dismay; it was the best editor shipped with a stock
Windows when it was around.

Received on 2019-08-19 20:57:05