C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] BOM in JSON (was: Re: SG16 meeting summary for July 31st, 2019)

From: Tony V E <tvaneerd_at_[hidden]>
Date: Mon, 19 Aug 2019 16:52:34 -0400
https://en.wikipedia.org/wiki/Byte_order_mark#Usage

There is some pertinent advice on that page.
There is also a note that Visual Studio uses/used the BOM to see if a file
is UTF8 vs whatever else.


On Mon, Aug 19, 2019 at 3:46 PM Ben Boeckel <ben.boeckel_at_[hidden]> wrote:

> On Mon, Aug 19, 2019 at 22:25:05 +0300, Henri Sivonen wrote:
> > On Mon, Aug 19, 2019 at 9:57 PM Ben Boeckel <ben.boeckel_at_[hidden]>
> wrote:
> > > Notepad?
> >
> > Yes, Notepad. It's generally easier to make parsers of all kinds (XML
> > before, JSON later) accept the UTF-8 BOM than to fight Notepad. It'll
> > take a long time for the existing installed base to get replaced with
> > the newest:
> https://mobile.twitter.com/JenMsft/status/1163474010509701120
>
> BOMs only make sense in an at-rest storage backed JSON file that the
> parser reads directly. Given a string, a JSON parser should *certainly*
> not accept a BOM leader.
>
> Quick survey:
>
> % echo $'\xEF\xBB\xBF{}' > bom.json
>
> - jsoncpp: no mention of a BOM in the source, probably unhappy about
> it
> - jq: fine
> - python3:
> json.decoder.JSONDecodeError: Unexpected UTF-8 BOM (decode using
> utf-8-sig): line 1 column 1 (char 0)
> - ruby:
> /usr/share/ruby/json/common.rb:156:in `parse': 765: unexpected token
> at '\xEF\xBB\xBF{}' (JSON::ParserError)
> - C#: https://jimmybogard.com/the-curious-case-of-the-json-bom/
>
> I don't know that BOM support is actually all that wide-spread in
> readers based on this short survey. And the solution seems to be "don't
> write the BOM" where the problem is encountered.
>
> I think those sticking to their notepad guns are just going to have to
> wait for something better because waiting for the libraries to catch up
> (and the relevant fixes to be backported to declared minimum supported
> versions) is likely going to take *even longer*. Or they can download a
> real editor and actually contribute to whatever codebase they're trying
> to build.
>
> > > > 2. Consistency with Web browsers.
> > >
> > > I don't see why a web browser would care about these files.
> >
> > Maybe not _these_ JSON files, but a general-purpose JSON parser can
> > still care about consistency with Web browsers.
>
> That's fine. They can then accept the not-BOM files that every writer
> for this format would write just like every other BOM-less
> network-transferred JSON content in the world.
>
> --Ben
>


-- 
Be seeing you,
Tony

Received on 2019-08-19 22:52:51