Date: Fri, 08 Nov 2024 08:20:09 -0800
On Friday 8 November 2024 03:56:43 Pacific Standard Time Frederick Virchanza
Gotham via Std-Proposals wrote:
> I haven't coded the following yet, but it's what I'm planning to do.
> So the plugin will be in the form of:
>
> Windows : DLL file
> Linux : SO file
> Apple : DYNLIB file
>
> I will build the plugin as a shared library file, and then I'll append
> a known UUID to the file contents, something like
>
> echo "8749cd5638ddd35726a524567dc9e63e" >> libplugin.so
>
> After the UUID, I'll append any info I need, such as strings and
> translations, version numbers, etc.
Bad idea. How do you know that the file formats don't require some parsing-
from-the-end too, for a signature verification? Or that they won't complain
that the file is bigger than it should be? Plus, of course, you could only do
this *after* stripping because the stripping of debug symbols would remove
your content too. This means you need to ensure no one tries to run the
symbol-stripping again, because it would remove again your tail payload:
$ file a.out
a.out: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically
linked, interpreter /lib64/ld-linux-x86-64.so.2,
BuildID[sha1]=e49755b9bdbcdfb0dbb823b9f7357faf40778d4a, for GNU/Linux 3.2.0,
with debug_info, not stripped
$ echo "Hello World" >> a.out
$ grep "Hello World" a.out
grep: a.out: binary file matches
$ strip a.out
$ grep -c "Hello World" a.out
0
$ echo "Hello World" >> a.out
$ strip a.out
$ grep -c "Hello World" a.out
0
You should put this in the file's actual content. That way, it's protected from
corruption by the file's own format and other tamper-protection mechanisms that
the vendor may have deployed.
Qt 4 initially just had a variable with a magic string that we could search
for ("QTMETADATA"). But:
a) scanning the entire file is expensive, because they can be arbitrarily big
b) the same string may be duplicated, for example for the debugging symbol of
the string itself
c) in case of fat file formats (like on Apple systems), the different slices
would have the same string, but their contents may not be the correct one
So by late Qt 4, we moved the generic variable to a specific section in the
binary. Then we could scan the binary's section table to find this specific
section. This of course required writing Mach-O and ELF parsers (the COFF-PE
one is much more recent). The format for each of those is in headers in each
of the platforms.
But then we found out that the section table on ELF is not mandatory for
execution and loading. There are tools like sstrip that can remove it. So in
Qt 6.2 I moved from an ELF section to an ELF note.
We've also recently found out that on the Apple embedded platforms, the
binaries can be encrypted, so you can't read the contents of the sections at
all without loading. Fortunately, for these platforms, apps get downloaded as
a single bundle with everything they need inside, from the App Store, so the
chance of binary incompatibility is effectively zero.
Gotham via Std-Proposals wrote:
> I haven't coded the following yet, but it's what I'm planning to do.
> So the plugin will be in the form of:
>
> Windows : DLL file
> Linux : SO file
> Apple : DYNLIB file
>
> I will build the plugin as a shared library file, and then I'll append
> a known UUID to the file contents, something like
>
> echo "8749cd5638ddd35726a524567dc9e63e" >> libplugin.so
>
> After the UUID, I'll append any info I need, such as strings and
> translations, version numbers, etc.
Bad idea. How do you know that the file formats don't require some parsing-
from-the-end too, for a signature verification? Or that they won't complain
that the file is bigger than it should be? Plus, of course, you could only do
this *after* stripping because the stripping of debug symbols would remove
your content too. This means you need to ensure no one tries to run the
symbol-stripping again, because it would remove again your tail payload:
$ file a.out
a.out: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically
linked, interpreter /lib64/ld-linux-x86-64.so.2,
BuildID[sha1]=e49755b9bdbcdfb0dbb823b9f7357faf40778d4a, for GNU/Linux 3.2.0,
with debug_info, not stripped
$ echo "Hello World" >> a.out
$ grep "Hello World" a.out
grep: a.out: binary file matches
$ strip a.out
$ grep -c "Hello World" a.out
0
$ echo "Hello World" >> a.out
$ strip a.out
$ grep -c "Hello World" a.out
0
You should put this in the file's actual content. That way, it's protected from
corruption by the file's own format and other tamper-protection mechanisms that
the vendor may have deployed.
Qt 4 initially just had a variable with a magic string that we could search
for ("QTMETADATA"). But:
a) scanning the entire file is expensive, because they can be arbitrarily big
b) the same string may be duplicated, for example for the debugging symbol of
the string itself
c) in case of fat file formats (like on Apple systems), the different slices
would have the same string, but their contents may not be the correct one
So by late Qt 4, we moved the generic variable to a specific section in the
binary. Then we could scan the binary's section table to find this specific
section. This of course required writing Mach-O and ELF parsers (the COFF-PE
one is much more recent). The format for each of those is in headers in each
of the platforms.
But then we found out that the section table on ELF is not mandatory for
execution and loading. There are tools like sstrip that can remove it. So in
Qt 6.2 I moved from an ELF section to an ELF note.
We've also recently found out that on the Apple embedded platforms, the
binaries can be encrypted, so you can't read the contents of the sections at
all without loading. Fortunately, for these platforms, apps get downloaded as
a single bundle with everything they need inside, from the App Store, so the
chance of binary incompatibility is effectively zero.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel DCAI Platform & System Engineering
Received on 2024-11-08 16:20:15