> 1. And I have come up with the idea of stack swapping

This idea sounds similar but slightly different to how a kernel call happens, where a syscall occurs it switches to the kernel stack associated with that thread, it elevates in ring levels and is able to access pages marked with supervisor on top of the mapped user mode pages.

If you were to model it a little closer to system calls where you have a set of numbered routines/functions which can be called, then the program could define those specific functions which are allowed to be called with the raised permissions but most go via the kernel.

The entering and leaving wouldn't be too difficult to engineer however the complexity I think would come with how to manage unique page tables for threads, typically each process has its own page table so you would need to duplicate it per thread, you wouldn't need to duplicate the entire thing but atleast the top level and then anything down to the page size you want for the protected region, the kernel doesn't need to do this as page table entries already have user/supervisor bit flags. There are other complexities that would come with this like page table updates needing to be applied to multiple threads, etc. If you simplified the design down to not every thread having its own unique page table and accessible memory but having a privileged mode which had access to a single page table then it would probably simplify the design. You could even take this to another level with even more operating system support having something like Linux's seccomp have different permissions based on if it's user mode elevated or not.

It wouldn't make you invulnerable to things like buffer overflows, but it would mean that you can separate your secret handling code and memory from the rest of your user mode code, but in the end you could still do this with some IPC method and having them in different processes (or even machines).

The more I think about this approach you can go further and further down the path of sandbox executables similar to what some web browsers have, if this was something easier to architecture with the support of some library (standard or not) then people might consider it for a design, at the moment I suspect 99% of programmers wouldn't know where to begin.

> 2. Messing with code generation isn’t a bad idea, specially if we are dealing with open-source applications. I can envision an additional argument being passed to the compiler in order to randomize the layout of where certain functions or variables are relative to each other, or change other aspects of the resulting code to add additional fuzzing.

This sounds a little bit similar to the work Charlie Curtsinger and Emery D. Berger at the University of Massachusetts Amherst did with Stabilizer ( https://github.com/ccurtsinger/stabilizer/ ), which randomized everything however probably to a bigger extreme. Their intention was more to do with performance analysis then security.

While ASLR gives overall layout randomization so you need to first find the addresses, randomizing the text section might add a slight barrier but probably not as much as you would think, from the perspective of a hacker if you have a read primitive then you could still read the text segment and extract enough to find the ROP gadgets, pwntools has DynELF that lets you use a read primitive to easily parse and work with ELF files in a compremissed processed using just a read primitive, if someone so desires they could even attempt to dump the entire binary using a read primitive, generate FLIRT signatures from a binary they compiled and determine where functions are placed within the one that just acquired through the read primitive. (I have done similar things in CTF security challenges)

Some other ideas to consider:

Separate stack data from return address stack, doing it without an ABI break would be hard as you would probably still need some data on the actual stack for when there is lots of arguments and you couldn't easily designate one register as the data stack so would require thread local storage to maintain a pointer, and also supporting crossing boundaries which do and don't support it.
Have compiler warnings for any array stored on the stack, many compilers currently generate canaries for cases where there is an array it could be taken a step further and generate compiler warnings, arrays are typically a source of buffer overflows
Have the .text segment not have read permissions only execute, it might hinder some cases where things like switch lookup tables are stored within the .text segment but would make it harder to leak the text segment, similar to DynELF mentioned above
Lightweight production level asserts, so things like std::array operator[] can terminate if it's out of bounds

Unfortunately all my emails to the sg14 list are hitting moderator approval, so only those I'm doing reply-all to are seeing anything, that's what I get for being a lurker the entire time.

Thanks,

James Mitchell

On Sun, 3 Mar 2024 at 08:54, Tiago Freire via SG14 <sg14@lists.isocpp.org> wrote:

I have to admit that at the beginning I was a bit skeptical in terms of what was being asked was either achievable or usable.

However, I did some further thinking on the issue and how to combine certain concepts to make something useful.

1.

And I have come up with the idea of stack swapping, i.e. mid execution a thread can swap its stack for another, given that this “another” could be a protected page locked to the current thread (i.e. only that thread can read it).

Assuming that restricting pages to specific threads is possible.

The idea being that access can never be changed. When the application needs to do something sensitive, it will swap its stack to this special protected one, do all of its cryptography there, return, swap the stack back, zero out the special stack before returning it to the system.

You could even use existing cryptographic libraries to keep it safe as long as they do everything on the stack, they wouldn’t be able to tell that they were running on a special stack. If they happen to require heap allocation then that will of course be leakable, but you can fix that by providing a special allocator where the pages are locked to the running thread.

As long as this part of the code is done correctly (which can be made to have a small testable surface), this kind of system would be invulnerable to overflow attacks. And even if you managed to get remote execution to work on some other part of the code you may not get that far, as there would be no facility available to unlock the page assigned to another thread, unless you can:

a) swap the running context of the thread that currently has privileged access to the memory at the right time. This is a higher bar to achieve.

b) get a root kit to gain the OS privileged access. At that point the entire system is screwed, and this type of protection wouldn’t make much of a difference.

As a bonus point, if you happen to leak this privileged memory, the system would be able to reclaim it back when the thread exits, or with a facility specially crafted to clear it.

This wouldn’t need to affect code generation of current applications given that you would need to explicitly opt-in by using special functions.

2.

Messing with code generation isn’t a bad idea, specially if we are dealing with open-source applications. I can envision an additional argument being passed to the compiler in order to randomize the layout of where certain functions or variables are relative to each other, or change other aspects of the resulting code to add additional fuzzing.

The side effect of which would be, even if someone managed to replicate the exact build environment, and figure out the exact version of the application that is running, and they have an exploit where they could target a specific place in memory, they wouldn’t have access to the exact build and would be much harder to figure out the code layout to make the exploit work.

Sure, this will have some impact on the predictability of the run time performance because of how the instruction appear in cache, but if what you are trying to do is protect data, predictable performance is not as high a priority.

3.

There’s always the lose point of how the cryptographic keys/credit card secrets end up in the application to begin with. As James hinted at, it seems like a bad idea that the user facing application that is subject to attacks and exploits from external malicious actors is also the application that has direct access to your passwords. If you have a separate application who’s only responsibility is to manage the secrets, and it can do it right, then the issue isn’t as much of a problem, this is not to say that this sort of memory protection isn’t useful, and protecting your one-time usable tokens isn’t worth doing, but perhaps may be less important if better security practices were adopted instead. There’s no magic solution that can save anyone if the developer just does “stupid shit”, and a minimum level of competence is required.

And I’m not sure if adopting better security standards is more productive.

In summary.

In any case it seems to me there is indeed a great deal of something that can actually be done, and definitely worth researching. But most of it involves either hardware or operating system design, this could benefit all programming languages that can be compiled into byte code, not just C++. The role of C++ would only be to standardize the API’s to make it available to the user. But these facilities will need to be created first outside of the C++ standard before the committee could do anything about it.

It is an interesting point of discussion; somebody should do research on this topic; maybe it will become standard practice in the future. But the C++ committee may not be the right venue.

Br,

From: Tiago Freire <tmiguelf@hotmail.com>
Sent: Saturday, March 2, 2024 9:04 AM
To: Robin Rowe <robin.rowe@heroicrobots.com>; sg14@lists.isocpp.org
Cc: undefined-behavior-study-group@sei.cmu.edu
Subject: Re: [SG14] Memory Safety and Page Protected Memory

I agree it doesn't have to be full proof in order to work.

And an answer could be all of the above.

Disconnected heap spaces

memory locks

memory scrubbers

safer designs to interact with sensitive data

they all do something, even if not perfect if at least can frustrate attacks to be statistically impractical for 50% of applications, we have still made things safer.

As long as it is understood that safer doesn't mean perfectly safe, I think we do have some points of actions that can be researched on and that can become reality.

_______________________________________________
SG14 mailing list
SG14@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg14