C++ Logo


Advanced search

Re: [std-proposals] SIMD by just operating on 2 arrays

From: Sebastian Wittmeier <wittmeier_at_[hidden]>
Date: Wed, 12 Apr 2023 18:41:19 +0200
Hi Samuel,   what is it exactly you are mainly after? a) To express vector operations with implicit fallback to a loop b) To have a nice way to express SIMD code with simple mathematical operations c) To create better interoperation between C-style arrays and SIMD code? Or strictly a combination of those?   All of those can currently be done with the existing language to some degree.   For standardization you probably will get most pushback for c), as there is a huge amount of decades-old code out there, there is C compatibility to consider and the existing C style arrays have lots of conceptual, usage and safety issues already as they are. Nevertheless - with the current language facilities - implicit conversions from C style arrays to any SIMD class type can be created, if one wishes for it. You will still have the issue with alignment for some architectures and some SIMD instructions -> your program will either run slower or terminate. Some operating systems recover by emulating the memory operation, when a program has those issues, but this is vvvvveeeeeerrrrryyyy slow.   Best, Sebastian   -----Ursprüngliche Nachricht----- Von:samuel ammonius via Std-Proposals <std-proposals_at_[hidden]> Gesendet:Mi 12.04.2023 18:16 Betreff:Re: [std-proposals] SIMD by just operating on 2 arrays An:std-proposals_at_[hidden]; CC:samuel ammonius <sfammonius_at_[hidden]>; On Wed, Apr 12, 2023 at 1:02 PM Jason McKesson via Std-Proposals <std-proposals_at_[hidden] <mailto:std-proposals_at_[hidden]> > wrote: The other elephant in the room is that SIMD often requires specific alignment for its objects. And C arrays don't come with that; you'd have to explicitly align them with `alignas`. Which means you now have to know what alignment to use for the platform, or to use some type that provides the alignment (which will often require you to restate the size of the array). Whereas with a proper, dedicated type, alignment comes for free.  I think a vectorized CPU would still be able to copy the elements of an unaligned array with the same speed though, right?  If not I guess the requirement would have to be "a list of 2, 4, or 8 floats or ints with no byte padding". -- Std-Proposals mailing list Std-Proposals_at_[hidden] https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2023-04-12 16:41:21