sg13

From: Roger Orr <rogero_at_[hidden]> · Date: Tue, 23 Jul 2019 21:20:31 +0000

--
Minutes from the Friday afternoon SG13 meeting held on 2019-07-19 (during the WG21 meeting in Cologne)
There were 11 attendees
---++ P1386R2 A Standard Audio API for C++
Slides based on P1386R2 + some modifications
Q: What is the difference between frames_contiguous and channels_contiguous and is_contiguous?
Interleaved means frames are contiguous, deinterleaved means channels are contiguous
Comment: I might call the is_contiguous function data_is_contiguous
A: These names match mdspan names
Comment: I care about the use case. I either want to make use of the frames being one after another in memory or I would make use of at least the samples of a single chainnel. You need to tell me what I need to do to make that design work.
I think there's no design for making the channel contiguous ...
A: In R2 we introduced channel_data()
Comment: But I can only see that if the channels are contiguous. I want channels_contiguous() to mean I can get channel_data().
( ... possibility of bikeshedding is_contiguous() ... )
Q: So, should channel_data and data return span? This would create an inconsistency with the STL containers. LEWG question?
Comment: We didn't have span when we wrote the containers, and this isn't a container.
Cool, we'll return span.
Q: In the case where channels_are_contiguous is true doesn't mean in_contiguous is true. So does that match span as well?
No, wait... The span is not a mdspan.
Maybe there is a redundancy...
You would have to ask if the span is empty.
Q: _question about is_contiguous_
Q: What do I do with this? If I have the interleaved case frames are contiguous. Or, channels are contiguous, I go to channel data. If neither is the case...
is_contiguous DOES tell you that you can take a block of memory and drop it to the API.
If data is deinterleaved then you can use channel data. If it's not contiguous then we just check to see if it's deinterleaved and we know if we can use channel data.
You know if you can use this regardless of whether it's contiguous or not.
_further discussion_
OK, is_contiguous and is_interleaved won't work: the other thing you want to explicitly support is what if the underlying layout is completely weird and nothing is interleaved or contiguous anywhere. This is why you always need three. We may have the wrong three though.
Comment: FB For data() to make sense, you want the frames to be contiguous. Otherwise you use channel_data(). If neither are true, and you need to go to the operator to get the data, then neither are true. So you need is_contiguous() to identify this.
So we remove frames_are_contiguous and we're fine.
_dissent_
Comment: We need an accessor for data frames perhaps?
I never considered the case where the frames would be pointed to.
Comment:You're never gong to have the frames as separate allocations, but that doesn't mean you can't have an indexer that just indexes into the array.
Perhaps rename data() to sample_data()...
I'm happy to return a pointer rather than a span just to make this clearer that this is a different use case.
I wanted to address wacky formats: I don't think it's as simple as the way it's being discussed. I don't think we can discuss this structure uninformedly.
This is a data structure for simple unencoded PCM that lives in your memory.
Presentation continues
Comment: It sounds like supplying a default is fine; I would be happy to start with get_default_audio_input_device()/output_device()
Agree
Comment: This same aspect is true of the 2D graphics. If you have a notion of a backend then what we should do for standardisation is describe how access to any backend should work and require the compiler to ship a default backend but likewise allow vendors to ship a backend.
You keep the defaults but you declare a concept that matches your device that anyone can implement.
You do not need to create a polymorphic list to cover runtime implementations
A: You do: this is a pattern in audio software being able to change backends.
Comment: A function would call the backends to get the strings identifying them.
So audio device is not a class anymore, it's a concept
Comment: FB We keep the class and add a concept that describes this thing
it's a category of devices.
Comment: It does sound like some discussion with SG1 is required about executors. You want to specify things that aren't standard but extensible by the vendor. We don't want to specify different backends, but specify a standard way of querying availability of backends.
Q: Are virtual function calls significant? Can this work?
Many frameworks have a base class you inherit from. But then you have people who say "we don't do polymorphism in the standard"
Comment: PS Having a dynamically extensible set of implementations is where polymorphism shines. If the set was fixed up front you could use variant. But you can't, it's not even fixed at compile time.
Type erased wrapper which hides the virtualness is what we do with executors. There isn't a base class that users inherit from
Q: We want a base class though that users inherit from. How do you implement an audio device?
(Look at Sean Parent's talk.)
Either we do this or the concept thing. Let's take this offline. It's not obvious to me which we should go for. The simple thing is there is only one backend provided by the standard.
Q: Why not that?
Some users would like to be able to choose different drivers for their sound card.
Q: Do you see any pro users using this?
I've heard from pro audio people say that they would use this.
Comment: I think the level of comprehensiveness for pro use is never going to be met and so those people won't use this.
There are many workstations that use JUCE which this is similar to.
Presentation continues.
_discussion about plugging in hardware_
Comment: I think on entry level audio output it would be nice to have the hello world.
You can implement on top of this.
Q: Do we need to specify the conditions under which thread the callback is called? Do we try to categorise all the targets?
Comment: It's going to be critical to know whether you're in the IO thread or not.
The callback doesn't tell you about what part of the list has changed
Typically you want to reiterate over the list again.
Q: What is the purpose of the event argument?
It's a compromise between having one callback and a very granular API where you get different callbacks for all kinds of events.
Comment:You could have another type of callback and that CB takes an argument which is the device list event and an overload of ser audio device list callback...
Yes we considered this in R1 and decided it yielded too much boilerplate.
One of the executor discussions is to treat this like a stream of empty events notifying that something has changed. If you had a function which returned a stream of device change events then you could apply the filter algorithm to that. those filters could be built as pre-available helper algorithms that composed that for users.
Q: When would you want to compose someone changing an audio device with something else?
You might have a timeout waiting for the user to plug in a device and after 30 seconds you want to cancel the subscription to the stream of device changes.
I would be interested in use cases where this approach would be better for those use cases. It's technically vey interesting but I'm struggling to find use cases.
Comment: The closest to where we're going is what I'll be presenting shortly.
Presentation continues.
_discussion about Configuring/Running/Unavailable state diagram_
Comments: Transferring the ownership transfers responsibilities.
That doesn't give you a wide contract though.
Look at P0408 on basic_stringbuf.
Presentation continues.
Comment: I think there should be a value type which identifies a device on a backend and the same type across all possible backends.
Q: How is thread_id like that?
There's only one thread_id type per library.
Q: The device can have both inputs and outputs?
Yes
Q: How can can_connect and can_process ever be constexpr?
It depends on a system property: are threads available.
Q: Should they be static?
Yes
Q: What is the meaning of and what does the client do with bool returns from start and stop?
Good question: what does it do with CoreAudio?
Comment: Error code handling
Excellent, we address that in the final slide in future work. Please seek clarification for all of us.
Q: Is the start operation always complete synchronously?
It returns instantly but you know nothing. Only when you get the callback do you know what's going on
Then you need to deliver the return code. Also, we are looking at stop_token for C++20.
---++ D1746R1 — Feedback on audio
Presentation of slides based on !D1746R1
Comment: Convinced we need to talk about forward guarantees, but everyone else says no. My incomplete understanding is that we shouldn't specify anything. If you run your program in a way that the output gets garbled that's not an error in your program, it's not something you can describe in C++. I think this is out of scope.
Comment: I don't have more to say, this is for posterity.
Comment: It's more "of broader scope" rather than "out of scope".
Comment: Key area of concern is about audio routing and policies. Audio APIs are policies about shared resources, sound being mixed between applications, privacy concerns about microphone access, concerns about ensuring that any specification here doesn't preclude strengthening of any privacy concerns.
Q: When I read through that I was thinking of what model does the default device have in the standard if the default device changes dynamically? Does that mean the environment would provide a virtual device and switch transparently? Do you see the standardisation effort describing hardware or virtualised hardware?
Comment: In my mind I would consider that one of the core questions to answer. I think the proposal as it stand straddles both worlds a little bit. I think this points to the great complexity when going to the lower level.
Comment: As I see it we describe conforming implementation and then it is implementation defined to say how that is realised by implementers. If you care about these things then you implement it with a virtual device. On that end I don't know how much of that complexity we need to pull into the standard. I admit there is a thing that we can easily do which is some notion of your audio needs to be paused. That I can see.
Comment: It's interesting that you say it's not decisive. On the one extreme imagine an embedded device where you're moving bytes around, nothing tells you anything. On the other extreme is something like a phone where if you have a piece of music and get a phone call and the OS handles a nice fade for you. They're both equally useful. We're somewhere in between.
Comment: There's a third, the production grade audio device which knows about the hardware.
Comment: Maybe a device is a more specific version of a more generic thing, a handle to something that lets you exchange audio with the system.
Comment: I will say that one of my goals is to question the level of abstraction here. It's a little bit ambiguous right now. We'd like to see it go a little higher because some of the current APIs are a little challenging now. It's going to be really hard, I would also say that the lower level of abstraction describes desktop computers and that's it. It doesn't map to embedded. Even on desktop computers it doesn't map to most audio needs.
Comment: Maybe I should add that the original motivation of this was not to add an audio API, but rather what is the lowest level of audio that we can provide such that portable code can be written on top of it.
Q: I question that motivation. Who is this going to serve? I think anyone with general audio needs is not going to be able to get a sufficiently good implementation built on top of this because of things the OS does with higher level APIs. It just does not map to most platforms. You can fudge all this but it doesn't exist. You're presenting API complexity that isn't there. If you choose the low level route and fully commit to that it gets far more complex than the proposal is currently.
Comment: If you could look at how concurrency came in to the standard it started out with the lowest level things. We built more on top of that. If you start at too high a level we might end up with another async that's not working or implementable, and we might just need to take the time to start from a lower level.
Audio is different from stuff PS said as analogy, audio is more bifurcated, operating system does a lot, standard can not provide all kind of audio format conversions, a standardized audio API is insufficient to build the high-level APIs provided by operating systems.
Comment: higher-level standardization might use values provided by lower-level API.
Q: what types are shared?
Comment: explains that something like a buffer might have meaning for higher and lower level
Comment: that something has different paradigms (phone vs. desktop) does not mean it is worth to be standardized (filesystem access on apple phones example given, that was not working); where are boundaries of APIs that one can access
phones and PCs provide different levels of abstraction for accessing audio.
Comment: low-level audio has different cross-platform libraries, that are portable, that is what we want to standardize. A simple access to the audio device cross platform.
Q: What would be the ideal audio API?
audio provide some functionality,
comments on analogy of atomics&threads starting, then higher-level abstractions like executors.
refers to high level surround example not working via the proposed API
responses to request: (ideal audio API) lowest-level audio interface is not common across platforms - only desktop - consumers are professional audio applications, the highest-level is across platforms and simpler to use (larger community) - highest-level can provide highest quality audio. abstraction: number of channels, stereo/3D, spatialization encoding is done by the platform, AVFoundation by apple would be an example of such a high-level API.
Comment: gives example on executors by providing "execution context" - GPU vendors might want to provide specializations to allow customization points e.g. special algorithm implementations that are platform-specific. parameterize higher-level algorithms parameterized by and "audio engine" to obtain both pieces. Emulate lower-level features non-perfect using higher-level features on platforms not allowing access to lower-level features.
Comment: on some platforms the best way to implement higher-level is to access platform specific and not the lower-level API provided by the standard. On some platforms hl will be via ll. It is not only on desktop platforms that provide access to the lower-level facilities.
On some platforms the underlying platform APIs for the lower level do not exist, and libraries that emulate them via higher level APIs; claims that the lower-level libraries might not be useful to many people.
Comment: two different set of requirements/perceived user groups. Both can be useful. lower-level API might be easier to implement to provide a minimal QOI (instead of the no-op audio device), especially on small devices.
Comment: bifurcation as mentioned - some platforms can only provide lower-level, other platforms only provide higher-level API, some platforms provide both. proposal should clearly say, lower-level is addressed. example in opposing device: 'get the microphone' property is beyond the platform. clarify intent of API (LL)
This layer of abstraction is not viable for some companies, unclear who this is serving, and professional level audio software needs every bit of performance and wouldn't like any additional library layer
Comment: from my experience about digital audio workstations, they use a cross-platform layer, that shouldn't hold them back
Comment: the paper has more than the device, is that the only problem
there's more
*continuing* the sample type shouldn't be manifested in the type system, because there is more esoteric types like non-native endian.
Comment: the performance should come through using a 3rd party lib
Comment: I'm from the DSP world where samples describe all sorts of things, in the audio world specifically the samples could very well be a vocabulary type instead of modeling the hardware.
Comment: I don't see why having a template parameter on the sample type precludes other interpretations than the type would naively give, like encoding anything in an integer.
On a higher level of abstraction you wouldn't be concerned with the sample type, but just give data in a particular type like float.
Comment: RO be good to capture a list of *all* known audio type that we need to look at for reference during the discussion
---++ P1678R0 — Callbacks and Composition
Presenting slides for callbacks and composition (attached to SG13 Wiki)
This paper contrasts different approaches of doing callbacks
Q: a callback isn't necessarily async, right?
right, like with for_each
Comment: the completion struct could apply to start and stop of audio device
Comment: I can see that work for start/stop, but not for the IO callback, it's not obvious to me
Q: why would a callback come from different execution contexts
the success and failure case may come from different contexts, such implementations may exist
Comment: in the networking TS the callback are always run on the same executor
Q: may the destructor of the callback be called from anywhere else?
yes, the dtor may, but the callbacks must not run on any other execution context.  Caring for the error case possibly run anywhere will make the implementation of that callback more difficult.
I'm just saying whether or not the error callback runs on the same or possibly a different execution context depends on the executor
Comment: this will have a runtime cost for carrying the arguments to the sender
Q: how does this relate to audio and graphics?
also in audio and graphics we see callbacks that need to deliver either errors or values, and this pattern addresses this.  With overloaded error functions I can use my own error type.
Q: would a 3rd party library create a new error type?
there are lots of error types in the world and with an overloaded error function in the callback object you could passthrough unknown error types in generic code.
Comment: the standard library should have a consistent take on this
Q: can we see this material in a form detached from the networking TS to make it simpler and more obvious?
networking TS is the most advanced in this regard so I pulled it into the comparison.  See P1660 for the concept.
Q: you mentioned callback composition.  What did that do?
if they all have the same surface you can start sequencing or otherwise composing callbacks.
---++ P1677R0 — Cancellation is not an Error
presenting slides for the paper (attached to SG13 Wiki)
Comment: we have four different callbacks in the audio proposal already.  I'm curious for this paper here to help in the audio space.