arrange audio and video of different sources instead of stopping a source plugin

Current situation:
Kodi seems to have only one exclusive (fullscreen) video output stream, as well as one exclusive audio output stream. So, whenever a new stream uses audio or video, any active stream is stopped before the new stream starts – e.g. When I start a video, my audio playback stops; this is mostly useful in this particular situation.

Issues:
(1) When I’m playing a picture slideshow (with videos included and enabled in the settings) while an audio stream (e.g. using “Digitally Imported” plugin) is playing, the audio stream stops when the video playback starts. After the video is finished, the slideshow continues, but audio stays silent. A “simple” (temporary) fix could be to pause the “audio only” stream as long as a video is playing in the slideshow and to resume it afterwards.
(2) The restriction of exclusively blocking media streams limits Kodi’s universalism of usage (as well as it keeps it extremely simple to use); e.g. there is no picture-in-picture feature (as far as I know, at least not for multiple plugins). So the challenge is to extend the scope of features while keeping the ease of usage and so to improve usability.
(3) Automatically stopped streams don’t resume automatically, so if my audio playback was stopped due to a start of a video, I must navigate back to the audio menu to resume my audio stream which I was listening before. This is quite long-winded.

Suggested [detailled] solution:
I imagine an in-between layer for audio streams and for video streams which manages multiple media sources and “mixes” the data for the output devices. A kind of source focus must be implemented: When a new source is stared, it’s status is set “focussed”, the secondary audio stream’s status is set to “reduced” and any other other audio stream is set to status “auto-muted” (I guess it makes no sense to play 3+ audio sources at a time). Source focus must be manually switchable using the user interface (as long as multiple sources are playing at a time).
For audio management, all audio sources are mixed (calculating an average value of each audio sample); but there must be an individual volume setting for each stream source, in addition to the already implemented master volume. The individual volume setting is always initialized on start of playback to 100% for the primary (focussed) audio stream. The master volume is applied to the overall output of this stream management layer. This keeps current behaviour of the user interface as it is today. A (hidden) individual “reduced” volume setting should be initialized to 100%, multiplied with a user defined amout (e.g. 20%). This setting is recalculated automatically when changing the visible individual volume. It is applied when the management status of the corresponding source is “reduced”. No audio stream is going to cancel another audio stream anymore – instead the particular audio plugin itself has to manage own playback only (e.g. when skipping to next/previous track etc.) – this is a more modularized (independent) behaviour for plugins, too. When an audio source has finished backback, it must actively disconnect from the management layer. Then the management layer should re-arrange the status of each source after the source is removed. Secondary source becomes primary (=focussed), tertiary source’s status is set to “reduced” and any other source again “auto-muted”. I would say, volume buttons should adjust the individual volume of the primary audio source while master volume should be adjustable via the slidebar / OSD or similar. The mute button should work on master volume, because it’s designed function is to stop any kind of sound instantly. Manual controllable mute functions could be applied to each audio plugin if required; these should work independently from the management layer’s “auto-mute” to respect user priority in relation to automatic management.
For video management, I would suggest the following: The primary video source is scaled to fullscreen which keeps current app behaviour. If more than one video source is active (e.g. YouTube and TV), the focussed video should show a focus outline/overlay (e.g. fullscreen glowing rectangle) on video startup which disapears after a few seconds of no user input to maximize video experience. The secondary video output should be windowed, Window size (screen percentage), position (combination of [top, middle, bottom] and [left, center, right]) and offset (screen percentage) should be user defined in Kodi’s settings. Pressing the “select” (or “ok” button) when the video window is focussed, switches (exchanges) both video sources and moves focus to the other windows at the same time – so the current (focussed) fullscreen playback is moved to the window (including the focus) and the windowed playback is shown in fullscreen then. Video playback does not even has to stop anymore if the user wants to go back to the user interface – the UI must just re-connect as a new, focussed video source, so the video continues in the small window. UI should disconnect itself as a source after the video playback is connected. Tertiary video sources should not be shown at all (in a simlar way as more than 2 audio sources are muted) – alternatively you could create a user setting for the maximum count of shown windows (and the corresponding placement options). But if more than 2 video streams are playing, an OSD warning/reminder icon should be visible all the time. This icon (if visible) can be focussed when using the arrow buttons (after the small window was focussed) – on clicking, it should show a screen with a 2×2 (or 3×3) matrix of all playing video streams for selection of a new primary (fullscreen) stream. Pressing a media button (play/pause/next/prev/etc.) while a stream is focussed but not conformed yet in this stream selection sends these commands to the corresponding source plugin. So it’s quite easy to e.g. stop or pause a tertiary video stream. Normally, media buttons should always control the focussed stream.
Please keep in mind that audio and video streams may be dependent to each other, so switching a video source focus must also switch the corresponding audio source focus and vice versa. Last thing to take into consideration is how to switch an “audio only” stream focus: I would it implement in 2 places – first in the side bar to be accessible from nearly everywhere. Second place would be the video source selection screen (which is only available when more than 2 video streams are played, but which is adequate for additional implementation of audio source selection).

Possible later extensions:
– more options for the stream management layer, e.g. rules for pausing a plugin when a video playback breaks an audio playback (and auto-resume afterwards) [or even rules for certain plugins])
– assigning certain plugins to use specified hardware (e.g. second display, another sound card [or specified cannels of the sound card, e.g. if the soundcard supports different stereo {or higher} outputs])
– streaming a simplified, but full featured user interface to a mobile device using lan or wifi
– user option to instruct the picture slideshow to enforce windowed playback of videos there while pictures are continued to be shown (and changed) fullscreen