The challenge is to keep the app’s media files small so that they can be served on-demand from a content delivery network (CDN). This ensures only content that is relevant to the current user is downloaded and the main app size is kept small. Instead of using a mp4 or other type of movie file to convey the town 360 scene I am using the new Google webp format which I have previously validated within the 360 Flutter component. My aim is to keep this file as small as possible so it can be quickly served from a remote server. Other content such as voice and 3d characters which are part of the experience but not specific to a town may be compiled as assets with the build as they will not need to update as often as other interactions. There is also the option to stream the non-town specific video over the underlying 360 animated webp image file.
Using an Adobe Media Encoder plugin I was able to export a short section of my 360 movie into a webm file. The movie version of the webp format. However, this format turned out not to be supported by the 360 component I am using. So, I am looking to convert to webp as I will not need embedded audio; which can be played from a separate file.
I found that Google provides decent documentation for webp as well as a number of command line programs to help convert.
https://developers.google.com/speed/webp/docs/using
The conversion worked nicely. I am now able to open up my town scene with movement; in this case rainfall. However, (there is always a however…) the 360 plugin supports the standard Flutter Image widget which in turn support webp animated images, but I am so far unable to loop. So, the animation stops after the final frame.
The tools I am using can be downloaded here:
https://storage.googleapis.com/downloads.webmproject.org/releases/webp/index.html

Adding Sound and Interaction
As the 360 visual content of the spatial narrative is town based; the relevant media file is served from the CDN. Its important that the image files are as optimized as possible do they transfer quickly. Sound effects will be embedded within the app as they are common to all users, whereas narrative speech is also, like the town based visual content, based on the user’s language but also the path they take through the adventure based on their decision making. So, it also makes sense to serve this content from CDN. Speech files do not need to load instantly as they are usually triggered through an interaction. So, the final recipe involves playing embedded sound effects and an embedded visual effect while the main 360 media loads. The media is then cached so the slight delay is only noticable on the first playback. The narrative sound is then played on top of the 360 visual content at the appropriate moments.
For the sound I have opted for an AI generated voice; albeit with intonation and a Hollywood accent.