Commons:Requests for comment/Technical needs survey/Video with multiple audio tracks

Video with multiple audio tracks

Description

Problem description: i was just thinking of making a video tutorial for v2c, or it could be a cooking recipe. in consideration of different languages, it would be best to make a video suitable for audio commentary in all languages. then other users can make their dubs and add them to the videos.
here comes the problem. to add additional soundtracks to a video, the whole video has to remuxed. it's a daunting task for many people, and each new edit would mean creating new versions of a video, i.e. lots of redundant big files.--RZuo (talk) 18:40, 29 December 2023 (UTC)[reply]
Proposal type: feature request
Proposed solution: additional soundtracks can be hosted separately, and during video playback, users can choose which tracks to play. that would allow playing different languages, or enjoying a movie (with the original sound) while listening to a commentary soundtrack.
youtube is just experimenting multiple tracks https://support.google.com/youtube/thread/129769858/updates-to-captions-and-audio-features-on-youtube .--RZuo (talk) 18:40, 29 December 2023 (UTC)[reply]
i think i should clarify my concept.

it's probably not to enable a video file with multiple soundtracks, because adding soundtracks to a video is harder?

it's to have a video file (with or without a single soundtrack), and separate audio files that are soundtracks to this video.

during playback, users can select which soundtrack to play, just like how users can select which timedtext file for cc now. RZuo (talk) 10:44, 7 January 2024 (UTC)[reply]
Phabricator ticket:
Further remarks: copied from https://commons.wikimedia.org/w/index.php?title=Commons:Idea_Lab&oldid=836423569#Video_with_multiple_audio_tracks .

Discussion

Votes

Support It's a large strain on storage space and just inconvenient and a burden on users and contributors to have separate video files for each languages rather than audio tracks. See the categories about redubbed explanatory videos in this new cat as an example of how this can be useful: Wikimedia projects and audio.

It would make the site more multilingual, improve global education, access, reduce storage requirements (example) etc. Moreover, there really should be machine translated video captions for all languages where that's feasible, people could use that for manmade caption, for example because the texts only need to be edited and are already set to the right timings. That's a separate issue though. Moreover, once a video has captions and separate audio tracks, AI-generated voice, which recently dramatically improved, could be used for auto-redubbed videos (audio-tracks per language) – sometimes also manually renarrated videos by WMC narrators – which could substantially improve global education and the usefulness of files of WMC. That's also another issue. Copied this over from Commons:Idea Lab. However, I wouldn't consider this a top important issue, just an important one where the sooner it's done the better and one where the potential benefit can be large mainly due to easily redubbed videos. --Prototyperspective (talk) 23:50, 29 December 2023 (UTC)[reply]

...Machine translation and AI voices aren't nearly at a level where that's feasible. Adam Cuerden (talk) 05:51, 3 February 2024 (UTC)[reply]

Say that again after using DeepL translator and listening to these free AI-narrated audiobooks. I didn't say humans aren't involved anymore, it just speeds up things by a lot. People already have the roughly translated text for the exact timing rather than writing and timing everything from scratch and only need to slightly edit things. I didn't say we're there yet but we're certainly close and you'd have to give a good reason for why the two examples above don't demonstrate we're actually are there already. Would be useful to have e.g. explanatory videos redubbed this way. It's already very feasible with an issue being that this AI voice tech not yet being free. Prototyperspective (talk) 11:44, 3 February 2024 (UTC)[reply]

AI redubbing is possible and performant now or soon depending on the user's skills of using the tools (e.g. preventing audio overlaps). Commenting to add this link to a related proposal in the Community Wishlist:

meta:Community Wishlist/Wishes/Multiple audio tracks for the same video
Prototyperspective (talk) 12:39, 31 July 2024 (UTC)[reply]

Support Is this possible with WebM actually --PantheraLeo1359531 😺 (talk) 21:01, 6 January 2024 (UTC)[reply]
Support We have a values commitment to multilingual support and already routinely produce audio translations for videos related to Wikimedia Movement governance. Increasing the accessibility of audio seems worthwhile. Bluerasberry (talk) 16:39, 16 January 2024 (UTC)[reply]
Support, great idea to support videos with multi-language audio. MGeog2022 (talk) 20:01, 16 January 2024 (UTC)[reply]
Support. — 🇺🇦Jeff G. ツ_{please ping or talk to me}🇺🇦 14:37, 22 January 2024 (UTC)[reply]
yes.--RZuo (talk) 12:01, 23 January 2024 (UTC)[reply]
Support -- Schlurcher (talk) 13:40, 23 January 2024 (UTC)[reply]
Support Mach61 (talk) 23:21, 23 January 2024 (UTC)[reply]
Support Right now we are subtitling the videos of the Ikusgela project, but we have also dubbed some of them into Catalan, and I think this option would be very interesting.--Demonocrazy (talk) 12:45, 29 January 2024 (UTC)[reply]
@Demonocrazy: Please use internal links like eu:Atari:Hezkuntza/Ikusgela and ca:Portal:Ikusgela en català. I fixed one of your links. — 🇺🇦Jeff G. ツ_{please ping or talk to me}🇺🇦 13:38, 29 January 2024 (UTC)[reply]
Support This is the future! -Theklan (talk) 12:52, 29 January 2024 (UTC)[reply]
Support ‑‑ Kays (T | C) 13:37, 30 January 2024 (UTC)[reply]
Oppose (Ignoring the enthusiastic talk in the comments about AI voices and machine translation, both of which are absolute disasters waiting to happen...) The basic proposal would be game changing and fantastic, if it could be done, but I'm not sure it'd be gamechanging for Wikipedia: it's going to apply to a fairly small proportion of videos since we're not going to want to replace the background noises in a lot of things, and for anything historically significant exact wording may well matter more. I find it hard to imagine good cases for this outside of a small number of Wikipedian-created videos, and maybe things related to internal Wikipedia business.

Basically, it's a niche tool, and, while it would be amazing in that niche, the WMF track record hasn't been great for such improvements. Simple projects like the reply buttons on talk pages? Great. Something they'll put all possible resources into fixing, like VisualEditor, amazing. But this is in the same middle ground as problematic projects like the half-broken MediaViewer and abandoned Flow (a talk page rethink that failed). I don't think that we could reasonably get the massive resources needed. Adam Cuerden (talk) 05:48, 3 February 2024 (UTC)[reply]

Briefly: AI things are just related to but not part of the proposal. There are very many videos with just voice and these are usually the most useful. Concrete example: all of these. The number of useful free videos we get may rise if we'd offer this. Regarding the cost-benefit ratio: a) consider the redundancy & storage space avoided this way b) it depends on the cost of implementation and maybe somebody could come up with a low-effort way for this (e.g. just an ffmpeg command being run if somebody clicks on "upload new audio track"). Prototyperspective (talk) 11:51, 3 February 2024 (UTC)[reply]

Basically, I do agree the idea is extremely useful in the specific set of cases it's valid for. I just have doubts as to whether there's enough resources available to actually do it when it only replis to a relatively small set of videos. It'd probably also need a certain amount of review of the new audio track uploads for accuracy.

If the WMF had a better track record with this sort of thing, I'd be more enthusiastic. Adam Cuerden (talk) 19:25, 3 February 2024 (UTC)[reply]

Support, though I doubt it would be used much. SWinxy (talk) 02:51, 6 February 2024 (UTC)[reply]

Commons:Requests for comment/Technical needs survey/Video with multiple audio tracks

Contents

Video with multiple audio tracks

Description

Discussion

Votes

Navigation menu

Commons:Requests for comment/Technical needs survey/Video with multiple audio tracks

Video with multiple audio tracks

Description

Discussion

Votes

Navigation menu

Search