Hacker News new | past | comments | ask | show | jobs | submit login
FFmpeg 4.3 (ffmpeg.org)
554 points by mfilion on June 16, 2020 | hide | past | favorite | 220 comments



In case you didn't know, you can use FFmpeg to convert Audible aax files to DRM-free MP3/AAC/whatever [0] (scroll down for FFmpeg instructions).

This is useful if for some reason you want to archive them or play them in an app that doesn't constantly change its UI and bombard you with ads.

[0]: https://www.kylepiira.com/2019/05/12/how-to-break-audible-dr...


If you're converting .aax files you should consider using .m4b as the output, since it preserves chapters and remembers your last listened timestamp [0]:

> Audiobook and podcast files, which also contain metadata including chapter markers, images, and hyperlinks, can use the extension .m4a, but more commonly use the .m4b extension. An .m4a audio file cannot "bookmark" (remember the last listening spot), whereas .m4b extension files can.

The cool thing is that once you rip your activation bytes, it works for all your audiobooks. Would definitely recommend it.

[0] https://en.wikipedia.org/wiki/MPEG-4_Part_14


I don't think that's quite accurate.

There's no functional difference between a .m4b file and a .m4a file. Both use the MP4 container so adhere to the same specification, so support all the same features (including bookmarks). FFmpeg even uses the same muxer and demuxer for both "formats".

The only difference is a non-standard convention used by certain software (like iTunes) to write autiobook-related metadata only to MP4 files that use the .m4b file extension.

You'll get exactly the same result if you just change the file extension after remuxing/transcoding.


Well, it's also worth noting that different file extensions can have different associations, so .m4b is more likely to open in an app that the user wants to use for audiobooks, rather than opening in a generic mp4 audio playing app.

Even iTunes, I think, would treat files differently between m4r (Ringtone) and m4a (audio) files, so despite there being no difference at all, using the 'correct' extension might be quite a bit more convenient in the long run.


This is true and I'm not saying you shouldn't use the .m4b extension, I just want to make it clear that that's the only difference.

A ".m4b file" is just an MP4 with a funny file extension.


>Even iTunes, I think, would treat files differently[...]

Yes, because it's Apple (of course) who started using non-standard .m4a, .m4b and .m4r extensions, instead of the standard .mp4.


I don’t see a problem with .m4a and .m4v for standalone streams of AAC and AVC respectively.


Wait, so an m4b stores playback position in the file itself? As in, the checksum will change and file syncers like Syncthing will re-upload every time I hit pause?


No, that's entirely player-specific and has nothing to do with the MP4 format at all.


I don't know Syncthing internals, but state of the art in file syncing is to use rolling checksums to identify which parts of the file have changed. If only a few bytes of the file are overwritten, only the immediate vicinity of these bytes would be synced.


audiobook-related metadata does not include the current play position, in this case. It includes things like author and editor and narrator.


The information online about M4A vs M4B is wrong. There is no difference other than the file extension. The Wikipedia article links to a lifewire.com article about the bookmark claim. This container format can store XMP metadata, and you can certainly have a player that saves a playback position in the file's XMP metadata, regardless of its M4A or M4B extension. But every player I know of doesn't do that. They store playback positions in their own internal database.

This claim seems to originate from the fact that the old iPods only remembered the last played position on M4B files. But that's entirely a player convention, not a file format convention.


So it writes the timestamp to the file metadata? That would cause issues with syncing, backups, running from a read-only filesystem, etc. My audiobook app already keeps track of current timestamp for me.


it sounds like .m4b is just a flag that tells the player "treat this as a book, not a music file"


And where does the audiobook app store the current timestamp?


Probably in the app data directory, which I prefer.


Isn't bookmarking a job for the player? Then it shouldn't matter what file type you use.


Note that Audible now uses the significantly harder to break AAXC file format in its mobile apps.

AFAIK FFmpeg does not yet support this decoding. There is a patch available: https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/202004...


That's disappointing. Do you know if there's a cracking tool out yet?

I just tried and it looks like you can still download aax through the browser.


The amount of time I spent on correcting wrong chapter marks (like the first four books of the Wheel of Time) is masochistic. But I am absolutely happy that I have the possibility of doing so after the conversion :)


Maybe you would like to try m4b-tool chapter-adjustment by musicbrainz id or silence-detection? :-) Disclaimer: I'm the author - https://github.com/sandreas/m4b-tool

I once planned a chapter database (https://www.chapter-db.org) to collect work like yours, provide an online chapter editor and bind an api to m4b-tool, but i did not have the time to finish the project.


Oh, very interesting, thanks for the hint :D


Heh, I just started The Dragon Reborn on my 3rd read-through. I have noticed the chapters aren't right, but it hasn't really bothered me. What does your workflow look like when fixing these? Is there a way to share chapter corrections so others can apply them to their files?


I have the corresponding ebooks too, so I open up the whole audiobook as one file and guess where the chapters are from the waveform. I listen to the guess and compare to the ebook, checking whether I am too far or too early. Once I find the correct position (and I got quite good at spotting it from the waveform), I set a marker and start with the next chapter. In the end I split it along the markers.

I was planning on writing something to spot when they say "chapter" as it is always the same but I never got around to that. Also, doing all that work was almost meditative :)

A way to share the corrections would be to export the markers from audacity but sadly I don't have that data anymore, though I could calculate the markers from the files I exported if you are interested.


Well, if you own the epub, you could try to find out the whole length of the audiobook, then extract the whole text of the epub splitted by chapters and then relatively match the text length to the audio length and put the chapters where the nearest silence is (chapter 2 is at 3.3845% of the whole text, so seek for a silence around 3.3845% of the audio length)

I got some pretty good matches with m4b-tool here, while it does not work for all audio books (you need the latest pre-release for this very experimental undocumented feature!):

  # try to match my-book.epub on my-audiobook.m4b
  # ignore first, second and last two epub-chapters for the match (dedication etc.)
  # split chapters into sub chapters to ensure they are between 5 and 15 minutes
  # create a backup of the original chapters (done automatically)
  m4b-tool chapters -v --epub=my-book.epub --epub-ignore-chapters=0,1,-1,-2 --max-chapter-length=300,900 "my-audiobook.m4b"

  # omg it did not work and messed up all chapters, please restore the original chapters
  m4b-tool chapters -v --epub-restore "my-audiobook.m4b"

  # ok, lets only dump the findings in chapter.txt format to do it manually
  m4b-tool chapters -v --epub-dump --epub=my-book.epub my-audiobook.m4b


Yeah that does sound like a lot of work. I appreciate the offer but like I said it hasn't bothered me much. I don't know that I've ever relied on chaptering for audiobooks, other than for breaking the book into smaller pieces to make the scrubbing less sensitive. My mental model is much more of a linear monolith.


I wanted to create a slideshow a couple weeks ago and came across this article [0] on creating a Ken Burns Effect Slideshow. Very cool and is a great demo of some of ffmpeg's functionality.

However, the final command is a little crazy:

  ffmpeg -i 1.jpg -i 2.jpg -i 3.jpg 
  -filter_complex "color=c=black:r=60:size=1280x800:d=10[black];[0:v]format=pix_fmts=yuva420p,crop=w=2*floor(iw/2):h=2*floor(ih/2),zoompan=z='if(eq(on,1),1,zoom+0.000417)':x='0':y='ih-ih/zoom':fps=60:d=60*4:s=1280x800,crop=w=1280:h=800:x='(iw-ow)/2':y='(ih-oh)/2',fade=t=in:st=0:d=1:alpha=0,fade=t=out:st=3:d=1:alpha=1,setpts=PTS-STARTPTS[v0];[1:v]format=pix_fmts=yuva420p,crop=w=2*floor(iw/2):h=2*floor(ih/2),pad=w=9600:h=6000:x='(ow-iw)/2':y='(oh-ih)/2',zoompan=z='if(eq(on,1),1,zoom+0.000417)':x='0':y='0':fps=60:d=60*4:s=1280x800,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=3:d=1:alpha=1,setpts=PTS-STARTPTS+1*3/TB[v1];[2:v]format=pix_fmts=yuva420p,crop=w=2*floor(iw/2):h=2*floor(ih/2),zoompan=z='if(eq(on,1),1,zoom+0.000417)':x='0':y='0':fps=60:d=60*4:s=1600x800,crop=w=1280:h=800:x='(iw-ow)/2':y='(ih-oh)/2',fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=3:d=1:alpha=0,setpts=PTS-STARTPTS+2*3/TB[v2];[black][v0]overlay[ov0];[ov0][v1]overlay[ov1];[ov1][v2]overlay=format=yuv420"
  -c:v libx264 out.mp4
[0] https://el-tramo.be/blog/ken-burns-ffmpeg/


Just so you know, FFMPEG has a long-standing bug (reported 8 years ago) [1] that affects precisely the type of work you're doing, converting static images to video.

The gist of it is that if your static image is in bgr24 (most of BMP images) format, when converting to typical video pixel format (like yuv420p), the color will be distorted.

This can be worked around by converting to rgb24 first (which is exactly why this bug is bizarre, since the two should be practically identical.)

(There is also the BT601/BT709 conversion thing, but that's not a bug, just something need to be taken care of.)

[1] https://trac.ffmpeg.org/ticket/979

Edit: small correction: I said PNG before, but it's actually BMPs that are usually in bgr24.


If you read the post by the link, each of these settings (or rather groups thereof) look reasonable and readabale in isolation.

This demonstrates, in a rather extreme example, the colossal composability of the interface, even if it lacks nice formatting here.


I feel the same about imagemagick. Oh the things you can do with just curl, bash, imagemagick... And gnuplot if you're rich.


Why do you need to be rich for gnuplot? Aren't gnu things generally GPL?


Sorry it was a bit tongue in cheek.

On many systems I work with and have to debug, gnuplot won't be already installed (and won't be installable on a system not connected to the Internet) so 'rich' would be 'flush in packages with a full system available'.

Sometimes, even imagemagick isn't there, but rsvg-convert is, you can still do amazing things with just bash+curl+svg...


Apparently gnuplot is unrelated to GNU, and the original developers were making a pun on "newplot." It's license is a bit complicated, which is why most systems don't have it.


Or the IT security guys that expertly hand-picked all your available packages for minimum attack surface removed it (but let imagemagick... after some time you stop asking some questions :-)


My guess is they’re referring the need for lots of compute power to get it to run in a reasonable time, which is expensive


FFmpeg is infamous for absurdly complex command options. It's really too much for a "one-liner" CLI.

When I use it, I just look for "pre-baked recipes" otherwise it's a really unpleasant rabbit hole to get into.


If you look at it as more of a "composable" interface (as the sibling poster suggested), it makes a lot more sense.

No one is going to type out this one-liner from scratch, or have an easy time understanding what it means by reading it, but as it's made up of a series of smaller, more easily understood commands, in a shell or Python script it could be vastly more legible and, dare I say it, usable.

This is also the reason why there are so many frontends to ffmpeg, to simplify various specific tasks. I can't count how many one-off apps I've seen that do one thing and do it well, and just ship a full copy of ffmpeg to do that one thing. Making an actual GUI for all of this would be just... insane, really, but it's so versatile and flexible that you can basically do anything with it.


> Making an actual GUI for all of this would be just... insane, really, but it's so versatile and flexible that you can basically do anything with it.

Challenge accepted! (No seriously, that's what I'm working on)


The problem with calling ffmpeg multiple times in a script is you often waste compute time.

If you can manage to cobble together a single command (as illegible as that may be), you might be able to do what you're looking to do in far less time


No to mention you also loss quality (the generation loss) unless you use lossless format as intermediate.


Maybe what's missing is some kind of ffmpeg shell where you can build pipelines and then run them? A bit like spark-shell, or avisynth on windows?


I've been working on making "pre-baked recipes" for a while to help with simple tasks like cutting/merging a video [0]. I recently made an npm package for making time lapses with effects from the command line with FFmpeg as well [1].

[0] https://github.com/zvakanaka/ffeasy [1] https://www.npmjs.com/package/mklapse


I'd recommend looking at something like kdenlive, it generates the ffmpeg command and script for you. It's still complex and difficult to do somethings but it can be much nicer than trying to work out the command line interface for something you want to do. It's also nice because you can save the project at a higher level and reopen it later without having to do all the work of figuring things out again.


> I'd recommend looking at something like kdenlive, it generates the ffmpeg command and script for you.

Doesn't kdenlive use melt [0] instead of ffmpeg directly?

[0] https://www.mltframework.org


The syntax is very difficult to wrap your head around, but beyond that, it is the implicit behavior that makes it difficult to reason about. I find that the short ffmpeg commands that take advantage of automatic stream selection and omit the inputs/outputs of filters because ffmpeg can hook them up automatically to be much harder to reason about than the fully written out, fully explicit ffmpeg command lines.

It almost seems like the best way to understand an ffmpeg complex filter graph is to actually draw a graph...


I actually made my own CLI frontend just because I didn't want to try and memorize ffmpeg options to do the simple things that I want to do most of the time. Now I can just do `--h264 -s X -e Y` to do a h.264 encode from X timecode to Y timecode.


Is -s X -e Y really much easier than -ss X -t $((Y-X)) ?

Every couple of years worth of not using an option I need to look it up again. But it's usually not too hard.


Yes, because that doesn't actually work proper. In order to seek fast while having the ability to start encoding at any point without issues with keyframes, I need to actually do `ffmpeg -ss X1 -i FILE -ss X2 -t (Y-X)` where X1 + X2 = X.

Compare:

    frontend -s 05:28:38.667 -e 05:28:58.767 input.ts
    ffmpeg -ss 05:28:28.667 -i input.ts -ss 10 -t 20.100 -pass 1
    ffmpeg -y -ss 05:28:28.667 -i input.ts -ss 10 -t 20.100 -pass 2


I wrap it in shell scripts or batch files.


That's awesome. Every section on the command is simple, but taken as a whole, it is pretty thorny. I find that MoviePy (https://zulko.github.io/moviepy/) is a great tool for more complex operations like the aboce. A lot of MoviePy's functionality is derived from an FFMPEG wrapper, but it is just easier to split things up into a small script.


Insert a line break at every instance of a ":" (while editing) and it becomes a lot clearer.


> However, the final command is a little crazy

Someone could make a million dollars by making a blog that just writes about ffmpeg incantations for use-cases like this.


Just did a little 1-day hackathon project where we built out a video presentation automation tool. One of those components would compile together a series of photos, video, and audio clips (with optional text to go with it for subtitle generation) and it would build everything out. The final command would look insane, there being a line for each added clip and any spacing between them, but it worked perfectly.


There's probably a way to fix the compression artefacts there too. Something like https://stackoverflow.com/a/42989654/280795


https://trac.ffmpeg.org/wiki/Xfade

You can try Xfade filter, hope oneday I can write opengl glsl effect in ffmpeg.


I have used ffmpeg. It's a damn good project, and under current development and support.

It also appears to be the only game in town. Many commercial offerings are really just veneers over custom ffmpeg implementations.

Tuning it is also pretty crazy. Some folks can make entire careers out of just tuning ffmpeg.

I think the biggest issue with video software (besides it being difficult and performance-intensive), is the prevalence of a lot of old, highly-enforceable patents.

Video has been around a while, and companies like Ampex patented a heck of a lot of stuff that can easily be applied to current video.

ffmpeg actually has a couple of build configs that are designed to remove coercive-licensed components.

I'm not so happy about that, but it's the world we live in.

I do have a project that I was playing around with (and will get back to, sooner or later), where I made a simple MacOS wrapper for ffmpeg:

https://github.com/RiftValleySoftware/RVS_MediaServer

I wrote about that, here:

https://littlegreenviper.com/series/streaming/


John Carmack recently commented on the ubiquity of FFMPEG: https://twitter.com/ID_AA_Carmack/status/1258531455220609025

FFMPEG is amazing. I don't think that it's the only game in town, but there it's the best by far.


It's so ubiquitous that a video-related project I work on ships with multiple copies of FFmpeg: several dependencies rely on it independently.


It's not the only game in town, the other major open source project for audio/video coding is gstreamer. https://gstreamer.freedesktop.org


It is worth mentioning that it's fairly common to use FFmpeg via gstreamer in the form of plugin.

https://packages.debian.org/stretch/gstreamer1.0-libav


Cool! I'll check that out.


There is also mencoder, which is derived from mplayer.


I use FFMPEG as a thermostat to keep my apartment nice and warm :) I encode all of my videos (phone, DSLR, dash cameras) to h265 on my Ryzen workstation when it's not use. I have a primitive "PID-controller" script pulling temperatures from influxdb (data collected using a few esp8266 with ds18b20 sensors) and adjusting -threads parameter accordingly. It automatically adjusts presets (slow, veryslow, placebo, etc) depending on number of videos in the queue, so it never runs out of material to encode :) It saves me from using stinky baseboard heaters and reduces my HDD bills!


This is amazing. If your alternative is a non-heatpump electric heater, then why not? :)



Can someone explain to me how FFmpeg seems to be the only open-source software to do even just basic functionality with audio.

I was looking at getting the sound wave graph for a piece of audio a while ago, and not only was FFmpeg the only option I found to be able to do it, it was amazingly fast and also free.


SoX is amazing. http://sox.sourceforge.net/

It’s not perfect but it’s way easier to use for audio stuff than FFmpeg is. I have a bunch of scripts I reuse that do basic stuff like high-pass, normalize, automatically trim audio files, add fade-in or fade-out, downmix to mono, and then resample / dither to the right depth and size.

It also will spit out spectrograms.

Generally when I need to record a ton of sound clips, I chop the audio up and rename it in a GUI editor similar to Audacity, and then do all the processing in SoX. I might also do a bunch of work in a DAW beforehand.


> SoX is amazing.

100% agree.

The man pages are chock full of examples too, which is great because the tool does a lot. Some of the examples are really interesting too, such as the delay effect showing how to synthesise a guitar chord.

I use an audio player built largely around sox¹, and it allows you to take advantage of the power of sox.

1. https://80x24.org/dtas.git/about/


SoX is amazing because it indeed makes very nice spectrograms which visually show how audio is encoded. It makes it easy to see if this really is a lossless FLAC or a crappy 192 VBC mp3 audio source.

If you personally hear the difference is a completely different subject of course.


I hadn't even thought about SOX 'till your comment in about 10 years. And looking at the page, there hasn't been a new release since 2015.

From what I recall, it only worked on wav files back in the day, but now it supports OGG. But a lot has changed in even 5 years - does it even support MP3, as patents expired since then?


> From what I recall, it only worked on wav files back in the day

It depends on your build, but on my system it supports: 8svx aif aifc aiff aiffc al amb amr-nb amr-wb anb au avr awb caf cdda cdr cvs cvsd cvu dat dvms f32 f4 f64 f8 fap flac fssd gsm gsrt hcom htk ima ircam la lpc lpc10 lu mat mat4 mat5 maud mp2 mp3 nist ogg paf prc pvf raw s1 s16 s2 s24 s3 s32 s4 s8 sb sd2 sds sf sl sln smp snd sndfile sndr sndt sou sox sph sw txw u1 u16 u2 u24 u3 u32 u4 u8 ub ul uw vms voc vorbis vox w64 wav wavpcm wv wve xa xi. You can check your own with `sox --help`.

On Debian mp3 support requires `libsox-fmt-mp3`.


I just use SoX for processing audio data, and then pass the result to LAME if I want an MP3. Each format has so many different options for encoding and metadata anyway. It’s not like video, where the sheer amount of data discourages you from working uncompressed.

Sure, there hasn’t been a new release since 2015… but would that be necessary? It’s not missing any features I want.


It's not important that it doesn't support mp3. That's not it's purpose - it doesn't need to. The unix philosophy. Feel free to pipeline it on either side with tools that do support MP3.


according to https://github.com/chirlu/sox/commit/af261dcc91071cafd7d8305..., sox added support for Ogg Vorbis files in 2001, which is a little more than 5 years ago. since sox didn't exist until 1999 and vorbis didn't exist until 2000, that seems like pretty solid format support to me.


In addition to what others have said, there's also gstreamer and its suite of plugins. I find gstreamer a bit easier to work with, although both are very complex pieces of software and each have their own quirks.

If you're looking for audio production work, there's Ardour, although I haven't used it myself. http://ardour.org/


Does gstreamer have a command-line interface in addition to the libraries?


Indeed it does; it's about as complex as ffmpeg, and in my opinion has a somewhat more intuitive interface for building up complicated pipelines of processing steps:

https://gstreamer.freedesktop.org/documentation/tools/index....


There is a "test tool" for gstreamer pipelines: gst-launch, but generally it's encouraged to run the gstreamer more as a library instead.

Example, if you have gstreamer libs installed:

gst-launch-1.0 videotestsrc ! videoconvert ! autovideosink


You can use gst_parse_launch to create a pipeline using the launch syntax.

I've found this helpful to prototype with gst-launch-1.0 and then pull into a separate program down the road. I found it to be pretty hairy trying to create and link all the individual elements manually in complex pipelines.

https://gstreamer.freedesktop.org/documentation/gstreamer/gs...


My DAW is bash+sox+ecasound because I don't want to be distracted by visuals when working with audio. However, I just started working on a project involving about 15 hours of digital audio recorded under less than ideal circumstances a couple of decades ago and need a reliable way to analyze the data. SoX produces spectrograms that are insufficient for my needs and I've had reliability issues with Audacity. So far, DFasma looks very promising:

https://gillesdegottex.gitlab.io/dfasma-website/



What ever happened to Facebook's (or was it Netflix?) technology to create a new unit of time measurement to help align audio and video files? I believe it was called a "flick"...



Doesn't support NTSC. Better to force everyone to deal with irrational sampling rates than pretend they don't exist.


The NTSC framerates are weird, but they are rational, in the mathematical sense.


Python with librosa is also quite practical. Can work interactively by doing it in a Jupyter notebook. Some tutorials (from 2015). https://www.youtube.com/watch?v=0ALKGR0I5MA https://www.youtube.com/watch?v=MhOdbtPhbLU

Not affiliated with the project, just use it quite a lot.


I don't know much, but isn't Audacity a thing?


Audacity is a great GUI for working with audio files. I would think it has a way to export a graph of the wave that it shows you when you open up an audio file.

You can install an FFMPEG plugin for Audacity if you need broader support of audio formats (either import or export).


At least on Windows, Audacity only supports an ancient version of ffmpeg. I don't know why.



What about Reaper? Its a full fledged DAW. http://reaper.fm/


Reaper is proprietary.


I mean there's plenty of open source stuff for audio. Visualization/oscilloscope tools included.

But that kind of visualization isn't what I'd call "basic" functionality, since it's functionally useless for the vast majority of audio applications.


In case that's helpful, here is the code we use to generate waveforms using: .mp3 -(1)-> .wav -> numpy -> matplotlib: https://github.com/learningequality/pressurecooker/blob/mast...

ffmpeg is used for the key step (1) ...


why duplicate something that works really well?


- You don't trust so much complex logic, taking untrusted input, written in C and want to rewrite it in Rust.

- You want to code it all again using an API that doesn't expect to get its input from a blocking read() function.

- ...

I think the main reason there isn't any alternative is that it supports soooo many formats that the task seems impossible to anybody thinking about it.


> - You want to code it all again using an API that doesn't expect to get its input from a blocking read() function.

In which real world situation/scenario is this a problem? It is hard to think of one, but I am probably missing something?

In any case, if that was a real show-stopper, it would probably be much wiser to go with a fork that would modify that one thing, instead of re-writing the whole project.


I could see it being an issue if you were doing a bunch of streaming transcodes, and wanted that in an event loop instead of blocking... but

a) you're probably going to want to control the number of simultaneous streams to a low enough number that you could just fork

b) the responsible thing to do when decoding streams with ffmpeg is to disable all formats except your whitelisted format, but still sandbox the heck out of it, because there's been a lot of CVEs where a crafted input allows remote code execution

Sandboxing is going to be much more complete if the ffmpeg process is only dealing with one input fd, one output fd (maybe an error reportint fd), and no network or filesystem access --- you don't want a decoder error to influence media you're encoding/decoding for another user.


As a little side project, I've been trying to automate creation of those "1 second everyday" style videos [1], and used FFmpeg to achieve this.

For things like trimming and concatenating videos, one thing that surprised me was that it was slower than using a tool like ScreenFlow. Note, we're talking about hundreds of gigabytes worth of 4K videos.

slower = When I say slower, I mean, if I manually performed the same operation in a professional video editing tool like ScreenFlow, the time it took ScreenFlow to export a video was quicker than the time it took FFmpeg to finish executing the command.

Interestingly, there seems to be a fast and a slow way to do things in FFmpeg [2]. The slow way is free of quirks, whereas the fast way introduces something unexpected to the video, like a half a second of a black screen with audio continue playing like normal.

I'm still curious as to how a tool like ScreenFlow can achieve faster trimming/concatenation/subtitle overlaying, than FFmpeg. I suspect if I read their documentation and do some more research, I'll discover a more optimal way of ordering the various flags on the command line which can speed up the execution, while preserving accuracy.

[1] https://github.com/umaar/video-everyday

[2] https://superuser.com/questions/499380/accurate-cutting-of-v...


> "1 second everyday" style videos

Tangent: you reminded me of one of the coolest auditory experiences I've ever had. Roughly one-and-a-half decades ago I attended a public lecture by Olivier Nijs, a sound design guy from my region[0], about how he built an automated set-up from an old desktop to record one second of 7:00 in the morning every day. Then he manually cut together one whole year. The amazing thing about it was that after a few seconds the long-term trends really started to become noticeable. The changing sounds of birds, people and other living things. How rains in spring were somehow just a little different than the rains in summer or autumn. It was really, really amazing.

(the artist himself wasn't that impressed with his own work - perhaps he saw someone else do it before and didn't feel like showing off with something unoriginal or something?)

[0] https://www.oliviernijs.nl/


Sounds fascinating, stuff like that interests me a lot. Hope that video surfaces one day, would really like to see it.


Not being able to read Dutch, do you happen to have a direct link to the lecture or the recording he made?


i'm thinking it's audio only. Maybe it's hiding here?

https://soundcloud.com/oliviernijs



That's seven in the afternoon, I guess I misremembered! Thank you so much for finding it! :)

EDIT: This is six in the morning, maybe I heard that version https://soundcloud.com/oliviernijs/perdag600


Sounds about right


Mm, that's one of those things I read and wish I had thought to try that. :)


Most likely you are transcoding the video instead of copying the raw stream. A lot of more complicated stuff requires that but things like trimming can be done the fast way by simply cutting of the irrelevant pieces in the raw encoded data itself which is much faster. It’s kind of a sport to find the exact string of flags that has the correct effect without transcoding :)

The reason it often fails is that ffmpeg can do so many things that any time you are using some curious combination of flag A, B and C is likely that no one else has ever done that and there are some side effects ;)

Anyway, some of it can be avoided by learning how containers en codecs work, what I-frames are and all the other nitty gritty details of the world of video where there is so much to learn!

Here’s a great intro to get started for anyone who got curious: https://github.com/leandromoreira/digital_video_introduction


Yes, copying the raw stream is the way to go if you're not rescaling, etc. This is the command to extract 15 seconds of video starting at the first minute, and not reencoding; it should be quite fast and it's also quite self-explanatory:

    ffmpeg -ss 00:01:00 -i in.mp4 -t 00:00:15 -c copy out.mp4
I'm really no expert so I just keep a few of those commands around... You don't need a deep understanding of video streams, etc. just to use ffmpeg.


This is the biggest weakness of ffmpeg, in my opinion.

It oftentimes requires so much understanding of how things actually work, and has too little to offer in the way of abstracting those things away.


> The slow way is free of quirks, whereas the fast way introduces something unexpected to the video, like a half a second of a black screen with audio continue playing like normal.

The reason here is the the "fast way" and the "slow way" work very differently behind the scenes.

The "fast way" looks only needs to look at the container, the bits of metadata that tell a player which bits of data need to be given to the decoder at which time. It can just take a blob of data and stick it in another blob of data without looking at the contents.

The "slow way" actually decodes the frames, that is, it takes the blobs of compressed data and turns them into actual pixels, which especially in the case of 4K video, is very slow.

The reason the "fast way" might be less accurate is that the frame you're asking for might not be possible to obtain without decoding the video. Modern video codecs have different kinds of frames and some frames depend on the frames before or after them. If you took such a frame and just jammed it into another video, things would break, because the other frames it refers to are missing.

> I'm still curious as to how a tool like ScreenFlow can achieve faster trimming/concatenation/subtitle overlaying, than FFmpeg.

It's likely that they have optimisations that FFmpeg doesn't or cannot have. FFmpeg has a bit of an emphasis of being able to play and handle pretty much anything you throw at it, no matter how broken. It could be that accurate input seeking is difficult while preserving that reliability.

One option that's probably not an option for you but you might consider is encoding your video in an all-intra format. This is fairly standard in the video editing world. All-intra means that all your frames are independent of each other and can be moved around by editing software without decoding anything. Doing this will result in larger files, however.


Agree overall, but a couple nits:

> The "slow way" actually decodes the frames, that is, it takes the blobs of compressed data and turns them into actual pixels, which especially in the case of 4K video, is very slow.

The "slow way" decodes and re-encodes the frames. Decoding is a little slow. Encoding is very very slow if done at high quality. (And even high quality is still lossy.)

> The reason the "fast way" might be less accurate is that the frame you're asking for might not be possible to obtain without decoding the video.

This is often possible to solve with an .mp4 "edit list". You include more data than is expected to be displayed along with instructions for the player to skip part of it. One obvious caveat is that the person you send the video from can remove the edit list, so the hidden frames shouldn't be anything you want to redact for privacy.


This is a great explanation thank you! Makes much more sense now.

Will indeed read into the all-intra format.


It depends what you're doing, there are many different ways to cut a section out of a video. If you do it copying the streams (not reencoding anything) ffmpeg should do it at almost the speed that it can read and write to disk. However if you are reencoding the video, the speed will depend on every parameter controlling the encoder. If that's what ScreenFlow is doing, it's probably using hardware accelerated encoding too.


Did you configure FFMPEG to encode your video with the exact same encoder as ScreenFlow? If Screenflow is using hardware accelerated encoding and FFMPEG was configured for software x264 encoding, that might explain the discrepancy.


ffmpeg is amazing, but it’s like tar on steroids: I can never remember the right incantation. Google is required for even the simplest of things.

I don’t know if it’s their goal, but I’d love a more user friendly set of command line arguments.


This is where GUI's shine, as opposed to command-line.

Perhaps this is slightly off-topic, but my dream is an interface that combines the best of both worlds.

Kind of an automated GUI-builder for command-line tools, that analyzes the combinations of options used most, breaks them down into workflows with options (that can be manually named), and you can thus execute one-off commands easily and quickly without having to hunt through man pages, but still export the command as a command-line incantation for reuse, to use in a script, etc.


I had a similar idea (if not the same), while learning about ffmpeg a year or two ago, and quickly put together a small prototype of it. At least, for the GUI command builder part of it.

Here's the demo: https://jonbo.github.io/project/simply/

There's also a small ffmpeg example. (If you click the fetch button, on the demo.) The schema "recipes" for it are just a basic JSON format and can be seen here https://gist.github.com/jonbo/c4067cd18e5fa687e896b2358aaf9e... and https://github.com/jonbo/simply.recipes/blob/master/docs/REA...

I never did publish the (non-bundled) source code for the demo, since it was done in a hurry (spaghetti), but I might if there's interest.

Maybe this will inspire somebody else to build something better!


That looks awesome, what a great idea. Then just have to turn man pages/help info into JSON..


This sounds great! Expanding on this a little:

Seems like you'd need a universal CLI tool usage traverser and parser to figure out what's possible. Likely this would produce a decision tree of sorts with different modes and options excluding or including new options. We'd need a way to show all this, maybe nested tabs for modes and check boxes and other inputs at each appropriate level.

Layer on this a way to optimize for the most common cases like you said. Further, if this becomes popular, CLI tools could emit some sort of standard description language that would optionally customize the GUI. The GUI's output should be both the text command it constructed and the ability to run that command directly.

More future steps would be a way to reason about multiple commands, pipes, and other combinators.

Adding this to my side projects backlog. Thanks for the idea!


In the emacs world, there are textual interfaces like magit's [0] or dired's [1] interface, that tick many, perhaps most, boxes that you mention. Magit is basically that interface, but tailored to `git`, that also constructs the actual git command, should you want to see it. Dired is like that for ls, rm, cp, mv and other file utils. So, as a general design, they _may_ be of useful interest.

A tool to automatically parse `man` pages or help prompts from tools would be a dream come true, basically.

Apart from the possible commands, it may be useful for the GUI to also show some kind of state, for example filesize (akin to invoking `ls` before `ffmpeg`, as you would normally do on the CLI).

Done with the right abstractions, command combinations should come almost for free.

[0] https://magit.vc/ [1] https://en.wikipedia.org/wiki/Dired


Exactly that -- all of it. And I love the idea of CLI tools optionally emitting their own defaults for it too.

I (selfishly) hope you build it, since I don't have the time!


That's interesting, but I'm aiming for something much simpler. Sensible defaults (faststart, for instance) and easy arguments:

  ffmpeg mymovie.mov -w 720 mymovie.mp4
Should do what you expect


Here's probably the most popular GUI: https://github.com/WinFF/winff


It's fine for simple transcoding jobs. Once it reaches certain complexity you at the very minimum want to outsource it to a shell script to organize the arguments in multiple lines or switch to some other language bindings that aren't limited by shell argument parsing.

Compare the CLI and python invocation in this example: https://github.com/kkroening/ffmpeg-python#complex-filter-gr...

For some subsets of ffmpeg functionality (e.g. creating webm videos from other sources) there also are dedicated GUIs.


That example is fantastic and really illustrates why a solid API can make things so much easier for the developer.

Reading the command line I could barely tell what ffmpeg was doing. This also could mean the command line is badly designed.


As mentioned, splitting over multiple lines can help https://trac.ffmpeg.org/wiki/FilteringGuide#Multipleinputove...


Several years ago I was attempting to automate a very long chain of AV transformations using ffmepg and such. I distinctly remember having a dream of a green-text black window command line prompt of ffmpeg incantations and that is when I realized that I had been perhaps digging too deep.


On my side, I frequently forget my own age, people's name, things my girlfriend remembers from 5 summers ago.

However, I'll never forget things like "tar -zxvf", "ffmpeg -i vid.mp4 image-%04d.png" and "convert image-*.png +dither anim.gif".


I have muscle memorized "youtube-dl --extract-audio --audio-format mp3 https://YouTube.com?v=123etc"


You probably want "-f bestaudio" instead of "--extract-audio". The former will download just the audio and skip the video which is significantly faster (and cheaper when on a metered connection). However you will have to do the conversion to MP3 yourself afterwards (e.g. with ffmpeg) if you want that format specifically.


These days, with AAC being supported pretty much everywhere you're probably better off just using the " -f 140" option to get the 128kbps m4a file, which at least saves the further (albeit subtle) degradation caused by another lossy to lossy transcode.


`ffmpeg` reminds me of `convert` from ImageMagick, it's complex because there's a lot of media formats and the command lets you do a lot things to the media as well.

I liked using WinFF a time ago for simple conversion jobs though where I didn't care too much about tweaking all the knobs.


Is WinFF still around in some capacity? The website doesn't load for me. Too bad if it's abandoned, because I remember it being pretty amazing.

EDIT: From Big Matt's blog:

> Unfortunately I lost the WinFf.org domain. I was broke but I don't think it's that important after all these many years. You can still go to github, video help, and others.

https://www.biggmatt.com/2020/04/winfforg.html


ImageMagick's convert is much friendlier than ffmpeg:

  convert image.png image.jpg 
Does what it's suppose to and it's as simple as possible. Sure, you can get really fancy with resizing proportions but easy stuff should be easy.


I mean,

    ffmpeg -i something.avi something.mp4
works as well, but there's just _so_ many more options inherently involved in converting a video: most inputs and outputs will be at least 2 tracks, so that's 2 codecs, and the majority of containers used now (MP4, MKV, etc.) support all sorts of codecs, so while ffmpeg will "guess" what you want just like ImageMagick does in your short example, the chances that it guesses right are a lot lower just on the basic level of what format you wanted your output to be. But it's not exactly ffmpeg's fault.


True, but that -i is telling. Remove it and it breaks spectacularly


You might want to look into Handbrake. I understand it's basically just a GUI frontend for ffmpeg.


I use it almost everyday.


I don't know if it fits you but there are several GUI frontends dedicated to help with the sheer amount of arguments/flags.

https://github.com/swl-x/MystiQ is what comes to mind


I think gstreamer's gst-launch syntax is a bit easier to read and write, although it has a different set of capabilities than ffmpeg, so may not be a whole replacement for what you want to do.


I love ffmpeg but those release notes suck.

If all you can say is how much time there's been, why isn't this just 4.2.4?

I suspect there are major features and improvements buried in the 30 pages of commits.

https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/refs/heads/rel...


There is an actual changelog, but it's just as sparse on details. At the very least tell me whether those filters are new, removed, fixed, etc.

https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/HEAD:/Changelo...


A formal news entry isn't up yet. Expect it in a few days.


That's so much better than the shortlog that is linked from the release page: https://git.ffmpeg.org/gitweb/ffmpeg.git/shortlog/n4.3


Now this sounds interesting

   - LEGO Racers ALP (.tun & .pcm) demuxer
I think I will dig through to figure out what that actually means later.


ffmpeg can now extract data from the .tun & .pcm file formats (but it might not be able to decode it yet, depending on the encoding used in such files).


It can, that's what the "High Voltage Software ADPCM decoder"'s for.


With so much video processing research using neural networks now I hope ffmpeg gets better support for it. They have some filters that use them, like sr[0] for super resolution and dnn_processing for general processing, but the user experience isn't great. They need a model file that's not included, and you need to train one yourself since there doesn't seem to be any included. Hopefully they add better support in the future, together with more dnn filters.

[0] https://ffmpeg.org/ffmpeg-filters.html#sr


Changelog: https://github.com/FFmpeg/FFmpeg/blob/master/Changelog

Seems like the big feature is Vulkan support?



TIL there’s a 9to5Linux. Is it new/affiliated with the other 9to5 properties?


Looking at their respective About pages they seem unrelated:

https://9to5mac.com/about/

https://9to5linux.com/about-us


Other pretty big ones looks like AV1 encoding (4.2 had decoding) and hardware acceleration for VP9.


Also AV1 via librav1e. Is this the first time a Rust library has been compiled into FFmpeg?


ffmpeg is fantastic. I'm an iOS dev and when I put something on Github, I first record the iOS simulator with QuickTime. Then I convert the resulting .mov file with ffmpeg:

    ffmpeg -i example.mov -r 15 example.gif
Voilà, an animated gif. Quality is atrocious but it gets the message across, plus the filesize is not too big.


If you're prepared to run ffmpeg twice (allowing it to analyze the GIF), the quality will be significantly better: https://cassidy.codes/blog/2017/04/25/ffmpeg-frames-to-gif-o...


No need to run it twice.

Basic syntax is

    ffmpeg -i video -filter_complex "[0]split[vid][pal];[pal]palettegen[pal];[vid][pal]paletteuse" out.gif
ffmpeg 4.0 and later will automatically insert fifo buffers for the main video while one copy is analyzed to generate the palette.


Great tip, thanks!


FFmpeg is hands down one of the most powerful and feature-packed tools I've used out there.[0] The associated complexity is also daunting, but thankfully there's a lot of documentation out there and it reflects the low level nuances of audiovisual formats.

I highly recommend anyone struggling to utilize it to write wrapper scripts around it so you only need to figure out things once. Here are some things I've done with it by that approach:

* Extracting any embedded subtitle files from MKVs. Nice if I want to search them or make changes.

* Back when GIFs were more popular, I converted any that were over 3MB to video to save space. If the output wasn't small enough, it would do a second pass with different settings to get it more compact. Not needed that much these days.

* "Barcodes" for videos, that is, it takes every second converted to a vertical sliver and combined you get an overview of how the average color of the film changes through its duration.

* A tool for creating video excerpts that lets me specify a start time and end time in more flexible timestamp formatting, and other things like a simple parameter for the output width.[1] It also allowed specifying a target filesize and did the math so the right bitrate would be chosen. I even include metadata so I know which original file it was made from and the parameters specified.

* Thumbnail previews. A lot of file sharing sites will include a file that includes some timestamped screenshots in a grid with encoding information at the top. This is good for movies so you see a high-level overview. The best part about doing this myself is that I could make it highly configurable, like choosing exactly how many images I want, the interval, whether I want timestamps, etc.

Note, for some of these, I also needed ImageMagick.

Also, when compiled with the right flags and libraries, FFmpeg has some really neat features: things like embedding subtitles, stabilizing video, hiding logos, etc. I recommend looking into the filters.

Thank you for all the manpower that goes into the project!

[0]: Two other ones that are also powerful are ImageMagick and Pandoc.

[1]: I initially wrote this in Bash, but later converted the code to Python to better handle command line arguments and allow things like using config files.


ffmpeg is absolutely phenomenal. I recently used it to combine multiple separate audio tracks from a webrtc session into a single file.

For anyone that hates compiling ffmpeg from source, John Van Sickle does an amazing job of doing the work for you by making binaries publicly available for each version: https://johnvansickle.com/ffmpeg/


I rely on those static builds and they are a lifesaver. Careful with linking to this website from automation scripts, etc. as the downloads are regularly swapped and only certain versions (IIRC latest in a point release) are kept available. E.g., 4.3.0 will be taken down once 4.3.1 is available, however 4.3.1 might be permanently moved to the archive if it shall be the last 4.3.? release before 4.4.0.


great point. What I've done in the past is have a jenkins job that points at the latest release and trigger job manually when you know you want to upgrade to the latest version.


I don't usually rely on a specific version and am fine running a release (or five) behind. If you don't need the latest and greatest pointing to an archived old release may save you some hassle.


Good resource, but I usually end up building my own to take advantage of the improved AAC encoding with libfdk-aac which can't be distributed pre-compiled.


If anyone is compiling their own version for use with IP video streams, here is a modification that adds to the av_read_frame() function a call to the avformat interrupt callback.

https://gist.github.com/bsenftner/ba3d493fa36b0b201ffd995e8c...

This effectively adds the ability to monitor the IP stream for unexpected termination.

The current implementation of av_read_frame() will hang if, for example, a human trips over a camera's cables and the stream abruptly terminates. Without modification to the API, this change to av_read_frame() calls the avformat interrupt callback each loop through av_read_frame()'s reading of packets. All the callback needs to do is look at the time, and signal error if the time between callbacks exceeds something reasonable.

I am not sure why, but this change was not accepted by the ffmpeg developers. I find it essential for working with IP video and IP video cameras.


I've been an ffmpeg user for something like 15 years now. The project never ceases to amaze :heart:

I remember using it to encode MPEG-2 DV footage to FLV (yes, flash video) to live-stream footage captured over firewire from early prosumer HD cameras :) It's always been a solid video swiss-army knife!


FFmpeg is one of the prime examples of open-source software, but a lot depends on the 3rd party libraries it uses. I tried using my Android phone to convert my videos from H.264 to HEVC (H.265) for storage, but it was ~3.5 times slower in comparison to x86 CPUs, in terms of frames encoded per second per watts of power consumed (FPS/Watt): https://quanticdev.com/articles/h265-encoding-on-arm-cpus

Though still, I can't even imagine attempting this test without FFmpeg in the first place. It is available directly in Termux on Android.


FFmpeg started out as frustrating for me but the more I use it the more I love it. The ability to split videos into smaller segments and applying different filters to each segment and finally combining the segments is just great.

I've been working on a web-based video editor for app features: https://glitter.now.sh/ and i've had tons of fun tweaking FFmpeg.

My only wish would be that the documentation would include video samples for the example commands (I'd love to help with this).


I use ffmpeg to run a fun twitch stream VOD to highlight reel pipeline. Example with explicit language: https://www.youtube.com/watch?v=ETR3IXyGgEo. ffmpeg handles everything from frame-extraction (to feed a deep learning model), audio spectrogram calculation (also features for the model), video trimming (to cut the interesting clips), video concatenation (to join the clips), and the text overlay


What is the deep learning model and what does it do?


It's just a standard vision convnet like ResNet-18, or ResNet-50. It gets fed facecam with an audio spectrogram concatenated to it (pretty hacky, but seems to help). All it does is binary prediction of {interesting, not interesting}, and I use some heuristics to pick regions of video based on how many frames were "interesting" to the model.

Feel free to take a look at the (research quality at best) code: https://github.com/eqy/autotosis


Thanks.


Huge thank you to FFmpeg for making my Video Hub App possible: extracting screenshots from all videos in your video collection to create an easy to browse/search library.

https://github.com/whyboris/Video-Hub-App

https://videohubapp.com/


That's a great idea, looks awesome.

If I understand the pricing, it's free for up to 50 files in each folder/hub; for more it's $3.50, which you donate 100% to an anti-malaria charity!


Good way to describe it. A "Hub" is all the videos in a folder and all its subfolders. The demo will be 100% the same as the final app but with a 50 videos per hub restriction.

You can build your own (without 50 videos limitation) with just `npm install` and `npm run electron:windows` (or `mac`, or `linux`).

If you choose to buy, it's $3.50 minimum - you can pay more. $3.50 of every purchase goes to Against Malaria Foundation

Cheers!


I was looking at ffmpeg in the Wayback Machine earlier today:

https://web.archive.org/web/20010218084709/http://ffmpeg.sou...

Fascinating to see such an important and sophisticated product come from these modest beginnings.


From https://9to5linux.com/ffmpeg-4-3-released-with-vulkan-suppor...:

> support for the ZeroMQ Message Transport Protocol (ZMTP)

This is fascinating. I've never heard of zmq being exposed in a public API. Cool idea.


Does anyone have a better documentation source for modern ffmpeg? Usually when I use it I get all sorts of different answers with different 'methods', and the official docs only confuse it more.

Also I hope to see pure GPU transcoding sometime. H264 to h265 transcodes in pure GPU space are Uber fast, but so far only done by other software.


> Also I hope to see pure GPU transcoding sometime. H264 to h265 transcodes in pure GPU space are Uber fast, but so far only done by other software.

FFmpeg has had pure GPU transcoding for quite some time. See https://trac.ffmpeg.org/wiki/HWAccelIntro

It even has a GPU-based scaler for NVIDIA.


It doesn't work pure GPU for transcoding between formats. You can do h264 to h264, but not h264 to h265, which can be done with other software if the hardware supports it.


I’ve pretty much accepted that the project is too big and changing too fast to ever have detailed documentation written.

Apple and Google don’t even fully document their products anymore.


If you get seriously confused, ask a question on stackoverflow. There's at least one dev around that answers. If it's a bug, they'll fix it.


Exciting improvements I am glad to see: Intel QSV-accelerated VP9 decoding, Support AMD AMF encoder on Linux (via Vulkan)


What is wrong with these people? Why the "libvidstab" is always disabled? Are they trying to save valuable disk space? -- Libvidstab works quite good and better than anything else readily available in Linux.


First I've heard of this library. The github page is pretty informative. I looks quite worthwhile, thanks for informing me.


My favorite memory of FFmpeg involved wowing a friend with the "magic of computer hacking" by slicing a subsection of a youtube video, cutting the source audio, and replacing it with audio from a song he had discovered "fit perfectly". Never mind that the entire process took about 20 minutes of googling and trial/error with the Linux Subsystem for Windows. It was better than the native windows alternative :P

https://www.youtube.com/watch?v=3tr8JVZdHfc


I thought that was Mad Rush for a second.


nice, I’ve been toying with it recently because I wanted to get some video on the Apple TV I’ve got connected to my TV.

In the process I was reminded of how important Intellectual Property is in our field.

I dowloaded a video from youtube [0] and learned that the video is in a WEBM container for certain types of video compression formats (VP8/9 + another one) as well as vorbis/opus audio. It turns out that to get the best quality on the Apple TV, I should encode to HEVC (H265) video and I guess aac audio.

There’s some sort of history behind this divide. A pain in the neck for people who want to toy around with video, but huge decisions for these companies in choosing the formats they use to move these bits around.

So I can use ffmpeg to re-encode into the new format, and I can play it on my apple devices if the file is local, but I can’t shoot it via UDP to the TV. When VLC (app on apple tv) is listening on a port for UDP I only get choppy audio:

Doing this on the sending side doesn’t seem to work:

  ffmpeg \
      -re -i video_stream_ready.mp4 \
      -c:v copy -c:a copy \
      -f mpegts udp:$APPLETVIP:2300
has anyone toyed around with shooting pre-generated (or even real time generated) video at their TV this way?

Being stuck at home makes me want to make something artsy that could be fun to look at through the day.

[0] https://youtu.be/wPXSFVruIHI


We recently started building a video player for go-flutter-desktop using FFmpeg. It's been immensely frustrating as the audio and video is slightly out of sync and requires a bit of tuning.

If anyone has experience with FFmpeg, I'd greatly appreciate if you could take a look! https://github.com/telefuel/video_player_testbed/issues/1


I use it with fd to convert audio files in parallel, such as FLAC to MP3 or Opus (or ALAC to FLAC when someone let me … sample a few albums). Found the tip on the Arch wiki: https://wiki.archlinux.org/index.php/Convert_FLAC_to_MP3#Par...


I agree with the shared sentiment that FFmpeg is awesome.

It's the only way I can get mkv converted for playback in browsers. I believe that browsers don't support mkv/aac natively because of licensing but I would be interested if anyone has a different solution for browser playback.


And here's my incantation for screen recording under X Window (also used one like it under MS Windows, but I don't have it on hand):

  ffmpeg -rtbufsize 2147M -f x11grab -video_size 1920x1080 -i :0.0 -r 30 -preset ultrafast -vcodec h264 output.mp4


nice but it doesn't record audio and doesn't use hardware acceleration.


So if that's possible, please tell us how!


Breaks chromium, at least on arch. Beware.

https://bugs.archlinux.org/task/67020?project=1&string=chrom...


I wish FFmpeg still had NDI support

NDI is super useful for low-latency local network streaming



I think the same way. It is too bad that the protocol is being mismanaged by its creators.


Love me some ffmpeg, but would be nice to have some bullet point summaries of updates in the release notes instead of pointing to the changelog dump. I know there is some information loss, yadda yadda


I've been feeling stupid these past few days because I've been using ffmpeg for the first time. I've been trying to take an rtsp stream from a security camera and transcode it, restream it, to something that can be viewed in an html video tag. It works, with poor quality, in Firefox, but in Chrome just a green screen. (I also tried using vlc/cvlc's streaming). I've been feeling stupid because of all the copy/pasting from the web into the command line without understanding what I was doing.


I wonder how hard it would be to build a video editing suite with FFMPEG

Avidemux is nice, but currently I only see vinci as an alternative to premiere.


I just used ffmpeg to extract audio from a video and audacity to remove audio noise... love those open source tools


I've been an ffmpeg user for something like 15 years now. The project never ceases to amaze :heart:


does anyone have a good installation guide, which includes all the stuff that isn't supported in the default installation? ffmpeg is great, but a lot of times I go to use it and find a feature isn't supported because I didn't compile it with a flag set.


If you're on Windows or macOS there are good premade builds available at https://ffmpeg.zeranoe.com/builds/. If you want to build it yourself look at https://github.com/rdp/ffmpeg-windows-build-helpers (this supports more than just Windows).


There are several static builds out there if you're okay with binaries. This person's builds[1] have the sources and build dependencies listed.

[1] https://johnvansickle.com/ffmpeg/


You didn't say which OS, but I wrote a set of cross-platform instructions for my PhotoStructure users: https://photostructure.com/getting-started/video-support/

tldr: if you're on Windows, use Chocolatey or Scoop. For macOS, use homebrew.


ffmpeg is great tool, Once I needed to do some basic video editing stuffs like cuts, fixed logo and some sound background on low resource computer, I quickly downloaded ffmpeg and work finished with few google searches and 10 minutes without any load on system.


Somewhat related. Anyone have a good utility for downloading youtube videos?



I'm a recent convert to youtube-dl. I use some bash scripts[0] to make youtube-dl even easier. I have a Mac but these should work elsewhere.

Copy the video page URL to your clipboard and just type

yd - download the video. Works on most popular sites with video.

yda - download just audio, best available

ypl - makes a subfolder with the whole playlist in it (Copy the playlist URL)

yc - downloads every video on the channel (Copy the channel URL)

    yd () { youtube-dl "$(pbpaste)" ; }
    yda () { youtube-dl -f bestaudio "$(pbpaste)" ; }
    ypl () { youtube-dl -i -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' "$(pbpaste)" ; }
    yc () { youtube-dl -i -o '%(channel)s/%(title)s.%(ext)s' "$(pbpaste)" ; }
I use especially yd and ypl constantly. Sites where youtube-dl doesn't work, this usually does: Get the master m3u8 or an mp4 link from the page using Developer tools->Network[1], copy link to clipboard, and

vd myfile - downloads the video as myfile.mp4

    vd () { youtube-dl -o "$1.mp4" "$(pbpaste)" ;  }
[0] Save them in your .bash_profile or equivalent on your machine.

[1] i.e. with Network tab open, refresh page and start video playing.


Nice. Thanks for the aliases. I just started using it myself for downloading and transcribing some videos. It works pretty well.


And back on topic youtube-dl uses system ffmpeg for converting youtube's .webm and other weird discrete formats into proper video (or audio) files.


Just a reminder, depending on your installation method of youtube-dl , you may have to manually install ffmpeg globally on your system in order to download/mux highest quality videos.


It should be noted that despite the name it supports a lot of different websites, not just Youtube. It's great to capture hls streams for instance.



And for good reason Insane amount of customization, and once you finally have pieces together your 10 line command with 40 parameters, you can just give it a huge list of URLs and let it do all the work in the background.


Thanks.


Is it me or their mobile version of website has lot of issues




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: