Show HN: Ez FFmpeg – Video editing in plain English

(npmjs.com)

211 points | by josharsh 6 hours ago

24 comments

  • qbow883 1 hour ago
    Days since last ffmpeg CLI wrapper: 0

    It's incredible what lengths people go to to avoid memorizing basic ffmpeg usage. It's really not that hard, and the (F.) manual explains the basic concepts fairly well.

    Now, granted, ffmpeg's defaults (reencoding by default and only keeping one stream of each type unless otherwise specified) aren't great, which can create some footguns, but as long as you remember to pass `-c copy` by default you should be fine.

    Also, hiding those footguns is likely to create more harm than it fixes. Case in point: "ff convert video.mkv to mp4" (an extremely common usecase) maps to `ffmpeg -i video.mkv -y video.mp4` here, which does a full reencode (losing quality and wasting time) for what can usually just be a simple remux.

    Similarly, "ffmpeg extract audio from video.mp4" will unconditionally reencode the audio to mp3, again losing quality. The quality settings are also hardcoded and hidden from the user.

    I can sympathize with ffmpeg syntax looking complicated at first glance, but the main reason for this is just that multimedia is really complicated and that some of this complexity is necessary in order to not make stupid mistakes that lose quality or waste CPU resources. I truly believe that these ffmpeg wrappers that try to make it seem overly simple (at least when it's this simple, i.e. not even exposing quality settings or differentiating between reencoding and remuxing) are more hurtful than helpful. Not only can they give worse results, but by hiding this complexity from users they also give users the wrong ideas about how multimedia works. "Abstractions" like this are exactly how beliefs like "resolution and quality are the same thing" come to be. I believe the way to go should be educating users about video formats and proper ffmpeg usage (e.g. with good cheat sheets), not by hiding complexity that really should not be hidden.

    Edit: Reading through my comment again, I have to apologize for the slightly facetious opening statement, even if I quality it later on. The fact that so many ffmpeg wrappers exists is saying something about its apparent difficulty, but as I argue above, a) there are reasons for this (namely, multimedia itself just being complicated), and b) I believe there are good and bad ways to "fix" this, with oversimplified wrappers being more on the "bad" side.

    • juujian 28 minutes ago
      Yes, I use ffmpeg about once a year, in about 350 years I really ought to have all the syntax figure out.
    • e-Minguez 51 minutes ago
      If you use it from time to time it would be very challenging to remember the million of different options ffmpeg has.
    • WhitneyLand 22 minutes ago
      “It's really not that hard”

      I’m going to guess your job does not involve much UX design?

      • qbow883 12 minutes ago
        I'm not saying it couldn't be better (and I even gave examples), my point is that the drawbacks of such a wrapper outweigh the benefits, at least when it's such an oversimplified one. I've said in other replies how I'd be very interested in e.g. an alternative libav* frontend with better defaults and more consistent argument syntax, but I don't think that this invalidates my criticism of the linked project.
    • Forgeties79 1 hour ago
      Some people just want to use an intuitive tool with better QoL, even if it leads to compromises, to do a job swiftly without going over documentation/learning a ton of new things. Not everything has to be an educational experience. ffmpeg exists in its original form like you prefer, but some folks want to use lossless cut. Nothing wrong with that IMO.

      Personally I think it’s great that it’s such a universally useful tool that it has been deployed in so many different variations.

      • hnarn 1 hour ago
        > Some people just want to use a tool to do a job swiftly. Not everything has to be educational.

        > some folks want to use lossless cut

        In that case I would encourage you to ruminate on what the following in the post you're replying to means and what the implications are:

        > "ff convert video.mkv to mp4" (an extremely common usecase) maps to `ffmpeg -i video.mkv -y video.mp4` here, which does a full reencode (losing quality and wasting time) for what can usually just be a simple remux

        Depending on the size of the video, the time it would take you to "do the job swiftly" (i.e. not caring about how the tools you are using actually work) might be more than just reading the ffmpeg manual, or at the very least searching for some command examples.

        • wpm 12 minutes ago
          The thing is that when a video is being re-encoded, so long as I'm not trying to play games on my computer at the same time, I'm free to go do something else. It does not command any of my attention while it's working, whereas sitting and reading the man pages commands my attention absolutely.
        • foodevl 14 minutes ago
          > > some folks want to use lossless cut > In that case I would encourage you to ruminate on what the following in the post you're replying to means and what the implications are:

          You may have misunderstood the comment: "lossless cut" is the name of an ffmpeg GUI front end. They're not discussing which exact command line gives lossless results.

      • qbow883 1 hour ago
        Yes, I am not opposed to ffmpeg wrappers in and of themselves. Some decent ffmpeg wrappers definitely exist. But I argue in my comment above that this specific tool does not have better QoL - again, since it reencodes unconditionally with quality settings that are usually not configurable.
    • kristopolous 51 minutes ago
      so you know how to swap audio with -map without having to look it up?
      • qbow883 50 minutes ago
        I do, yes. Though that's not really the point, it'd already be enough to know where to look it up.
        • kristopolous 46 minutes ago
          no the point is that there are some things I've done a hundred times and I never remember it because it's designed in a wildly bad way. ffmpeg, gpg, openssl and git has those things all over the place. Is it -c:v or -v:c? I don't know. used to be -vcodec so it's -v:c now? no it's -c:v I think because they swapped it?

          There isn't internal consistency to really hold on to ... it's just a bunch of seemingly independent options.

          The biggest problem is open source teams really don't get people on board that focus on customer and product the way commercial software does. This is what we get as a result

          • qbow883 39 minutes ago
            > Is it -c:v or -v:c?

            Sure, I agree with all of this. Like I said above, the syntax (and, even more, the defaults) isn't great. I'm just arguing that "improving the syntax" should not mean "hiding complexity that should not be hidden", as the linked project does. An alternative ffmpeg frontend (i.e. a new CLI frontend using the libav* libraries like ffmpeg is, not a wrapper for the ffmpeg CLI program) with better syntax and defaults but otherwise similar capabilities would be a very interesting project.

            (The answer to your question is that both -vcodec and -c:v are valid, but I imagine that's not the point.)

            > The biggest problem is open source teams really don't get people on board that focus on customer and product the way commercial software does.

            I believe in this case it may be more of a case of backwards compatibility, with options being added incrementally over time to add what was needed at the moment. Though that's just my guess.

            • kristopolous 37 minutes ago
              ffmpeg doesn't go away. it's still there. people can use tig and git, having something that isn't insane can live in harmony with the other thing.
  • dllu 5 hours ago
    When converting video to gif, I always use palettegen, e.g.

        ffmpeg -i input.mp4 -filter_complex "fps=15,scale=640:-2:flags=lanczos,split[a][b];[a]palettegen=reserve_transparent=off[p];[b][p]paletteuse=dither=sierra2_4a" -loop 0 output.gif
    
    See also: this blog post from 10 years ago [1]

    [1] https://blog.pkh.me/p/21-high-quality-gif-with-ffmpeg.html

    • CrossVR 5 hours ago
      I've been thinking of integrating pngquant as an ffmpeg filter, it would make it possible to generate even better pallettes. That would get ffmpeg on par with gifski.
    • dspillett 1 hour ago
      Does ffmpeg's gif processing support palette-per-frame yet? Last time I compared them (years ago, maybe not long after that blog post), this was a key benefit of gifski allowing it to get better results for the same filesize in many cases (not all, particularly small images, as the total size of the palette information can be significant).
    • crazysim 2 hours ago
      Gifski (https://gif.ski/) might be a good alternative to look to that's gif-pallete aware.
    • xattt 4 hours ago
      Those command flags just roll off the tongue like two old friends catching up!

      /s

  • karmakaze 29 minutes ago
    I would definitely use an LLM, to see what the suggested options do and tweak them.

    Using a different package name could be helpful. I searched for ezff docs and found a completely different Python library. Also ez-ffmpeg turns up a Rust lib which looks great if calling from Rust.

  • HelloUsername 6 hours ago
    The one good usecase I've found for AI chatbots, is writing ffmpeg commands. You can just keep chatting with it until you have the command you need. Some of them I save as an executable .command, or in my .txt note.
    • Terr_ 5 hours ago
      As pessimistic about it as I am, I do think LLMs have a place in helping people turn their text description into formal directives. (Search terms, command-line, SQL, etc.)

      ... Provided that the user sees what's being made for them and can confirm it and (hopefully) learn the target "language."

      Tutor, not a do-for-you assistant.

      • left-struck 5 hours ago
        I agree apart from the learning part. The thing is unless you have some very specific needs where you need to use ffmpeg a lot, there’s just no need to learn this stuff. If I have to touch it once a year I have much better things to spend my time learning than ffmpeg command
        • rolfus 2 hours ago
          Agreed. I have a bunch of little command-line apps that I use 0.3 to 3 times a year* and I'm never going to memorize the commands or syntax for those. I'll be happy to remember the names of these tools, so I can actually find them on my own computer.

          * - Just a few days ago I used ImageMagick for the first time in at least three years. I downloaded it just to find that I already had it installed.

        • serial_dev 3 hours ago
          There is no universe where I would like to spend brain power on learning ffmpeg commands by heart.
          • skydhash 1 hour ago
            No one learns those. What people do is just learning the UX of the cli and the terminology (codec, opus, bitrate, sampling,…)
      • famahar 2 hours ago
        Do most devs even look at the source code for packages they install? Or the compiled machine code? I think of this as just a higher level of abstraction. Confirm it works and not worry about the details of how it works
        • d-us-vb 1 hour ago
          For the kinds of things you’d need to reach for an LLM, there’s no way to trust that it actually generated what you actually asked for. You could ask it to write a bunch of tests, but you still need to read the tests.

          It isn’t fair to say “since I don’t read the source of the libraries I install that are written by humans, I don’t need to read the output of an llm; it’s a higher level of abstraction” for two reasons:

          1. Most Libraries worth using have already been proven by being used in actual projects. If you can see that a project has lots of bug fixes, you know it’s better than raw code. Most bugs don’t show up unless code gets put through its paces.

          2. Actual humans have actual problems that they’re willing to solve to a high degree of fidelity. This is essentially saying that humans have both a massive context window and an even more massive ability to prioritize important things that are implicit. LLMs can’t prioritize like humans because they don’t have experiences.

        • skydhash 1 hour ago
          I don’t because I trust the process to get the artifacts. Why? Because it’s easy to replicate and verify. Just like how proof works in math.

          You can’t verify LLM’s output. And thus, any form of trust is faith, not rational logic.

          • ben_w 51 minutes ago
            I don't install 3rd party dependencies if I can avoid them. Why? Because although someone could have verified them, there's no guarantee that anybody actually did, and this difference has been exploited by attackers often enough to get its own name, a "supply-chain attack".

            With an LLM’s output, it is short enough that I can* put in the effort to make sure it's not obliviously malicious. Then I save the output as an artefact.

            * and I do put in this effort, unless I'm deliberately experimenting with vibe coding to see what the SOTA is.

      • xattt 4 hours ago
        It you stretch it little further, those formal directives also include language and vocabulary of a particular domain (legalese, etc…).
      • eviks 5 hours ago
        The "provided" isn't provided, of course, especially the learning part, that's not what you'd turn to AI for vs more reliable tutoring alternatives
    • Tempest1981 6 hours ago
      One that older AI struggled with was the "bounce" effect: play from 0:00 to 0:03, then backwards from 0:03 to 0:00, then repeat 5 times.
      • geysersam 2 hours ago
        Just tried it and got this, is it correct?

        > Write an ffmpeg command that implements the "bounce" effect: play from 0:00 to 0:03, then backwards from 0:03 to 0:00, then repeat 5 times.

          ffmpeg -i input.mp4 \
          -filter_complex "
          [0:v]trim=0:3,setpts=PTS-STARTPTS[f];
          [f]reverse[r];
          [f][r]concat=n=2:v=1:a=0[b];
          [b]loop=loop=4:size=150:start=0
          " \
          output.mp4
    • corobo 4 hours ago
      LLMs are an amazing advance in natural language parsing.

      The problem is someone decided that and the contents of Wikipedia was all something needs to be intelligent haha

      • madeofpalk 3 hours ago
        The confusion was thinking that language is the same thing as intelligence.
        • Kiro 37 minutes ago
          You and me are great examples of that. We are both extremely stupid and yet we can speak.
        • Marazan 2 hours ago
          This seems like a glib one liner but I do think it is profoundly insightful as to how some people approach thinking about LLMs.

          It is almost like there is hardwiring in our brains that makes us instinctively correlate language generation with intelligence and people cannot separate the two.

          It would be like if for the first calculators ever produced instead of responding with 8 to the input 4 + 4 = printed out "Great question! The answer to your question is 7.98" and that resulted in a slew of people proclaiming the arrival of AGI (or, more seriously, the ELIZA Effect is a thing).

    • beepbooptheory 5 hours ago
      But doesnt something like this interface kind of show the inefficiency of this? Like we can all agree ffmpeg is somewhat esoteric and LLMs are probably really great at it, but at the end of the day if you can get 90% of what you need with just some good porcelain, why waste the energy spinning up the GPU?
      • pixelpoet 4 hours ago
        Requiring the installation of a massive kraken like node.js and npm to run a commandline executable hardly screams efficiency...
        • RadiozRadioz 2 hours ago
          That's a deficiency with this particular implementation, not an inherent disadvantage to the method
      • chpatrick 4 hours ago
        Because FFmpeg is a swiss army knife with a million blades and I don't think any easy interface is really going to do the job well. It's a great LLM use case.
        • skydhash 1 hour ago
          But you only need to find the correct tool once and mark it in some way. Aka write a wrapper script, jot down some notes. You are acting like you’re forced to use the cli each time.
      • geysersam 2 hours ago
        Because getting 90% might not be good enough, and the effort you need to expend to reach 97% costs much more than the energy the GPU uses.
      • imiric 5 hours ago
        Because the porcelain is purpose built for a specific use case. If you need something outside of what its author intended, you'll need to get your hands dirty.

        And, realistically, compute and power is cheap for getting help with one-off CLI commands.

  • vithalreddy 5 hours ago
    Can't access the githup repo https://github.com/josharsh/ezff
  • btbuildem 11 minutes ago
    npm? Have we learned nothing from the weekly node/npm security breaches? Not putting that hot mess anywhere near my dev box, thanks.
  • gcanyon 1 hour ago
    I can only speak to my experience, but I spent a long time being puzzled by video editor user interfaces, until I ran into ScreenFlow about ten years ago. For whatever reason, the UI clicked, and I've used it ever since. It's a single purchase, not monthly, and relatively affordable. https://www.telestream.net/screenflow/overview.htm
  • ramon156 2 hours ago
    > it handles 20 common patterns ... that cover 90%

    Could you elaborate on this? I see a lot of AI-use and I'm wondering if this is claude speaking or you

  • ramigb 2 hours ago
    That's beautiful! I see a .claude folder in your code, I am curious if you've "vibecoded" the whole project or just had claude there for some tasks! not that it matters or takes away from your work but just pure curiosity as someone who enjoys betting on the LLM output XD
  • eviks 5 hours ago
    That's the problem ideally solved by typed data, i.e., some UI where instead of trying to memorize whether it's thumb/s/nails you can read the closed list of alternatives, read contextual help and pick one
    • my_brain_saying 5 hours ago
      This is why we have fish tab completions. Does exactly that; list of possible commands with contextual help. Fish rules.
      • eviks 4 hours ago
        Yeah, no, that's a pale imitation that only addresses the one specific example given. But, like, how would you even know what target formats are supported? Break the flow and look it up or simply read the drop-down list? The free type-any-text interface with poor helpers is the worst in accessibility

        Which format is the default if no argument is given?

        Or more complicated contextual knowledge - if you cut 1sec of a video file, does fish autocomplete to tell you whether the video is reencoded or cut (otherwise) losslessly

        Also, what does fish complete to on Windows?

        • skydhash 1 hour ago
          Which flow is being broken here? Especially when the information is easily accessible with `man`.
          • eviks 1 hour ago
            the flow that doesn't require you to open a different tab or cancel a command to `man` your way through dozens of poorly searchable pages of documentation, but allows you to continue translating what you want in your mind into the interface command with delay potentially subsecond interrupts
            • skydhash 1 hour ago
              Is there kind of rewards for speed running typing ffmpeg flags? Like an advent of ffmpeg?

              I know what I want to do, I don't know how it's being done, but there's a wealth of information that is very accessible. So I just read it.

              It's very easy to type `apropos ffmpeg`. And even if you typed `man ffmpeg`, if you go to the end, you will find related manuals name for more information. And you can always use the pager (`less` in most case) facility for quick search.

              I believe that a lot of frustration comes from people unwilling to learn the conceptual basis of the tools they are using.

              • eviks 55 minutes ago
                What's the reward for trivializing real issues and coming up with broken "solutions"?

                > It's very easy to type `apropos ffmpeg`

                No it's not. First, that's not a Windows command, so right off the bat you've cut off the largest OS. Second, your command is naively empty and it's telling that you've given it instead of an actual search query because you wouldn't be able to come up with a great one right away that would result in the correct result at the top - while the correct resuls is "hardcoded" in the field type in the UI. So yeah, go on, find that perfect query and then explain why you think every single user should be able to do the same quickly. Then you can think about how justified your other beliefs are about basic workflow issues you don't understand

  • petterroea 4 hours ago
    Somehow it seems ffmpeg has become the "Can it run crysis" of UX design
  • alexellisuk 1 hour ago
    This looks handy.. along with the odd gist of "convert mkv to mp4" that I have to use every other week.

    Quite telling that these tools need to exist to make ffmpeg actually usable by humans (including very experienced developers).

    • teitoklien 1 hour ago
      i figure out the niche ffmpeg commands various chain filters, etc then expose them from my python cli tool with words similar to what this gentleman above has done.

      If one has fewer such commands its as simple as just bash aliases and just adding it to ~/.bashrc

      alias convertmkvtomp4='ffmpeg command'

      then just run it anytime with just that alias phrase i use ffmpeg a lot so i have my own dedicated cli snippet tool for me, to quickly build out complex pipeline in easier language

      the best part is i have --dry-run then exposes the flow + explicit commands being used at each step, if i need details on whats happening and verbose output at each step

    • sallveburrpi 1 hour ago
      I have a text file with some common commands, so no tools needed.

      But yea ffmpeg is awesome software, one of the great oss projects imo. working with video is hellish and it makes it possible.

  • mmahemoff 6 hours ago
    Very cool idea since ffmpeg is one of those tools that has a few common tasks but most users would need to look up the syntax every time to implement them (or make an alias). In line with the ease of use motivation, you might consider supporting tab completion.
  • spullara 4 hours ago
    I have a little script that I use on the CLI to do this kind of stuff (calls an LLM to figure out how to do CLI stuff) but you can just as easily now use any of the coding agents.
  • Kwpolska 5 hours ago
    GitHub repo link returns 404.
  • broken-kebab 4 hours ago
    I like the idea, but a CLI utility dependent on Node.js is not a good thing frankly.
    • AnonC 39 minutes ago
      I agree. Apart from having to use npm (and its package repository being susceptible to security issues), I’d prefer something a lot simpler. Could’ve been a Rust program or a Go program (a single executable) that could be built locally or installed (using several different methods and offering a choice).
    • tclancy 3 hours ago
      That ship sailed some time ago.
  • vivzkestrel 1 hour ago
    I would love to see something like this for OpenSSL
  • gamer191 3 hours ago
    Thanks, will definitely check this out

    Has anyone else been avoiding typing FFmpeg commands by using file:// URLs with yt-dlp

  • pdyc 4 hours ago
    interesting approach, i solved similar problem by creating visual tool to generate ffmpeg commands but its not the same(it cant do conversion etc.)

    I like that you took no AI approach, i am looking for something like this i.e. understanding intent and generating command without using AI but so far regex based approaches have proved to be inadequate. I also tried indexing keywords and creating index of keywords with similar meaning that improved the situation a bit but without something heavy like bert its always a subpar experience.

  • bdbdbdb 5 hours ago
    Sometimes an idea comes along thats so obvious it makes me angry. I have been struggling with ffmpeg commands for over well a decade. All the time I wasted googling and creating scripts so I wouldn't have to regoogle and this could have existed literally from day one
  • Tempest1981 6 hours ago
    I was surprised that macOS (QuickTime/Preview, iMovie) can't read .mp4 files. Not sure if it was due to H.265 or the audio codec. I tried using ffmpeg to convert to .mov but that also failed to open, since I guess MOV is just another container format.

    Is there an easier way?

    • kiicia 4 hours ago
      MP4 is container, not format, so if you have unsupported format packed into MP4 container it won’t be played. Example is trying to play AV1 video codec on devices with M2 chip or older. It won’t play. But it will play on devices with M3 chip and newer. Easiest solution is to use other player so that you can watch any MP4 file but with software decoding where hardware decoding is not available. Examples of such players are MPV or VLC.
      • Tempest1981 9 minutes ago
        Yes, VLC works fine for playing. The user wanted to edit some mp4 videos with iMovie (vs ffmpeg).

        I think it was an M4 Mac. Does iMovie need a codec pack? I know some PC OEMs don't ship an h.265 codec, pointing users to a $0.99 download. Thought Mac would include it, being promoted as for content creators.

    • felixfoertsch 5 hours ago
      IMHO the de-facto video player for macOS is [IINA](https://iina.io/).
      • trvz 4 hours ago
        That exists, but it’s still VLC.
        • wging 4 hours ago
          It's based on mpv, not vlc.
    • andrewf 2 hours ago
      Try something like: ffmpeg -i in.mp4 -c:v h264 -c:a aac out.mp4

      To re-encode the content into H.264+AAC, rather than simply "muxing" the encoded bitstreams from the MP4 container into a new MOV container.

    • codegladiator 6 hours ago
      vlc
  • Joyfield 3 hours ago
    Uhm... Millibit, Millibyte, Megabit, Megabyte?
  • maximgeorge 3 hours ago
    [dead]
  • Kcnfjhggjbh 1 hour ago
    Nnjdjfuvugnguh