One underappreciated thing Docker changed: it made the gap between "works on my machine" and "works in production" observable. Before containers, debugging a production issue often meant reconstructing an environment from memory.
The next gap that still feels unsolved: knowing whether what's running in that container is actually behaving correctly, not just running. Uptime checks tell you the process is alive; they don't tell you the API is returning what it should.
I've seen countless attempts to replace "docker build" and Dockerfile. They often want to give tighter control to the build, sometimes tightly binding to a package manager. But the Dockerfile has continued because of its flexibility. Starting from a known filesystem/distribution, copying some files in, and then running arbitrary commands within that filesystem mirrored so nicely what operations has been doing for a long time. And as ugly as that flexibility is, I think it will remain the dominant solution for quite a while longer.
> But the Dockerfile has continued because of its flexibility.
The flip side is that the world still hasn’t settled on a language-neutral build tool that works for all languages. Therefore we resort to running arbitrary commands to invoke language-specific package managers. In an alternate timeline where everyone uses Nix or Bazel or some such, docker build would be laughed out of the window.
As a Nix evangelist, I have to say: Nix is really not capable of replacing languag-specific package managers.
> running arbitrary commands to invoke language-specific package managers.
This is exactly what we do in Nix. You see this everywhere in nixpkgs.
What sets apart Nix from docker is not that it works well at a finer granularity, i.e. source-file-level, but that it has real hermeticity and thus reliable caching. That is, we also run arbitrary commands, but they don't get to talk to the internet and thus don't get to e.g. `apt update`.
In a Dockerfile, you can `apt update` all you want, and this makes the build layer cache a very leaky abstraction. This is merely an annoyance when working on an individual container build but would be a complete dealbreaker at linux-distro-scale, which is what Nix operates at.
Fundamentally speaking, the key point is really just hermeticity and reliable caching. Running arbitrary commands is never the problem anyways. What makes gcc a blessed command but the compiler for my own language an "arbitrary" command anyways?
And in languages with insufficient abstraction power like C and Go, you often need to invoke a code generation tool to generate the sources; that's an extremely arbitrary command. These are just non-problems if you have hermetic builds and reliable caching.
Well, arbitrary granularity is possible with Nix, but the build systems of today simply do not utilise it. I've for example written an experimental C build system for Nix which handles all compiler orchestration and it works great, you get minimal recompilations and free distributed builds. It would be awesome if something like this was actually available for major languages (Rust?). Let me know if you're working on or have seen anything like this!
On my nixos-rebuild, building a simple config file for in /etc takes much longer than a common gcc invocation to compile a C file. I suspect that is due to something in Nix's Linux sandbox setup being slow, or at least I remember some issue discussions around that; I think the worst part of that got improved but it's still quite slow today.
Because of that, it's much faster to do N build steps inside 1 nix build sandbox, than the other way around.
Another issue is that some programming languages have build systems that are better than the "oneshot" compilation used by most programming languages (one compiler invocation per file producing one object file, e.g. ` gcc x.c x.o`). For example, Haskell has `ghc --make` which compiles the whole project in one compiler invocation, with very smart recompilation avoidance (pet-function, comment changes don't affect compilation, etc) and avoidance of repeat steps (e.g. parsing/deserialising inputs to a module's compilation only once and keeping them in memory) and compiler startup cost.
Combining that with per-file general-purpose hermetic build systems is difficult and currently not implemented anywhere as far as I can tell.
To get something similar with Nix, the language-specific build system would have to invoke Nix in a very fine-grained way, e.g. to get "avoidance of codegen if only a comment changed", Nix would have to be invoked at each of the parser/desugar/codegen parts of the compiler.
I guess a solution to that is to make the oneshot mode much faster by better serialisation caching.
Reminds me of the “Electric cars in reverse” video where the guy envisions a world where all vehicles are electric and tries to make the argument for gas engines.
There are some hurdles preventing that flow from achieving reproducible builds. As the bad guys get more sophisticated, it's going to become more and more important that one party can say "we trust this build hash" and a separate party to say "us too".
That's not going to work if both parties get different hashes when they build the image, which won't happen as long as file modification timestamps (and other such hazards) are part of what gets hashed.
Recent versions of buildkit have added support for SOURCE_DATE_EPOC. I've been making the images reproducible before that with my own tooling, regctl image mod [1] to backdate the timestamps.
It's not just the timestamps you need to worry about. Tar needs to be consistent with the uid vs username, gzip compression depends on implementations and settings, and the json encoding can vary by implementation.
And all this assumes the commands being run are reproducible themselves. One issue I encountered there was how alpine tracks their package install state from apk, which is a tar file that includes timestamps. There are also timestamps in logs. Not to mention installing packages needs to pin those package versions.
All of this is hard, and the Dockerfile didn't make it easy, but it is possible. With the right tools installed, reproducing my own images has a documented process [2].
Both are needed, but you get more bang for your buck focusing on build security than on audited sources. If the build is solid then it forces attackers to work in the open where all auditors can work together towards spoiling the attack.
If you flip it around and instead have magically audited source but a shaky build, then perhaps a diligent user can protect themself, but they do so by doing their own builds, which means they're unaware that the attack even exists. This allows the attacker to just keep trying until they compromise somebody who is less diligent.
Getting caught requires a user who analyses downloaded binaries in something like ghidra... who does that when it's much easier to just build them from sources instead? (answer: vanishingly few people). And even once the attacker is found out, they can just hide the same payload a bit differently, and the scanners will stop finding it again.
Also, "maybe the code itself is malicious" can only ever be solved the hard way, whereas we have a reasonable hope of someday providing an easy solution to the "maybe the build is malicious" problem.
The lack of docker registry-like solutions really does seem to be the chokepoint for many alternatives.
Personally I love using mkosi and while it has all the composability and deployment options I'd care for, its clear not everyone wants to build starting only with a blank set of OS templates.
Does Nix do one layer per dependency? Does it run into >=128 layers issues?
In Spack [1] we do one layer per package; it's appealing, but I never checked if besides the layer limit it's actually bad for performance when doing filesystem operations.
tl;dr it will put one package per layer as much as possible, and compress everything else into the final layer. It uses the dependency graph to implement a reasonable heuristic for what is fine grained and what get combined.
That layering algorithm is also configurable, though I couldn’t really understand how to configure it and just wrote my own post processing to optimize layering for my internal use case. I believe I can open source this w/o much work.
The layer layout is just a json file so it can be post processed w/o issue before passing to the nix docker builders
Yes but then you're committed to using Nix which doesn't work so well the moment you need some software not packaged by Nix.
Want to throw a requirements.txt in there? No no, why would you even ask that? Meanwhile docker says yeah sure just run pip install, why should I care?
Then you're committing to maintaining a package for that software.
Like all LLM boosters, you've ignored the fact that the largest time sink in many kinds of software is not initial development, but perpetual maintenance.
This. I wouldn't have touched Nix when you needed someone who was really good at Nix to keep it working, but agents make it viable to use in a number of place.
I don't in ow if I'd say it's "easy". The Python ecosystem in particular is quite hard to get working in a hermetic way (Nix or otherwise). Multiple attempts at getting Python easy to package with Nix have come and gone over the years.
Nix doesn't make sense if all you're going to use it for is building Docker images. It only makes sense if you're all in in the first place. Then Docker images are free.
I'm not sure if this is what you mean but in some ways it would be nice to have tighter coupling with a registry. Docker build is kind of like a multiplexer - pull from here or there and build locally, then tag and push somewhere else. Most of the time all pulls are from public registries, push to a single private one and the local image is never used at all.
It seems overly orthogonal for the typical use case but perhaps just not enough of an annoyance for anyone to change it.
You make an excellent point about Dockerfile's flexibility being its enduring strength. In my experience, this flexibility is especially valuable in microservices environments where different services might need vastly different runtime environments - from Java Spring apps to Node.js services to Python data processing scripts.
The "copy files and run commands" approach you mention is indeed similar to traditional ops workflows, which I think is why it has such staying power. It's conceptually simple and maps well to how developers already think about building and deploying applications.
> the Dockerfile has continued because of its flexibility
I wish we had standardized on something other than shell commands, though. Puppet or terraform or something more declarative would have been such a better alternative to “everyone cargo cults ‘RUN apt-get upgrade’ onto the top of their dockerfiles”.
Like, the layer/stage/caching behavior is fine. I just wish the actual execution parts had been standardized using something at a higher level of abstraction than shell.
> Puppet or terraform or something more declarative would have been such a better alternative
Until you need to do something that isn't covered with its DSL, and you extend it with an external command execution declaration... At which point people will just write bash scripts anyway and use your declarative language as a glorified exec.
If you have 90-95% of everyone's needs (installing packages, compiling, putting files) covered in your DSL, and it has strong consistency and declarativeness, it's not that big of a problem if you need an escape hatch from time to time. Terraform, Puppet, Ansible, SaltStack show this pretty well, and the vast majority of them that isn't bash scripts is better and more maintainable than their equivalents in pure bash would be.
The problem is, ironically, that each DSL has its own execution platform, and is not designed for testability. Bash scripts may be hard to maintain, but at least you can write tests for them.
In Azure YAML I had an odd bug because I used succeeded() instead of not(failed()) as a condition. I had no way of testing the pipeline without executing it. And each DSL has its own special set of sharp edges.
However, Dockerfiles are so popular because they run shell commands and permit 'socially' extending someone else shell commands; tacking commands onto the end of someone else's shell script is a natural process. /bin/sh is unreasonably effective at doing anything you need to a filesystem, and if the shell exposes a feature, it has probably been used in a Dockerfile somewhere.
Every other solution, especially declarative ones, tend to come up short when _layering_ images quickly and easily. However, I agree they're good if you control the entire declarative spec.
I'd say LLB is the "standard", Dockerfile is just one of human-friendly frontends, but you can always make one yourself or use an alternative. For example, Dagger uses BuildKit directly for building its containers instead of going through a Dockerfile.
Dockerfile has the flexibility to do what you want though, no? Use a base image with terraform or puppet or opentofu or whatever pre-installed, then your Dockerfile can just run the right command to apply some declarative config file from the build context.
And if you want something weird that's not supported by your particular tool of choice, you have the escape hatch of running arbitrary commands in the Dockerfile.
Give https://github.com/project-dalec/dalec a look.
It is more declarative. Has explicit abstractions for packages, caching, language level integrations, hermetic builds, source packages, system packages, and minimal containers.
Its a Buildkit frontend, so you still use "docker build".
Oof, not terraform please. If you use foreach and friends, dependency calculations are broken, because dependency happens before dynamic rules are processed.
I'd get much better results it I used something else to do the foreach and gave terraform only static rules.
Bash is pretty darn abstracted from the OS, though. Puppet vs Bash is more about abstraction relative to the goal.
If your dockerfile says “ensure package X is installed at version Y” that’s a lot clearer (and also more easy to make performant/cached and deterministic) than “apt-get update; apt-get install $transitive-at-specific-version; apt-get install $the-thing-you-need-atspecific-version”. I’m not thrilled at how distro-locked the shell version makes you, and how easy it is for accidental transitive changes to occur too.
But neither of those approaches is at a particularly low abstraction level relative to the OS itself; files and system calls are more or less hidden away in both package-manager-via-bash and puppet/terraform/whatever.
>You can pretty much replace "docker build" with "go build".
I'll tell that to my CI runner, how easy is it for Go to download the Android SDK and to run Gradle? Can I also `go sonarqube` and `go run-my-pullrequest-verifications` ? Or are you also going to tell me that I can replace that with a shitty set of github actions ?
I'll also tell Microsoft they should update the C# definition to mark it down as a scripting language. And to actually give up on the whole language, why would they do anything when they could tell every developer to write if err != nil instead
Just because you have an extremely narrow view of the field doesn't mean it's the only thing that matters.
Go is just one language, while Dockerfile gives you access to the whole universe with myriads of tools and options from early 1970s and up to the future. I don't know how you can compare or even "replace" Docker with Go; they belong to different categories.
In some situations, yes, others no. For instance if you want to control memory or cpu using a container makes sense (unless you want to use cgroups directly). Also if running Kubernetes a container is needed.
You have to differentiate container images, and "runtime" containers. You can have the former without the latter, and vice versa. They are entirely orthogonal things.
E.g. systemd exposes a lot of resource control as well as sandboxing options, to the point that I would argue that systemd services can be very similar to "traditional" runtime containers, without any image involved.
The math of “a decade” seemed wrong to me, since I remembered Docker debuting in 2013 at PyCon US Santa Clara.
Then I found an HN comment I wrote a few years ago that confirmed this:
“[...] I remember that day pretty clearly because in the same lightning talk session, Solomon Hykes introduced the Python community to docker, while still working on dotCloud. This is what I think might have been the earliest public and recorded tech talk on the subject:”
Just being pedantic though. That’s about 13 years ago. The lightning talk is fun as a bit of computing history.
(Edit: as I was digging through the paper, they do cite this YouTube presentation, or a copy of it anyway, in the footnotes. And they refer to a 2013 release. Perhaps there was a multi-year delay between the paper being submitted to ACM with this title and it being published. Again, just being pedantic!)
From another comment below, it's just a nice short title to convey that we're going back in time and not one to set your watch by.
We first submitted the article to the CACM a while ago.
The review process takes some time and "Twelve years of
Docker containers" didn't have quite the same vibe.
(The CACM reviewers helped improve our article quite a bit. The time spent there was worth it!)
You’re right, it was 2014. I was there on HN when docker was announced by shykes. It was a godsend because I was getting bummed by the alternatives like LXC, juju charms or vagrant.
I still prefer LXC to docker. Improving libvirt and making virtualization a first-class OS feature with a library interface - vs. relying on an external tool and company interested in monetization - was and is the right approach.
> Docker repurposed SLIRP, a 1990s dial-up tool originally for Palm Pilots, to avoid triggering corporate firewall restrictions by translating container network traffic through host system calls instead of network bridging.
Until recently, Podman used slirp4net[1] for its container networking. About two years ago, they switched over to Pasta[2][3] which works quite a bit differently.
I don't think SLIRP was originally for palm pilots, given it was released two years before.
SLIRP was useful when you had a dial up shell, and they wouldn't give you slip or ppp; or it would cost extra. SLIRP is just a userspace program that uses the socket apis, so as long as you could run your own programs and make connections to arbitrary destinations, you could make a dial script to connect your computer up like you had a real ppp account. No incomming connections though (afaik), so you weren't really a peer on the internet, a foreshadowing of ubiquitous NAT/CGNAT perhaps.
> I don't think SLIRP was originally for palm pilots, given it was released two years before.
That's a mistake indeed; "popularised by" might have been better. Before my beloved Palmpilot arrived one Christmas, I was only using SLIRP to ninja in Netscape and MUD sessions onto a dialup connection which wasn't a very mainstream use.
repurposing a Palm Pilot dial-up tool to sneak container traffic past enterprise firewalls is unhinged and yet it worked the best infrastructure hacks are never clever in the moment they are just desperate that the cleverness only shows up after someone else has to maintain it.
VPNKit (the SLIRP component) has been remarkably bug free over the years, and hasn't been much of a burden overall.
There was another component that we didn't have room to cover in the article that has been very stable (for filesystem sharing between the container and the host) that has been endlessly criticised for being slow, but has never corrupted anyone's data! It's interesting that many users preferred potential-dataloss-but-speed using asynchronous IO, but only on desktop environments. I think Docker did the right thing by erring on the side of safety by default.
Except it's the plain NAT which was named 'bridge' because there were no sysadmin around slap some sense into the authors. Slirp is for 'unprivileged network namespaces' which is for a 'rootless' variants of docker ie attaching a container network to the host without the need for the root-level privileges.
I've not done serious networking stuff for over two decades, and never in as complex an environment as that in the article, so the networking part of the article went pretty much over my head.
What I want to do when running a Docker container on Mac is to be able to have the container have an IP address separate from the Mac's IP address that applications on the Mac see. No port mapping: if the container has a web server on port 80 I want to access it at container_ip:80, not 127.0.0.1:2000 or something that gets mapped to container port 80.
On Linux I'd just used Docker bridged networking and I believe that would work, but on Mac that just bridges to the Linux VM running under the hypervisor rather than to the Mac.
Is there some officially recommended and supported way to do this?
For a while I did it by running WireGuard on the Linux VM to tunnel between that and the Mac, with forwarding enabled on the Linux VM [1]. That worked great for quite a while, but then stopped and I could not figure out why. Then it worked again. Then it stopped.
I then switched to this [2] which also uses WireGuard but in a much more automated fashion. It worked for quite a while, but also then had some problems with Docker updates sometimes breaking it.
It would be great if Docker on Mac came with something like this built in.
(co-author of the article and Docker engineer here) I think WireGuard is a good foundation to build this kind of feature. Perhaps try the Tailscale extension for Docker Desktop which should take care of all the setup for you, see https://hub.docker.com/extensions/tailscale/docker-extension
BTW are you trying to avoid port mapping because ports are dynamic and not known in advance? If so you could try running the container with --net=host and in Docker Desktop Settings navigate to Resources / Network and Enable Host Networking. This will automatically set up tunnels when applications listen on a port in the container.
I'm basically using Docker on Mac as an alternative to VMWare Fusion with a much faster startup startup time and more flexible directory sharing.
I want to avoid port mapping because I already have things on the Mac using the ports that my things in the container are using.
I have a test environment that can run in a VM, container, or an actual machine like an RPi. It has copies of most of our live systems, with customer data removed. It is designed so that as much as possible things inside it run with the exact same configuration they do live. The web sites in then are on ports 80 and 443, MySQL/MariaDB is on 3306, and so on. Similarly, when I'm working on something that needs to access those services from outside the test system I want to as much as possible use the same configuration they will use when live, so they want to connect to those same port numbers.
Thus I need the test environment to have its own IP that the Mac can reach.
Or maybe not...I just remembered something from long ago. I wanted a simpler way to access things inside the firewall at work than using whatever crappy VPN we had, so I made a poor man's VPN with ssh. If I needed to access things on say port 80 and 3306 on host foo at work, I'd ssh to somewhere I could ssh to inside the firewall at work, setting that up to forward say local 10080 and 13306 to foo:80 and foo:3306. I'd add an /etc/hosts entry at foo giving it some unused address like 10.10.10.1. Then I'd use ipfw to set it up so that any attempt to connect to 10.10.10.1:80 or 10.10.10.1:3306 would get forwarded to 127.0.0.1:10080 or 127.0.0.1:13306, respectively. That worked great until Apple replaced ipfw with something else. By then we had a decent VPN for work and so I no longer need my poor man's VPN and didn't look into how to do this in whatever replaced ipfw.
Learning how to do that in whatever Apple now uses might be a nice approach. I'll have to look into that.
I don't have a Mac environment, but I have researched a bit for devex purposes, and I would go with the Colima project as a open source solution for containers on mac. Have you tried it?
A full decade since we took the 'it works on my machine' excuse and turned it into the industry standard architecture ('then we'll just ship your machine to production').
Well, before Docker I used to work on Xen and that possible future of massive block devices assembled using Vagrant and Packer has thankfully been avoided...
One thing that's hard to capture in the article -- but that permeated the early Dockercons -- is the (positive) disruption Docker had in how IT shops were run. Before that going to production was a giant effort, and 'shipping your filesystem' quickly was such a change in how people approached their work. We had so many people come up to us grateful that they could suddenly build services more quickly and get them into the hands of users without having to seek permission slips signed in triplicate.
We're seeing the another seismic cultural shift now with coding agents, but I think Docker had a similar impact back then, and it was a really fun community spirit. Less so today with the giant hyperscalars all dominating, sadly, but I'll keep my fond memories :-)
And those lightweight VM base images are possible because Docker applied a downward pressure on OS base image sizes! Alpine Linux doesn't get enough credit for this; in addition to being a great base image, it was also the first distro to prioritise fast and small image creation (Gentoo and Arch were small, but not fast to create).
It's not as easy; a block device has to be bootable and so usually bundles a kernel (large). And because the filesystem inside is opaque, you can't do layering like Docker does easily via overlayfs and friends. libguestfs does a heroic job of making VM images easier to manipulate from code, but it's an uphill battle...
Great point about coding agents! Back then, Docker gave us 'it works on my machine, let's ship the machine'. Now, AI agents are giving us 'I have no idea how this works, let's ship the prompt'. The early Docker community spirit really was legendary though—before every hyperscaler wrapped it in 7 layers of proprietary managed services. Thanks for the memories and the write-up!
Thanks for the kind words! I've been prodding @justincormack to resurrect the single most fun OS unconference I've ever attended -- New Directions in Operating Systems (last held back in 2014). https://operatingsystems.io
Some of those talks strangely make more sense today (e.g. Rump Kernels or unikernels + coding agents seems like a really good combination, as the agent could search all the way through the kernel layers as well).
Exactly my feeling. Docker is "works on this machine" with an executable recipe to build the machine and the application. Newer better solutions like OCI-compliant tools will gradually replace Docker, but the paradigm shift has provided a lot of lasting value.
Yeah docker codifies what the process to convert a base linux distro in to a working platform for the app actually is. Every company I've worked at that didn't use docker just has this tribal knowledge or an outdated wiki page on the steps you need to take to get something to work. Vs a dockerfile that exactly documents the process.
> I don't care about glibc or compatibility with /etc/nsswitch.conf.
So what do you do when you need to resolve system users? I sure hope you don't parse /etc/passwd, since plenty of users (me included) use other user databases (e.g. sssd or systemd-userdbd).
the real trick was making "ship your machine" sound like best practice and ten years later we r doing the same thing with ai "it works in my notebook" jst became "containerize the notebook and call it a pipeline" the abstraction always wins because fixing the actual problem is just too hard.
I think it’s laziness, not difficulty. That’s not meant to be snide or glib: I think gaining expertise in how to package and deploy non-containerized applications isn’t difficult or unattainable for most engineers; rather, it’s tedious and specialized work to gain that expertise, and Docker allowed much of the field to skip doing it.
That’s not good or bad per se, but I do think it’s different from “pre-container deployment was hard”. Pre-container deployment was neglected and not widely recognized as a specialty that needed to be cultivated, so most shops sucked at it. That’s not the same as “hard”.
It's not even laziness or expertise. A lot of people are against learning conventions. They want their way, meaning what works on their computer. That's why they like the current scope of package managers, docker, flatpack,... They can do what they want in the sandbox provided however nonsensical and then ship the whole thing. And it will break if you look at it the wrong way.
I mean, walking through a door is easier than tearing down a wall, walking through it, and rebuilding the wall. That doesn't mean the latter is a good idea.
Those are global to the machine; generally not an issue and seccomp rules can filter out undesirable syscalls to other containers. But GPU kernel/userspace driver matching has been a huge headache; see https://cacm.acm.org/research/a-decade-of-docker-containers/... in the article for how the CDI is (sort of) helping standardise this.
Oh, thank you... I'm not alone... I'm so tired of seeing crappy containers with pseudo service management handled by Dockerfiles, used instead of proper and serious packaging like that of many venerable Linux distributions.
Yeah, but if the problem you are solving is rare for most practitioners, effectively theoretical until it actually happens, then people won't switch until they get bit by that particular problem.
But they’re roughly the same paradigm as docker, right? My understanding of the Nix approach is that it’s still reproducing most of a user land/filesystem in a captive/separate/sandbox environment. Like, docker is using namespaces for more stuff, Nix has a heavier emphasis on reproducibility/determinism, but … they’re both still throwing in the towel on deploying directly on the underlying OS’s userland (unless you go all the way to nixOS) and shipping what amounts to a filesystem in a box, no?
I daily drive NixOS. I don't have a global "userland". Packages are shipped from upstream and pull in the dependencies they need to function.
That means unlike Gentoo, I've never dealt with a "slot conflict" where two packages want conflicting dependencies. And unlike Ubuntu, I have new versions of everything.
Pick 2: share dependencies, be on the bleeding edge, or waste your time resolving conflicts.
Yeah nix is great for this. Also I can update infrequently and still package anything I want bleeding edge without any big issues other then maybe some build from sourcing.
> But they’re roughly the same paradigm as docker, right?
Absolutely not. Nix and Guix are package managers that (very simplified) model the build process of software as pure functions mapping dependencies and source code as inputs to a resulting build as their output. Docker is something entirely different.
> they’re both still throwing in the towel on deploying directly on the underlying OS’s userland
The existence of an underlying OS userland _is_ the disaster. You can't build a robust package management system on a shaky foundation, if nix or guix were to use anything from the host OS their packaging model would fundamentally break.
> unless you go all the way to nixOS
NixOS does not have a "traditional/standard/global" OS userland on which anything could be deployed (excluding /bin/sh for simplicity). A package installed with nix on NixOS is identical to the same package being installed on a non-NixOS system (modulo system architecture).
> shipping what amounts to a filesystem in a box
No. Docker ships a "filesystem in a box", i.e. an opaque blob, an image. Nix and Guix ship the package definitions from which they derive what they need to have populated in their respective stores, and either build those required packages or download pre-built ones from somewhere else, depending on configuration and availability.
With docker two independent images share nothing, except maybe some base layer, if they happen to use the same one. With nix or Guix, packages automatically share their dependencies iff it is the same dependency. The thing is: if one package depends on lib foo compiled with -O2 and the other one depends on lib foo compiled with -O3, then those are two different dependencies. This nuance is something that only the nix model started to capture at all.
> This nuance is something that only the nix model started to capture at all.
Unpopular opinion, loosely held: the whole attempt to share any dependencies at all is the source of evil.
If you imagine the absolute worst case scenario that every program shipped all of its dependencies and nothing was shared then the end result would be… a few gigabytes of duplicated data? Which could plausible be deduped at the filesystem level rather than build or deployment layer?
Feels like a big waste of time. Maybe it mattered in the 70s. But that was a long, long time ago.
Node.js basically tried this — every package gets its own copy of every dependency in node_modules. Worked great until you had 400MB of duplicated lodash copies
and the memes started.
pnpm fixed it exactly the way you describe though: content-addressable store with hardlinks. Every package version exists once on disk, projects just link to it.
So the "dedup at filesystem level" approach does work, it just took the ecosystem a decade of pain to get there.
It used to be, but only in cases where your distro doesn't just package whatever software you require. Nowadays I prefer Flatpak or AppImage over crappy custom Windows installers for those cases. They allow for sandboxing and reliable updating/deinstallation.
These days, I equate anything that ships via docker/flatpak first as built by someone that only care about their own computer, especially if the project is opensource. As soon as a library or a tool update, they usually rush to add a hard condition on it for no reason other than to be on the "bleeding edge".
We've given up on native Windows containers in OCaml after trying to use them for our CI builds for many years. See https://www.tunbury.org/2026/02/19/obuilder-hcs/ for our recent switch to HCS instead. Compared to Linux containers, they're very much a second-class citizen in the Microsoft worldview of Docker.
This is because your team doesn’t know how to ship software without using containers.
If you have adopted a bad tool then people are likely to want the bad tool in more places. This is the opposite of a virtuous cycle and is a horrible form of tech debt.
With ML and AI now being pushed into everything, images have ballooned in size. Just having torch as a dependency is some multiple gigabytes. I miss the times of aiming for 30MB images.
Have others found this to be the case? Perhaps we're doing something wrong.
I’ve seen images that accidentally install tensorflow twice, too. It wouldn’t be so bad if large files were shared between layers but they aren’t. It’s bad enough that I’m building an alternative registry and snapshotter with file level dedupe to deal with it.
Sounds like it would be useful. Many common dev workflows started falling apart when it's not just tiny code files they need to deal with. In the python world, uv has helped massively, with pip we were seeing 30+ min build times on fairly simple images with torch
The historic information in here was really interesting, and a great example of an article rapidly expanding in scope and detail. How they combatted corporate IT “security” software by pretending to be a VPN is quite unexpected.
The discussion about Dockerfile vs Nix is fascinating. Having dealt with both in production environments, I think Docker's "ugly flexibility" mentioned by bmitch3020 is precisely why it won.
The barrier to entry matters enormously for adoption. A Dockerfile that starts with `FROM ubuntu:22.04` and runs shell commands is immediately understandable to any ops person, even if they've never used Docker before. Meanwhile Nix, despite its superior hermetic builds, requires learning an entirely new paradigm.
That said, the reproducible builds issue __MatrixMan__ raised is becoming critical. I've seen subtle supply chain attacks that exploit the non-deterministic nature of traditional builds. The SOURCE_DATE_EPOCH support in buildkit is a good step, but we need more tooling that makes reproducibility the default, not an afterthought.
I'm optimistic we will succeed in efforts to simplify linux application / dependency compatibility instead of relying on abstractions that which work around them.
Maybe if you only look at it through the lens of building an app/service, but containers offer so much more than that. By standardizing their delivery through registries and management through runtimes, a lot of operational headaches just go away when using a container orchestrator. Not to mention better utilization of hardware since containers are more lightweight than VMs.
Hah indeed that's my perspective. I'm used to being able to compile program, distribute executable, "just works", across win, Linux, MacOs. (With appropriate compile targets set)
When compared to a VM, yes. But shipping a separate userspace for each small app is still bloat. You can reuse software packages and runtime environments across apps. From an I/O, storage, and memory utilization point of view, it feels baffling to me that containers are so popular.
"bloat" has always been the last resort criticism from someone who has nothing valid. Containers are incredibly light, start very rapidly, and have such low overhead in general that the entire industry has been using them.
Docker containers also do reuse shared components, layers that are shared between containers are not redownloaded. The stuff that's unique at the bottom is basically just going to be the app you want to run.
I was referring to the userspace runtime stack, not the kernel. What I criticize is that multiple containers that share a single host usually overdo it with filesystem isolation. Hundreds of MBs of libraries and tools needlessly duplicated, even though they could just as well have used distro packages and deployed their apps as system-level packages and systemd unit files with `DynamicUser=`.
You can hardly call this efficient hardware utilization.
The duplication is a necessity to achieve the isolation. Having shared devels and hordes of unit files for a multi tenant system is hell - versioning issues can and will break this paradigm, no serious shop is doing this.
For running your own machine, sure. But this would become non maintainable for a sufficiently multi tenant system. Nix is the only thing that really can begin to solve this outside of container orchestration.
I've recently switched from docker compose to process compose and it's super nice not to have to map ports or mount volumes. What I actually needed from docker had to do less with containers and more with images, and nix solves that problem better without getting in the way at runtime.
Assuming I've found the right process-compose [1], it struck me as having much overlap with the features of systemd. Or at least, I would tend to reach for systemd if I wanted something to run arbitrary processes. Is there something additional/better that process-compose does for you?
What process-compose gives me is a single parent with all of that project's processes as children, and a nice TUI/CLI for scrolling through them to see who is happy/unhappy and interrogating their logs, and when I shut it down all of that project's dependencies shut down. Pretty much the same flow as docker-compose.
It's all self-contained so I can run it on MacOS and it'll behave just the same as on Linux (I don't think systemd does this, could be wrong), and without requiring me to solve the docker/podman/rancher/orbstack problem (these are dependencies that are hard to bundle in nix, so while everything else comes for free, they come at the cost of complicating my readme with a bunch of requests that the user set things up beforehand).
As a bonus, since it's a single parent process, if I decide to invoke it through libfaketime, the time inherited by subprocess so it's consistently faked in the database and the services and in observability tools...
My feeling for systemd is that it's more for system-level stuff and less for project-level dependencies. Like, if I have separate projects which need different versions of postgres, systemd commands aren't going to give me a natural way to keep track of which project's postgres I'm talking about. process-compose, however, will show me logs for the correct postgres (or whatever service) in these cases:
~/src/projA$ process-compose process logs postgres
~/src/projB$ process-compose process logs postgres
This is especially helpful because AI agents tend to be scoped to working directory. So if I have one instance of claude code on each monitor and in each directory, which ever one tries to look at postgres logs will end up looking at the correct postgres's logs without having to even know that there are separate ones running.
Basically, I'm alergic to configuring my system at all. All dependencies besides nix, my text editor, and my shell are project level dependencies. This makes it easy to hop between machines and not really care about how they're set up. Even on production systems, I'd rather just clone the repo `nix run` in that dir (it then launches process compose which makes everything just like it was in my dev environment). I am however not in charge of any production systems, so perhaps I'm a bit out of touch there.
I'm curious why. To me "We updated our library to change some things in a way that's an improvement on net but only mostly backwards compatible" seems like an extremely common instinct in software development. But in an environment where people are doing that all the time, the only way to reliably deploy software is to completely freeze all your direct and indirect dependencies at an exact version. And Docker is way better at handling that than traditional Linux package managers are.
Why do you think other tools will make a comeback?
You can write any software you want without worrying about depending on a specific set of system dependencies. I like software that "just works", and making something that will give you inscrutable linking or dependency errors if the OS isn't set up just so is a practice I think should go away.
I am also optimistic we will succeed in efforts to properly annotate the data on the Internet with useful and accurate meta-data and achieve the semantic web vision instead of relying on search engines and LLMs.
docker is bloated. i'm almost certain half of every image is dead weight. unused apt packages, full distros for a single binary, shell configs nobody touches. but the incentive is to make things work, not make them small. so bloat wins.
still, i use it every day and i don't see what replaces it. every "docker killer" solves one problem while ignoring the 50 things docker does well enough.
Docker released "Docker Hardened Images" last year and made them free. They contain less bloat.
Buying more RAM for your server or only touching a select few images that are run most often is also a way to make things work.
It might not be the most elegant software engineering approach, but it just works.
I realise apple containers haven't quite taken off yet as expected but omission from the article stands out. Nice that it mentions alternative approaches like podman and kata though.
Unfortunately Apple managed to omit the feature we all want that only they can implement: namespaces for native macOS!
Instead we got yet another embedded-Linux-VM which (imo) didn't really add much to the container ecosystem except a bunch of nice Swift libraries (such as the ext2 parsing library, which is very handy).
Great. I started developing in the Docker era, and while I can see some flaws, it is one of the easiest, most reliable tools I constantly use. I can't imagine how people dealt with those problems before Docker
We have shipped unikernels for the last decade. Zero sec issues so far. I highly recommend looking into the unikernel space for a docker alternative. MirageOS being a good start.
cool! What services have you shipped as unikernels? Docker doesn't have to be an alternative; it can help with the build/run pipeline for them too: https://www.youtube.com/watch?v=CkfXHBb-M4A (Dockercon 2015!)
Somewhere along the line they started prioritising docker desktop over docker. It's a bit jarring to see new features coming to desktop before it comes to Linux, such as the new sandbox features.
Is there any insight into this, I would have thought the opposite where developers on the platform that made docker succeed are given first preview of features.
This was AI generated right? And saying that in 2010 Linux was complex and needed to compile everything across virtual machines for cloud solutions and that socket solved that… really in what universe people lived?
Docker made convenient distribute some functionalities which were planned as stack of services rather than packaging appropriately for each distribution and handling to an administrator the configuration. That’s it. And it has many inconveniences. And Linux as well as BSD had containers before and chroots and many other things.
I remember being pretty skeptical of “dockerizing” applications when it first becamee popular. But I’ve come around to it, if for no other reason than it provided an easily understandable concept which anyone could understand and more importantly use. The onramp to using docker is very gentle.
I've been a professional in the industry for over a decade and I've still not found any meaningful benefit for learning or using containerization in the real world. I just install my dependencies with good old fashioned version managers (asdf) and I develop the project. I ignore all docker documentation, and everything works fine. When I try to use containers to develop, it's seemingly two dozen gotchas that sum up to my IDE needing endless configuration to function properly. I don't get it.
Curious what you build/work on? If you're only shipping software library or tools, then yeah, your perspective makes sense - `asdf` or `mise` seem superior to `devcontainer`s. But I wouldn't want to deploy a web application without Dockerizing it.
Mostly web applications actually. Most web stuff is not that complicated. Usually the deployment itself is from docker (ops insists) but I just do development without it and I've never had a problem. I understand in theory I could get mismatched versions of things from production and thereby introduce a bug; in practice this has never happened a single time.
We first submitted the article to the CACM a while ago. The review process takes some time and "Twelve years of Docker containers" didn't have quite the same vibe.
> If you are a developer, our goal is to make Docker an invisible companion
I want it not to just be invisible but to be missing. If you have kubernetes, including locally with k3s or similar, it won't be used to run containers anyway. However it still often is used to build OCI images. Podman can fill that gap. It has a Containerfile format that is the same syntax but simpler than the Docker builds, which now provides build orchestration features similar to earthly.dev which I think are better kept separate.
Increasingly flake.nix is present in good repos. Zillions of packages are available. But yes, you will need to learn and sometimes File System Hierarchy assumptions need to be worked around. The rewards however, dominate the (few) inconveniences once you know your way around.
> While convenient, this single shared filesystem makes it very difficult to install multiple applications at the same time if they have conflicting dynamic library requirements.
No, it does not make it "very difficult".
And like other comments here grumble - this rationale is essentially a sanctification of the sentiment of "It builds and runs on my system and I can't be bothered to make fewer assumptions so that it runs on yours".
The fact that docker still, in 2026, will completely overwrite iptables rules silently to expose containers to external requests is, frankly, fucking stupid.
I am so thoroughly convinced that Docker is a hacky-but-functional solution to an utterly failed userspace design.
Linux user space decided to try and share dependencies. Docker obliterates this design goal by shipping dependencies, but stuffing them into the filesystem as-if they were shared.
If you’re going to do this then a far far far simpler solution is to just link statically or ship dependencies adjacent to the binary. (Aka what windows does). Replicating a faux “shared” filesystem is a gross hack.
This is a distinctly Linux problem. Windows software doesn’t typically have this issue. Because programs ship their dependencies and then work.
Docker is one way to ship dependencies. So it’s not the worst solution in the world. But I swear it’s a bad solution. My blood boils with righteous fury anytime anyone on my team mentions they have a 15 minute docker build step. And don’t you damn dare say the fix to Docker being slow is to add more layers of complexity with hierarchical Docker images ohmygodiswear. Running a computer program does not have to be hard I promise!!
Okay, so what's the best solution? What's even just a better solution than Docker? I mean really truly lay out all the details here or link to a blog post that describes in excruciating detail how they shipped a web application and maintained it for years and was less work than Docker containers. Just saying "a far far simpler solution is to just link statically or ship dependencies adjacent to the binary" is ignoring huge swaths of the SDLC. Anyone can cast stones, very few can actually implement a better solution. Bring the receipts.
Given that distributions are the distributors of packages and not the upstream developers, I think static linking is fine as is dep-shipping. The now dead Clear Linux was great at handling package distribution.
Personally, I think docker is dumb, so is AppImage, so is FlatPak, so are VMs… honestly, it’s all dumb. We all like these various things because they solve problems, but they don’t actually solve anything. They work around issues instead. We end up with abstractions and orchestrations of docker, handling docker containers running inside of VMs, on top of hardware we cannot know, see, control, or inspect. The containers are now just a way to offer shared hosting at a price premium with complex and expensive software deployment methods. We are charged extortionate prices at every step, and we accept it because it’s convenient, because these methods make certain problems go away, and because if we want money, investors expect to see “industry standards.”
The first half of my career was spent shipping video games. There is no such thing as shipping a game in Docker. Not even on Linux. You depend on minimum version of glibc and then ship your damn dependencies.
The more recent half of my career has been more focused on ML and now robotics. Python ML is absolute clusterfuck. It is close to getting resolved with UV and Pixi. The trick there is to include your damn dependencies… via symlink to a shared cache.
Any program or pipeline that relies on whatever arbitrary ass version of Python is installed on the system can die in a fire.
That’s mostly about deploying. We can also talk about build systems.
The one true build system path is a monorepo that contains your damn dependencies. Anything else is wrong and evil.
I’m also spicy and think that if your build system can’t crosscompile then it sucks. It’s trivial to crosscompile for Windows from Linux because Windows doesn’t suck (in this regard). It almost impossible to crosscompile to Linux from Windows because Linux userspace is a bad, broken, failed design. However Andrew Kelley is a patron saint and Zig makes it feasible.
Use a monorepo, pretend the system environment doesn’t exist, link statically/ship adjacent so/dll.
Docker clearly addresses a real problem (that Linux userspace has failed). But Docker is a bad hack. The concept of trying to share libraries at the system level has objectively failed. The correct thing to do is to not do that, and don’t fake a system to do it.
Windows may suck for a lot of reasons. But boy howdy is it a whole lot more reliable than Linux at running computer programs.
It solves a practical problem that’s obvious. And on one hand the practical where-were-at-now is all that matters, that’s a legitimate perspective.
There’s another one, at least IMHO, that this entire stack from the bottom up is designed wrong and every day we as a society continue marching down this path we’re just accumulating more technical debt. Pretty much every time you find the solution to be, “ok so we’ll wrap the whole thing and then…” something is deeply wrong and you’re borrowing from the future a debt that must come due. Energy is not free. We tend to treat compute like it is.
Maybe I’m in a big club but I have a vision for a radically different architecture that fixes all of this and I wish that got 1/2 the attention these bandaids did. Plan 9 is an example of the theme if not the particular set of solutions I’m referring to.
Something that I recently have explored is the optimization of Docker layers and startup time for large containers. Using shared storage, tar layers preload, overlayBD https://github.com/codeexec/overlaybd-deploy is something that I would like to see more natively. Great article
Just pull a tarball from a signed URL, install deps, and run from systemd. Rolls out in 30 seconds, remarkably stable. Initial bootstrap of deps/paths is maybe 5 minutes.
The next gap that still feels unsolved: knowing whether what's running in that container is actually behaving correctly, not just running. Uptime checks tell you the process is alive; they don't tell you the API is returning what it should.
The flip side is that the world still hasn’t settled on a language-neutral build tool that works for all languages. Therefore we resort to running arbitrary commands to invoke language-specific package managers. In an alternate timeline where everyone uses Nix or Bazel or some such, docker build would be laughed out of the window.
> running arbitrary commands to invoke language-specific package managers.
This is exactly what we do in Nix. You see this everywhere in nixpkgs.
What sets apart Nix from docker is not that it works well at a finer granularity, i.e. source-file-level, but that it has real hermeticity and thus reliable caching. That is, we also run arbitrary commands, but they don't get to talk to the internet and thus don't get to e.g. `apt update`.
In a Dockerfile, you can `apt update` all you want, and this makes the build layer cache a very leaky abstraction. This is merely an annoyance when working on an individual container build but would be a complete dealbreaker at linux-distro-scale, which is what Nix operates at.
And in languages with insufficient abstraction power like C and Go, you often need to invoke a code generation tool to generate the sources; that's an extremely arbitrary command. These are just non-problems if you have hermetic builds and reliable caching.
On my nixos-rebuild, building a simple config file for in /etc takes much longer than a common gcc invocation to compile a C file. I suspect that is due to something in Nix's Linux sandbox setup being slow, or at least I remember some issue discussions around that; I think the worst part of that got improved but it's still quite slow today.
Because of that, it's much faster to do N build steps inside 1 nix build sandbox, than the other way around.
Another issue is that some programming languages have build systems that are better than the "oneshot" compilation used by most programming languages (one compiler invocation per file producing one object file, e.g. ` gcc x.c x.o`). For example, Haskell has `ghc --make` which compiles the whole project in one compiler invocation, with very smart recompilation avoidance (pet-function, comment changes don't affect compilation, etc) and avoidance of repeat steps (e.g. parsing/deserialising inputs to a module's compilation only once and keeping them in memory) and compiler startup cost.
Combining that with per-file general-purpose hermetic build systems is difficult and currently not implemented anywhere as far as I can tell.
To get something similar with Nix, the language-specific build system would have to invoke Nix in a very fine-grained way, e.g. to get "avoidance of codegen if only a comment changed", Nix would have to be invoked at each of the parser/desugar/codegen parts of the compiler.
I guess a solution to that is to make the oneshot mode much faster by better serialisation caching.
https://crane.dev/getting-started.html
That's not going to work if both parties get different hashes when they build the image, which won't happen as long as file modification timestamps (and other such hazards) are part of what gets hashed.
It's not just the timestamps you need to worry about. Tar needs to be consistent with the uid vs username, gzip compression depends on implementations and settings, and the json encoding can vary by implementation.
And all this assumes the commands being run are reproducible themselves. One issue I encountered there was how alpine tracks their package install state from apk, which is a tar file that includes timestamps. There are also timestamps in logs. Not to mention installing packages needs to pin those package versions.
All of this is hard, and the Dockerfile didn't make it easy, but it is possible. With the right tools installed, reproducing my own images has a documented process [2].
[1]: https://regclient.org/cli/regctl/image/mod/
[2]: https://regclient.org/install/#reproducible-builds
I’m more concerned about sources being poisoned over the build processes. Xz is a great example of this.
If you flip it around and instead have magically audited source but a shaky build, then perhaps a diligent user can protect themself, but they do so by doing their own builds, which means they're unaware that the attack even exists. This allows the attacker to just keep trying until they compromise somebody who is less diligent.
Getting caught requires a user who analyses downloaded binaries in something like ghidra... who does that when it's much easier to just build them from sources instead? (answer: vanishingly few people). And even once the attacker is found out, they can just hide the same payload a bit differently, and the scanners will stop finding it again.
Also, "maybe the code itself is malicious" can only ever be solved the hard way, whereas we have a reasonable hope of someday providing an easy solution to the "maybe the build is malicious" problem.
Personally I love using mkosi and while it has all the composability and deployment options I'd care for, its clear not everyone wants to build starting only with a blank set of OS templates.
In Spack [1] we do one layer per package; it's appealing, but I never checked if besides the layer limit it's actually bad for performance when doing filesystem operations.
[1] https://spack.readthedocs.io/en/latest/containers.html
tl;dr it will put one package per layer as much as possible, and compress everything else into the final layer. It uses the dependency graph to implement a reasonable heuristic for what is fine grained and what get combined.
The layer layout is just a json file so it can be post processed w/o issue before passing to the nix docker builders
Want to throw a requirements.txt in there? No no, why would you even ask that? Meanwhile docker says yeah sure just run pip install, why should I care?
Like all LLM boosters, you've ignored the fact that the largest time sink in many kinds of software is not initial development, but perpetual maintenance.
If you care about getting it to work with minimal effort right now more thar about it being sustainable later, then sure.
It seems overly orthogonal for the typical use case but perhaps just not enough of an annoyance for anyone to change it.
The "copy files and run commands" approach you mention is indeed similar to traditional ops workflows, which I think is why it has such staying power. It's conceptually simple and maps well to how developers already think about building and deploying applications.
I wish we had standardized on something other than shell commands, though. Puppet or terraform or something more declarative would have been such a better alternative to “everyone cargo cults ‘RUN apt-get upgrade’ onto the top of their dockerfiles”.
Like, the layer/stage/caching behavior is fine. I just wish the actual execution parts had been standardized using something at a higher level of abstraction than shell.
Until you need to do something that isn't covered with its DSL, and you extend it with an external command execution declaration... At which point people will just write bash scripts anyway and use your declarative language as a glorified exec.
In Azure YAML I had an odd bug because I used succeeded() instead of not(failed()) as a condition. I had no way of testing the pipeline without executing it. And each DSL has its own special set of sharp edges.
At least Bash's common edges are well known.
However, Dockerfiles are so popular because they run shell commands and permit 'socially' extending someone else shell commands; tacking commands onto the end of someone else's shell script is a natural process. /bin/sh is unreasonably effective at doing anything you need to a filesystem, and if the shell exposes a feature, it has probably been used in a Dockerfile somewhere.
Every other solution, especially declarative ones, tend to come up short when _layering_ images quickly and easily. However, I agree they're good if you control the entire declarative spec.
And if you want something weird that's not supported by your particular tool of choice, you have the escape hatch of running arbitrary commands in the Dockerfile.
What more do you want?
Its a Buildkit frontend, so you still use "docker build".
They sounded nice on paper but the work they replaced was somehow more annoying.
I moved over to Docker when it came out because it used shell.
I'd get much better results it I used something else to do the foreach and gave terraform only static rules.
If your dockerfile says “ensure package X is installed at version Y” that’s a lot clearer (and also more easy to make performant/cached and deterministic) than “apt-get update; apt-get install $transitive-at-specific-version; apt-get install $the-thing-you-need-atspecific-version”. I’m not thrilled at how distro-locked the shell version makes you, and how easy it is for accidental transitive changes to occur too.
But neither of those approaches is at a particularly low abstraction level relative to the OS itself; files and system calls are more or less hidden away in both package-manager-via-bash and puppet/terraform/whatever.
But as long as people want to use scripting languages (like php, python etc) i guess docker is the neccessary evil.
I'll tell that to my CI runner, how easy is it for Go to download the Android SDK and to run Gradle? Can I also `go sonarqube` and `go run-my-pullrequest-verifications` ? Or are you also going to tell me that I can replace that with a shitty set of github actions ?
I'll also tell Microsoft they should update the C# definition to mark it down as a scripting language. And to actually give up on the whole language, why would they do anything when they could tell every developer to write if err != nil instead
Just because you have an extremely narrow view of the field doesn't mean it's the only thing that matters.
E.g. systemd exposes a lot of resource control as well as sandboxing options, to the point that I would argue that systemd services can be very similar to "traditional" runtime containers, without any image involved.
Interesting. How does go build my python app?
Then I found an HN comment I wrote a few years ago that confirmed this:
“[...] I remember that day pretty clearly because in the same lightning talk session, Solomon Hykes introduced the Python community to docker, while still working on dotCloud. This is what I think might have been the earliest public and recorded tech talk on the subject:”
YouTube link: https://youtu.be/1vui-LupKJI?t=1579
Note: starts at t=1579, which is 26:19.
Just being pedantic though. That’s about 13 years ago. The lightning talk is fun as a bit of computing history.
(Edit: as I was digging through the paper, they do cite this YouTube presentation, or a copy of it anyway, in the footnotes. And they refer to a 2013 release. Perhaps there was a multi-year delay between the paper being submitted to ACM with this title and it being published. Again, just being pedantic!)
Here’s the announcement from 2013:
https://news.ycombinator.com/item?id=5408002
https://news.ycombinator.com/item?id=5409678
Genuinely fascinating and clever solution!
[1] https://github.com/rootless-containers/slirp4netns
[2] https://blog.podman.io/2024/03/podman-5-0-breaking-changes-i...
[3] https://passt.top/passt/about/#pasta-pack-a-subtle-tap-abstr...
SLIRP was useful when you had a dial up shell, and they wouldn't give you slip or ppp; or it would cost extra. SLIRP is just a userspace program that uses the socket apis, so as long as you could run your own programs and make connections to arbitrary destinations, you could make a dial script to connect your computer up like you had a real ppp account. No incomming connections though (afaik), so you weren't really a peer on the internet, a foreshadowing of ubiquitous NAT/CGNAT perhaps.
That's a mistake indeed; "popularised by" might have been better. Before my beloved Palmpilot arrived one Christmas, I was only using SLIRP to ninja in Netscape and MUD sessions onto a dialup connection which wasn't a very mainstream use.
There was another component that we didn't have room to cover in the article that has been very stable (for filesystem sharing between the container and the host) that has been endlessly criticised for being slow, but has never corrupted anyone's data! It's interesting that many users preferred potential-dataloss-but-speed using asynchronous IO, but only on desktop environments. I think Docker did the right thing by erring on the side of safety by default.
Sir, this is a hacker news.
What I want to do when running a Docker container on Mac is to be able to have the container have an IP address separate from the Mac's IP address that applications on the Mac see. No port mapping: if the container has a web server on port 80 I want to access it at container_ip:80, not 127.0.0.1:2000 or something that gets mapped to container port 80.
On Linux I'd just used Docker bridged networking and I believe that would work, but on Mac that just bridges to the Linux VM running under the hypervisor rather than to the Mac.
Is there some officially recommended and supported way to do this?
For a while I did it by running WireGuard on the Linux VM to tunnel between that and the Mac, with forwarding enabled on the Linux VM [1]. That worked great for quite a while, but then stopped and I could not figure out why. Then it worked again. Then it stopped.
I then switched to this [2] which also uses WireGuard but in a much more automated fashion. It worked for quite a while, but also then had some problems with Docker updates sometimes breaking it.
It would be great if Docker on Mac came with something like this built in.
[1] https://news.ycombinator.com/item?id=33665178
[2] https://github.com/chipmk/docker-mac-net-connect
BTW are you trying to avoid port mapping because ports are dynamic and not known in advance? If so you could try running the container with --net=host and in Docker Desktop Settings navigate to Resources / Network and Enable Host Networking. This will automatically set up tunnels when applications listen on a port in the container.
Thanks for the links, I'll dig into those!
I want to avoid port mapping because I already have things on the Mac using the ports that my things in the container are using.
I have a test environment that can run in a VM, container, or an actual machine like an RPi. It has copies of most of our live systems, with customer data removed. It is designed so that as much as possible things inside it run with the exact same configuration they do live. The web sites in then are on ports 80 and 443, MySQL/MariaDB is on 3306, and so on. Similarly, when I'm working on something that needs to access those services from outside the test system I want to as much as possible use the same configuration they will use when live, so they want to connect to those same port numbers.
Thus I need the test environment to have its own IP that the Mac can reach.
Or maybe not...I just remembered something from long ago. I wanted a simpler way to access things inside the firewall at work than using whatever crappy VPN we had, so I made a poor man's VPN with ssh. If I needed to access things on say port 80 and 3306 on host foo at work, I'd ssh to somewhere I could ssh to inside the firewall at work, setting that up to forward say local 10080 and 13306 to foo:80 and foo:3306. I'd add an /etc/hosts entry at foo giving it some unused address like 10.10.10.1. Then I'd use ipfw to set it up so that any attempt to connect to 10.10.10.1:80 or 10.10.10.1:3306 would get forwarded to 127.0.0.1:10080 or 127.0.0.1:13306, respectively. That worked great until Apple replaced ipfw with something else. By then we had a decent VPN for work and so I no longer need my poor man's VPN and didn't look into how to do this in whatever replaced ipfw.
Learning how to do that in whatever Apple now uses might be a nice approach. I'll have to look into that.
Well, before Docker I used to work on Xen and that possible future of massive block devices assembled using Vagrant and Packer has thankfully been avoided...
One thing that's hard to capture in the article -- but that permeated the early Dockercons -- is the (positive) disruption Docker had in how IT shops were run. Before that going to production was a giant effort, and 'shipping your filesystem' quickly was such a change in how people approached their work. We had so many people come up to us grateful that they could suddenly build services more quickly and get them into the hands of users without having to seek permission slips signed in triplicate.
We're seeing the another seismic cultural shift now with coding agents, but I think Docker had a similar impact back then, and it was a really fun community spirit. Less so today with the giant hyperscalars all dominating, sadly, but I'll keep my fond memories :-)
Funny comment considering lightweight/micro-VMs built with tools like Packer are what some in the industry are moving towards.
Some of those talks strangely make more sense today (e.g. Rump Kernels or unikernels + coding agents seems like a really good combination, as the agent could search all the way through the kernel layers as well).
"Ship your machine to production" isn't so bad when you have a ten-line script to recreate the machine at the push of a button.
Wonder when some enterprising OSS dev will rebrand dynamic linking in the future...
I don't care about glibc or compatibility with /etc/nsswitch.conf.
look at the hack rust does because it uses libc:
> pub unsafe fn set_var<K: AsRef<OsStr>, V: AsRef<OsStr>>(key: K, value: V)
So what do you do when you need to resolve system users? I sure hope you don't parse /etc/passwd, since plenty of users (me included) use other user databases (e.g. sssd or systemd-userdbd).
I think it’s laziness, not difficulty. That’s not meant to be snide or glib: I think gaining expertise in how to package and deploy non-containerized applications isn’t difficult or unattainable for most engineers; rather, it’s tedious and specialized work to gain that expertise, and Docker allowed much of the field to skip doing it.
That’s not good or bad per se, but I do think it’s different from “pre-container deployment was hard”. Pre-container deployment was neglected and not widely recognized as a specialty that needed to be cultivated, so most shops sucked at it. That’s not the same as “hard”.
Minus the kernel of course. What is one to do for workloads requiring special kernel features or modules?
I sort of had the problem in mind. Docker is the answer. Not clever enough to have inventer it.
If I did I would probably have invented octopus deploy as I was a Microsoft/.NET guy.
Good luck convincing people to switch!
Using it, solving problems with it, and building a real community around it tend to make a much greater impact in the long run.
That means unlike Gentoo, I've never dealt with a "slot conflict" where two packages want conflicting dependencies. And unlike Ubuntu, I have new versions of everything.
Pick 2: share dependencies, be on the bleeding edge, or waste your time resolving conflicts.
Absolutely not. Nix and Guix are package managers that (very simplified) model the build process of software as pure functions mapping dependencies and source code as inputs to a resulting build as their output. Docker is something entirely different.
> they’re both still throwing in the towel on deploying directly on the underlying OS’s userland
The existence of an underlying OS userland _is_ the disaster. You can't build a robust package management system on a shaky foundation, if nix or guix were to use anything from the host OS their packaging model would fundamentally break.
> unless you go all the way to nixOS
NixOS does not have a "traditional/standard/global" OS userland on which anything could be deployed (excluding /bin/sh for simplicity). A package installed with nix on NixOS is identical to the same package being installed on a non-NixOS system (modulo system architecture).
> shipping what amounts to a filesystem in a box
No. Docker ships a "filesystem in a box", i.e. an opaque blob, an image. Nix and Guix ship the package definitions from which they derive what they need to have populated in their respective stores, and either build those required packages or download pre-built ones from somewhere else, depending on configuration and availability.
With docker two independent images share nothing, except maybe some base layer, if they happen to use the same one. With nix or Guix, packages automatically share their dependencies iff it is the same dependency. The thing is: if one package depends on lib foo compiled with -O2 and the other one depends on lib foo compiled with -O3, then those are two different dependencies. This nuance is something that only the nix model started to capture at all.
Unpopular opinion, loosely held: the whole attempt to share any dependencies at all is the source of evil.
If you imagine the absolute worst case scenario that every program shipped all of its dependencies and nothing was shared then the end result would be… a few gigabytes of duplicated data? Which could plausible be deduped at the filesystem level rather than build or deployment layer?
Feels like a big waste of time. Maybe it mattered in the 70s. But that was a long, long time ago.
pnpm fixed it exactly the way you describe though: content-addressable store with hardlinks. Every package version exists once on disk, projects just link to it. So the "dedup at filesystem level" approach does work, it just took the ecosystem a decade of pain to get there.
Much harder to get reproducibility with C++ than JavaScript to say the least.
If you have adopted a bad tool then people are likely to want the bad tool in more places. This is the opposite of a virtuous cycle and is a horrible form of tech debt.
[1]: https://anil.recoil.org/papers/2025-docker-icfp.pdf
Have others found this to be the case? Perhaps we're doing something wrong.
The barrier to entry matters enormously for adoption. A Dockerfile that starts with `FROM ubuntu:22.04` and runs shell commands is immediately understandable to any ops person, even if they've never used Docker before. Meanwhile Nix, despite its superior hermetic builds, requires learning an entirely new paradigm.
That said, the reproducible builds issue __MatrixMan__ raised is becoming critical. I've seen subtle supply chain attacks that exploit the non-deterministic nature of traditional builds. The SOURCE_DATE_EPOCH support in buildkit is a good step, but we need more tooling that makes reproducibility the default, not an afterthought.
When compared to a VM, yes. But shipping a separate userspace for each small app is still bloat. You can reuse software packages and runtime environments across apps. From an I/O, storage, and memory utilization point of view, it feels baffling to me that containers are so popular.
Docker containers also do reuse shared components, layers that are shared between containers are not redownloaded. The stuff that's unique at the bottom is basically just going to be the app you want to run.
Why? It's not virtualization, it's containerization. It's using the host kennel.
Containers are fast.
You can hardly call this efficient hardware utilization.
For running your own machine, sure. But this would become non maintainable for a sufficiently multi tenant system. Nix is the only thing that really can begin to solve this outside of container orchestration.
I've recently switched from docker compose to process compose and it's super nice not to have to map ports or mount volumes. What I actually needed from docker had to do less with containers and more with images, and nix solves that problem better without getting in the way at runtime.
[1]: https://github.com/F1bonacc1/process-compose
What process-compose gives me is a single parent with all of that project's processes as children, and a nice TUI/CLI for scrolling through them to see who is happy/unhappy and interrogating their logs, and when I shut it down all of that project's dependencies shut down. Pretty much the same flow as docker-compose.
It's all self-contained so I can run it on MacOS and it'll behave just the same as on Linux (I don't think systemd does this, could be wrong), and without requiring me to solve the docker/podman/rancher/orbstack problem (these are dependencies that are hard to bundle in nix, so while everything else comes for free, they come at the cost of complicating my readme with a bunch of requests that the user set things up beforehand).
As a bonus, since it's a single parent process, if I decide to invoke it through libfaketime, the time inherited by subprocess so it's consistently faked in the database and the services and in observability tools...
My feeling for systemd is that it's more for system-level stuff and less for project-level dependencies. Like, if I have separate projects which need different versions of postgres, systemd commands aren't going to give me a natural way to keep track of which project's postgres I'm talking about. process-compose, however, will show me logs for the correct postgres (or whatever service) in these cases:
This is especially helpful because AI agents tend to be scoped to working directory. So if I have one instance of claude code on each monitor and in each directory, which ever one tries to look at postgres logs will end up looking at the correct postgres's logs without having to even know that there are separate ones running.Basically, I'm alergic to configuring my system at all. All dependencies besides nix, my text editor, and my shell are project level dependencies. This makes it easy to hop between machines and not really care about how they're set up. Even on production systems, I'd rather just clone the repo `nix run` in that dir (it then launches process compose which makes everything just like it was in my dev environment). I am however not in charge of any production systems, so perhaps I'm a bit out of touch there.
Why do you think other tools will make a comeback?
still, i use it every day and i don't see what replaces it. every "docker killer" solves one problem while ignoring the 50 things docker does well enough.
Buying more RAM for your server or only touching a select few images that are run most often is also a way to make things work. It might not be the most elegant software engineering approach, but it just works.
These are pretty handy to use
(article author here)
Apple containers are basically the same as how Docker for Mac works; I wrote about it here: https://anil.recoil.org/notes/apple-containerisation
Unfortunately Apple managed to omit the feature we all want that only they can implement: namespaces for native macOS!
Instead we got yet another embedded-Linux-VM which (imo) didn't really add much to the container ecosystem except a bunch of nice Swift libraries (such as the ext2 parsing library, which is very handy).
> It is also something developers seem to enjoy using
Count me out.
Is there any insight into this, I would have thought the opposite where developers on the platform that made docker succeed are given first preview of features.
Docker made convenient distribute some functionalities which were planned as stack of services rather than packaging appropriately for each distribution and handling to an administrator the configuration. That’s it. And it has many inconveniences. And Linux as well as BSD had containers before and chroots and many other things.
I want it not to just be invisible but to be missing. If you have kubernetes, including locally with k3s or similar, it won't be used to run containers anyway. However it still often is used to build OCI images. Podman can fill that gap. It has a Containerfile format that is the same syntax but simpler than the Docker builds, which now provides build orchestration features similar to earthly.dev which I think are better kept separate.
No, it does not make it "very difficult".
And like other comments here grumble - this rationale is essentially a sanctification of the sentiment of "It builds and runs on my system and I can't be bothered to make fewer assumptions so that it runs on yours".
Linux user space decided to try and share dependencies. Docker obliterates this design goal by shipping dependencies, but stuffing them into the filesystem as-if they were shared.
If you’re going to do this then a far far far simpler solution is to just link statically or ship dependencies adjacent to the binary. (Aka what windows does). Replicating a faux “shared” filesystem is a gross hack.
This is a distinctly Linux problem. Windows software doesn’t typically have this issue. Because programs ship their dependencies and then work.
Docker is one way to ship dependencies. So it’s not the worst solution in the world. But I swear it’s a bad solution. My blood boils with righteous fury anytime anyone on my team mentions they have a 15 minute docker build step. And don’t you damn dare say the fix to Docker being slow is to add more layers of complexity with hierarchical Docker images ohmygodiswear. Running a computer program does not have to be hard I promise!!
Personally, I think docker is dumb, so is AppImage, so is FlatPak, so are VMs… honestly, it’s all dumb. We all like these various things because they solve problems, but they don’t actually solve anything. They work around issues instead. We end up with abstractions and orchestrations of docker, handling docker containers running inside of VMs, on top of hardware we cannot know, see, control, or inspect. The containers are now just a way to offer shared hosting at a price premium with complex and expensive software deployment methods. We are charged extortionate prices at every step, and we accept it because it’s convenient, because these methods make certain problems go away, and because if we want money, investors expect to see “industry standards.”
The more recent half of my career has been more focused on ML and now robotics. Python ML is absolute clusterfuck. It is close to getting resolved with UV and Pixi. The trick there is to include your damn dependencies… via symlink to a shared cache.
Any program or pipeline that relies on whatever arbitrary ass version of Python is installed on the system can die in a fire.
That’s mostly about deploying. We can also talk about build systems.
The one true build system path is a monorepo that contains your damn dependencies. Anything else is wrong and evil.
I’m also spicy and think that if your build system can’t crosscompile then it sucks. It’s trivial to crosscompile for Windows from Linux because Windows doesn’t suck (in this regard). It almost impossible to crosscompile to Linux from Windows because Linux userspace is a bad, broken, failed design. However Andrew Kelley is a patron saint and Zig makes it feasible.
Use a monorepo, pretend the system environment doesn’t exist, link statically/ship adjacent so/dll.
Docker clearly addresses a real problem (that Linux userspace has failed). But Docker is a bad hack. The concept of trying to share libraries at the system level has objectively failed. The correct thing to do is to not do that, and don’t fake a system to do it.
Windows may suck for a lot of reasons. But boy howdy is it a whole lot more reliable than Linux at running computer programs.
There’s another one, at least IMHO, that this entire stack from the bottom up is designed wrong and every day we as a society continue marching down this path we’re just accumulating more technical debt. Pretty much every time you find the solution to be, “ok so we’ll wrap the whole thing and then…” something is deeply wrong and you’re borrowing from the future a debt that must come due. Energy is not free. We tend to treat compute like it is.
Maybe I’m in a big club but I have a vision for a radically different architecture that fixes all of this and I wish that got 1/2 the attention these bandaids did. Plan 9 is an example of the theme if not the particular set of solutions I’m referring to.
Docker images come in layers, which may or may not change depending on your release, and may or may not be shared across services.