envd vs. Other tools
envd is a tool to build portable environments for machine learning and data science scenarios with Python-like syntax.
Benefits of using
- Clear dev environment setup and management for each project. No longer worry about the conflict between CUDA, system package, virtualenv, etc. Each project will have its dev environment. And you can easily use the environment on any machine and share it with your collaborators.
- Unify dev and production environments by
envd's multi-target supports.
- Integrate with container ecosystem.
envdcan be used in any container, such as Docker, Kubernetes, etc.
- Dataset support.
envdcan be used to download and preprocess datasets.
How does it work?
- Parallelize the image build process. Powered by buildkit,
envdcan parallelize different tools installation to minimize the build time.
- Caching support for standard installation tools, such as pip, conda, apt, and so on. In a standard dockerfile build, each layer depends on the previous layer. Any change that happens before will trigger the re-build for all the later stages, which greatly increases the build time.
envdhad built-in cache support for most common-used tools and reduced the dependency between them, which minimizes the time to wait when you make modifications to your environment.
- Python-like language syntax.
envduses a python dialect called starlark as the language and provide a set of built-in functions to simplify your burden. Declare what you want, and
envdwill take care of the rest, including user permission, ssh server, entrypoint setup, etc...
envd vs. conda
envd doesn't conflict with conda. You can also use conda inside
envdenvironment is built from the pure official image. There will be no legacy installation artifacts. Only those you declare in the file will appear in the environment. conda's environment isolation functionality depends on environment variable and rewrite library's rpath. Users might meet conflict between system package and the conda package (such as CUDA). In corner cases, it might result in unexpected behaviors.
envdcan export containers used for production or pipeline stages, narrowing the gap between development and production. You can build multiple environments for different purposes(research, development, serving, data processing, etc.) from the same file.
envd vs. docker
envd is built on top of the docker ecosystem, and fully compatible with docker.
envdparallelizes the image build process, accelerating the build process.
- docker uses dockerfile as the major language.
envduses starlark as the language, which is more familiar to the data science practitioners.
envdsupports multi-target build from a single file, while docker usually needs to write multiple files for each scenario, which makes reuse of the building block complicated.