Containerized environments are increasingly popular, and Docker remains the most popular container solution for developers. But the process of moving from virtual machines to containers is complex. If you’re just getting started with Docker, check out our list of 50 useful Docker tutorials for IT professionals, which includes tutorials for beginners, intermediate users, and advanced Docker pros.
It’s common to make mistakes during the transition from VMs to Docker containers, and it’s important to remember that Docker won’t fix all your problems in the cloud. There are also security issues you need to weigh in order to keep your environment fully secure both during and after the transition. Threat Stack’s Docker integration offers full visibility into your container environment, alerting you to internal and external threats — along with the context needed to understand what happened during a security event so you can take appropriate action.
Aside from failing to implement robust security measures for your containerized environment, people make other common mistakes make when switching to Docker containers. To gain some insight into the most common, we reached out to a panel of Docker experts and asked them to answer this question:
“What’s the biggest mistake people make in switching to Docker containers?”
Meet Our Panel of Docker Experts:
Read on to find out what mistakes you might be making when switching to Docker containers and how to avoid them.
Artsiom Som is a DevOps engineer at ScienceSoft.
“To mention some of the gravest mistakes…”
A great misstep is to pack more than one service inside a container, since containers are mostly used for microservices-based projects. Such bound services deprive you of many microservices’ and containers’ benefits. Namely, they hinder targeted updates and independent scaling of a function, they are more difficult to monitor and debug, they should be based on a compatible tech stack and developed by one single team, etc. Also, you may face a situation when one of the services in the container is much more overwhelmed than the rest and therefore tries to use all available resources to keep things going. However, in doing so, it leaves the rest of the packaged services out of business.
Another crucial mistake is to store the data inside the container. Containers are designed to be temporary components that are easy to start, stop, and replace. Thus, it is better to place data storage that has distinctive update and administration regulations outside the container.
Randy Chou is the CEO at Nubeva.
“The biggest mistakes people make when switching to Docker containers include these three missteps…”
- They try to run containers like a virtual machine (VM). They run multiple processes that require separate monitoring and configuration. A VM is a virtual server that emulates a hardware server. It relies on the system’s physical hardware to emulate the same environment where the applications are installed. Docker containers are much smaller and require fewer resources than a VM. They are much faster than a VM, taking just milliseconds to start a Docker container from a container image. And containers are more portable — so team members can share across the development pipeline.
- They use storage performance intensive applications with the default file system. To get the most from containers for data-intensive applications, enterprises must use persistent storage designed to fully support container frameworks and meet reliability requirements of the applications.
- They try to route networks in containers like they do in their own data center with VMs. Users can create multiple networks with Docker and add containers to one or more networks. Containers can communicate within networks but not across networks unless attachments are made to multiple networks.
Speed and efficiency are the biggest advantages of Docker containers over VMs. Given these advantages, Docker and VMs will co-exist, which ultimately gives DevOps more choice when running their cloud applications.
Artem Aksenkin is a DevOps engineer at Belitsoft. He has been working in DevOps and systems engineering for over 7 years, starting in the military and then moving to private enterprise. Currently, he is a part of a team delivering a large telecommunications application.
“Docker images have a layered file system…”
Each directive written in the Dockerfile creates a new layer. And one of the typical mistakes is placing too many instructions into a single directive:
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
&& rm -rf /var/lib/apt/lists/* \
&& mkdir -p /usr/share/jenkins \
&& chmod 755 /usr/share/jenkins \
&& chmod 644 /usr/share/jenkins/slave.jar
Every time you change or add an instruction in that directive, you’ll cause the whole layer to rebuild from scratch. This is a waste of time and resources. Instead of overcrowding the layer, you should just add another directive:
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
RUN mkdir -p /usr/share/jenkins \
&& chmod 755 /usr/share/jenkins \
&& chmod 644 /usr/share/jenkins/slave.jar
This way, the existing directive will be executed in milliseconds, and only the new additions will need to be built.
Rob Black, CISSP is the Founder and Managing Principal of Fractional CISO. He helps organizations reduce their cybersecurity risk as a Virtual CISO. Rob earned an MBA from the Kellogg School of Management and two engineering degrees from Washington University. He is the inventor of three security patents. He consults, speaks, and writes on IoT and security.
“Don’t run as root…”
A container that runs as root has privileges that can affect the host. This could have detrimental effects on the security of your system. For example, one Docker instance could shut down the host, turning off all the other Docker instances. That would be bad!
Instead, what you should do is use Docker Namespaces. This will allow your Docker instances to run as user with a lower set of privileges. This user will not (hopefully) be able to negatively impact the host or other Docker instances. Preventing privilege escalation is so important, especially for someone new to Docker.
This is not an academic problem. Developers are often just trying to get things to work. Once they do, they don’t think about what privileges the instance should be operating under. I have worked with many developers who were ready to launch their solution. When I asked what privileges were associated with the container, they invariably said “root.”
Don’t be that person!
Kyle Sloka-Frey is a Partner and Technical Lead at Aces Design.
“One of the biggest mistakes we run into is…”
Not planning the size of your Docker containers correctly. It’s easy to build a container for a specific purpose, then a few weeks later think, well we need a container for this similar process, let’s add it in. A few months down the line, you end up with a monstrosity of a container with too large of an environment, and too many services being set up unnecessarily.
Take the time to put together new containers for each instance that makes sense.
Bich Le is co-founder and Chief Architect at Platform9. In 14 years at VMware as Principal Engineer, Bich enhanced x86 virtualization performance, developed P2V Migration, USB emulation, and virtualized application management. More recently, he became involved in web technologies through a secure web browsing product. That combined work resulted in the granting of more than 10 patents.
“Trying to convert a virtualized application to…”
A container with no modifications — that is, treating a container as a VM. This is an anti-pattern. A Docker container works best when hosting a specific component of an application. It doesn’t work well when multiple programs and scripts are stuffed into it. Doing that will make it difficult to build the Docker image and to operate and debug the resulting container.
Jack Bedell-Pearce is the CEO of 4D Data Centres, a UK-based cloud and technology company.
“One potential pitfall people make when moving to containerized applications is…”
When they use more than one process or function in a single container. Expecting a one-to-one mapping from non-containerized applications can cause problems for you as well. When moving to the cloud, containerization requires planning and possible changes to how the application processes data and the overall workflow. Trying to map one-to-one and move into a containerized solution without re-adapting your application can cause unforeseen bottlenecks.
The best solution is to move slowly, one service or process at a time, optimize that, test thoroughly, and then move the next part of the application. Training staff is key when moving to containers — certain types, such as Docker, use different management techniques and have many new commands to learn. You can build and test a container first, then deploy to the host(s) later — mixing Docker and Puppet can automate large deployments, upgrades, and rollouts. However, unless there is a specific need for an application to scale, the additional management overhead of Docker containers without a centralized automation system to deploy (and the change in commands / functions / network when not using it often) means that it can sometimes be more complicated than using a sensibly set up VM with all the services running together.
Rahul Varshneya is the co-founder of Arkenea, a custom software development company, and an advisor to several technology startups, including a publicly listed company.
“Docker is a great tool for making operating system level virtualization…”
This open source software development platform is great for packaging the applications in containers and making them portable through any system that runs on Linux OS. The biggest mistake people make while switching to Docker containers is not adhering to the container architecture.
The container is not a typical Linux environment. As a rule of thumb, Docker runs one process per container (i.e., for every service that runs through the application, a different server should be used). Having multiple services running in the same container is a big no.
Having more than one service running in the same container results in the following issues: It leads to the creation of additional dependencies and layers, which result in a slower build; it makes horizontal scaling of the application a difficult task and makes the process of writing the Dockerfile and its subsequent maintenance and debugging a difficult task.
In case it is necessary to have two or more processes running within the same container, it is preferable to have a baseline Linux image to handle the services.
Ian McClarty holds an MBA from Thunderbird School of Global Management. He has over 20 years’ of executive management experience in the cybersecurity and data center industry. Currently, he is the CEO and President of PhoenixNAP Global IT Solutions.
“One of the mistakes when switching to Docker is going all-in, all-at-once…”
Docker must be launched in a well-thought-out architecture and planned accordingly where the benefits of Docker as a code deployment method, can be taken advantage of.
There is plenty of legacy code that will not take advantage of Docker. There needs to be a cultural shift in software architecture, and code has to be written in a way that is more microservices oriented.
Security is also a significant concern with Docker deployments; more time needs to be spent locking the platform down and utilizing good security business practices along the way.
Jeff Capone, PhD, is co-founder and CEO of the security startup, SecureCircle. With expertise in enterprise software development, network and storage solutions, and IoT applications, Jeff has a track record of founding and selling successful software companies. Previously, Jeff served as CTO at NETGEAR and CEO/Co-Founder of Leaf Networks, which was acquired by NETGEAR.
“The biggest mistake people make in switching to Docker containers is that…”
Containers should be treated as a standard virtual machine instead of a disposable, immutable process-in-a-package. Data should not be stored directly in containers; rather, data should be stored in external volumes or databases. Configuration, likewise, should be provided externally. Container images should be limited to only the dependencies needed for your process to run, since development libraries and user tools make images larger and harder to distribute. Stick to running one process per container, as multiple processes increase container management overhead.
Tim has more than 15 years’ of software development experience. He started his career at Aprimo, co-founded GreenSuite, worked in a customer success leadership role at an Austin-based tech company, served as a Director of Software Engineering at Salesforce, and is now the VP of Enterprise Delivery at Lumavate.
“Individuals tend to start too BIG…”
A Docker container is intended to be process specific and often, when starting with Docker, the initial thought is to throw an entire application into a container and run it holistically, without taking a look at the barebones requirements for the application being run. For example, many individuals start by selecting an underlying OS (like Ubuntu) to base a container from, rather then just starting from a slimmed down (and even hardened) version needed for the application, like alpine. Likewise, using multi-staged builds can significantly reduce the overhead when running an application in a container and should be used from the start to improve overall efficiency, understanding, and security across containers.
Manuraj M.R. is a full-stack Web Application Developer and a Senior Software Developer, currently working at Fingent Corp.
“The biggest mistakes people make when making the switch to Docker containers are addressed in the following list of best practices:”
- Don’t run processes as root user: Processes in a container should not run as root or assume that they are root. Instead, create a user in your Dockerfile with a known UID and GID, and run your process as this user. Images that follow this pattern are easier to run securely by limiting access to resources.
- Never store data in containers: A container can be stopped, destroyed, or replaced. An application version 1.0 running in a container should be easily replaced by the version 1.1 without any impact or loss of data. For that reason, if you need to store data, do it in a volume and use non-persistent data.
- Never use the “latest” tag: On creating containers, don’t use the latest tag since this will create problems in the future. This is because when you run the container after some months, there might be newer versions of the “latest” tags, and this will probably get you into version dependency issues and failures.
- Don’t run more than one process in a single container: Containers are meant to run a single process, but if you have more than a single process, you may have more trouble managing, retrieving logs, and updating the processes individually. So it will be better to use Docker Compose and create multiple services or containers for each process instead of using a single container.
- Don’t store credentials in the image: Use environment variables. Please don’t hardcode any usernames/passwords in your image. Instead, use the environment variables to retrieve that information from outside the container.
- Don’t install unnecessary packages: A good image should contain only the required packages and files. Avoid installing updates and unnecessary packages because this will increase the size of the container and make it difficult to distribute.
- Do not use IP addresses: Every container has its own internal IP address, and it could change if you restart the container. If your application needs to communicate with another container, use service names or use environment variables instead of IP.
- Ignoring caching in Dockerfiles: It won’t take long to notice problems with the Dockerfiles build cache because your container images will suddenly start taking a very long time to build. If you’re using behaviours like ADD, VOLUMES, or RUN in the wrong places, those may be invalidating your cache. Once the cache is invalidated, all subsequent Dockerfile commands will generate new images, and the cache will not be used.
Dan Wahlin founded Wahlin Consulting, which provides training and architecture services on front-end and back-end web technologies, Microservices, and Docker/Kubernetes. He has published multiple courses on Pluralsight.com and is a Docker Captain, Microsoft MVP/Regional Director, and Google GDE. Dan speaks at multiple conferences and runs the Code with Dan development newsletter.
“Containers provide many benefits to developers and DevOps alike, but it’s not all roses…”
While images and containers provide a great way to consistently deploy applications between environments, the biggest mistake I see companies making is failing to adequately plan for container monitoring.
Kubernetes provides a great way to deploy, scale, and even heal containers, but without a rich set of tools to monitor the status of containers (as well as clusters and pods), it can be challenging for companies to know how healthy an application actually is. This especially true when containers are used to deploy multiple microservices at scale.
Companies that don’t adequately plan to monitor container availability and health, resource metrics, network connectivity, and more end up trying to find a needle in a haystack when a problem occurs. Several monitoring tools exist, but the monitoring process and solution needs to be baked into the overall company planning and processes before going live with an application to help identify and solve issues that will certainly arise.
Randy Apuzzo is the Founder and CTO of Zesty.io.
“For Docker, if you were running CRON JOBS, you better find another solution…”
Also, internal networking can get messy when switching to Docker. If you were previously running local databases, consider running those as cloud services.
Peter Baxter has 20+ years in technology and is currently the Director of Products at human_code.
“Definitely one of the easiest mistakes is to…”
Set up one container that handles too many processes.
Separating the containers to run their own processes makes processing quicker and more powerful. For example, with one of our customers, we tried to set up one Docker container to run Apache and MySQL, but there was way too much data to be able to run both in the same container, so we ended up separating these into their own containers.
Davy Hua is a leader and entrepreneur who is on a lifelong mission to optimize complex infrastructures. During the last two decades, he has specialized in designing and managing complex DevOps infrastructures. After working as an Engineering Manager, Principal Engineer, and Architect as well as founding his own company, he is presently Head of DevOps at ShiftLeft Inc., an application security start-up.
“People who are making the switch to Docker containers from their legacy application tend to…”
Misinterpret the purpose of using containers as the preferred deployment model. They tend to map the application running on a server as a 1:1 relationship within a container.
Oftentimes an application has multiple services and as such, the mistake is committed when these are lumped into a single container. As a best practice, each Docker container should contain exactly one running service or daemon. In a legacy application, deploying multiple containers as a single pod or group can be an effective approach to optimizing the switch without violating best practice guidelines for Docker containers.
Marko Anastasov is a co-founder of Semaphore, a cloud-based CI/CD platform.
“Docker containers standardize, but also change how deployment is being done…”
Compared to a full PaaS, such as Heroku, using Docker is more complicated for the average developer. It requires the team to learn and adopt a whole new DevOps toolchain.
A team can easily underestimate the amount of work required to bring container-based systems to production reliably. If an organization doesn’t manage it as a project with clear goals and dedicated resources, moving to Docker may distract the team and slow down development for a period of time.
However, containers are the only way to deploy to cloud that doesn’t lead to vendor lock in, so making the effort is worth it in the long run.
Nadav Leibo is a DevOps Engineer at Namogoo, which is dedicated to creating better digital experiences through customer hijacking prevention and third party tag monitoring.
“A very common mistake when migrating to Docker containers is…”
Trying to write application logs in the same way as writing them in a server. Unlike servers, Docker containers are supposed to print to the standard output (STDOUT) and run them in the foreground, but it is not uncommon to mistakenly write logs to a local file. This very common mistake makes it much harder to debug, especially in a large environment with several containers. In fact, when a Docker container restarts or gets replaced by a new container, for example during deployment, all local files not included in the image get deleted. In Docker, you would need to integrate with a centralized logging solution and redirect logging to the STDOUT.
Sean McGowan is the content writer at Codal.
“One of the biggest mistakes people make when implementing Docker is…”
Not using Kubernetes to manage multiple Docker containers. While Docker is by far the most popular container build solution, it’s missing one crucial function. Luckily, Kubernetes is an open source solution that helps you manage what Docker can’t.
Say you have multiple containers running, maybe even across different machines or environments. How do you coordinate the isolated containers to operate as one system? How will they communicate? Integrate? What if one fails? This isn’t a far-fetched scenario — in fact, it’s necessary if you’re running microservices. And thankfully, there’s a clean, simple tool on the market that allows you to easily orchestrate and manage multiple containers at once.
Another mistake people make when implementing Docker is storing sensitive information like passwords or environmental variables in the containers, rather than on the cloud. If the container stops, the information that you stored locally can be publicly accessed. Instead, make sure that you’re only storing temporary data in the containers — keep the sensitive information under lock and key in a private server.
Justin Hough, CDO at Hounder, has been developing web applications for the last 15 years. At Hounder, he is responsible for building incredible and scalable web applications across all platforms (web, mobile, desktop and server) for startups to large enterprise companies.
“A common mistake that I have seen when developers switch from another development stack to Docker containers is…”
Not taking the time to fully document the setup and the underlying technology stacks that need to run in concert within separate Docker containers. For example, I have seen many containers running possibly a PHP environment, but the database might be run from a local MySQL instance (not in Docker) or from a remote MySQL service. This type of setup could cause unexpected connection issues if the MySQL instances differ from dev to production. Also, I have seen it where the contents of the PHP environment are deployed to a production server outside of Docker, which thus makes Docker in those cases more of a development environment than a true source of truth for your application.
Docker for me has become far more powerful and useful in development and debugging common problems when the entire development stack is structured to use Docker from local to production. It’s especially important when there’s a need to get new developers set up quickly or when spinning up new load balanced instances that are true clones of each other.
Brian is a Senior Software Architect at custom software development firm Small Footprint. As a solutions architect he is always trying to figure out a better way to solve a problem and works with the development teams to oversee delivery of the solution to ensure that technical goals are met.
“One of the biggest mistakes teams make when starting to use Docker is…”
Not understanding that containers shouldn’t be used to store data, as changes are reset when the container is restarted. For services like a database, data volumes or external services like Amazon RDS should be used instead.
Final Words . . .
If you’re ready to make the switch to Docker containers, don’t lose sight of the importance of securing your containerized environments. Threat Stack’s Docker integration will help you make smarter security decisions while you’re streamlining your data consumption. In fact, you can even deploy Threat Stack as a container.