What is Docker? An Introduction into Container Technology. -

–Originally posted on this linkedIn post and now moved to WordPress–

Introduction

In this blog, I will give a short Introduction into container technologies and I will give reasons, why I have chosen to start proof of concepts with Docker and no other container technology.

For more blogs in this series on Docker, see here.

Why Containers?

Containers are the the new way how run application in the cloud. It is high time to gain hands on experience with container technologies. However, before I plunge into the installation part, I will investigate the benefits of container technologies over other virtualization technologies and compare the different types of containers.

The VMware Hype: a History

I remember being very much excited about the new virtualization trend in the early 2000s: with the free VMware Server 2.0, it was the first time I had the chance to install and test our cool new IP telephony system for up to 100.000 telephones on my laptop. There even was the prospect of sharing defined test environments by sending around VMs containing all the network and VMs you need for testing.

Okay, that dreams of sharing whole test environments never came true, partly because VMs consume a tremendous amount of hardware resources, especially, if you have several VMs encapsulated inside a VM. Still, Hypervisor virtualization has proven to have tremendous operational advantages over hardware servers: in our lab as well as in our SaaS productive environment, we could spin up a new host whenever needed without waiting for our managers to approve the ordering of new hardware and without waiting for the delivery of the new hardware. Cool stuff.

Containers: are they Game Changers?

So, why do we need containers? Both, containers as well as hypervisors offer a way to provide applications with an encapsulated, defined, and portable environment. However, because of the high resource cost and low performance on my laptop, I (as a hobby-developer) do not develop my applications on a virtual machine. Even if would decide to do so, the environment on my laptop and the one in production will differ: I need to convert the image from VMware Workstation format to ESXi format. Network settings need to be changed. Java code needs to be compiled. Many things can happen until my application has reached its final destination in the production cloud.

Containers promise to improve the situation: in theory, containers are much more lightweight, since you can get rid of the Hypervisor layer, the virtual Hardware layer and the guest OS layer.

containers are light-weight and have a better performance

This does not only promise good performance. I was able to show here that a containerized Rails web application runs 50% than the same web application running on Windows Rails.

And this, although the container is located on a Virtualbox Linux VM that in turn runs on Windows. The 50% performance gain holds true, even if we work with shared folders: i.e. even if both, the Rails code and database are located outside the container on a mapped folder on the Linux VM. Quite surprising.

Note, however, you need to be careful with the auto-share between the Linux VM and Windows C:\Users folder as described in the official Docker documentation, which will cause a performance drop by a factor of ~10 (!).

Container technology increasing the Portability?

Docker evangelists claim that Docker images will work the same way in every environment, be it on the developer’s laptop, be it on a production server. Never say again: „It does not work in production? It works fine on on my development PC, though“. If you are starting a container, you can be sure all libraries in compatible versions are included.

Never again say: „It does not work in production? It works fine on on my development PC, though!“

Okay, along the DDDocker series, you will see that this statement remains a vision to be fulfilled by certain careful measures. One of the topics I stumbled over, is the topic of network connectivity and HTTP proxies. Docker commands as well as Cluster technologies like CoreOS depend on Internet connectivity to public services like Docker Hub and Cluster Discovery Services per default. In case of the discovery services, this is aggravated by the fact that the discovery services protocol does not yet support HTTP proxies.

Container images still might work in one environment, but not in the other. It is still up to the image creator to reduce the dependency of external resources. This can be done by bundling the required resources with the application image.

Even with container technologies, it is still up to the image creator to offer a high degree of portability.

E.g. I have found on this blog post, that CoreOS (a clustered Docker container platform) requires a discovery agent that can be reached without passing any HTTP proxy. As a solution of this cluster discovery problem behind a HTTP proxy, a cluster discovery agent could be included in the image. Or the cluster discovery agent can be automatically installed and configured by automation tools like Vagrant, Chef, Puppet or Ansible. In both cases, the docker (CoreOS) host cluster does not need to access the public cluster discovery service anymore.

Containers are getting social

With the more lightweight images (compared to VMware/KVM/XEN & Co), and the offered public repositories like Docker Hub, the exchange of containerized applications might be a factor that can boost the acceptance of containers in the developer community. Docker Hub still needs to be improved (e.g. the size of the images is not yet visible; the image layers and status of the images often are intransparent to the user), but Docker Hub images can help developers to get more easily started with new, unknown applications, since the image comes along with all software and libraries needed.

Containers helping with Resiliency and horizontal Scalability

Containers help with Resiliency and with scalability. E.g. CoreOS and Atomic offer options to easily build cluster solutions with automatic restart of containers on another cluster node, if the original node fails. Moreover, emerging container orchestration systems like Kubernetes offer possibilities to horizontally scale up the containers and their applications by making sure that the right number of containers of the same type are always up and running. Kubernetes also promises to unify the communications between containers of one type to the rest of the world, so other applications talking to a scaled application does not need to care, how many instances of containers it is talking to.

Why Docker?

Before plunging into the container technology, I want to make sure that I spend my free time on the container technology that has the highest possible potential of becoming the de facto standard. Digging into Google, I found articles about „Docker vs. Rocket vs. Vagrant“ and alike. Some examples are here and here.

Vagrant is mentioned often together with docker, but it is not a container technology, but a virtual machine configuration/provisioning tool.

I also find „docker vs LXC“ pages on Google. What is LXC? According to stackoverflow, LXC refers to Linux‘ capabilities of the Linux kernel (specifically namespaces and control groups) which allow sandboxing processes from one another. Flockport points out that LXC offers more like a lightweight virtual machine while Docker is designed for single applications. Docker will loose all „uncommitted“ data after a reboot of the container while LXC will keep the data similar to virtual machines. See here for more details.

The CoreOS project, a lightweight, clustered docker host operating system, has developed a competing container technology called Rocket (say: rock-it). When reading the articles, I still do not know who has won the race: Docker or Rocket. However, the post here states

Yet when you see both Amazon and Google announcing Docker container services within weeks of each other, it’s easy to assume that Docker has won.

When looking into Google Trends, the assumption that docker has won the race, seems to be confirmed:

Let us assume that Docker is a good first choice. It is also supposed to be more mature than rocket, since it is available longer than rocket. Anyway, rocket does not seem to be available for Windows (see its GitHub page).*

* = Okay, later I found out that Docker is supported on Windows only by installing a Linux host VM, which in turn is hosting the containers. A docker client is talking to a docker agent on the Linux host, so it looks as if docker was supported on Windows. Still, I think, Docker is a good choice, considering its popularity.

Summary

This blog has discussed the advantages of container technologies like Docker and rocket when compared to the traditional virtualization techniques like VMware.

Docker images are more lightweight and, in a first Ruby on Rails test, have proven a 50% higher performance of a web service compared to running the same service on native Windows.
Container images have the potential to improve portability. However, external dependencies limit portability.
Container help with resiliency, scalability and ease of operations by making sure that a container image gets restarted on another host, if the original host dies. Moreover, container orchestration systems like Kubernetes by googlemake sure that the administrator-specified number of containers is up and running at all times, and that this set of containers appears to the outside world, as if it was a single application.
Google trends shows that docker is much more popular than other container technologies like rocket and LXC.

Those are all reasons why I have started to dig deeper into the container technology, and why I have chosen Docker to do so. An overview of those efforts can be found here.

What is Docker? An Introduction into Container Technology.