Infrastructure as Code

Infrastructure as code and Docker
Lesson 1: What Is Infrastructure as Code?
Infrastructure as code: What is it? Why is it important?
(This is a fantastic 3 minute explanation of the value of Infrastructure as Code.)

Infrastructure as code (IaC) as the idea that, rather than manually provisioning servers, or setting up hardware through a point-and-click GUI, the "server room" should itself be managed by code. That code can then be put under version control, tested, deployed with automated build tools, and so on. The code also serves as necessarily up-to-date documentation of what the infrastructure is.

The advantages of IaC can be divided into three main categories:

  • Cost savings: By automating hardware provisioning, the time of the people who would have been doing that by hand is freed up for other tasks.
  • Speed of deployment: It is much faster to configure infrastructure by running a script than by manually setting a bunch of parameters in a GUI interface.
  • Lower error rates: It is error-prone, because it is boring, too configure systems "by hand." A script can be debugged once, and then will run reliably again and again. Furthermore, as code, the infrastructure can be read and reasoned about. It is very hard to do that with a bunch of check-boxes!

(Source: Wikipedia on Infrastructure as Code )

A Software Engineering principle: Asking people to behave like automatons bores and dehumanizes them. Asking them to devise clever ways to automate things interests them, and treats them as the rational beings that they are!
Aristotle: humans are rational animals.

Lesson 2: Available Tools
Puppet
Pros:
  • Easy installation
  • Supports all majors operating systems
  • GUI is user friendly
  • Stable and mature solution
Cons:
  • New users must learn Puppet DSL (domain-specific language)
  • Remote execution is challenging

More on puppet

Chef
Pros:
  • Meant to be used by programmers
  • Useful for large-scale development
  • Stable and mature solution
  • Good version control capabilities
Cons:
  • Complicated tool to use
  • Familiarity with Ruby is desirable.
  • Documentation can be overwhelming

A sample Chef script that provisions a MySql server:

                
    service "mysql" do
      supports :restart => true
      action :enable
    end

    if ['solo', 'db_master', 'db_slave'].include? node[:instance_role]
      template "/etc/mysql.d/custom.cnf" do
        owner 'root'
        group 'root'
        source 'custom.cnf.erb'
        notifies :restart, resources(:service => 'mysql')
      end
    end

                
                

More on Chef

Ansible
Pros:
  • Easy and fast deployment
  • Secure SSH connection
  • Meant for environments that can scale rapidly
  • Push and pull models are supported
Cons:
  • Basic support for Windows
  • GUI is not very interactive
  • Difficult to locate syntax errors with YAML

Sample Ansible playbook:

                
    - name: Install WordPress, MySQL, Nginx, and PHP-FPM
  hosts: all
  remote_user: root
  # remote_user: user
  # become: yes
  # become_method: sudo

  roles:
    - common
    - mysql
    - nginx
    - php-fpm
    - wordpress
                
                

Source: https://github.com/ansible/ansible-examples/ blob/master/wordpress-nginx/site.yml

More on Ansible

SaltStack
Pros:
  • Implemented in Python and controlled with YAML files which are simple to understand
  • Fast communication between master and client
  • Provides high scalability and resiliency
  • Vibrant support community
Cons:
  • Difficult to set up
  • Salt GUI is under development
  • Does not support a variety of Operating Systems
Lesson 3: What Is Docker?
Docker versus Virtual Machines
Docker structure

Before we look at how to run Docker, let's examine of few aspects of what Docker is. A very simple way to describe it, for those who know virtual machines, is that it is a light-weighter, somewhat-less-walled-off, virtual machine. But this is only a way to give you a rough idea of how to think about Docker: let's go deeper into the details and see what it really is.

The most important feature of Docker is its extremely fast load time: a container can start in under a second, while a virtual machine may take minutes to boot. Docker achieves this speed increase by building on top of virtualization aspects of the Linux kernel. And that is why it is not as fast, and more difficult to work with, on Mac or Windows boxes: it is actually running a Linux virtual machine, that then runs Docker. Ironically, this may mean that your container actually starts up much faster on a busy cloud service than it does on your otherwise idle laptop.

How Docker rests on Linux kernel services

Here are the major technical capabilities upon which Docker relies:

  • cgroups
    cgroups, short for "control groups," limit a process, and all of its children, to certain amounts of CPU, memory, and network bandwith, and particular namespaces and devices on the system. When we kick off a Docker container, it runs within a cgroup that limits what it can do.
  • namespaces
    Here, the Wikipedia article on Linux namespaces describes these as well as I could wish to, so:
    "Namespaces are a feature of the Linux kernel that partitions kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources. The feature works by having the same name space for these resources in the various sets of processes, but those names referring to distinct resources. Examples of resource names that can exist in multiple spaces, so that the named resources are partitioned, are process IDs, hostnames, user IDs, file names, and some names associated with network access, and interprocess communication."
    "Namespaces are a fundamental aspect of containers on Linux." (See source below.)
    It is worth noting that Google Chrome uses namespaces as a security, not a virtualization, technique.
  • libnetwork
    This library enables separate network connections and IP addresses for processes running on the same machine, so that with docker compose we can run an entire distributed application on our laptops.
  • Union File System
    As Wikipedia has it, "In computer operating systems, union mounting is a way of combining multiple directories into one that appears to contain their combined contents."
    UNIX and Linux filesystems are themselves files. To access them, they must be mounted at some point under the root directory. If we mount two directories at the same mount point, one after another, we will only find the contents of the second there. But if we union mount them, we will have the union of the two.
    Union mounting is how Docker builds up your image from multiple layers of docker images. That means you don't have to start from scratch every time you want to make an image. And containers can share files until one of them writes to one, at which point that file will be moved from the read-only shared area to the read-write layer specific to that container.
  • AppArmor
    "AppArmor ("Application Armor") is a Linux kernel security module that allows the system administrator to restrict programs' capabilities with per-program profiles. Profiles can allow capabilities like network access, raw socket access, and the permission to read, write, or execute files on matching paths." -- Wikipedia
  • libcontainer / runC
    This C library (which appears under either name) executes programs in a chroot jail environment. This means that a program is "jailed" within the directory tree rooted in the directory in which the program was started.
  • Netlink
    "The Netlink socket family is a Linux kernel interface used for inter-process communication (IPC) between both the kernel and userspace processes, and between different userspace processes, in a way similar to the Unix domain sockets. Similarly to the Unix domain sockets, and unlike INET sockets, Netlink communication cannot traverse host boundaries." Wikipedia
  • Netfilter
    Netfilter is a framework provided by the Linux kernel that allows various networking-related operations to be implemented in the form of customized handlers. Netfilter offers various functions and operations for packet filtering, network address translation, and port translation, which provide the functionality required for directing packets through a network and prohibiting packets from reaching sensitive locations within a network. Wikipedia

Sources:
Docker: What's Under the Hood?
Linux Namespaces
Introduction to Control Groups
Union mount from Wikipedia Docker and Union File System

Lesson 4: Running Docker

by
Prashantkumar Patel and Prof. Callahan

First thing: Make sure you have Docker installed! You won't get any further in following along on your laptop if you do not.

Secondly: Please clone (if you have not already) our online DevOps repo:
git clone https://github.com/gcallah/OnlineDevops.git

Once you have cloned that repo, please open two shells: in one, we will look at your local environment, and in the other we will explore the container.

In one of the two shells, in your OnlineDevops directory, please run:
./container.sh
That should put you inside the OnlineDevops container: if that command worked, you should see your prompt change.
If it did, let's explore the shells you are in a little to try to understand better what a container is.
I am going to proceed by showing you the results of running the same command inside and outside the container on my machine: your results will be different, but similar, to mine.

First of all, let's look at the root file system, inside and outside the container:

Outside the container:
ls /

    Macintosh:OnlineDevops gcallah$ ls /
    Applications            bin                net
    Library                cores                opt
    Network                debug.txt            private
    Shockwave Log            debug.txt.1            sbin
    System                dev                tmp
    User Guides And Information    etc                usr
    User Information        home                var
    Users                installer.failurerequests
    Volumes                logFile.xsl
                    

Inside the container:
ls /

    root@a5dd222a9812:/home/DevOps# ls /
    bin   dev  home  lib64    mnt  proc           root  sbin  sys    usr
    boot  etc  lib     media    opt  requirements.txt  run   srv   tmp    var
                    

What's of note here: From outside and from inside the container, we see completely different file systems! Inside the container, we are in a chroot file system.

What about our view of what processes are running?
Outside the container we see:
ps -ef | wc -l

     Macintosh:OnlineDevops gcallah$ ps -ef | wc -l
     667
                    

From outside the container, the OS lists 667 processes as running on my Mac.

Inside the container we see:
ps -ef

    root@a5dd222a9812:/home/DevOps# ps -ef
    UID        PID  PPID  C STIME TTY          TIME CMD
    root         1     0  0 Oct25 pts/0    00:00:00 bash
    root       192     1  0 21:18 pts/0    00:00:00 ps -ef
                    

From inside the container, there are two processes running! The container has process isolation from the host. It has its own process namespace separate from its host's namespace and from the namespace of any other containers running on that host. The separate namespace also provides the container with its own hostname, its own user IDs, and its own inter-process communication names.

Some Docker commands

Now let's look at what some Docker commands are available, and what they do.

docker ps

This will list your currently running Docker images. When Prof. Callahan runs it while preparing this lecture, he sees:

    Macintosh:OnlineDevops gcallah$ docker ps
    CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                     NAMES
    429ff1a37d21        devops              "bash"              11 seconds ago
    Up 6 seconds        0.0.0.0:32768->8000/tcp   stupefied_swanson
                    

(The 0.0.0.0 is the IP address to use for the container, and 32768 is the port it is using.)

docker ps -a
This will list all images that have been run on the system, not just those that are active:

    Macintosh:OnlineDevops gcallah$ docker ps -a
    CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                           PORTS                    NAMES
    c560c3729d7e        devops              "bash"              18 minutes ago
    Up 18 minutes                    0.0.0.0:8000->8000/tcp   brave_kirch
    429ff1a37d21        devops              "bash"              About an hour ago   Exited (130) 44 minutes ago                               stupefied_swanson
    32b272487675        devops              "bash"              5 days ago          Created                                                   wonderful_heyrovsky
    a5dd222a9812        devops              "bash"              10 days ago         Exited (130) About an hour ago                            naughty_beaver
    e82e4fb65823        devops              "bash"              11 days ago         Exited (130) 11 days ago                                  youthful_noyce
    0433f774c394        devops              "bash"              11 days ago         Created                                                   stupefied_hawking
    1ba39e023554        devops              "bash"              11 days ago         Created                                                   practical_ptolemy
    4a6bc8cf7ab0        devops              "bash"              11 days ago         Created                                                   vigilant_curie
    47cb14ca5b1b        devops              "bash"              2 weeks ago
    Exited (255) 11 days ago         0.0.0.0:8000->8000/tcp   determined_goldwasser
    7af5f14f88d9        f418f33054e8        "bash"              2 weeks ago         Exited (130) 2 weeks ago                                  modest_hypatia
    e3292cdc1449        indra               "bash"              2 weeks ago         Exited (130) 2 weeks ago                                  competent_hugle
    3ad630752ebf        indra               "bash"              2 weeks ago         Exited (130) 2 weeks ago                                  amazing_meitner
    fe439d8e15c2        indra               "bash"              2 weeks ago         Exited (130) 2 weeks a
                        

docker images

This command should give you the list of images that are available on the system. For example in Prashant's system it looks something like this:

    ENG-EJC369-02:$ docker images
    REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
    gcallah/jenkins_py3   latest              21130e104ca1        4 weeks ago         742MB
    jenkins               latest              cd14cecfdb3a        6 weeks ago         696MB
    gcallah/indra         v7                  ad9670e8b27f        2 months ago        946MB
    python                latest              efb6baa1169f        5 months ago        691MB
    ubuntu                latest              f975c5035748        5 months ago        112MB
    gcallah/emu86         v4                  f6833ae8bf9e        6 months ago        776MB
    gcallah/django        latest              432de70e222d        6 months ago        769MB
    bash                  latest              59507b30b48a        6 months ago        12.2MB
    alpine                latest              3fd9065eaf02        7 months ago        4.15MB
                    

In the above listing, the nginx image is not installed. Running the following command will pull the nginx image from DockerHub, which is like GitHub, but for docker images.

docker pull nginx

Now that you have installed the nginx image, just run the docker images command again, in the list you should see ngnix image as below:

    ENG-EJC369-02:$ docker images
    REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
    nginx                 latest              71c43202b8ac        7 hours ago         109MB
    gcallah/jenkins_py3   latest              21130e104ca1        4 weeks ago         742MB
    jenkins               latest              cd14cecfdb3a        6 weeks ago         696MB
    gcallah/indra         v7                  ad9670e8b27f        2 months ago        946MB
    python                latest              efb6baa1169f        5 months ago        691MB
    ubuntu                latest              f975c5035748        5 months ago        112MB
    gcallah/emu86         v4                  f6833ae8bf9e        6 months ago        776MB
    gcallah/django        latest              432de70e222d        6 months ago        769MB
    bash                  latest              59507b30b48a        6 months ago        12.2MB
    alpine                latest              3fd9065eaf02        7 months ago        4.15MB
                    

(You can put an image into DockerHub using docker push.)

Download website code

Ok now that you have download the nginx image, let's download the static website that you are going to host inside the docker container. We are going to use the algorithms website for another course. You can find the code for the website here.

Please remember the location where you have cloned the repository. Prashant has cloned it in /Users/prashant/school/algorithms. Your location will be different than this, please note the location.

Let's make a container

Ok, so we have downloaded the nginx image and the code of the website which we want to host. All we need to do is just make a container out of the image. We will put the code of website inside the container so that the web server which is nginx in our case can read the html files and host it in a local server. The command for that is as shown below.

docker run --name algo_website -p 127.0.0.1:8080:80 -v /Users/prashant/school/algos/:/usr/share/nginx/html -d nginx

Please don't forget to change the location of algorithms directory in above command. After you run the command open the browser and type http://localhost:8080 and you should be able to see the webpage. Windows user should use the address http://0.0.0.0:8080 instead of localhost.

After we leave the container, we can get rid of it using docker rm algo_website. If we need to remove a container that is still running, we will have to stop it first with docker stop.

Our Docker Implementation

We use Docker in our projects for two main reasons, with a third to come:

  1. To set up a local version of a web server that will be configured "just like" our production server. ("Just like" is in quotes because that is always the ideal, but it may not be fully achieved.)
  2. To provide our full suite of development tools, such as the correct Python version, make, flake8, various Python libraries, etc., in one simple to build package, so all developer's have a consistent environment.
  3. Ultimately, we should be deploying the container where we locally test our web servers right into production, guaranteeing that development and production are identical environments. Unfortunately, at the moment, the places we are hosting do not support that. We are exploring other options.

So we need to know how to create the right container for each project. Each project we work on should have a Dockerfile consisting of instructions on how to build the image for that project, a requirements.txt listing what external modules need to be included in the image, and a line in the project's makefile automating the build of the image. This is infrastructure as code, since the infrastructure for the project is built from these files of code. (For development containers, we also should have a file called requirements-dev.txt.)

So, in the makefile we want something like:

container: $(DOCKER_DIR)/Dockerfile $(DOCKER_DIR)/requirements.txt
        docker build -t indra docker

  • Here is a sample Dockerfile.
    The FROM python:3.6.0 command says what base image to build our image from.
    The COPY requirements.txt /requirements.txt line brings the requirements file inside the container.
    The line RUN pip install -r requirements.txt installs everything from the requirements file in the container.
    The line ENV user_type TERMINAL sets an environment variable (user_type) that will be available inside the container.
    WORKDIR /home/IndrasNet/ sets the starting directory inside the container.
  • Here is the requirements file it uses.
Our More Advanced Docker Usage

In the Online DevOps course project, we use a more advanced Docker setup: we employ Docker Compose to define and run an application composed of more than one Docker container.

Docker Compose use a YAML file to specify the configuration of a multi-container Docker app.

In the Online DevOps course setup, we are running just two containers: one to run the MySQL database, and the other to run our Django web server. But other Docker Compose setups might include a web server, a load balancer, a database, an authentication server, and more!
Here is the YAML file that specifies our two-container application.

Other Readings
Quiz

    The Docker server is...?

    1. creating images
    2. creating containers
    3. maintaining containers
    4. all of the above

    Infrastructure as code is less error-prone than manual deployment because...?

    1. it can be debugged once and that's that
    2. manual deployment is boring, and humans will make errors
    3. we can look over the code and reason about it
    4. all of the above

    'COPY index.js .' means...?

    1. we are turning index.js into the hidden file
    2. we are copying index.js from the current working directory inside the Docker container to the local filesystem
    3. we are copying index.js from the local filesystem to the current working directory inside the Docker container
    4. the above code is NOT syntactically correct

    Among the reasons infrastructure as code (IAC) is desirable is/are...?

    1. IAC can be put under version control
    2. IAC can be be automatically deployed
    3. IAC can be put through automated testing
    4. all of the above

    A Docker Compose...

    1. is a CLI that gets installed with Docker
    2. can be used for running multiple containers
    3. automates some of the commands for running Docker
    4. all of the above

    The Docker client is...?

    1. a tool that we issue commands to
    2. responsible for creating images, running containers, etc.
    3. another name for the Docker daemon
    4. both A and B

    To store the data generated by the Docker container, we use...?

    1. volumes
    2. Docker Hub
    3. makefile
    4. shell script

    Docker Hub is...?

    1. an IDE for running Docker containers
    2. the equivalent of GitHub, but for storing Docker images
    3. a container orchestration tool
    4. the name of the company maintaining Docker

    'docker pull redis' will...?

    1. will pull the redis image from the Docker Hub
    2. will pull the redis container from the Docker Hub
    3. will create a Docker container with the name redis
    4. is an invalid command

    Frequently changing parts of our program should be...?

    1. at the top of the Dockerfile
    2. in the middle of the Dockerfile
    3. at the bottom of the Dockerfile
    4. should NOT be in the Dockerfile