ZHAW Zurich University of Applied Sciences
Bachelor's degree programme Computer Science
Bachelor Thesis

Code Driven Infrastructure and Deployment

Author: Fabian Vogler, fabian.vogler@gmail.com
Student number: 10-966-596
Advisors: Beat Seeliger, Silvan Spross
2015-07-31, Version 1.1

This thesis was done as part of a bachelor's degree study at ZHAW Zurich University of Applied Sciences in Zurich. The source code of this document is available online at https://github.com/fabian/code-driven-infrastructure-and-deployment.

All sources are numbered [n] and listed in the bibliography in the appendix. Basic knowledge in computer science is required for reading and understanding the thesis. The most important acronyms and concepts are explained in the glossary that can be found in the appendix as well.

This document has been written in HTML and was converted into a PDF document with Prince. The font used is Helvetica Neue, created by D. Stempel AG and based on Helvetica by Max Miedinger.

The icons used on the cover and in this document are from small-n-flat, they are released to the public domain.

Abstract

WorldSkills International manages its web infrastructure manually with a small team of developers. There is no designated system administrator, the responsibility of managing the web infrastructure is shared among the developers. The author of this thesis is employed as a developer by WorldSkills International. Target of the thesis is to develop a concept for a testable and reproducible infrastructure where all changes are done within a revision control system. This increases visibility of changes within the developers team and makes the change tracable.

Many software solutions exist for provisioning of servers and deployment of applications. They can be grouped into software containers, configuration repository, and remote command execution. With software containers the software needed to run an application is encapsulated and run in operating system-level virtualization. A configuration repository is a centralized repository with configuration files and software is used to configure servers based on the configuration files. With remote command execution the installation commands and configuration files are transmitted in a coordinated manner to a remote server.

The current infrastructure requirements were analyzed and documented. The different types of software for provisioning of servers and deployment of applications were considered for solving the problem. Each type was evaluated with a popular representative according to the requirements. The evaluation showed that software containers need additional software for the orchestration of the provisioning and deployment. Due to the additional complexity and the missing requirements for software containers their usage was delayed to a potential separate project after this thesis. The automation software Ansible was choosen as best fitting for the requirements.

An architecture concept for an automated infrastructure was developed and successfully verified in a proof-of-concept. All software and configuration files required are defined in structured text files that can be read and transmitted to the server with Ansible. Local development and testing can be done in a virtual machine on the developers’ computer. The whole infrastructure can easily be cloned by running the provisioning scripts against new servers. A continious integration server executes the provisioning script after each change and verifies the infrastructure with system tests. The automated creation of the proof-of-concept infrastructure takes about 20 minutes. The automated process of setting up a new infrastructure environment for testing is simply initiated by creating a new branch adhering to a naming scheme. The implementation of the architecture concept and a migration to the new automated infrastructure is planned for fall 2015.

Contents

    Introduction

    Current Situation

    WorldSkills International is a non-profit membership association which organizes a world championship in skilled professions every two years. The author of this thesis is employed by WorldSkills International. To manage members and to organize the preparation and execution of the competition the organization runs multiple web applications. The mix of PHP and Java applications consists of legacy systems and a newly developed software system with a service-oriented architecture.

    Screenshots applications

    All web software is running on rented virtual servers with a Linux operating system. They are managed manually using a web control panel (Parallels Plesk). Changes to the infrastructure are done by the four internal software developers manually.

    Fundamental changes like the migration to a new server or the switch to a new runtime engine require a lot of knowledge about the existing system and manual testing of the new installation.

    Objectives

    The main goal of this thesis is to develop a concept for a versioned, testable and reproducible infrastructure. Changes to the system should be visible for the IT team and traceable if needed. As a result knowledge is shared in written form.

    To achieve this goal, manual steps to build or change the infrastructure should be replaced by code stored in a revision control system. Three different types of software for provisioning of servers and deployment exist at the moment:

    These types of software should be evaluated and an architecture documentation as well as a test concept should be written. The proposed architecture should then be tested in a proof-of-concept.

    Tasks

    The following tasks will be completed by the student as part of the bachelor thesis:

    1. Analyze infrastructure requirements
    2. Test requirements with popular representatives for each type of provisioning software
    3. Evaluate provisioning software
    4. Write architecture documentation
    5. Develop test concept
    6. Implement proof-of-concept with automated provisioning and deployment

    Project Management

    The following gantt diagram shows an overview of the actual timeline during the project as well as the dates of the most important milestones.

      February March April May June July
    Kick-Off 25.02.15 ◆                                        
    Requirements analysis                                          
    Software evaluation                                            
    Design Review                       ◆ 29.04.15                  
    Architecture concept                                      
    Test concept                                      
    Proof-of-concept                                            
    Final date                                         31.07.15 ◆
    Project schedule

    According to the rules at least 360 hours have to be invested into the bachelor thesis. The planning was done according to those hours.

    Both the requirements analysis and the software evaluation took longer than expected as many aspects had to be studied in detail. In return the architecture concept could be done faster. The proof-of-concept also took less time then expected as only a small number of problems occured during development. Writing the documentation was slightly underestimated.

    Description Planned Actual
    Requirements analysis 64 h ~72 h
    Software evaluation 48 h ~64 h
    Architecture concept 88 h ~80 h
    Test concept 40 h ~40 h
    Proof-of-concept 80 h ~64 h
    Write documentation 40 h ~56 h
    Total 360 h 376 h
    Comparison planned and actual hours

    Requirements

    Overview

    The following requirements were established by analysing the existing infrastructure and taking known problems into account.

    Stakeholders

    The table below shows the stakeholders which have been found for the current infrastructure. They have a direct or indirect influence on the requirements. These stakeholders are also used in the system context diagram.

    Stakeholder Description
    Developer

    Works at WorldSkills International and is responsible for the development and maintenance of the infrastructure. There are four developers working full-time for WorldSkills International. There is no designated system administrator.

    Developers have different backgrounds and therefore different knowledge about specific components of the instrastructure. They all share the responsibility for keeping the infrastructure running.

    All developers are using Mac OS X for development. They work from three different time zones.

    User

    Interacts with applications running on the WorldSkills infrastructure. This includes the Secretariat, Members, competition personnel and website visitors. Most of them are registered users. Their expectations for fast and continuously running services influence the requirements.

    Hosting provider

    Provides virtual Linux servers for the infrastructure. As they are in a highly competitive and fast moving market they can become obsolete and need to be replaced with another hosting provider with a better offering.

    GitHub

    Hosts Git code repositories for WorldSkills International. Provides a web interface for managing permissions of the repositories. They control how code can be accessed.

    Codeship

    Provides a hosted continuous integration software for WorldSkills International. The software is based on Linux with support for PHP, Java and JavaScript applications. Their functionality defines how applications can be built and deployed.

    Stakeholders

    System Context

    System context

    Both Developer and User need to interact with the infrastructure or the applications running on it. They have a direct influence on the requirements and lie within the system context.

    The source code of most applications is stored on GitHub. Codeship is used for running automated tests and executing the deployment of new versions. The hosting provider supplies the servers for running the infrastructure. All three vendors influence the requirements indirectly with the constraints of their services. They are outside of the system context as they cannot be influenced.

    Applications

    The current infrastructure is composed of multiple applications deployed on three servers. There is no requirement to keep them on separate servers. The following diagram gives an overview of all applications. A short description of each application can be found in the appended table.

    System overview

    The following table lists all applications and their special requirements.

    Application Description Requirements
    RabbitMQ MySQL JavaMail Uploads
    Web services Java applications for managing organization information Yes Yes Yes Yes
    Management JavaScript applications for accessing the web services
    Auth PHP application for login
    worldskills.org Organization website Yes Yes Yes
    WSC2015 website WorldSkills São Paulo 2015 event website Yes Yes
    WSC2017 website WorldSkills Abu Dhabi 2017 event website Yes Yes
    Members map World map with Facebook pages from other countries
    IL PHP application for managing infrastructure lists Yes
    Aggregator PHP application which serves the mobile app content Yes
    Who-is-who PHP application for managing organization personnel Yes Yes
    Registrations PHP application for registering people and a web service Yes
    Forums Dicussion forums Yes Yes
    Portal Website with information about WorldSkills Competition participants Yes Yes
    CIS demo Competition Information System demo Yes Yes
    Rooms Java application for reserving meeting rooms Yes Yes
    Archive Static copies of old event websites
    Mailer PHP application for sending emails to groups of people Yes
    SMP Skill Management Plan for planning the skill competitions Yes Yes
    CPT Competition Planning Timetable with important deadlines Yes Yes
    Application requirements

    User Stories

    To describe the functional requirements for the new infrastructure user stories are used in this chapter. They are prioritized in agreement with the developers in three levels: Must, Should, Could.

    Name R01 PHP applications
    Description As a developer I want to run multiple PHP applications so users can access them. A PHP application usually needs a MySQL database, the source code is stored on GitHub.
    Acceptance Criteria Each PHP application is running and can be accessed with a web browser.
    Priority Must

    Name R02 Java applications
    Description As a developer I want to run multiple Java applications on a Tomcat server so users and other applications can use the services provided by the applications. A Java application usually needs a MySQL database, the source code is stored on GitHub.
    Acceptance Criteria Each Java application is running and can be accessed over HTTP/S.
    Priority Must

    Name R03 Server configuration
    Description As a developer I might want to change a server configuration file to optimize a setting. For example the MySQL Query Cache size needs to be increased. Another example would be that the servlet container configuration needs a new variable due to a change in the application.
    Acceptance Criteria A configuration file gets modified and the change is pushed to the code repository. The new configuration file is automatically deployed to the server and the affected applications loads the new configuration.
    Priority Must

    Name R04 Application deployment
    Description As a developer I want to deploy a new version of an application so users can benefit from new features or bug fixes. The deployment causes no downtime of the application.
    Acceptance Criteria A new version of an application gets pushed to the code repository. Automated tests of the application are executed and if they pass the application gets deployed to the server. The old version keeps responding to requests until the new version is ready.
    Priority Should

    Name R05 Staging environment
    Description As a developer, I want to test a new version of a web service in a staging environment so I can make sure it works correctly with all other components of the system.
    Acceptance Criteria A new version of a web service is pushed in a separate branch, the functionality is available for testing in a staging environment within 30 minutes from the push.
    Priority Should

    Name R06 Hosting provider switch
    Description As a developer I want to switch the server hosting provider so I can profit from a better offer. Another reason could be that the current hosting provider shuts down or its no longer justifiable because of his actions (e.g. security problems).
    Acceptance Criteria The infrastructure can be ported to a different provider within 48 hours. All needed software is installed automatically, databases and user files are transferred manually.
    Priority Could

    Non-functional Requirements

    The following non-functional requirements are not conclusive but they are the most critical ones. They are classified according to quality model of ISO/IEC 25010:2011 and prioritized in agreement with the developers in three levels: Must, Should, Could.

    The requirements are based on the current infrastructure and events in the past related to it.

    Name R11 Configuration files
    Classification Changeability
    Description All configuration files are stored in a code repository.
    Priority Must

    Name R12 Change history
    Classification Accountability
    Description Every change must be traceable by a developer. Associated with every change is an explanation.
    Priority Must

    Name R13 Open Source
    Classification Replaceability
    Description All software used for infrastructure has to be built on open-source software. This guarantees that components can easily be ported to different providers or maintainers. It also allows other Members to easily copy parts of the infrastructure.
    Priority Must

    Name R14 Test environment
    Classification Testability
    Description To test configuration changes to it, the whole or part of the infrastructure can be started in a test environment. This is different from the staging environment in that the test environment can be local and automated tests are executed against it.
    Priority Must

    Name R15 Encrypted passwords
    Classification Confidentiality
    Description Server passwords should be stored only encrypted on third-party systems. The advantage of storing encrypted passwords in the code repository and sharing a key file instead of sharing the passwords in a file, is that the file doesn't need to be updated for everyone each time a password is added.
    Priority Must

    Name R16 Superuser access
    Classification Technical accessibility
    Description In case of a problem that only occurs in a certain environment, a developer needs unrestricted access to the server to debug the error and try out different solutions.
    Priority Must

    Name R17 Custom software
    Classification Interoperability
    Description New software can be installed without restrictions. New features or analytics tools might require the installation of additional software.
    Priority Must

    Name R18 Learning curve
    Classification Learnability
    Description How to use the software system to install and configure the infrastructure can be learned quickly so all developers can make changes to the infrastructure without spending weeks studying it.
    Priority Should

    Name R19 Horizontal scaling
    Classification Changeability
    Description Horizontal scalable applications can be installed on multiple servers and served through a load balancer.
    Priority Could

    Evaluation

    Introduction

    Based on the given requirements the following evaluation defines the guiding principles for the architecture concept. It compares the different approaches for configuring and deploying software and how well they fit to the existing environment.

    Only specific attributes of the approaches and their software is analyzed, general comparisons have been written before.

    Cloud solutions for managing applications exist, but are usually locked in on the vendor and targeted at high volumes. Our scaling requirements are low as the infrastructure needs to serve mainly the Competition and the Members and they are both limited by other resources (e.g. Member budgets).

    Software Containers

    This type of software recently became popular with Docker. The idea is to use the advantages of a virtual machine (isolation, portability, efficiency) while sharing resources. Docker has been chosen as it has been perceived as the most active project. A similar project is rkt (Rocket) by CoreOS.

    Logo Docker

    Docker only works on Linux, so for development on Mac OS X the software Boot2Docker is used on the development laptop. Installation instructions are provided in the Docker documentation in the chapter Installation for Mac OS X.

    After installation the virtual machine with Linux running Docker is launched with the following two commands.

    $ boot2docker init
    $ boot2docker start
    Waiting for VM and Docker daemon to start...
    .............ooo
    Started.
    Writing /Users/fabian/.boot2docker/certs/boot2docker-vm/ca.pem
    Writing /Users/fabian/.boot2docker/certs/boot2docker-vm/cert.pem
    Writing /Users/fabian/.boot2docker/certs/boot2docker-vm/key.pem
    
    To connect the Docker client to the Docker daemon, please set:
        export DOCKER_HOST=tcp://192.168.59.103:2376
        export DOCKER_CERT_PATH=/Users/fabian/.boot2docker/certs/boot2docker-vm
        export DOCKER_TLS_VERIFY=1
    
    Boot2Docker initialization

    Another useful tool provided by Docker is Docker Compose which allows multiple instances to be started based on a configuration file. However its usage in production is not recommended at the moment as it's missing some features for managing running instances.

    During the following verification The Docker Book was used as a reference.

    Requirements verification

    Requirement R01 PHP applications
    Verification

    Docker publishes official images for certain applications and programming languages in the Docker Hub Repository. There's also an official image for PHP which can be found on GitHub. Maintained images for recent PHP versions are provided with built-in Apache or PHP-FPM for running the application as a service.

    The easiest way to run a PHP application with Docker is to the use Apache web server so the application itself and static files (JavaScript, CSS, etc.) can both be served from one process.

    The MySQL server would be started in a separate instance, so an additional Dockerfile for MySQL is needed and then needs to be linked with the container running the PHP application. To simplify this Docker Compose can be used. It's a tool that can start multiple Docker instances based on a configuration file.

    Verdict Pass

    Requirement R02 Java applications
    Verification

    For Tomcat official images are available from the Docker Hub Repository as well. Multiple versions for Java 7 and 8 are available.

    Configuration and the application code can be copied to the working directory of Tomcat - the application is then started automatically. Multiple applications can be bundled into one container or one container per application can be used. Again as with PHP, a MySQL server would be started in a separate instance.

    Verdict Pass

    Requirement R03 Server configuration
    Verification

    To guarantee reproducible containers, configuration files are usually copied into the container at build time. This allows the container to be tested in a staging environment and guarantees the same results in the production environment.

    Docker provides the tools to build the image and upload it to a central repository, however it doesn't provide any tools to do an actual deployment of the image to a live environment out of the box.

    Verdict Fail

    Requirement R04 Application deployment
    Verification

    There are different ways to implement this with Docker. One way would be to build a new image, deploy it separately and then route requests with a load balancer to the new image. However as noted before, Docker doesn't provide tools to do a deployment of the image nor the orchestration of running machines.

    Another possibility would be to use volumes to deploy the new application into the running container but this would require each application to have built-in support for zero downtime deployment, which is not always the case (Tomcat supports this with parallel deployment but not PHP).

    Verdict Fail

    Requirement R05 Staging environment
    Verification

    Containers can be started quickly in a new environment (e.g. a new virtual server). Database fixtures can be loaded from additional containers which are linked to the database container.

    Verdict Pass

    Requirement R06 Hosting provider switch
    Verification

    Thanks to virtualization the only requirement for a new hosting provider would be to run Docker. Transfer of the data is possible by launching additional containers on the old server and mounting the data volumes. The installation of Docker would need to be done manually if Docker is not pre-installed by the hosting provider. But more importantly the configuration of the host machine (e.g. SSH keys, logging, security settings) could not be automated with Docker alone.

    Verdict Fail

    Requirement R11 Configuration files
    Verification

    All configuration files can be stored in the code repository and then added to the container during build time.

    Verdict Pass

    Requirement R12 Change history
    Verification

    By storing all Dockerfiles and configuration files in a code repository and building the container images on a continuous integration server each change can be traced back.

    Verdict Pass

    Requirement R13 Open Source
    Verification

    Both Docker and the registry server for Docker are Open Source on GitHub, they are licensed under the Apache License.

    Verdict Pass

    Requirement R14 Test environment
    Verification

    Multiple containers can be launched locally so tests can be executed.

    Verdict Pass

    Requirement R15 Encrypted passwords
    Verification

    Secret data is usually passed as environment variables to the Docker container. Docker doesn't provide functionality for encrypting information.

    Verdict Fail

    Requirement R16 Superuser access
    Verification

    Containers can be accessed from the outside with different methods and superuser access to the host system is required by Docker.

    Verdict Pass

    Requirement R17 Custom software
    Verification

    There are some limitations of what can be run inside a container on the kernel level, but they are not affecting most use cases.

    Verdict Pass

    Requirement R18 Learning curve
    Verification

    Immutable applications are a complex topic and Docker requires specific methods for normal tasks.

    Verdict Fail

    Requirement R19 Horizontal scaling
    Verification

    Multiple instances of images can be started in parallel and Docker Swarm, a clustering tool for Docker, can be used to manage them.

    Verdict Pass

    Summary Software Containers

    One notable property of Software Containers is their iteration speed. Due to how file changes are saved by the file system, setup commands don't need to be repeated each time but the resulting files can be restored within seconds. This makes experimenting with the container much faster than setting up a complete virtual machine each time.

    It became clear that Docker itself is a great tool for reducing the overhead of virtual machines, however it is not a complete solution for infrastructure management (yet).

    Advantages Disadvantages
    • Active Docker community
    • Software can be updated for each container independently
    • Clear separation of concern
    • Can be scaled well horizontally
    • Fast development iterations
    • Added complexity
    • Additional virtualization layer makes debugging harder
    • New technology, still in development
    Analysis Software Containers

    A comparison of all requirements can be found after the evaluation of the three different software types.

    Configuration Repository

    Chef and Puppet are the most popular representatives for this type of software. It has existed for a few years now and has the target to provide a central repository for infrastructure configuration from which server clients regularly pull their changes. This kind of software type usually also provides a way to use it without a central server but that loses the advantage of having a central point to coordinate changes.

    Logo Chef

    Chef is used here because of the already existing knowledge of Ruby and its popularity. It is usually running on a Linux machine, so for development a virtual machine with Linux is required. To simplify the setup of the virtual machine, Vagrant can be used. Vagrant is software to run virtual machines based on a configuration file. After installing Vagrant, a virtual machine can be started with a Linux image provided by Chef which already has all the required software pre-installed.

    $ vagrant up
    Bringing machine 'default' up with 'virtualbox' provider...
        default: The Berkshelf shelf is at "/Users/fabian/.berkshelf/…"
    ==> default: Sharing cookbooks with VM
    ==> default: Importing base box 'chef/ubuntu-14.04'...
    […]
    ==> default: Running provisioner: chef_solo...
        default: Installing Chef (latest)...
    Generating chef JSON and uploading...
    ==> default: Running chef-solo...
    ==> default: stdin: is not a tty
    ==> default: […] INFO: Forking chef instance to converge...
    ==> default: […] INFO: *** Chef 12.2.1 ***
    ==> default: […] INFO: Chef-client pid: 2030
    ==> default: […] INFO: Setting the run_list to ["recipe[example-chef::default]"]
    ==> default: […] INFO: Run List is [recipe[example-chef::default]]
    ==> default: […] INFO: Run List expands to [example-chef::default]
    ==> default: […] INFO: Starting Chef Run for example-chef-berkshelf
    ==> default: […] INFO: Running start handlers
    ==> default: […] INFO: Start handlers complete.
    ==> default: […] INFO: Chef Run complete in 0.017230899 seconds
    
    Vagrant initialization with Chef

    Chef organizes commands in recipes and cookbooks. Variables are stored in files called data bags. The books Cooking Infrastructure by Chef and Taste Test were used during the following verification.

    Requirements verification

    Requirement R01 PHP applications
    Verification

    Cookbooks for PHP are available in the Supermarket, the official repository for cookbooks. The PHP cookbook is published and maintained by the company behind Chef itself.

    However the functionality provided by the cookbook is mostly focused on the PHP extension repository PEAR, which is not a requirement. A more popular alternative is the apache2 cookbook. It provides the possibility to install the Apache web server with mod_php to run PHP together within a web server.

    For MySQL there's an official cookbook available as well which installs and starts a MySQL server as required.

    Verdict Pass

    Requirement R02 Java applications
    Verification

    The Supermarket also has an official cookbook for Tomcat. Java must be installed separately, again there's a cookbook available in the Supermarket. The same MySQL cookbook as mentioned before can be used.

    Verdict Pass

    Requirement R03 Server configuration
    Verification

    With Chef configuration files are usually built from templates which are stored in the code repository. Once uploaded to the Chef server the client receives them, replaces the variables in them and puts them to the targeted location. In addition the service using the file can be restarted.

    Verdict Pass

    Requirement R04 Application deployment
    Verification

    For Tomcat parallel deployment can be used here. The actual deployment to the server is done using the Tomcat Manager App.

    The deploy directive can be used to update a PHP application to the latest revision without causing any downtime. It keeps a copy of the current version running and uses symlinks to make the latest version active once all dependencies have been installed.

    Verdict Pass

    Requirement R05 Staging environment
    Verification

    Chef can be used to setup a virtual machine with the needed environment. Once the virtual machine with the Chef client is running it can pull all information needed to setup the environment from the Chef server.

    Verdict Pass

    Requirement R06 Hosting provider switch
    Verification

    The Chef client needs to be installed manually on a new virtual machine if the hosting provider doesn't provide machines with Chef pre-installed. Once the Chef client is installed it can install all required software and configure the machine as needed.

    Verdict Pass

    Requirement R11 Configuration files
    Verification

    Configuration files can be stored as templates in the code repository. They get compiled and written to their target destination by Chef.

    Verdict Pass

    Requirement R12 Change history
    Verification

    All cookbooks and templates can be stored in a code repository, a continuous integration server uploads them to the Chef server for distribution to the clients.

    Verdict Pass

    Requirement R13 Open Source
    Verification

    Chef Client, Chef Server and Chef Development Kit (DK) are Open Source on GitHub, they are licensed under the Apache License.

    Verdict Pass

    Requirement R14 Test environment
    Verification

    The cookbooks can also be used to configure the virtual machine on which the tests are running on.

    Verdict Pass

    Requirement R15 Encrypted passwords
    Verification

    Data bags can store variables for the Chef server. Confidential information can be stored in encrypted data bags. The secret key can be shared within the developers over other secure channels.

    Verdict Pass

    Requirement R16 Superuser access
    Verification

    No restrictions about access to the server are imposed by Chef.

    Verdict Pass

    Requirement R17 Custom software
    Verification

    Packages from the Linux distribution are simple to install with Chef, but also custom software can be downloaded and installed.

    Verdict Pass

    Requirement R18 Learning curve
    Verification

    Chef has a complex architecture and many dependencies on other libraries. Developers not familiar with Ruby also need to learn the syntax first.

    Verdict Fail

    Requirement R19 Horizontal scaling
    Verification

    Multiple Chef clients can connect to the same Chef server and the same cookbooks can be installed on multiple hosts.

    Verdict Pass

    Summary Configuration Repository

    The installation of the Chef DK was difficult as it relies on overriding rbenv, a tool for managing Ruby versions, in the PATH variable. Some modifications to .bash_profile were needed to get it running properly.

    The possibility to test the cookbooks locally inside a virtual machine with Vagrant proved handy. However a few times the cookbook failed unexpectedly (e.g. because a wrong package was accidentally installed first) - a complete rebuild solved the problem but took quite some time.

    A popular cookbook that was initially selected for deploying Java applications turned out to be incompatible with the latest version of the official Tomcat cookbook for no obvious reason. In general the whole system seemed complex: many components rely on others and compatibility has to be figured out manually.

    Advantages Disadvantages
    • Can be used to manage large number of servers
    • Utilizes operating system packages
    • Central server as single source for configuration
    • Complex dependencies
    • Maintained server or hosted solution for Chef server needed
    • Complete rebuild is slow
    Analysis Configuration Repository

    Remote Command Execution

    This type of software is similar to the Configuration Repository described in the previous chapter: Software to be installed and commands to be executed are defined in text files. The main difference is that instead of having a central server where everything is available to get pulled, everything gets pushed to the hosts. In fact Configuration Repository software like Chef also supports this kind of operation mode with Chef Solo.

    Ansible, Salt, and Rex are implementations specific for this kind of software type. Ansible is used here because it is only focused on this kind of software type and because of its popularity.

    Logo Ansible

    Ansible is a Python application and can be installed on a development machine as a Python package according to the Ansible documentation. Ansible uses SSH to communicate to remote servers and execute commands on them. Directives are organized in playbooks and roles.

    Again Vagrant can be used to simplify the setup of a remote host and test the playbook. As Ansible requires no specific software on the host any Linux distribution can be used with Vagrant. The following snippet shows the initialization process of Vagrant with Ansible:

    $ vagrant up
    Bringing machine 'default' up with 'virtualbox' provider...
    ==> default: Running provisioner: ansible...
    […] ansible-playbook […] site.yml
    
    PLAY [all] ********************************************************************
    
    […]
    
    PLAY RECAP ********************************************************************
    default                    : ok=1    changed=0
    
    Vagrant initialization with Ansible

    The book Taste Test was used during the following verification.

    Requirements verification

    Requirement R01 PHP applications
    Verification

    Ansible includes already most tools needed for installing and configuring software using the operating system mechanisms. Examples for how to use them are provided on GitHub by Ansible itself.

    The Apache web server and mod_php can be installed with the operating system packages, their configuration can be created with the Ansible template directive and the service directive makes sure Apache is running. The source code of the PHP application can either be copied to the host or cloned from Git.

    MySQL can be installed and configured the same way as Apache.

    Verdict Pass

    Requirement R02 Java applications
    Verification

    In the examples provided by Ansible they also describe how to install a standalone instance of Tomcat. The installation package gets downloaded from the Tomcat download server, extracted to an appropriate location, the configuration files are created from templates and Tomcat is started as a service. MySQL can be installed with the operating system packages.

    Verdict Pass

    Requirement R03 Server configuration
    Verification

    Configuration files can be either simply copied to the server from the local playbook or templates with variables can be used. Ansible checks if there were any changes in the configuration file and notifies a service to restart if needed. The playbook can also be run from an continuous integration server.

    Verdict Pass

    Requirement R04 Application deployment
    Verification

    For Tomcat parallel deployment can be used here again. The actual deployment to the server can be done using the Tomcat Manager App or by using Ansible to copy the file to the server.

    A combination of the git and file directives can be used to checkout the latest version of the PHP application and activate it with a symlink as soon as all dependencies have been installed.

    Verdict Pass

    Requirement R05 Staging environment
    Verification

    As long as Ansible can connect via SSH to a machine it can set it up as needed. Ansible can also launch new virtual machines in the cloud which is useful for quickly starting a staging environment.

    Verdict Pass

    Requirement R06 Hosting provider switch
    Verification

    As mentioned before the only requirement on the host machine is the ability for Ansible to connect to it. Once this is provided Ansible can setup the new server as required. The only limitation is that Ansible differentiates between package management systems, so switching from Apt (used by Debian) to Yum (used by SUSE Linux) would require some code changes, but this switch is outside of the scope of this requirement.

    Verdict Pass

    Requirement R11 Configuration files
    Verification

    All configuration files can be stored as templates or raw files in the code repository of the playbook. Ansible copies them to the target server when being executed.

    Verdict Pass

    Requirement R12 Change history
    Verification

    The playbook and all its files can be stored in the code repository. A continuous integration server can run the playbook if a change has been made.

    Verdict Pass

    Requirement R13 Open Source
    Verification

    Ansible is Open Source on GitHub and licenced under GNU General Public License (GPL) v3.0.

    Verdict Pass

    Requirement R14 Test environment
    Verification

    The playbook can also be executed locally to configure a test environment on the same machine.

    Verdict Pass

    Requirement R15 Encrypted passwords
    Verification

    Ansible provides a tool called Vault to encrypt files with variables. The encrypted file can automatically be encrypted when running the playbook. The password can be stored in a separate file and shared among developers.

    Verdict Pass

    Requirement R16 Superuser access
    Verification

    Ansible doesn't restrict how the server can be accessed. In fact Ansible requires that the server can be access using SSH which is the preferred way for developers to access the server as well.

    Verdict Pass

    Requirement R17 Custom software
    Verification

    Operating system packages as well as custom downloaded software can be installed using Ansible.

    Verdict Pass

    Requirement R18 Learning curve
    Verification

    Ansible files are easy to read and the concept can be understood quickly. Only the YAML syntax requires a bit of learning but most examples already give a good idea how it works.

    Verdict Pass

    Requirement R19 Horizontal scaling
    Verification

    Ansible can connect to multiple servers and execute the defined commands there. Servers can also be grouped by their role so that only specific commands are executed on certain servers.

    Verdict Pass

    Summary Remote Command Execution

    Ansible was easy to install with the Python package manager pip and was instantly ready to use. Because all servers are configured with SSH key authentication already, no additional setup is required to make Ansible usable on them.

    It is basically just a thin layer above shell scripts but provides all the tools needed for provisioning and maintaining a server (commands, services, configuration files). It lacks a few advanced features for highly complex setups, but the fact that it's easy to learn is useful in a small team with shared responsibilities.

    Advantages Disadvantages
    • Few dependencies
    • Easy to learn
    • Utilizes operating system packages
    • No central server needed
    • Limited reusability of playbooks
    • No programming in playbooks possible
    • System package installation not distribution independent
    Analysis Remote Command Execution

    Conclusion

    Requirement Software Containers Configuration Repository Remote Command Execution
    R01 PHP applications
    R02 Java applications
    R03 Server configuration
    R04 Application deployment
    R05 Staging environment
    R06 Hosting provider switch
    R11 Configuration files
    R12 Change history
    R13 Open Source
    R14 Test environment
    R15 Encrypted passwords
    R16 Superuser access
    R17 Custom software
    R18 Learning curve
    R19 Horizontal scaling
    Requirements comparison

    Software Containers don't comply with all Must requirements, a Configuration Repository misses one Should requirement (Learning curve), and Remote Command Execution matches all requirements.

    Even though Software Containers don't comply with all requirements it is still considered a useful concept if it gets combined with a Configuration Repository or Remote Command Execution for orchestration. The next chapter evaluates the possibility of combining Software Containers with another Configuration Repository or Remote Command Execution.

    Software Containers in Combination

    The following rough estimation for introducing a software type alone or together with Software Containers under regular workload also shows the result of the higher complexity of a Configuration Repository compared with Remote Command Execution.

    Training Setup Migration Total
    Configuration Repository 4 weeks 7 weeks 3 weeks 14 weeks
    Configuration Repository with
    Software Containers
    6 weeks 10 weeks 5 weeks 21 weeks
    Remote Command Execution 2 weeks 4 weeks 2 weeks 8 weeks
    Remote Command Execution with
    Software Containers
    5 weeks 8 weeks 5 weeks 18 weeks
    Estimation introduction time

    The introduction of an orchestration software together with Software Containers could be divided into two phases. In the first phase the orchestration could be set up with a Configuration Repository or Remote Command Execution for the existing infrastructure. In a second phase new applications which fit into the model of Software Containers could be adapted to those.

      Phase 1 Phase 2
    Month 1 Month 2 Month 3 Month 4 Month 5
    Training Orchestration                                    
    Setup Orchestration                                    
    Migration Orchestration                                      
    Training Software Containers                                    
    Setup Software Containers                                
    Migration Software Containers                                  
    Project schedule

    Recommendation

    Software Containers add additional complexity and their advantages are not urgently needed for a team of four developers. One of them being able to quickly setting up a development environment, which is not as important as no new developers are expected to join the team within the next four years. The initial architecture concept of this thesis will focus only on phase one based on the requirements and the time available.

    The evaluation has shown that Remote Command Execution is easier to understand and maintain than a Configuration Repository. Understanding the system easily is crucial as all developers need to be able to make quick changes to the infrastructure. Based on this the infrastructure concept will use Remote Command Execution software to describe phase one mentioned above. The software Ansible will be used based on its popularity and the good experience during the evaluation.

    Concept

    Introduction

    The following architecture concept is based on the arc42 template, which provides a guideline for documenting software architecture. Goals, constraints and the scope (requirements related information) have already been documented in the previous chapters. For easier understanding some core concepts of Ansible are outlined in this introduction.

    Ansible uses the term playbook to describe the collective of configuration files, deployment instructions, and environment specific variables. Each playbook can have multiple roles which allow modular grouping of related files. Best practices how to organize playbooks and roles are provided in the Ansible documentation.

    Playbook structure

    A playbook can also be described as a better structured shell script that tells the server which commands to execute in which order. Before the commands are executed however, it checks if the command is still required or if it is redundant. For configuration files it can check if the file needs to be updated by comparing the desired content to the actual content on the server.

    Different environment variables can be grouped in inventory files. An inventory can contain a list of server addresses of an environment as well as configuration variables specific to that environment.

    [webservers]
    foo.example.org
    bar.example.org
    
    [webservers:vars]
    proxy_host=proxy.example.org
    
    Example inventory

    The YAML syntax is used in playbooks for defining variables and commands to execute. The following is an example of a file for setting up a webserver.

    ---
    - hosts: webservers
      vars:
        http_port: 80
        max_clients: 200
      remote_user: root
      tasks:
      - name: ensure apache is at the latest version
        apt: name=apache2 state=latest
    
    Example Ansible file

    Solution Strategy

    The whole infrastructure is described in an Ansible playbook which is shared with a code repository on GitHub. Different services are split up into roles. Inside the roles provisioning and deployment are separate tasks and can be executed independently.

    Changes to the infrastructure are pushed to the code repository and then Ansible runs on the continuous integration system Codeship to deploy the changes to the servers. Confidential information like passwords and encryption keys are stored in encrypted Ansible Vaults.

    For application deployment the playbook is downloaded by the continuous integration system of the application and the required deployment tasks are executed with Ansible to update the application on the server.

    Building Block View

    The files and their organization within the playbook used to maintain the WorldSkills infrastructure are explained below. The most important parts of the playbook are the inventories and the roles, they are displayed in the following overview diagram.

    WorldSkills playbook overview

    Environment specific variables are stored in different inventories. There are two inventories: prod (for production) and staging (for system tests). As environment specific variables often contain sensitive information they are encrypted with Ansible Vault.

    The role common is used to execute often used tasks like adding public keys of all developers. Instead of hardcoding all servers in the inventory file, they are created or fetched dynamically with the servers role. Each software has its own role for the installation and the setup of the configuration files. Additionally each self-developed application has a role to define the deployment and the setup of configuration files.

    The following table show the file structure of the playbook and explains the function of most files and folders. The role apache is used as an example, the structure applies for all roles.

    File Description
    site.yml Main playbook file, delegate execution of commands to servers
    inventories
    prod
    hosts Inventory file with list of servers for production
    group_vars
    all Variables for production (e.g. database password)
    staging
    hosts Inventory file with list of servers for staging
    group_vars
    all Variables for staging environment
    roles
    apache Role for the installation and configuration of the webserver
    defaults
    main.yml Default webserver configuration variables (e.g. virtual hosts)
    files
    worldskills.org.crt SSL certificate
    worldskills.org.key SSL key
    handlers
    main.yml Commands for restarting the webserver
    tasks
    vhosts.yml Setup of virtual hosts
    main.yml Installation of the webserver and setup of the configuration files
    templates
    vhost.conf.j2 Virtual host configuration file template
    File structure playbook

    Runtime View

    This view explains the dynamic aspects of the playbook execution. In particular the creation of the dynamic inventory and how they work together with the roles are highlighted.

    Playbook execution

    While running the playbook, the first role that gets executed is the servers role. This role doesn't execute any remote commands but runs locally to connect to the web service of the hosting provider. It either creates and boots the servers if they don't exist yet or just fetches their information. The returned IP addresses are added to the dynamic inventory so following commands can be executed on the servers.

    Subsequently all other roles are executed. This includes setup of software and applications. The commands are sent to the remote servers by Ansible.

    Deployment View

    In the bigger context the playbook is not executed manually on the developer machine but automated on a continuous integration server.

    Deployment overview

    Source code is stored in Git repositories. The playbook and each application have all their own private code repository. The continuous integration server has a build pipeline for each code repository. For each pipeline a unique private SSH key is used. The public keys for those private keys can be exported and are deployed to the code repository provider to guarantee access to the source code. They are also copied to the virtual servers to allow Ansible to connect to the servers via SSH.

    Each application pipeline uses the playbook to provision the infrastructure and then deploys the application to the targeted environment. The servers required for the production and staging environment are supplied by the hosting provider.

    Design Decisions

    Playbook organization

    A fundamental decision of the project is how to organize and structure the playbooks. Each application has its own code repository and allows actions to be executed after a push.

    Two variations are possible: One code repository with a playbook for the infrastructure and deployment, or a playbook in each application code repository. The first approach has been chosen to easily allow sharing of variables from the infrastructure setup with the application deployment.

    Usage of inventories

    Ansible is built to also manage legacy infrastructures with existing servers, so the main purpose of inventories is to provide a list of servers grouped by their function. But Ansible also has built-in server provisioners, which allows to take the approach of an idempotent infrastructure one step further and also create the virtual servers with Ansible as part of the playbook.

    Usually inventory files in Ansible playbooks contain a list of IP addresses or server names which belong to the infrastructure, but with the proposed concept the only server listed in the inventory file is localhost. This is because the servers get added to the inventory list in memory while running the playbook. If the servers don't exist yet, they get created; if they already exist, their IP addresses are added to the inventory.

    This approach has the advantage that temporary environments (like staging) can easily be created and existing IP addresses of existing servers don't need to be manually maintained inside an inventory file.

    Ansible Tower

    The company behind Ansible distributes also commercial software called Ansible Tower for running playbooks with a user interface. The software provides a web interface with permission management, an audit trail and visualizations.

    Screenshot Ansible Tower

    Ansible Tower was not used for this project due to the high price (starting at $5000 per year) and because it wouldn't bring many additional benefits. As all four developers responsible for the system are used to working with Git there's no need for a graphical user interface. It would also contradict the principle of having all changes driven by code and it's advantage of being able to easily trace a change.

    Technical Risks

    The following paragraphs outlines potential technical risks of the architecture concept. The factors convenience and security are often conflicting and need to be weighted in each case.

    One disadvantage of automating the server setup is that the executor needs full administration access to the servers. In the proposed concept the continuous integration server has superuser access to all servers. This leads to two attack scenarios: An attacker could compromise the continuous integration server to gain access to the servers. Or the attacker could compromise the code repository and infiltrate the server through malicious commands. Both GitHub and Codeship take appropriate actions to prevent these kind of attacks.

    As the staging environment is always built by syncing databases from production the risk of data leakage increases as staging environments are often regarded as non-critical. By treating the staging environment as confidentially as the production environment (e.g. encrypting staging passwords), incidents can be prevented.

    Testing

    The testing is divided into multiple stages which all aim to find bugs in the Ansible playbook. Testing of the self-developed applications has to be done by each application individually. All automated tests are executed on the continuous integration server.

    Testing flow

    After each code change a syntax check of the Ansible playbook is performed. The playbook is then executed to create a temporary staging environment. To verify the integrity of the staging environment system tests are performed as a last step. Once the staging environment is ready manual tests can be executed depending on the change made. Testing during development can be done locally with Vagrant.

    Syntax check

    Ansible provides two flags when running playbooks which are particularly useful for testing. --syntax-check validates the YAML syntax of the playbook and can be used to discover problems early.

    The flag --check runs the playbook in a special mode where a dry run is performed. In this mode no changes are made on remote systems, but all tasks are checked if they can be executed. Thanks to this it can easily be checked if the playbook would run or fail (e.g. because of some missing variables). Unfortunately this is not possible in all cases as some modules require some dependencies to be installed (e.g. rabbitmq_plugins).

    Because of the restrictions of the check mode only the syntax check is performed as an automated test.

    Staging environment

    New functionality is developed in feature branches in the application repositories. To test new functionality which requires changes in multiple applications they all need to be deployed to the same staging environment. For this a branch naming convention is defined as following: Each time a branch starting with staging- is created, a new staging environment is automatically created.

    The first time an application uses the branch name, it boots the required servers and installs all applications from their master branches. Only the application using the branch name is installed from the branch.

    If another application creates a branch with the same name it will get deployed to the same staging environment. It makes sure the servers in the environment are in the desired state and as applications are only updated during the initial provisioning, the first application won't get overridden with the master branch.

    After testing is completed and the branch has been merged into the master branch the staging environment can be removed with a separate Ansible script.

    System tests

    Simple system tests can be done with the uri module of Ansible. The module allows to send HTTP requests and store the response in a variable. The response can then be checked for certain content.

    - name: check Auth API
      action: uri url="https://api.worldskills.org/auth/ping" return_content=yes
      register: uri_response
    
    - name: verify Auth API
      fail: msg="Auth API not reachable"
      when: "'pong' not in uri_response.content"
    Example system test

    Testing the web interface of the applications also verifies that all underlying software like the Apache web server and PHP are running properly. For each self-developed application at least one system should be created to make sure they are running as expected.

    Migration

    To migrate the existing infrastructure to the new automated infrastructure multiple steps need to be taken. Not only do all applications need to be installed and configured on the new infrastructure but existing data needs to be transferred and DNS entries need to be updated.

    The following steps should be executed in the order outlined. Some have to be done manually, others can be automated with Ansible.

    1. Reduce DNS time to live (TTL), manual
    2. Create new infrastructure, automated
    3. Stop existing infrastructure, manual
    4. Synchronize database to new infrastructure, automated
    5. Synchronize user files to new infrastructure, automated
    6. Switch DNS entries, manual
    7. Increase DNS time to live (TTL), manual

    The first steps are reducing the time DNS entries are cached so the switch to the new server gets propagated faster. The new infrastructure is then created with the main Ansible playbook. Once the new infrastructure is ready the old one can be switched off so no changes can be made anymore by users. To synchronize the database and user files from the existing infrastructure a separate Ansible playbook should be written so it can be executed on its own. After everything has been transferred to the new infrastructure the DNS entries can be changed to point to the new infrastructure and the DNS TTL can be increased again.

    Implementation

    Proof-of-concept

    For the proof-of-concept the following reduced infrastructure has been selected to test the feasibility of the concept focusing on the requirements. The proof-of-concept infrastructure includes a PHP application (R01), two Java applications (R02), one JavaScript application, multiple MySQL databases and the RabbitMQ message queue.

    Proof-of-concept infrastructure

    The proof-of-concept also includes the dynamic generation of staging environments based on branch naming (R04). The source code for all applications and the Ansible playbook is stored in a separate organization on GitHub. The domain name worldskills.ch is used for the proof-of-concept.

    To prepare for horizontal scaling (R19) databases are installed on a separate server. Initial data fixtures are imported from the existing WorldSkills infrastructure (using an SSH connection).

    All Must and Should requirements can be tested with the proof-of-concept infrastructure. The following list includes all requirements that are checked with the proof-of-concept infrastructure.

    The following Could requirements are not explicitly tested with the proof-of-concept infrastructure because of their complexity.

    Ansible Playbook

    The playbook code repository is organized with roles. Each role is responsible for the installation and configuration of the applications. The main playbook delegates the roles for execution to the servers.

    - hosts: database
    
      roles:
        - mysql
        - rabbitmq
    
    - hosts: api
    
      roles:
        - tomcat
        - worldskills_api_auth
        - worldskills_api_events
    
    - hosts: web
    
      roles:
        - apache
        - php
        - worldskills_concrete5
        - worldskills_auth
        - worldskills_events
    
    Main playbook

    System Packages

    The PHP role is an example of software that gets installed with the package manager of the operating system. The template module of Ansible is used to create the required configuration files. The same approach is used for the installation of MySQL, RabbitMQ, Tomcat and the Apache web server.

    - name: install PHP
      apt: name={{ item }} state=present # install software packages
      with_items:
        - php5
        - libapache2-mod-php5
        - php5-mysql
        - php5-curl
        - php5-gd
      notify: restart apache # restart Apache after PHP installation
    
    - name: put configuration file in place
      template: src=worldskills.ini.j2 dest="{{ php_ext_path }}/worldskills.ini"
      notify: restart apache # restart Apache if configuration changes
    
    Role php tasks

    Virtual Hosts

    Virtual hosts in the Apache web server are also created with the template module and by linking the configuration files to /etc/apache2/sites-available.

    - name: add configuration files
      template: >
        src="vhost.conf.j2"
        dest="/etc/apache2/sites-available/{{ item.servername }}.conf"
      with_items: vhosts
      notify: restart apache
    
    - name: enable vhosts
      file: >
        src="/etc/apache2/sites-available/{{ item.servername }}.conf"
        dest="/etc/apache2/sites-enabled/{{ item.servername }}.conf" state=link
      with_items: vhosts
      notify: restart apache
    
    Virtual hosts tasks

    PHP and JavaScript Applications

    Self-developed applications are installed by checking out the source code and then installing all dependencies. To keep the existing application running the source code is installed in a new folder and a symlink pointing to the latest checkout is updated at the end of the installation process.

    - name: clone project
      git: >
        repo="git@github.com:worldskills-infrastructure/worldskills-events.git"
        dest="{{ worldskills_events_path }}/repo"
        version="{{ worldskills_events_branch }}"
        update="{{ worldskills_events_update }}"
      register: worldskills_events_clone
    
    - name: set release path
      set_fact: >
        worldskills_events_release="{{ worldskills_events_path }}/releases/
        {{ worldskills_events_clone.after }}"
    
    - name: export a copy of the repo
      command: >
        git checkout-index -a --prefix="{{ worldskills_events_release }}/"
        chdir="{{ worldskills_events_path }}/repo"
        creates="{{ worldskills_events_release }}"
    
    - name: install npm dependencies
      npm: path="{{ worldskills_events_release }}"
    
    - name: install bower dependencies
      command: >
        bower install chdir="{{ worldskills_events_release }}"
        creates="{{ worldskills_events_release }}/app/bower_components"
    
    - name: create config
      template: >
        src=config.js.j2 dest="{{ worldskills_events_release }}
        /app/scripts/config.js"
      notify: build worldskills_events
    
    - name: link release to current
      file: >
        state=link path="{{ worldskills_events_path }}/current"
        src="{{ worldskills_events_release }}"
    
    Role Events application tasks

    Java Applications

    Java applications are deployed using the Tomcat Maven Plugin. This plugin builds the application and uploads it to the Tomcat server. Again the old version of the application is kept running until the new application has been started and is ready. For Tomcat this is achieved with the built-in functionality for parallel deployment as described in the Tomcat Documentation.

    The current build timestamp is used as a substitute for the version number of the application and configured in the Maven file pom.xml.

    <plugin>
        <groupId>org.apache.tomcat.maven</groupId>
        <artifactId>tomcat7-maven-plugin</artifactId>
        <version>2.2</version>
        <configuration>
            <port>9090</port>
            <contextFile>src/test/resources/context.xml</contextFile>
            <url>http://api.worldskills.ch/manager/text</url>
            <path>/events##${maven.build.timestamp}</path>
            <server>tomcat</server>
            <update>true</update>
            <mode>war</mode>
        </configuration>
    </plugin>
    
    Tomcat Maven Plugin configuration

    Ansible Vault

    As environment specific variables contain sensitive data like database passwords, they are encrypted with Ansible Vault. The password for the Vault is stored in the file vault_password.txt which is not inside the code repository but shared manually between developers. To prevent the password file from accidentaly being added to the code repository it is explicitly excluded in the file .gitignore.

    The encrypted file in the inventory can be edited with the following command which uses the password file and opens the configured text editor with the content of the file.

    $ ansible-vault \
        edit \
        --vault-password-file=vault_password.txt \
        inventories/prod/group_vars/all
    
    Ansible Vault edit command

    Data Synchronization

    As the CMS of the website heavily depends on content in the database to work properly, the database must be synced from the current infrastructure to the proof-of-concept infrastructure. Additionally this synchronization must be done when creating a staging environment.

    Because Ansible is using SSH the server which the new database is running on can get an export of the database directly from the production database server.

    - name: get worldskills_concrete5 dump from prod
      command: >
        rsync -ze "ssh -o StrictHostKeyChecking=no"
        root@beuk.worldskills.org:/path/to/database/mysqldump.sql.gz
        /usr/local/worldskills-playbook/worldskills_concrete5.sql.gz
        creates=/usr/local/worldskills-playbook/worldskills_concrete5.sql.gz
    
    - name: import worldskills_concrete5 database
      mysql_db: >
        name="{{ worldskills_api_auth_database_name }}"
        state=import
        target=/usr/local/worldskills-playbook/worldskills_concrete5.sql.gz
    
    Data synchronization tasks

    Vagrant Setup

    For local development the playbook can be executed in Vagrant. Ansible is configured as provisioner in Vagrant and points to the main playbook. As the Ubuntu image used by Vagrant has the root user disabled by default, the playbook is executed with the user vagrant and with superuser permissions (sudo). Environment specific variables like URLs are overridden with the extra_vars parameter. To simplify things all server groups are mapped to the default server, so only one virtual machine is needed to run the infrastructure locally.

    config.vm.provision "ansible", run: "always" do |ansible|
      ansible.playbook = "site.yml" # WorldSkills playbook
      ansible.raw_arguments = ['--diff'] # show file changes
      ansible.sudo = true # force sudo
      ansible.extra_vars = {
        ansible_ssh_user: "vagrant",
        server_prefix: "",
        worldskills_auth_url: "http://vagrant-auth.worldskills.ch/",
        worldskills_events_url: "http://vagrant-events.worldskills.ch/",
      }
      ansible.groups = {
        "servers" => ["default"],
        "web" => ["default"],
        "api" => ["default"],
        "database" => ["default"],
      }
    end
    
    Vagrant provisioner configuration

    GitHub Setup

    An independent GitHub organization is used to keep the proof-of-concept separated from existing WorldSkills projects. All projects which are part of the proof-of-concept have been replicated as private repositories in this organization. Most notably is the new code repository worldskills-playbook which contains the Ansible playbook.

    Screenshot GitHub repositories

    Every commit is also displayed in Slack, the chat application used by WorldSkills. This guarantees that all developers are aware of the change.

    Screenshot commit in Slack

    Codeship Setup

    Codeship knows setup, test and deployment commands. Setup and test commands are always executed, deployment commands can be different for each branch. Deployment commands are only executed if the test commands succeeded. If there's a syntax error in the Ansible playbook, the deployment commands are not executed.

    The setup commands on Codeship install Ansible and other required software. The Vault password is stored as an environment variable on Codeship. During setup the variable is written to a Vault password file. This approach prevents it from being displayed in the execution logs.

    pip install ansible dopy httplib2
    echo $VAULT_PASSWORD > vault_password.txt
    
    Codeship setup commands

    The test command then runs the syntax check of the main Ansible playbook file locally. The comma at the end of localhost, tells Ansible to use the passed string as inventory instead of trying to load an inventory file.

    ansible-playbook -i localhost, site.yml --syntax-check
    
    Codeship test commands

    With the deployment commands, the master branch gets deployed to the production environment. The production inventory is loaded from the file inventories/prod and the Vault password file created with the setup commands is used for encrypting the production variables.

    ansible-playbook \
        -i inventories/prod \
        --vault-password-file=vault_password.txt \
        site.yml
    
    Codeship deployment commands

    All branches starting with staging- are deployed to an environment with the branch name. Codeship provides the current branch name as environment variable $CI_BRANCH. The variable digital_ocean_droplet_prefix gets overridden from the command line with the option -e (additional variables).

    ansible-playbook \
        -i inventories/staging \
        --vault-password-file=vault_password.txt \
        -e "digital_ocean_droplet_prefix=$CI_BRANCH" \
        site.yml
    
    Codeship deployment commands for staging environment

    All commands are logged and reported in the Codeship web interface.

    Screenshot Codeship execution

    With a webhook the status of every build is also displayed in Slack.

    Screenshot notification Slack

    The danger of a race condition with multiple playbooks running at the same time is avoided by the Codeship plan which allows only one concurrent build.

    DigitalOcean Setup

    The servers for the infrastructure are created by the servers role with Ansible on DigitalOcean. The servers are identified by their name, their IP address is stored for usage in other roles (e.g. digital_ocean_ip_web).

    - name: create servers
      digital_ocean:
        command: droplet
        state: present
        name: "{{ item.key }}" # server name
        size_id: 63 # 1 GB memory
        image_id: 11836690 # Ubuntu 14.04
        unique_name: yes # identify server by name
        client_id: "{{ digital_ocean_client_id }}"
        api_key: "{{ digital_ocean_api_key }}"
      register: droplets_response # store DigitalOcean response
      with_dict: # loop over list of servers
        web: {group: web}
        api: {group: api}
        database: {group: database}
    
    - name: store server ips # store IP addresses in variables
      set_fact: "digital_ocean_ip_{{ item.item.key }}={{ item.droplet.ip_address }}"
      with_items: droplets_response.results # loop over all servers
    
    Role servers

    After running the servers role the servers are visible in the DigitalOcean web interface.

    Screenshot DigitalOcean servers

    Challenges

    During the implementation the following problems occurred and had to be fixed. Outlined are descriptions of the problems, the steps taken, and the final solutions.

    Hierarchy of group_vars

    Originally it was planned to use group_vars on the playbook level to define default values for various variables and override the values with environment specific values. However in Ansible group_vars on the playbook level precede inventory variables. This hierarchy was reversed in Ansible 1.7 due to a bug and is still controversial.

    The problem has been solved by moving the default values to the individual roles. This has the disadvantage that default values can only be shared between roles that run on the same server. Hopefully Ansible will introduce default values on the playbook level in the future.

    Problem SSH connection closed

    A problem that occurred during testing with DigitalOcean was that sometimes the servers couldn't be created properly. When trying to connect to the just created server via SSH the connection always failed.

    ssh root@46.101.135.160
    Connection closed by 46.101.135.160
    
    SSH error message closed connection

    After resetting the root password of the server via the DigitalOcean web interface and connecting via their remote control interface it turned out the server was missing the SSH keys required to successfully connect to it. Contacting the support of DigitalOcean didn't bring any new information, they assured that they are not aware of any software problem. As this problem only occurs sometimes it is most likely related to a race condition when deleting a server and creating a new one with the same name. Waiting at least 5 minutes before trying to create the same servers again solved the problem.

    Problem SSH connection timeout

    Another similar SSH problem was that sometimes the SSH connection was reset while the Ansible playbook was running. A first part of the playbook was successfully executed on the database servers but the next part which was to be executed on the API servers failed with the following message.

    ssh: connect to host 46.101.255.100 port 22: Connection timed out
    Couldn't read packet: Connection reset by peer
    
    SSH error message timed out connection

    Checking the SSH timeout variables (ClientAliveInterval, ClientAliveCountMax, ServerAliveInterval) showed that the both the SSH server and the client are configured correctly. Manual testing of the SSH connection from Codeship to DigitalOcean by keeping the connection idle for some time also worked as expected. The problem could not be reproduced and happened only twice. Most likely a temporary network problem between Codeship and DigitalOcean.

    Problem MySQL socket

    After running the playbook repeatedly, PHP was suddenly unable to connect to the MySQL database. Investigating the problem it turned out that the MySQL socket had disappeared. After debugging the issue the problem could be tracked down to a problem in Ubuntu when the command /etc/init.d/mysql restart was executed. Due to a recent change in Ubuntu to upstart scripts this is actually a legacy command and service mysql restart should be executed instead.

    The main issue is that Ansible isn't using the correct restart command, so instead of relying on the Ansible service module, the correct command is now executed with the shell module directly. A fix for this bug will be available in Ansible 1.9.2. After the release the service module can be used again.

    Missed handlers when task failed

    If a task in an Ansible playbook fails, the execution of the whole playbook stops. Queued handlers are lost if this happens and in the next run the handlers might not get notified again. Because of this a required restart easily gets forgotten and can cause unexpected server states (e.g. configuration file has been updated but a restart is required for loading the updated configuration). Luckily Ansible can be forced to always run handlers, even if a task fails with the setting force_handlers: True. This solved the problem of missed application restarts.

    Test Results

    The whole proof-of-concept infrastructure can be created in 18 minutes. The output of the execution can be found below.

    PLAY [provision] **************************************************************
    TASK: [servers | create droplet]
    TASK: [servers | store ips]
    TASK: [servers | store private ips]
    TASK: [servers | create web domain]
    TASK: [servers | create events domain]
    TASK: [servers | create api domain]
    TASK: [servers | add hosts to the inventory group]
    PLAY [digital_ocean]
    TASK: [Wait for port 22 to become available.]
    
    PLAY [servers] ****************************************************************
    GATHERING FACTS
    TASK: [common | remove Landscape packages (cause unnecessary delay)]
    TASK: [common | update apt cache once a day]
    TASK: [common | install development tools]
    TASK: [common | add public keys]
    TASK: [common | add GitHub to known hosts]
    TASK: [common | create working directory]
    
    PLAY [database] ***************************************************************
    GATHERING FACTS
    TASK: [mysql | install]
    TASK: [mysql | install Python interface]
    TASK: [mysql | install Java interface]
    TASK: [mysql | create configuration file]
    TASK: [mysql | create worldskills_concrete5 database]
    TASK: [mysql | create worldskills_concrete5 user]
    TASK: [mysql | get worldskills_concrete5 dump from prod]
    TASK: [mysql | copy worldskills_concrete5 normalize script]
    TASK: [mysql | create worldskills_auth database]
    TASK: [mysql | create worldskills_auth user]
    TASK: [mysql | get worldskills_auth dump from prod]
    TASK: [mysql | copy worldskills_auth normalize script]
    TASK: [mysql | create worldskills_events database]
    TASK: [mysql | create worldskills_events user]
    TASK: [mysql | get worldskills_events dump from prod]
    TASK: [mysql | copy worldskills_events normalize script]
    TASK: [rabbitmq | import repository key]
    TASK: [rabbitmq | add apt repository]
    TASK: [rabbitmq | install]
    TASK: [rabbitmq | enable plugins]
    TASK: [rabbitmq | run service]
    TASK: [rabbitmq | add users]
    TASK: [rabbitmq | remove default guest user]
    NOTIFIED: [mysql | restart mysql]
    NOTIFIED: [mysql | import worldskills_concrete5 database]
    NOTIFIED: [mysql | run worldskills_concrete5 normalize script]
    NOTIFIED: [mysql | import worldskills_auth database]
    NOTIFIED: [mysql | run worldskills_auth normalize script]
    NOTIFIED: [mysql | import worldskills_events database]
    NOTIFIED: [mysql | run worldskills_events normalize script]
    NOTIFIED: [rabbitmq | restart rabbitmq]
    
    PLAY [api] ********************************************************************
    GATHERING FACTS
    TASK: [tomcat | install Java]
    TASK: [tomcat | install]
    TASK: [tomcat | create configuration file]
    TASK: [tomcat | create logging configuration]
    TASK: [tomcat | run service]
    TASK: [tomcat | get wsideps from prod]
    TASK: [tomcat | add wsideps to classpath]
    TASK: [worldskills_api_auth | get WAR from prod]
    TASK: [worldskills_api_auth | deploy]
    TASK: [worldskills_api_events | get WAR from prod]
    TASK: [worldskills_api_events | deploy]
    NOTIFIED: [tomcat | restart tomcat]
    
    PLAY [web] ********************************************************************
    GATHERING FACTS
    TASK: [apache | install]
    TASK: [apache | run service]
    TASK: [apache | enable mod_rewrite]
    TASK: [apache | add vhosts configuration]
    TASK: [apache | enable vhosts]
    TASK: [php | install]
    TASK: [php | ensure configuration directories exist]
    TASK: [php | put WorldSkills configuration file in place]
    TASK: [php | download Composer]
    TASK: [php | move Composer to global location]
    TASK: [npm | install]
    TASK: [npm | install packages]
    TASK: [worldskills_concrete5 | clone project]
    TASK: [worldskills_concrete5 | create shared directory]
    TASK: [worldskills_concrete5 | get files directory from prod]
    TASK: [worldskills_concrete5 | set release path]
    TASK: [worldskills_concrete5 | export a copy of the repo]
    TASK: [worldskills_concrete5 | check for existing files directory]
    TASK: [worldskills_concrete5 | remove existing files directory]
    TASK: [worldskills_concrete5 | link files directory]
    TASK: [worldskills_concrete5 | install Composer dependencies]
    TASK: [worldskills_concrete5 | create config]
    TASK: [worldskills_concrete5 | create properties file]
    TASK: [worldskills_concrete5 | link release to current]
    TASK: [worldskills_auth | clone project]
    TASK: [worldskills_auth | create shared directory]
    TASK: [worldskills_auth | create logs directory]
    TASK: [worldskills_auth | create log file]
    TASK: [worldskills_auth | set release path]
    TASK: [worldskills_auth | export a copy of the repo]
    TASK: [worldskills_auth | check for existing logs directory]
    TASK: [worldskills_auth | remove existing logs directory]
    TASK: [worldskills_auth | link logs directory]
    TASK: [worldskills_auth | install Composer dependencies]
    TASK: [worldskills_auth | create config]
    TASK: [worldskills_auth | link release to current]
    TASK: [worldskills_events | clone project]
    TASK: [worldskills_events | set release path]
    TASK: [worldskills_events | export a copy of the repo]
    TASK: [worldskills_events | install npm dependencies]
    TASK: [worldskills_events | install bower dependencies]
    TASK: [worldskills_events | create config]
    TASK: [worldskills_events | link release to current]
    NOTIFIED: [apache | restart apache]
    NOTIFIED: [worldskills_events | build worldskills_events]
    
    PLAY RECAP ********************************************************************
    127.0.0.1                  : ok=7    changed=3    unreachable=0    failed=0
    46.101.255.100             : ok=21   changed=15   unreachable=0    failed=0
    46.101.255.101             : ok=54   changed=42   unreachable=0    failed=0
    46.101.255.110             : ok=40   changed=34   unreachable=0    failed=0
    
    Output playbook

    After the playbook has been executed the WorldSkills website can be accessed on the proof-of-concept infrastructure at www.worldskills.ch.

    Screenshot Proof-of-concept

    As the whole playbook is idempotent, it can be executed repeatedly without making any changes. Executing it without making any modifications takes 3 minutes.

    All system tests are executed after the main Ansible playbook. They check content on various URLs to make sure the applications are running as expected.

    PLAY [all] ********************************************************************
    TASK: [check Events API]
    TASK: [verify Events API]
    TASK: [check Auth API]
    TASK: [verify Auth API]
    TASK: [check concrete5]
    TASK: [verify concrete5]
    TASK: [check Auth application]
    TASK: [verify Auth application]
    TASK: [check Events application]
    TASK: [verify Events application]
    
    PLAY RECAP ********************************************************************
    localhost                  : ok=5    changed=0    unreachable=0    failed=0   
    Output system tests

    Verification

    The requirements outlined for the proof-of-concept are verified manually against the running proof-of-concept infrastructure.

    Requirement R01 PHP applications
    Acceptance Criteria Each PHP application is running and can be accessed with a web browser.
    Verification

    The website is running at http://www.worldskills.ch/ and displays information from the Events web service. Clicking on Login redirects to http://auth.worldskills.ch/ where the Auth application is running as expected.

    Verdict Pass

    Requirement R02 Java applications
    Acceptance Criteria Each Java application is running and can be accessed over HTTP/S.
    Verification

    The Auth and the Events web service are running. Events are listed when accessing https://api.worldskills.ch/events and organizational groups are listed on https://api.worldskills.ch/auth/ws_entities/1.

    Verdict Pass

    Requirement R03 Server configuration
    Acceptance Criteria A configuration file gets modified and the change is pushed to the code repository. The new configuration file is automatically deployed to the server and the affected applications loads the new configuration.
    Verification

    After increasing the value of query_cache_size in the file roles/mysql/templates/my.cnf.j2 and pushing the change to GitHub, the playbook is executed and the change is deployed to the server. The MySQL server automatically gets restarted because of the configuration file change.

    Verdict Pass

    Requirement R04 Application deployment
    Acceptance Criteria A new version of an application gets pushed to the code repository. Automated tests of the application are executed and if they pass the application gets deployed to the server. The old version keeps responding to requests until the new version is ready.
    Verification

    The favicon.ico of the main website is changed and pushed to GitHub. The unit tests are executed on Codeship and the Ansible playbook is checked out. With a separate playbook the changed file gets deployed to the server.

    The new Favicon is only visible after the new version of the website is completely ready.

    Verdict Pass

    Requirement R05 Staging environment
    Acceptance Criteria A new version of a web service is pushed in a separate branch, the functionality is available for testing in a staging environment within 30 minutes from the push.
    Verification

    A new branch called staging-timetable is created for the Events web service and pushed to GitHub. The unit tests are executed on Codeship and the Ansible playbook is checked out. A new staging environment is created on DigitalOcean and all applications are installed.

    The new staging environment can be used at http://staging-timetable-www.worldskills.ch/ after 19 minutes.

    Verdict Pass

    Requirement R11 Configuration files
    Acceptance Criteria All configuration files are stored in a code repository.
    Verification

    Configuration files are part of the Ansible playbook and stored in the code repository.

    Verdict Pass

    Requirement R12 Change history
    Acceptance Criteria Every change must be traceable by a developer. Associated with every change is an explanation.
    Verification

    Every change gets commited to the code repository. An explanation is included in the commit comment.

    Verdict Pass

    Requirement R13 Open Source
    Acceptance Criteria All software used for infrastructure has to be built on open-source software. This guarantees that components can easily be ported to different providers or maintainers. It also allows other Members to easily copy parts of the infrastructure.
    Verification

    The playbook can be executed with Ansible which is Open Source. The required Python dependencies dopy and httplib2 are Open Source too.

    Verdict Pass

    Requirement R14 Test environment
    Acceptance Criteria To test configuration changes to it, the whole or part of the infrastructure can be started in a test environment. This is different from the staging environment in that the test environment can be local and automated tests are executed against it.
    Verification

    The playbook can be executed locally with Vagrant. Automated tests can be executed locally too.

    Verdict Pass

    Requirement R15 Encrypted passwords
    Acceptance Criteria Server passwords should be stored only encrypted on third-party systems. The advantage of storing encrypted passwords in the code repository and sharing a key file instead of sharing the passwords in a file, is that the file doesn't need to be updated for everyone each time a password is added.
    Verification

    Sensitive data is stored inside the Ansible Vault file.

    Verdict Pass

    Requirement R16 Superuser access
    Acceptance Criteria In case of a problem that only occurs in a certain environment, a developer needs unrestricted access to the server to debug the error and try out different solutions.
    Verification

    The servers on DigitalOcean can be accessed directly with SSH.

    Verdict Pass

    Requirement R17 Custom software
    Acceptance Criteria New software can be installed without restrictions. New features or analytics tools might require the installation of additional software.
    Verification

    Any software for Linux can be installed on the servers. As an example, rsync was installed on the servers to get data files for the website from the existing infrastructure.

    Verdict Pass

    Requirement R18 Learning curve
    Acceptance Criteria How to use the software system to install and configure the infrastructure can be learned quickly so all developers can make changes to the infrastructure without spending weeks studying it.
    Verification

    Ansible can be understood within one day. The development of the Ansible playbook took about about 2 weeks.

    Verdict Pass

    The proof-of-concept passes all 13 requirements that were defined for it. The proof-of-concept covers all Must and Should requirements of the infrastructure.

    Conclusions

    Ansible and Idempotence

    The project has shown that idempotence on a computer system is hard. This not only applies to Ansible but to all server management software. There are always multiple sources for changes on a modern computer system and almost infinite states exist. Reducing the configuration of the whole system to one code repository (the Ansible playbook in this case) helps by reducing dependencies.

    The promise of idempotence is also not true in every case with Ansible as some attributes require additional manual actions after a change. Changing the physical server location inside the digital_ocean module for example won't move the server from one location to another - the attribute is only used during creation of the server. Another example is the copy module which makes sure that a file exists in the specified location. The module checks this with every execution, however when the file path is changed (e.g. because it had a typo in it) it only makes sure the file exists in the new location. To delete the old file a separate task has to be written in Ansible. So the developers always have to be aware of what the module exactly does and they need to take care of cleanup operations by themselves.

    Given that the whole infrastructure can be duplicated automatically one could think of using this for automated testing of every change to the Ansible playbook. However as the creation of such a testing environment takes around 20 minutes, every little change to the infrastructure would also take at least that long - so this option is not considered useful. Instead the choice of creating a staging environment is left up to developer if a certain change needs to be tested in more detail.

    In summary the usage of Ansible makes the infrastructure more structured and changes more visible while keeping it flexible for future developments (like the integration of Docker).

    Staging Environment

    The time required to create the staging environment is considered a potential problem, especially as it gets worse with every additional application. The most time is used by the build of JavaScript and PHP applications. Storing compiled versions of these applications on a central server would improve the build time of a new staging environment but requires more development outside the scope of this project.

    Another option of speeding up the creation of the staging environment would be to clone the server completely (on system level) instead of recreating it every time with Ansible. However this only works within the same hosting provider (not even with all hosting providers) and one of the targets of this projects was to be independent from the hosting provider. By recreating the servers each time the developers can be sure that the playbook is ready to be used with a different hosting provider.

    The staging environment also suits well for more end-user tests and automated tests with tools like Selenium.

    Software Containers

    The Evaluation showed that software containers have potential in abstracting the deployment by bundling applications into their own unit and separating the infrastructure and applications. However adopting this technology requires some changes to the applications and the deployment needs to be orchestrated with another software increasing the complexity of the whole system in total.

    Applications can more easily be moved between servers and are less error prone to configuration mistakes. The advantages need to be carefully weighed in each case and testing new approaches before implementing them in production is recommended.

    Objectives

    The objectives of creating a versioned, testable and reproducible infrastructure where all changes are visible to the whole IT team has been fully achieved. The combination of multiple technologies and services allows quick modifications and the results are visible for everyone.

    Comparison before and after infrastructure concept

    Comparing the deployment pipeline before and after the infrastructure concept shows that both infrastructure changes and application deployments follow now the same steps. Manual actions on the server and custom application deployment scripts have been replaced with Ansible.

    Handling changes to the infrastructure the same way as changes to application helps the developers in their daily work as the processes and feedback loops are aligned. They get notified about modifications to the infrastructure in the chat application Slack, can review the diff on GitHub and give feedback there. The testing and deployment status is displayed in Slack as well and correction of failed changes can be organized right there with the developers online.

    Recommendations

    It is recommended to keep an eye on the development of software containers and re-evaluate their usage in six months. Suitable applications should then be migrated within three months if the software containers would bring reasonable advantages.

    Using Ansible for provisioning and deployment of the infrastructure makes sense. It is recommended to extend the proof-of-concept playbook with all applications running on the current infrastructure. Once the setup of the new infrastructure is completed a migration to it can be done with Ansible as outlined before. The targeted time for this migration is fall 2015.

    As the setup of a staging environment takes some time it's questionable how useful it is for small features. Investing into unit tests that can be executed locally is recommended here. Being able to replicate the complete infrastructure is useful for bigger features which join multiple services. It is recommended to implement the staging environment as described in the architecture documentation but use it only if needed and further invest in automated system tests.

    Appendix

    Bibliography

    Agile Orbit (2015): java Cookbook. https://supermarket.chef.io/cookbooks/java (retrieved 21. April 2015)

    Ansible, Inc. (2015): Ansible Documentation. April 2015. http://docs.ansible.com/ (retrieved 20. April 2015)

    Ansible, Inc. (2015): Ansible Tower Pricing. http://www.ansible.com/pricing (retrieved 13. July 2015)

    Ansible, Inc. (2015): Ansible Tower. http://www.ansible.com/tower (retrieved 18. June 2015)

    Ansible, Inc. (2015): Ansible. http://www.ansible.com/ (retrieved 20. April 2015)

    Apache Software Foundation (2013): Apache Tomcat Maven Plugin. November 2013. Version 2.2. https://tomcat.apache.org/maven-plugin-2.2/ (retrieved 18. July 2015)

    Apache Software Foundation (2015): Apache HTTP Server Project. https://httpd.apache.org/ (retrieved 19. July 2015)

    Apache Software Foundation (2015): Apache Tomcat 7 Configuration Reference - The Context Container. June 2015. Version 7.0.63. https://tomcat.apache.org/tomcat-7.0-doc/config/context.html (retrieved 18. July 2015)

    Ben-Kiki O., Evans C., Döt Net I. (2009): YAML Ain’t Markup Language. http://www.yaml.org/spec/1.2/spec.html (retrieved 31. May 2015)

    CFEngine AS (2015): CFEngine. http://cfengine.com/ (retrieved 13. July 2015)

    Chef Software, Inc. (2015): Chef. https://www.chef.io/ (retrieved 20. April 2015)

    Chef Software, Inc. (2015): Supermarket. https://supermarket.chef.io/ (retrieved 21. April 2015)

    Chesne A. (2014): small-n-flat. http://paomedia.github.io/small-n-flat/ (retrieved 3. November 2015)

    Codeship (2015): Codeship. https://codeship.com/ (retrieved 19. July 2015)

    Codeship (2015): Security Guidlines of Codeship. https://codeship.com/security (retrieved 4. June 2015)

    CoreOS, Inc. (2015): rkt Documentation. https://coreos.com/rkt/docs/0.5.6 (retrieved 13. July 2015)

    DeHaan M. (2014): Ansible GitHub. https://github.com/ansible/ansible (retrieved 26. July 2015)

    DeHaan M. (2014): Ansible GitHub issue #9877. https://github.com/ansible/ansible/issues/9877 (retrieved 24. June 2015)

    DeHaan M. (2014): Ansible GitHub issue #999. https://github.com/ansible/ansible/issues/999 (retrieved 24. June 2015)

    DigitalOcean, Inc. (2015): Digital Ocean. https://www.digitalocean.com/ (retrieved 26. July 2015)

    Docker, Inc. (2015): Docker. https://www.docker.com/ (retrieved 20. April 2015)

    Docker, Inc. (2015): Docker Documentation. Version v1.5. https://docs.docker.com/v1.5/ (retrieved 7. April 2015)

    Docker, Inc. (2015): Docker Hub. https://hub.docker.com/ (retrieved 13. July 2015)

    Free Software Foundation, Inc. (2007): GNU General Public License. June 2007. Version 3. https://www.gnu.org/licenses/gpl.html (retrieved 13. July 2015)

    Geerling J. (2015): Ansible for DevOps. May 2015. ISBN 978-0-9863934-0-2

    GitHub, Inc. (2015): GitHub. https://github.com/ (retrieved 19. July 2015)

    GitHub, Inc. (2015): GitHub Security. https://help.github.com/articles/github-security/ (retrieved 4. June 2015)

    Gravi T. (2015): Docker Official Image packaging for PHP. https://github.com/docker-library/php (retrieved 7. April 2015)

    Gregorio J. (2015): httplib2. https://pypi.python.org/pypi/httplib2 (retrieved 19. July 2015)

    HashiCorp (2015): Vagrant. https://www.vagrantup.com/ (retrieved 13. July 2015)

    Hochstein L. (2015): Ansible: Up and Running. March 2015. Early release revision 4. ISBN 063-6-920-03562-6

    International Organization for Standardization (2011): ISO/IEC 25010:2011. http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=35733 (retrieved 13. July 2015)

    Jaynes M. (2014): Taste Test: Puppet, Chef, SaltStack, Ansible. June 2014. https://devopsu.com/books/taste-test-puppet-chef-salt-stack-ansible.html (retrieved 7. April 2015)

    Parallels IP Holdings GmbH (2015): Plesk. http://parallels.com/en/products/plesk (retrieved 7. April 2015)

    Puppet Labs (2015): Open Source Puppet. https://puppetlabs.com/puppet/puppet-open-source (retrieved 13. July 2015)

    PyPA (2014): pip. https://pip.pypa.io/en/stable/ (retrieved 13. July 2015)

    (R)?ex - A simple framework to simplify system administration and datacenter automation. http://www.rexify.org/ (retrieved 13. July 2015)

    SaltStack (2015): SaltStack automation for CloudOps, ITOps & DevOps at scale. http://saltstack.com/ (retrieved 13. July 2015)

    SPI (2015): Apt, Advanced Package Tool. https://wiki.debian.org/Apt (retrieved 13. July 2015)

    Starke G., Hruschka P. (2014): arc42 Template. March 2012. Version 6.0. http://www.arc42.de/template/ (retrieved 13. July 2015)

    Stephenson S. (2013): rbenv. July 2014. https://github.com/sstephenson/rbenv (retrieved 26. July 2015)

    Selenium Project (2015): Selenium. http://docs.seleniumhq.org/ (retrieved 19. July 2015)

    Slack Technologies, Inc. (2015): Slack. https://slack.com/ (retrieved 19. July 2015)

    The Apache Software Foundation (2004): Apache License. January 2004. Version 2.0. https://www.apache.org/licenses/LICENSE-2.0.html (retrieved 13. July 2015)

    The PHP Group (2015): PEAR. http://pear.php.net/ (retrieved 13. July 2015)

    The PHP Group (2015): PHP Supported Versions. https://php.net/supported-versions.php (retrieved 7. April 2015)

    Turnbull J. (2015): The Docker Book. February 2015. Version v1.5.0. ISBN 978-0-9888202-3-4

    Van Zoest S. (2015): apache2 Cookbook. https://supermarket.chef.io/cookbooks/apache2 (retrieved 21. April 2015)

    Vasiliev A. (2014): Cooking Infrastructure by Chef. http://chef.leopard.in.ua/ (retrieved 18. April 2015)

    Venezia P. (2013): Puppet vs. Chef vs. Ansible vs. Salt. http://www.infoworld.com/article/2609482/data-center/data-center-review-puppet-vs-chef-vs-ansible-vs-salt.html (retrieved 13. July 2015)

    Viallet V. (2015): dopy. https://pypi.python.org/pypi/dopy (retrieved 19. July 2015)

    YesLogic Pty. Ltd. (2015): Prince. http://www.princexml.com/ (retrieved 25. July 2015)

    Yum Package Manager. http://yum.baseurl.org/ (retrieved 13. July 2015)

    Zürcher Hochschule für Angewandte Wissenschaften (2014): Reglement Bachelorarbeit Studiengang Informatik der ZHAW am Standort Zürich. March 2015. Version 3.3. https://ebs.zhaw.ch/files/documents/informatik/Reglemente/Bachelor/Bachelorarbeit/a_Reglement-Bachelorarbeit_Studiengang-Informatik_V3.3.pdf (retrieved 25. July 2015)

    List of Figures

    Software

    To make the results of this thesis reproducible, the following list shows the version of the software used on the development machine:

    The following versions of software were used on the server:

    Glossary

    Branch
    A method for making parallel code changes in a code repository.
    CI
    Continuous integration: Automated testing and deployment of code changes.
    Cloud
    Running dynamic computer software without detailed knowledge of the physical hardware.
    CMS
    Content management system: Software to edit websites.
    Code repository
    Used to store and manage source code in a version control system.
    Competition
    WorldSkills Competition: Biennial skills competition for youth.
    CSS
    Cascading Style Sheets: Language used to style websites
    Database fixtures
    Initial data for a database.
    Debian
    Variation of Linux maintained by a group of volunteers.
    Deployment
    Installing or updating a software on a server.
    DNS
    Domain Name System: Software used to resolve domain names.
    Environment
    Logical group of servers.
    Favicon
    Small icon used by web browsers to represent a website.
    Git
    A distributed version control system.
    Horizontal scaling
    Extending the capacity of a service by distributing the load over multiple servers.
    HTML
    Hypertext Markup Language: A markup language used on the world wide web.
    HTTP
    Hypertext Transfer Protocol: A protocol for transfering data of the world wide web.
    HTTPS
    HTTP Secure: Protocol for using HTTP in a secure way with an added encryption layer.
    Idempotence
    Operation that can be executed multiple times without changing the state after an initial change.
    IP address
    Numerical identifier of a computer within a network.
    IT
    Information technology: Management of data using computers.
    Java
    Programming language used for different purposes.
    JavaScript
    Programming language which can be executed in a web browser.
    Kernel
    Component of an operation system responsible for executing system commands.
    Linux
    Open Source operating system.
    Mac OS X
    An operating system by Apple Inc.
    Maven
    Dependency and build automation software for Java applications.
    Member
    Member organization of WorldSkills International.
    mod_php
    Apache web server module for running PHP.
    MySQL
    A relational database management system.
    MySQL Query Cache
    A cache of MySQL used to reduce the time required to parse the query.
    OS
    Operating system: Software running on a computer and providing the possibility to run other software.
    OSS
    Open source software: Software where the source code is licensed in a way that it can be viewed and changed.
    PDF
    Portable Document Format: A portable file format for documents.
    PHP
    A programming language used mostly on the world wide web.
    PHP-FPM
    PHP FastCGI Process Manager: An way to run PHP so a webserver can execute PHP applications.
    Python
    Dynamic programming language used for various purposes.
    RabbitMQ
    Open source message queue software.
    Race condition
    Unintended dependency of the ouput of a system on time.
    Ruby
    A dynamic programming language used on the web and on the command line.
    Secretariat
    People working for the organization WorldSkills International.
    Servlet container
    An application to run Java web applications.
    SSH
    Secure Shell: An encrypted protocol for executing commands on a remote server.
    Staging
    A dedicated environment to test software applications.
    Superuser
    Special user account on a system with unrestricted access.
    SUSE Linux
    Variation of Linux supported by Novell.
    Symlink
    Symbolic link: A file or directory pointing to another path.
    Template file
    File with placeholders that gets compiled into its final form by replacing those placeholders with variable values.
    Tomcat
    A web server to run Java applications.
    Ubuntu
    Variation of Linux maintained by Canonical Ltd.
    URL
    Uniform resource locator: Reference to a document in a network.
    Vagrant
    Software for running virtual machines locally for development.
    Virtual machine
    Emulation of a computer system used to run multiple machines on the same hardware.
    Webhook
    A URL that gets called by an online service after a specific action to let another service know about the outcome of the action.
    Web service
    Interface to provide programmatic access to data and functionality over the web.
    YAML
    YAML Ain't Markup Language: File format for writting structured data.