Incremental Development

"Deliver software frequently..." -- The Agile Manifesto

Lesson 1: Why incremental?
"He pushed another commit!"

Why do incremental development?

  • Small batches of work make testing easier.
  • Small batches make it easier to find a bug when you have one.
  • Small batches enable rapid feedback from the users on whether the software is delivering what they want.
  • Small batches allow us to deliver value to the user more rapidly.
  • Small batches are more satisfying to the developers, because they get to see their software being used soon after they write it.

Completing and delivering small batches of work at a time is such an important part of DevOps that we could almost say that every other part of DevOps exists to enable frequent, reliable delivery of small improvements to a piece of software.

In older models of software development, it was thought that rapid delivery of software to users meant buggy systems that crashed a lot.

To complete this section, please read Iterative and incremental development

Lesson 2: Version control everything

Version control is the name for tools that allow the storing of different versions of project files, and the ability to revert to an earlier version (for instance, when a bug is found in the new version). There are a variety of version control systems in existence, such as RCS, SCCS, Suberversion, CVS, and git. You should study the page on version control here.

For the rest of lesson 2, please read:
Version control everything

Lesson 3: git and GitHub
A brief tutorial on using git.

First things first: don't confuse git with GitHub! git is a system for creating and updating a distributed source code control repository. It includes features for adding new files, updating files, deleting files or versions from the repository, branching, resolving version conflicts, and reverting to earlier versions.

GitHub, on the other hand, is a web site that allows for free storage of public repositories ("repos" for short). It is not necessary to use GitHub in order to use git: some companies have a "origin" repo they keep internally, while other people use other public sites, such as BitBucket, to hold the origin version of their repo. (We, in fact, will explore moving to BitBucket during our course.)

Basic git

You will only need a small number of git commands for most of our work. (You can also use git from GitHub Desktop, but it is good to be at least a little familiar with command line usage.) In particular, you need to know:

  • git clone https://github.com/gcallah/OnlineDevops.git
    This makes a local copy of a repo from an online copy. The online copy becomes the origin repo for the local copy.
  • git add <filename>
    Adds <filename> to the repo.
  • git commit <filename> -m "Message"
    Commits changes to <filename> into the repo, labeled with "Message".
    Note: Commit messages are important! They let your teammates know why a new version of the file has been created.
  • git push origin master
    This pushes the master branch of your code to the origin repo. (Some projects create multiple branches: following the guidelines for continuous delivery, we will not be doing that.) You run this when you are ready to share your work with your team: that should happen frequently! Several pushes per hour is not unusual.
  • git pull origin master
    This is how you refresh your local repo with the work others have been doing. You should certainly do this to start your work day, and if you know others are working at the same time, perhaps more often.
  • git rm <filename>
    This removes <filename> from the repo and your local file system. You should always delete files through git and not using your native OS's capabilities.
  • git mv <old_filename> <new_filename>
    This renames <old_filename> to <new_filename>. You should always rename files through git and not using your native OS's capabilities.
  • git status
    This shows you the current state of your repo.
  • git branch <branchname>
    This will create a new branch with <branchname>. As mentioned above, our projects generally discourage branching as an anti-CI pattern.
  • git checkout <branchname>
    This will set your working branch to <branchname>.
  • git merge <branchname>
    This will merge <branchname> into the current branch. Therefore, use git status to make sure you are on the branch into which you want to merge before you do the merge.

Some files to know about:

  • .git is the actual repository: the other files you see are working copies of the latest version in the repo.
  • .gitignore list the various types of files you don't want in the repo.
    A sample git ignore file.

git is a distributed version control system. This is important to understand for a couple of reasons:

  1. Using git one can work entirely offline, and continue checking code into the repo: each person who has cloned the repo has a full copy of it. Whenever you get back online, you can re-sync.
  2. Understanding many of the problems one will hit while using git is easier knowing this: typically, problems arise when to copies of the repo get out of sync, and git has trouble re-syncing them.
Resolving conflicts

So what to do if your copy of the repo is out of date with the origin copy? Here are some things to look at:

  • Sometimes, the problem is simply that you haven't done a git pull recently enough: try the pull command above, and see if that clears up your problem.
  • But maybe you and someone else have actually been working on the same file at the same time? git does a pretty good job of trying to resolve this: if you have obviously been working on different parts of the file, git can often merge your changes correctly, without human intervention. But if git can't do this, it will produce a marked-up file for you to merge by hand. The file will contain lines like:
                        
                            <<<<< HEAD:file.txt
                            Hello world
                            =======
                            Goodbye
                            >>>>> 77976da35a11db4580b80ae27e8d65caf5208086:file.txt
                        
                        

    Your job then is to choose whether you want "Hello world" or "Goodbye" in the file, and delete the one you don't want, as well as deleting those weird lines git added.
  • Finally, you may have a situation where you realize you were just "messing around" and don't want to retain any of what you wrote. (In that case, a branch would have been useful!) You can sometimes fix that up by letting git know you want it to ignore your work:
    git checkout --theirs <filename>
    git add <filename>
    After that, you can perform a commit and a push as above. This will cause git to ignore your changes and just use "theirs": the one's in the origin repo.
    On the other hand, sometimes you want your own version to be used rather than the version from "their" repo; then you can use --ours rather than --theirs:
    git checkout --ours <filename>
Submodules

The final aspect of git we will discuss here is submodules. They are a method of incorporating one repo within another one. A good example of the use of a submodule is our utils repo: it contains various programs that are useful in multiple projects. We don't want to have different copies of these programs: that violates DRY! So what we do is to share that repo among several repos by including it as a submodule.

Submodules are useful, but git's implementation of them is... tricky. First of all, a repo's submodules do not update automatically! This is sensible, because the submodule might be someone else's code, and you don't want a new version to break your program. So you have to cd to the submodule directory, and do a pull there to update it.

First, let's add a submodule: we will add the utils repo we use in all of our projects as a submodule in our practice repo. The first thing to do is to add it:
git submodule add https://github.com/gcallah/utils.git
(The presenter will do this step for the class, since we can only add a submodule once per repo!)
This will create a new file called .gitmodules and a new dir called utils. These store the information about the submodule we have added in the repo. utils will store the information on where the files in this submodule come from.

Now we need to see how to clone a repo that has a submodule and make things work properly. After you clone a repo with a submodule, you will see that you have a directory for the submodule, but that it is empty. You need to do two more steps:
git submodule init
will initialize your local configuration files to "know" about the submodule. And:
git submodule update
Will fill the submodule directory with the proper files.

From that point on, it is important to note that the submodule will not update automatically when you do a pull. Given that the submodule might be third party software, we don't want a new version that might break our code automatically joining our repo! Thus, when you need the new version of a submodule in a repo, you have to explicitly run an update. We are working to automatically include such an update in our build files, where appropriate.

Submodules sometimes get so seriously out of sync with the origin repo of the submodule that the only solution is to delete the submodule and re-add it. That process is described here. (After deleting it, you simply re-add it with the normal steps.)

Other Readings
Quiz

    git and GitHub are...?

    1. just different names for the same software
    2. very different: git is the basic version control software, while GitHub is a place to store git repos
    3. entirely unrelated
    4. none of the above

    If you type 'git pull origin master' then master is...?

    1. the branch you are pulling
    2. the level of difficulty of the pull
    3. the repo you are pulling from
    4. how much git should try to force the pull

    We use "git clone" to...?

    1. add a clone of a file to a repo
    2. turn one repo into two repos
    3. make a local copy of a repo
    4. all of the above

    If we see text like "<<<<< HEAD:file.txt" inserted into a file, that is...?

    1. the header for the file
    2. a UNIX command accidentally inserted in the file
    3. git showing us where a conflict is in file.txt
    4. none of the above

    Automated deployment should be coupled with...?

    1. automated tests
    2. manual steps to make sure the sys admin is on their toes
    3. manual configuration
    4. none of the above

    We develop incrementally because it....?

    1. increases programmer satisfaction
    2. produces less buggy software
    3. delivers value to the customers faster
    4. all of the above

    Incremental and iterative development were a response to failures in...?

    1. functional programming
    2. the waterfall model
    3. test-driven development
    4. resume-driven development

    Version control is an important part of incremental development because...?

    1. it allows us to roll back changes easily
    2. it permits auditors to see how much work has been done
    3. "if you don't know your past you won't know your future."
    4. all of the above

    The problem with describing a server setup in a detailed document is...?

    1. an engineer has to manually do the setup
    2. the document will get out of date
    3. the document is likely to miss some "obvious" step
    4. all of the above

    Commit messages in git are...?

    1. important to communicate to team members what you are up to
    2. unimportant and should be ignored
    3. can be the same for every commit
    4. generated automatically