From charlesreid1

Patching is a useful tool when sharing and dealing with source code changes, particularly in situations where there is a main or "trunk" code that multiple people are developing. A patch consists of a file that contains differences between two files or directories. In the case of multiple people modifying the same code base, people can share patches, which allows them to share fixes, modifications, or other changes, bundled together in a single file, rather than having to share every single file that's changed.

Life without patches

Imagine you have a 70,000-line file, and you only changed one line - why share 70,000 lines of code, instead of the 1 line you changed? It gets even worse if the person you're sharing your changes with has changed other lines in that file. Now he or she has to open up the two files, compare them side-by-side, and apply your changes to their code. It's a big hassle, and has a very high potential for someone to unintentionally blow away all their hard work.

Life with patches

You share a small patch file, which will apply the change to the 1 line you modified. The person you share it with need only apply the patch, and only that 1 line is touched.

Creating patches

Creating patches with diff

The unix utility diff can be used to create patches. Diff summarizes differences between two files. As an example, take two files with only minor capitalization/punctuation changes:

hello_world_1.cc
#include <iostream>

using namespace std;
void main()
{
  cout << "Hello World!" << endl;
  cout << "Welcome to C++ Programming!" << endl;
}
hello_world_2.cc
#include <iostream>

using namespace std;
void main()
{
  cout << "Hello world." << endl;
  cout << "Welcome to C++ programming." << endl;
}

Then the output of the diff command will be:

$ diff hello_world_1.cc hello_world_2.cc

6,7c6,7
<       cout << "Hello World!" << endl;
<       cout << "Welcome to C++ Programming!" << endl;
---
>       cout << "Hello world." << endl;
>       cout << "Welcome to C++ programming." << endl;

which tells me that lines 6 and 7 of the original file were changed to lines 6 and 7 of the new file, with the old and new lines listed.

I can put this change into a patch file by running:

$ diff hello_world_1.cc hello_world_2.cc > hello_world_1.patch

But if I share this patch with someone else, they need to know which file to apply the patch to. As the number of changes increases, and as I incorporate more changed files into my patches, I will need to include information about which files must be changed in the patch itself.

Diff has a -u flag, which outputs the difference between the files in the "unified output format". This includes information about which file is being patched.

$ diff -u hello_world_1.cc hello_world_2.cc

--- hello_world_1.cc       2010-11-24 14:03:25.000000000 -0700
+++ hello_world_2.cc       2010-11-24 14:00:17.000000000 -0700
@@ -3,6 +3,6 @@
 using namespace std;
 void main()
 {
-      cout << "Hello World!" << endl;
-      cout << "Welcome to C++ Programming!" << endl;
+      cout << "Hello world." << endl;
+      cout << "Welcome to C++ programming." << endl;
 }

This can also be put into a patch file via:

$ diff -u hello_world_1.cc hello_world_2.cc > hello_world_1.patch

except now the person applying the patch does not need to know which files the patch must be applied to.

Patches become most useful when they are created from and applied to directories. This allows individuals to share changes to an entire directory tree.

Let's say you're dealing with a complex software project with a directory tree that looks something like this:

src/
  models/
  grid/
  turbulence/

If you want to make a patch file of the entire directory of src/ using diff, and you have a copy of the original src/ directory, you can do this:

$ ls
src/
original_src/

$ diff -ruPN src/ original_src/ > src.patch
  • The -r flag runs diff recursively
  • The -u flag includes information about which changes apply to which file in the patch
  • The -P and -N flags ensure that, if a file is not common to both directories (if you either created or deleted files), these files will either be created or deleted when the patch is applied.

Another very useful flag is the -x (or -X) flag, which keeps any files matching a regular expression from being included in the patch. This would be useful if, say, each person working on the software project had to maintain his or her own Makefile, and you wanted to distribute a patch without messing up everyone else's Makefile. You could run:

$ diff -ruPN -x "Makefile" src/ original_src/ > src.patch

or, if you have a number of different files or patterns that you don't want to include in the patch:

$ diff -ruPN -x "Makefile" -x "pattern2" -x "pattern3" src/ original_src/ > src.patch

or alternatively, create a file with patterns and feed it to the -X flag.

$ cat exclude_patterns
pattern1*
*pattern2
*pattern3*

$ diff -ruPN -X exclude_patterns src/ original_src/ > src.patch

Creating patches with SVN diff

Large software projects are often dealt with using Subversion. Patches can be easily made using subversion, since the svn diff command outputs in the same format as diff -ruPN. To make a patch from svn, simply redirect the output of svn diff to your patch file:

$ svn diff src/ > src.patch

Applying patches

Single-file patches

Using the example above, I can share the patch for hello_world_1.cc by sharing hello_world_1.patch with someone else who has hello_world_1.cc. Someone else can apply this patch to their hello_world_1.cc by running

$ patch hello_world_1.cc < hello_world_1.patch

which results in a hello_world_1.cc that is now identical to hello_world_2.cc above.

Directory patches

When a patch of an entire directory is applied, an additional argument must be fed to patch:

$ cd src/
$ patch -p0 < src.patch

The -pN flag indicates the level of stripping that will occur for the files being patched.


To illustrate, creating a patch using absolute paths:

$ diff -urNP \
  /home/charles/path/to/hello_world_1.cc \
  /home/charles/new/path/to/hello_world_1.cc \
  > hello_world_using_absolute_paths.patch

which creates the patch file

--- /home/charles/path/to/hello_world_1.cc       2010-11-24 14:03:25.000000000 -0700
+++ /home/charles/new/path/to/hello_world_1.cc   2010-11-24 14:00:17.000000000 -0700
@@ -3,6 +3,6 @@
 using namespace std;
 void main()
 {
-      cout << "Hello World!" << endl;
-      cout << "Welcome to C++ Programming!" << endl;
+      cout << "Hello world." << endl;
+      cout << "Welcome to C++ programming." << endl;
 }

Alternatively, if relative paths are used,

$ diff -urNP path/to/hello_world_1.cc newpath/to/hello_world_1.cc > hello_world_using_relative_paths.patch

this creates the patch file

--- path/to/hello_world_1.cc    2010-11-24 14:03:25.000000000 -0700
+++ newpath/to/hello_world_1.cc 2010-11-24 14:00:17.000000000 -0700
@@ -3,6 +3,6 @@
 using namespace std;
 void main()
 {
-      cout << "Hello World!" << endl;
-      cout << "Welcome to C++ Programming!" << endl;
+      cout << "Hello world." << endl;
+      cout << "Welcome to C++ programming." << endl;
 }

When applying the patch, the first case (absolute paths) requires 2 levels of stripping:

$ patch -p2 < hello_world_absolute_paths.patch

which will remove two levels, /home/charles, from the locations of files being patched.

Alternatively, if using the patch created with relative paths, no level of stripping is needed:

$ patch -p0 < hello_world_absolute_paths.patch

More complex scenarios

References