From charlesreid1

Installing, Configuring, and Building Software in Linux

If you're too impatient (read: lazy) to read through all this, you can also watch the Scientific Computing Summer Program presentation here: Software Presentation.

Source vs. Binary

Most software available for Linux and Unix is open-source, and is provided for free. This software is often provided in two forms: a source package, and a binary package.

Source packages consist of the source code, makefiles, scripts, and other things required to compile the source code. Because you're compiling the source code, on YOUR machine, with YOUR compiler, it doesn't matter what platform you're running. When you compile the code, you will have an executable file that will run on your computer.

Binaries are pre-compiled executables. A Linux binary is different from a Mac OS X binary, which is different from a Windows binary. Often, there may be different binaries even for a single platform (for example, a Mac OS X 64-bit binary is different from a Mac 32-bit binary, and the 64-bit binary will not run on a 32-bit machine). The binary is pre-compiled, which means the developer has already compiled the source code for that particular platform.

Downloading and running a binary package for your platform is simple to do - you simply download the binary and run it from the command line. To run a Linux or Unix binary, you would type:

$ /path/to/program/bin/program

(Note that the binary will be located in the "bin" folder 99% of the time). To run a Mac binary, you would move the .app file into your Applications folder (or wheverever you would like to run the binary from). And in Windows, you would double-click the .exe file.


Downloading The Source Package

Source files are available on a software project's website. You'll likely see several different versions, so you'll want to get the latest version. Versions are released when problems with the software are fixed, so the latest version will probably have more features and fewer problems. However, if you see a version that is labeled "alpha" or "beta", that means it has not been completely tested yet, and so it's not stable and may crash or behave strangely.

The packages are usually in a compressed format, like .tar.gz or .tar.bz. These can be unzipped using:

$ tar xvzf program-version.tar.gz

This creates a folder called 'program-version'. This folder contains all the source code that will be compiled in order to run the program.

When you move into the folder that was created and list the files in the directory, you'll probably see several files with names like README or INSTALL. It is a good idea to read these, as the instructions given here will not work for building every program. The README or INSTALL files will give you some basic instructions on how to build and configure the program.

Configuring the Program for Installation

The first step in building your program is to configure it. This is a way of "customizing" the program to fit your needs. Often, a program's default installation settings are not enough to fit your needs. For example, if you have a 64-bit processor, the program will not install a 64-bit version by default (because most people don't have a 64-bit processor). You can specify that you have a 64-bit processor when you configure the program. You can also control whether features are enabled or disabled, where the program installs itself, and where it looks for various programs it might depend on.

To configure your program, you'll execute a file called "configure", which is essentially a script that checks to make sure you have everything you need. You run the script with various flags (meaning you add --flag to the end, where "flag" is replaced by some special keyworkd that tells the program how to behave). The configure script will check to ensure you have all the prerequisite software, all the libraries you need, that your hardware matches the configure options you gave it (that is, it makes sure that you don't try to install a 64-bit version on a 32-bit machine), and lots of other things.

Since the options available for "configure" vary greatly from program to program, you need a way of seeing what options you have available to you. To do this, you will run configure with a special flag - the help flag.

$ ./configure --help

The ./ at the beginning tells Unix to run the command named "configure" that is located in the current folder, instead of looking for a program already installed on your computer named configure.

This will print out a lot of output that will give you a long list of different flags and what each will do. For example, here is the output from running this command for the program Subversion:

$ ./configure --help
`configure' configures subversion 1.5.2 to adapt to many kinds of systems.

Usage: ./configure [OPTION]... [VAR=VALUE]...

To assign environment variables (e.g., CC, CFLAGS...), specify them as
VAR=VALUE.  See below for descriptions of some of the useful variables.

Defaults for the options are specified in brackets.

Configuration:
  -h, --help              display this help and exit
      --help=short        display options specific to this package
      --help=recursive    display the short help of all the included packages
  -V, --version           display version information and exit
  -q, --quiet, --silent   do not print `checking...' messages
      --cache-file=FILE   cache test results in FILE [disabled]
  -C, --config-cache      alias for `--cache-file=config.cache'
  -n, --no-create         do not create output files
      --srcdir=DIR        find the sources in DIR [configure dir or `..']

Installation directories:
  --prefix=PREFIX         install architecture-independent files in PREFIX
                          [/usr/local]
  --exec-prefix=EPREFIX   install architecture-dependent files in EPREFIX
                          [PREFIX]

By default, `make install' will install all the files in
`/usr/local/bin', `/usr/local/lib' etc.  You can specify
an installation prefix other than `/usr/local' using `--prefix',
for instance `--prefix=$HOME'.

For better control, use the options below.

Fine tuning of the installation directories:
  --bindir=DIR            user executables [EPREFIX/bin]
  --sbindir=DIR           system admin executables [EPREFIX/sbin]
  --libexecdir=DIR        program executables [EPREFIX/libexec]
  --sysconfdir=DIR        read-only single-machine data [PREFIX/etc]
  --sharedstatedir=DIR    modifiable architecture-independent data [PREFIX/com]
  --localstatedir=DIR     modifiable single-machine data [PREFIX/var]
  --libdir=DIR            object code libraries [EPREFIX/lib]
  --includedir=DIR        C header files [PREFIX/include]
  --oldincludedir=DIR     C header files for non-gcc [/usr/include]
  --datarootdir=DIR       read-only arch.-independent data root [PREFIX/share]
  --datadir=DIR           read-only architecture-independent data [DATAROOTDIR]
  --infodir=DIR           info documentation [DATAROOTDIR/info]
  --localedir=DIR         locale-dependent data [DATAROOTDIR/locale]
  --mandir=DIR            man documentation [DATAROOTDIR/man]
  --docdir=DIR            documentation root [DATAROOTDIR/doc/subversion]
  --htmldir=DIR           html documentation [DOCDIR]
  --dvidir=DIR            dvi documentation [DOCDIR]
  --pdfdir=DIR            pdf documentation [DOCDIR]
  --psdir=DIR             ps documentation [DOCDIR]

System types:
  --build=BUILD     configure for building on BUILD [guessed]
  --host=HOST       cross-compile to build programs to run on HOST [BUILD]
  --target=TARGET   configure for building compilers for TARGET [HOST]

Optional Features:
  --disable-option-checking  ignore unrecognized --enable/--with options
  --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no)
  --enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
  --disable-subdir-config do not reconfigure packages in subdirectories
  --disable-neon-version-check
                          do not check the Neon version
  --enable-experimental-libtool
                          Use APR's libtool
  --enable-shared[=PKGS]  build shared libraries [default=yes]
  --enable-static[=PKGS]  build static libraries [default=yes]
  --enable-fast-install[=PKGS]
                          optimize for fast installation [default=yes]
  --disable-libtool-lock  avoid locking (might break parallel builds)
  --enable-all-static     Build completely static (standalone) binaries.
  --disable-keychain      Disable use of Mac OS KeyChain for auth credentials
  --disable-nls           Disable gettext functionality
  --enable-debug          Turn on debugging
  --enable-maintainer-mode
                          Turn on debugging and very strict compile-time
                          warnings
  --disable-mod-activation
                          Do not enable mod_dav_svn in httpd.conf
  --enable-gprof          Produce gprof profiling data in 'gmon.out' (GCC
                          only).
  --enable-runtime-module-search
                          Turn on dynamic loading of RA/FS libraries
  --enable-javahl         Enable compilation of Java high-level bindings
                          (requires C++)

Optional Packages:
  --with-PACKAGE[=ARG]    use PACKAGE [ARG=yes]
  --without-PACKAGE       do not use PACKAGE (same as --with-PACKAGE=no)
  --with-apache=DIR       Build static Apache modules. DIR is the path to the
                          top-level Apache source directory. IMPORTANT: Unless
                          you are *absolutely* certain that you want to build
                          the modules *statically*, you probably want
                          --with-apxs, and not this option.
  --with-apxs[=FILE]      Build shared Apache modules. FILE is the optional
                          pathname to the Apache apxs tool; defaults to
                          "apxs".
  --with-apr=PATH         prefix for installed APR, path to APR build tree,
                          or the full path to apr-config
  --with-apr-util=PATH    prefix for installed APU, path to APU build tree,
                          or the full path to apu-config
  --with-neon=PREFIX      Determine neon library configuration based on
                          'PREFIX/bin/neon-config'. Default is to search for
                          neon in a subdirectory of the top source directory
                          and then to look for neon-config in $PATH.
  --with-serf=PREFIX      Serf WebDAV client library
  --with-gnu-ld           assume the C compiler uses GNU ld [default=no]
  --with-pic              try to use only PIC/non-PIC objects [default=use
                          both]
  --with-tags[=TAGS]      include additional configurations [automatic]
  --with-trang=PATH       Specify the command to run the trang schema
                          converter
  --with-berkeley-db=PATH The Subversion Berkeley DB based filesystem library
                          requires Berkeley DB $db_version or newer. If you
                          specify `--without-berkeley-db', that library will
                          not be built. Otherwise, the configure script builds
                          that library if and only if APR-UTIL is linked
                          against a new enough version of Berkeley DB. If and
                          only if you are building APR-UTIL as part of the
                          Subversion build process, you may help APR-UTIL to
                          find the correct Berkeley DB installation by passing
                          a PATH to this option, to cause APR-UTIL to look for
                          the Berkeley DB header and library in `PATH/include'
                          and `PATH/lib'. If PATH is of the form `HEADER:LIB',
                          then search for header files in HEADER, and the
                          library in LIB. If you omit the `=PATH' part
                          completely, the configure script will search for
                          Berkeley DB in a number of standard places.
  --with-sasl=PATH        Compile with libsasl2 in PATH
  --with-ssl              This option does NOT affect the Subversion build
                          process in any way. It enables OpenSSL support in
                          the Neon library. If and only if you are building
                          Neon as an integrated part of the Subversion build
                          process, rather than linking to an already installed
                          version of Neon, you probably want to pass this
                          option so that Neon (and so indirectly, Subversion)
                          will be capable of https:// access.
  --with-editor=PATH      Specify a default editor for the subversion client.
  --with-zlib=PREFIX      zlib compression library
  --with-jdk=PATH         Try to use 'PATH/include' to find the JNI headers.
                          If PATH is not specified, look for a Java
                          Development Kit at JAVA_HOME.
  --with-jikes=PATH       Specify the path to a jikes binary to use it as your
                          Java compiler. The default is to look for jikes
                          (PATH optional). This behavior can be switched off
                          by supplying 'no'.
  --with-swig=PATH        Try to use 'PATH/bin/swig' to build the swig
                          bindings. If PATH is not specified, look for a
                          'swig' binary in your PATH.
  --with-ruby-sitedir=SITEDIR
                          install Ruby bindings in SITEDIR (default is same as
                          ruby's one)
  --with-ruby-test-verbose=LEVEL
                          how to use output level for Ruby bindings tests
                          (default is normal)
  --with-junit=PATH       Specify a path to the junit JAR file.

Some influential environment variables:
  CC          C compiler command
  CFLAGS      C compiler flags
  LDFLAGS     linker flags, e.g. -L<lib dir> if you have libraries in a
              nonstandard directory <lib dir>
  LIBS        libraries to pass to the linker, e.g. -l<library>
  CPPFLAGS    C/C++/Objective C preprocessor flags, e.g. -I<include dir> if
              you have headers in a nonstandard directory <include dir>
  CPP         C preprocessor
  CXX         C++ compiler command
  CXXFLAGS    C++ compiler flags
  CXXCPP      C++ preprocessor
  F77         Fortran 77 compiler command
  FFLAGS      Fortran 77 compiler flags

Use these variables to override the choices made by `configure' or to help
it to find libraries and programs with nonstandard names/locations.

Report bugs to <http://subversion.tigris.org/>.


Most of these options are very specialized and the average user will not need them. In fact, if you don't care about customizing the installation, you could run ./configure without options. When you don't specify a value for a flag, the configure script will use a default value.

Sometimes, however, you will need to configure your program with specific options, so the values for various flags will need to be set. Sometimes it is only one or two flags, but sometimes you may need to change as many as 10 flags, sometimes more, and set special environmental variables or compiler flags to make your compiler work like normal. It can be a hassle, if something goes wrong in the configure script, to re-type the whole line, or to keep editing the configure line on the command line. You may also lose the configure line you used, which can be frustrating if something happens to the program and you need to re-install it, or if you need to install it on another system and you forgot that one special flag you need to make everything work.

In this case, you can make a configure script. You can call it anything you'd like, but as an example I will use runconfigure.sh. The script will be a very simple script that, when run, will simply run your configure line. You can make the file using your favorite text editor, and it will look like this:

#!/bin/sh

./configure \
--flag-1=THIS \
--flag-2=THAT \
--flag-3=THE_OTHER \
VARIABLE=value

The first line tells the system that this is a script, and the script should be run using the program /bin/sh. The next lines run configure with the flags flag-1 set to the value THIS, flag-2 set to THAT, and flag-3 set to THE_OTHER. The backslash \ at the end of each line makes sure that everything is put together as one single command (if there were no backslash, it would run the first line as a command, then run the second line as a separate command, and so on).

One last step remains before you can run your custom configure script - you have to make it executable. You can do this using chmod, which changes the permissions of a file:

$ chmod u+x runconfigure.sh

Your custom configure script can be run just like the normal configure script:

$ ./runconfigure.sh

This will give the same output as running ./configure except that it will run configure with the options you want.

Making Your Program

Once you've configured your program, you have told the program what you want it to install and where. The next step is to compile all of the source code into the executable files. This involves using the GNU make program. The make program uses a file called a Makefile, which is essentially a special script containing detailed instructions about how to compile each file that's part of a project. Each file needs to link to the files it refers to, and include files it uses, and the Makefile ensures each file is linked or included in the instructions given to the compiler.

Running configure sets certain variables used by GNU make. Once you have run configure, you can run GNU make by typing:

$ make

Or, in some cases,

$ make all

You can always see what is available to install by using help:

$ make help

This will display a list of different packages that can be made (also called "make targets"). Running "make" makes the default targets, and running "make all" makes all targets.

Typing make or make all will compile the source code in-place - so one additional step is necessary to install and use the program.

Installing Your Program

Typing make will only compile each source code file in-place - meaning, if you specify that you want to the program to install to /path/to/installation, and you are running configure and make in /path/to/source/code, then when you type make, and when once make finishes, the finished product will reside in /path/to/source/code, and not in /path/to/installation. In these cases, you must issue a second command:

$ make install

This will create a copy of the built executable in /path/to/installation, which you probably specified using the --prefix=/path/to/installation option while configuring. By default, programs are (usually) installed to /usr/local/.

Advanced Topics

Build Systems

See also 2010 Scientific Computing Summer Workshop

Make

Cmake

Auotools

Compiler Specifications

Specifying Which Compiler

You can specify compilers when running configure by doing something like this:

#!/bin/sh

./configure \
 --option=something \
 \
 CC="/usr/bin/gcc" \
 CXX="/usr/bin/g++" \
 F77="/usr/bin/gfortran"

This specifies that the C compiler (specified with the CC variable) should be /usr/bin/gcc, the C++ compiler (specified with the CXX variable) should be /usr/bin/g++, and the Fortran compiler (specified with the F77 variable) should be /usr/bin/gfortran.

Specifying Compiler Flags

If you want to specify flags for the compilers, you can use the variables CFLAGS for C compiler flags, CXXFLAGS for C++ compiler flags, and FFLAGS for Fortran compiler flags.

For example, if I wanted to specify that the compiler should use verbose warnings when I am compiling, I would use the -Wall flag for the compilers. To do this when configuring:

#!/bin/sh

./configure \
 --option=something \
 \
 CC="/usr/bin/gcc" \
 CXX="/usr/bin/g++" \
 F77="/usr/bin/gfortran" 
 \
 CFLAGS="-Wall" \
 CXXFLAGS="-Wall"

If I wanted to specify that the compiler should build a 64-bit binary, I would specify this using the -arch flag:

#!/bin/sh

./configure \
 --option=something \
 \
 CC="/usr/bin/gcc" \
 CXX="/usr/bin/g++" \
 F77="/usr/bin/gfortran" 
 \
 CFLAGS="-Wall -arch x86_64" \
 CXXFLAGS="-Wall -arch x86_64"

Conflicts

There are potential conflicts that can come about using the compiler and flags variables. For example, if I want the compiler to build binaries that work for multiple architectures (say, 32-bit and 64-bit), I can specify this with two -arch flags:

CFLAGS="-arch x86_64 -arch i386"

but often the configure script will strip the second -arch flag, and this will cause errors like "File i386 does not exist".

To get around this, I can add flags to the compiler command. The configure script will not strip flags from the compiler command, it will leave it alone. So I could say:

CC="/usr/bin/gcc -arch x86_64 -arch i386"

and this would work fine.

However, this will create problems when it comes time to run the preprocessor. The preprocessor is the part that deals with all of the

#include <Header1.h>
#include <Header2.h>

#ifdef SOME_VARIABLE
#include <Variable.h>
#endif

stuff. Normally, the C and C++ preprocessors are the C and C++ commands with the -E flag added on. But if you try and run a preprocessor command like

/usr/bin/gcc -arch x86_64 -arch i386 -E

then you're gonna have problems. You can get around this by specifying the preprocessor command, as shown in the next section.

Specifying the Preprocessor

You can specify the C or C++ preprocessor using the CPP and CXXCPP variables, like this:

#!/bin/sh

./configure \
 --option=something \
 \
 CC="/usr/bin/gcc  -arch x86_64 -arch i386" \
 CXX="/usr/bin/g++ -arch x86_64 -arch i386" \
 F77="/usr/bin/gfortran" 
 \
 CFLAGS="-Wall" \
 CXXFLAGS="-Wall" \
 \
 CPP="/usr/bin/gcc -E" \
 CXX="/usr/bin/g++ -E"

Example

For an example of all of these pieces coming together when configuring and compiling a piece of software, see the FFTW page. The FFTW libraries required compilation of both 32-bit and 64-bit versions, and configure stripped off the second -arch flag, so I had to use all of the above techniques to get it to compile correctly.