Linux security from the attacker's perspective

In these notes I am going to cover some aspects of security on Linux systems. For this discussion we will find it useful to adopt the perspective of a potential attacker. The attacker's primary goal is

To execute arbitrary code with elevated privileges

These notes will cover some some methods that attackers may use to acheive this goal, and some appropriate counter-measures that can be employed to thwart these attack methods.

The danger: buffer overflow

Here is a chunk of code taken from the source code for the CGI web server we saw earlier in the course. In these first few lines of code we are reading the first line of the request from the client and starting to do some preliminary processing.

void serveRequest(int fd) {
  char lineBuffer[256];

  // Read the first line of the request
  readLine(fd,lineBuffer,255);
  // Grab the method and URL
  char method[16];
  char url[128];
  sscanf(lineBuffer,"%s %s",method,url);

This code relies on the readLine() function to read the first line of the user's request into a buffer. Here is the code for that function:

void readLine(int fd,char* buffer,int maxBytes) {
  char* ptr = buffer;
  int bytesRead = 0;
  while(bytesRead < maxBytes) {
    read(fd,ptr,1);
    if(*ptr == '\n')
      break;
    ptr++;
  }
  *(++ptr) = '\0';
}

In this code I was careful to make sure that when we go to read the first line of the user's input we don't accidentally read more data than our lineBuffer can handle. A less careful program may have written code like this:

void readLine(int fd,char* buffer) {
  char* ptr = buffer;
  do {
    read(fd,ptr,1);
    ptr++;
  } while(*(ptr-1) != '\n');
  *ptr = '\0';
}

void serveRequest(int fd) {
  char lineBuffer[256];

  // Read the first line of the request
  readLine(fd,lineBuffer);
  // Grab the method and URL
  char method[16];
  char url[128];
  sscanf(lineBuffer,"%s %s",method,url);

The readLine() function in this version reads one character at a time from the socket until it encounters the '\n' character marking the end of the first line of input. Note that no attempt has been made here to determine whether or not the read will overflow the capacity of the buffer: this is what leads to an exploitable error.

Here is another example that shows that it is very easy to introduce a buffer overflow flaw. In this version of the code we use fdopen() convert the file descriptor to a FILE pointer, which enables us to use the unsafe fscanf() function:

void serveRequest(int fd) {
  char method[16];
  char url[128];
  // Grab the method and URL
  FILE* fp = fdopen(fd,"r");
  fscanf(fp,"%s %s",method,url);

fscanf() is unsafe to use, because it does not check to see that the string it copies into a buffer you provide is smaller than the buffer. This makes this code highly vulnerable to a buffer overflow attack.

The exploit

The first class of exploit we are going to study goes by the name of "stack smashing", and is based on forcing a local buffer to overflow and subsequently affect the structure of the stack.

To understand how an overflowed buffer can lead to an exploitable error, we have to understand the structure of the stack in a Linux application. Here is a diagram illustrating the structure of the stack and the stack frame associated with the serveRequest() function shown above.

The key features of the stack that matter to us here are the placement of the return address right above the top of the stack frame, and the presence of the local variables at the top of the stack frame. The local variables are laid out in sequence from the top of the stack frame on down. Since the buf array appears first in the list of local variables, that array will be placed at the top of the stack frame.

Since the stack grows downward in Linux, when the buf array overflows it will overflow upward past the top of the stack frame, overwriting first the saved frame pointer, then the return address, and finally the parameter fd and beyond. Our attacker is particularly interested in overwriting the return address, and possibly also the value of fd, so they will arrange to send our flawed server a first line that is specially constructed to do just that. The attacker can then arrange to set the first few characters of the first line sent to "POST ", so that the logic in the serveRequest() function will return right after reading the first line and overflowing buf.

Once the serveRequest() function returns, execution of the program will jump to the location specified in the return address. Since the attacker has taken care to overwrite the return address, execution will now proceed to a point of the attacker's choice.

The next trick the attacker deploys is to route execution back into the memory space occupied by the method and url arrays, which now contains data the attacker has sent to the server. Embedded in the data the attacker placed in these arrays is machine code the attacker wants to force the server application to run. The attacker does not have much space in these arrays in which to place malicious code, but the available space will be adequate for what the attacker plans to do next.

To prepare their malicious code, the attacker has written a small function and compiled it:

void doEvil(int fd)
{
  char **empty = { NULL };
  if (fork() == 0) {
    /* Redirect stdout and stdin to my socket */
    dup2(fd, STDIN_FILENO);
    dup2(fd, STDOUT_FILENO);
    execve("/bin/sh", empty, empty); /* Start a shell */
  }
}

The attacker extracts the compiled code from their little program, makes some simple modifications to the assembly code to correct for the actual location of fd on the stack, places the executable code in the string to be inserted in the arrays, and arranges for the return address to point to that start of the malicious code in those arrays. Once this code executes, the attacker will have shell access to the compromised machine and can launch further exploits. Further, if the programmer who wrote the flawed server was foolish enough to arrange for the server to run with root privileges (not an uncommon thing!), the process launched by the attacker's code will inherit that level of privilege, and the attacker's shell is now running with full root access.

Counter-measures deployed to prevent this exploit

The first and most obvious counter-measure that we can deploy to prevent this exploit is to take care that we never allow unbounded reads in server applications. The first version of the code that I showed above is carefully constructed to not allow unbounded reads into a buffer.

Not all programmers are this careful, so attackers may instead seek out other server applications that are vulnerable to buffer overflow attacks.

To counter this vulnerability, Linux deploys some further counter-measures. One of these counter-measures is address space layout randomization (ASLR), which slightly randomizes the location of the base of the stack in memory each time an application launches, making it more difficult for an attacker to know precisely where the buffer they plan to overflow will be in memory, which in turn makes it more challenging for them to craft an appropriate return address for use in this exploit.

A further counter-measure is to forbid the execution of code found on the stack. By setting an appropriate system-wide option, system administrators can mark memory pages that contain the stack "no execute", making it impossible for attackers to execute malicious code they have placed in a buffer on the stack.

Counter-counter measures employed by attackers

Computer security is an ever-escalating arms race between attackers and system administrators. As attack techniques become wide-spread, system developers start to employ counter-measures. As those counter-measures become widely employed, attackers are forced to develop new, more sophisticated attack strategies.

For example, as systems began to employ counter-measures that prevented the execution of arbitrary code from the stack, attackers switched to alternative methods for exploiting buffer overflows. Once such set of alternative techniques are called "return to libc" methods, which overwrite the return address with an address in the libc library. This technique can be effective, because if the attacker can guess the system details of the system they are attacking, they can count on the routines in libc being located in fixed, predictable locations. By routing program execution to known locations in libc and combining this ability with other manipulations of the stack (including placing bogus parameters on the stack), attackers have been able to carry out successful exploits. More recently, attackers have even figured out how to employ return to libc strategies to jump to carefully selected snippets of libc functions in sequence and consequently implement programs of arbitrary complexity by using a technique known as "return oriented programming."

As these techniques started to emerge, systems programmers have begun to employ even more sophisticated counter-measures in turn. Some of these newer techniques include compiler options which place 'secret' values on the stack and then trigger an exception if those values have been overwritten by an attacker trying to exploit a buffer overflow. For example, modern versions of gcc include a feature known as "stack smashing protection (ssp)", which can be invoked via the -fstack-protector compiler flag.

Privileged programs

If an attacker is able to hijack a running process, they then face an additional impediment. On Unix systems the operating system will use a system of privileges to determine what a process is allowed to do. For example, if a process attempts to open a file but does not have the appropriate privileges to open that file, the system will deny the process access.

Generally speaking, the privilege level of a process is determined by the privileges assigned to the user who launched that process. Every process has associated with it a userid (and a groupid), which associates that process with the user who launched the process (and the group the user belongs to). The operating system will use that combination of userid and groupid to determine what actions the process is allowed to carry out.

One way to grant a process elevated privileges is to launch the process as root. For example, this can be done by invoking the program from the command line via the sudo command:

sudo ./miniweb

Another technique used to grant a program elevated privileges is the "setuid" technique. In this technique we first use chown command to set owner of our program to root:

sudo chown root miniweb

We then use the chmod command to set the setuid bit on the executable:

sudo chmod u+s miniweb

Once this bit is set, the program will take on the user id of its owner when launched, rather than taking on the user id of the user who actually launched it.

Lessening the danger in privileged processes

Although it may be useful for a process to execute with root privileges, doing this can create security problems. If an attacker manages to hijack a process running with root privileges, that attacker can become root and quickly do a great deal of damage.

One way to potentially mitigate this problem is to allow programs to drop down to reduced privileges and only escalate back to root privileges when needed. To make this possible, Unix tracks both an effective user id and a real user id for each process. For a process launched with the setuid bit set and an owner of 0, the effective user id of the process will be 0, which is the user id for root, while the real user id will be the user id of the user who actually launched the application.

A program can lower its effective privileges by setting its effective user id back equal to the real user id.

uid_t orig_euid = geteuid();
seteuid(getuid()); // Sets effective uid = real uid

The program can subsequently regain privileges by doing

seteuid(orig_euid);

You may wonder how a program gets permission to reset its effective user id back to a user id with higher privileges. To manage this process, the system stores a third user id, the saved user id. When calling seteuid to restore privileges, the system will compare the requested effective user id with the saved user id. If they match, the system will grant permission to restore the more powerful effective user id.

To permanently drop elevated privileges, a program can do

setreuid(getuid(),getuid());

which sets the real, effective, and saved user ids all to the real user id.

Another strategy that a program can use to limit the damage an attacker can do is to intentionally limit the part of the file system a program can touch. The mechanism for doing this is the chroot() system call.

#include <unistd.h>

int chroot(const char *path);

This system call changes the effective root of the file system to the location described in the path parameter. For example, if the path provided is /var/myapp and you subsequently try to open a file from your application located at /etc/myapp.conf, the effective path used by the open command will be /var/myapp/etc/myapp.conf. This technique makes it effectively impossible for an attacker to see files outside of the /var/myapp directory. This technique is commonly referred to as setting up a chroot jail.