Fun With Pointers

Pointers are one of the most misunderstood concepts in C but also one the most powerful tools it provides. In general they are just numbers in memory like everything else, but their value is interpreted as an address to other data. In this document I’ll attempt to demystify the arcane pointer and show some practical examples of their power.



Everything Lives In Memory

Before I get into pointers, I want to clarify some aspects of how a program runs. This will all be based on my local Linux environment so your mileage may vary, but conceptually it’s all the same.

The C compiler and linker you use takes your code, parses it, and converts it to a binary file that works for the targeted processor and operating system. The compiler will validate your C code and generate machine code that your processor can work with, the linker will do some organizational tasks to allow your program to use libraries and be loaded by the operating system.

Most, if not all, UNIX systems use the Executable and Linkable Format (ELF) for compiled programs and libraries. It has a header that describes what type of binary the file is, and has a lot of other data that tells the operating system where in virtual memory the program’s assets should be loaded.

Here’s a simple program that we’ll inspect a little bit. It spits out the addresses of a few things, the locations in the running virtual memory space of a program. The & operator before a variable will give you the address of that variable.

#include <stdio.h>

int my_global = 123;

int main(void)
{
  int my_local = 456;

  printf("main has an address, it is %p\n", main);
  printf("my_global has an address, it is %p\n", &my_global);
  printf("my_local has an address, it is %p\n", &my_local);
  printf("printf has an address, it is %p\n", printf);
  return 0;
}

Tons of crap ends up in the binary for a C program, like all the stuff that provides you the goodness of the standard C library and the stuff the C library itself needs. One of the more interesting things we can look at is the symbol table. Symbols help give us names to find important things in the program. I’ll use readelf to dump the symbol table from the binary, and filter out globals.

From this list there are a few things that were defined in my source file. Entries 4 and 53 both refer to the printf call I’m using from glibc which for this program is located at the memory address 0x400470, entry 59 is the my_global variable from my program at the address 0x601040 and entry 64 is my main function at 0x400596. The only thing missing is the my_local variable that’s inside of my main function.

If I run the program a few times I’ll see where the program says these things are.

As seen here, main, my_global and printf all have consistent locations in memory and it matches what was listed in the symbol table, while my_local changes between calls of the program. This is because the local variable is part of the function which is part of the call stack and Linux uses a security feature that randomizes where the stack begins. As functions are called in your C program the stack is used for function arguments, function local variables and a pointer to code that should be ran after the function is complete. Each function call gets a new section of the stack, referred to as a Stack Frame.

Just like a string and everything else on the computer, the executable code resides somewhere in memory. I can use objdump to get the machine code version of main, and it’ll show us each instruction that main is composed of and the addresses of each processor operation.

0000000000400596 <main>:
  400596:       55                      push   rbp
  400597:       48 89 e5                mov    rbp,rsp
  40059a:       48 83 ec 10             sub    rsp,0x10
  40059e:       64 48 8b 04 25 28 00    mov    rax,QWORD PTR fs:0x28
  4005a5:       00 00 
  4005a7:       48 89 45 f8             mov    QWORD PTR [rbp-0x8],rax
  4005ab:       31 c0                   xor    eax,eax
  4005ad:       c7 45 f4 c8 01 00 00    mov    DWORD PTR [rbp-0xc],0x1c8
  4005b4:       be 96 05 40 00          mov    esi,0x400596
  4005b9:       bf b8 06 40 00          mov    edi,0x4006b8
  4005be:       b8 00 00 00 00          mov    eax,0x0
  4005c3:       e8 a8 fe ff ff          call   400470 <printf@plt>
  4005c8:       be 40 10 60 00          mov    esi,0x601040
  4005cd:       bf d8 06 40 00          mov    edi,0x4006d8
  4005d2:       b8 00 00 00 00          mov    eax,0x0
  4005d7:       e8 94 fe ff ff          call   400470 <printf@plt>
  4005dc:       48 8d 45 f4             lea    rax,[rbp-0xc]
  4005e0:       48 89 c6                mov    rsi,rax
  4005e3:       bf 00 07 40 00          mov    edi,0x400700
  4005e8:       b8 00 00 00 00          mov    eax,0x0
  4005ed:       e8 7e fe ff ff          call   400470 <printf@plt>
  4005f2:       be 70 04 40 00          mov    esi,0x400470
  4005f7:       bf 28 07 40 00          mov    edi,0x400728
  4005fc:       b8 00 00 00 00          mov    eax,0x0
  400601:       e8 6a fe ff ff          call   400470 <printf@plt>
  400606:       b8 00 00 00 00          mov    eax,0x0
  40060b:       48 8b 55 f8             mov    rdx,QWORD PTR [rbp-0x8]
  40060f:       64 48 33 14 25 28 00    xor    rdx,QWORD PTR fs:0x28
  400616:       00 00 
  400618:       74 05                   je     40061f <main+0x89>
  40061a:       e8 41 fe ff ff          call   400460 <__stack_chk_fail@plt>
  40061f:       c9                      leave  
  400620:       c3                      ret    

Don’t be too daunted with the assembly here, we’re not here to talk about that. Just grasp that when this program is running main will be residing here in the memory space of the program.

To the memory, there is no difference between program code and bytes that may be strings, integers or pictures of cats.  To prove this, I’ll tell the compiler that I want to use main as a string, and I’ll look at the first character of it.

#include <stdio.h>

int main(void)
{
  char *string = (char*) main;

  printf("character: %c\n", string[0]);
  printf("hex: %x\n", string[0]);

  return 0;

}

When I build and run this program, this is what I get:

$ ./test 
character: U
hex: 55

Just like my previous program, main starts with the assembly operation push rbp, which if you look at as an ascii character would be a U, or 0x55 as a hexadecimal number.

Why You Confuse Me So?!

The reason I’m pushing this so hard is to really drive home the fact that everything lives in memory and that the only difference between data types is how you treat the data in memory. You can interpret any given chunk of memory as any type, in the case of pointers we treat a chunk of memory as an address that points to another chunk of memory.

Let’s look at some less esoteric examples of how this can be useful.



The Great Unknown

For someone writing a C application, it would be fantastic if you knew how much memory a program would use before running it. Sadly this is almost never the case, so we need ways to get more memory when we need it. The call stack can suffice to create space for the variables we use inside of a function, but when the function ends we lose scope of those variables. To dynamically allocate space for keeps we use the heap, most commonly via malloc() (memory allocate).

The malloc() function is provided by the C standard library, it asks the operating system to give your program a chunk of memory to work on. This is useful for making space as needed to store data in your program.

Here’s an example, this program will take a number of bytes to allocate. It’ll request a chunk of memory of that size, print out it’s address, and then releases that chunk of memory back to the system (to avoid memory leaks!).

#include <stdio.h>
#include <stdlib.h>
 
int main(int argc, char *argv[])
{
	void *dynamic_space;
	int size;
 
	if(argc != 2)
	{
		fprintf(stderr, "Usage: %s <number of bytes to allocate>\n", argv[0]);
		return 1;
	}
 
	size = atoi(argv[1]);
	if(size < 1)
	{
		fprintf(stderr, "Number of bytes must be larger than 0\n");
		return 1;
	}
 
	dynamic_space = malloc(size);
 
	printf("We allocated %d bytes, they are located at %p\n", size, dynamic_space);
 
	free(dynamic_space);
 
	return 0;
}

I’ll give this program a few runs asking for different amounts of memory

As you can see, the memory gets allocated in a variety of places even if you ask for the same amount of memory on another run of the program. On my machine, when I asked for a huge allocation it provided me a much higher memory address to use. I’m not sure why this happened but if the system thinks that’s where it should be who am I to argue.

It’s easy enough to ask for space dynamically, the new problem introduced by this is how to be organize it all.

Pointer Uses

There is no one perfect way to organize your pointers, but there are a ton of tried and true methods that are useful for a variety of cases.

Null terminated arrays

One of most common structures is a null terminated array, in the case of the char type we normally just call this a string. One of the handy things about these types of structures is they allow for pointer arithmetic. A null terminated array is a contiguous block of memory that contains no information about it’s size, but you can find the end of it by looking for the NULL or 0 (same thing in C) value at the end of it.

In C the ++ and -- operators work to increment and decrement values, but in the case of pointers they increment by the size of the data type the pointer works with, so it’s identical to incrementing a number with character pointers since characters use a single byte, but for a 4 byte integer it’s more like +=4. Using this functionality you can loop through a null terminated list until the value the pointer points to is zero pretty easily.

Here’s an example of that:

#include <stdio.h>
 
char my_string[] = "well hello there";
int many_numbers[] = {1, 2, 3, 0};
 
int main(int argc, char *argv[])
{
  char *a_character; 
  int *an_integer;
 
  // Start where my_string points, printing each character until a_character
  // points to a zero value
  for(a_character = my_string; *a_character; a_character++)
  {
    printf("at %p we have '%c'\n", a_character, *a_character);
  }
  printf("\n");
 
  for(an_integer = many_numbers; *an_integer; an_integer++)
  {
    printf("at %p we have %d\n", an_integer, *an_integer);
  }
 
  return 0;
}

And here’s that program in action

As you can see the address is incremented by 1 for character pointers and 4 for integer pointers. We can keep reading as long as the dereferenced value (the value in the address the pointer points to) is not zero. When it is zero we’re at the end of the array.

To drive this home further, here’s a program that’ll determine the length of a string given as the first argument.

#include <stdio.h>
 
int main(int argc, char *argv[])
{
	int size = 0;
	char *ptr;
 
	if(argc != 2)
	{
		fprintf(stderr, "Usage: %s <string>\n", argv[0]);
		return 1;
	}
 
	ptr = argv[1];
	while(*ptr)
	{
		size++;
		ptr++;
	}
 
	printf("first argument is %d bytes long\n", size);
 
	return 0;
}

And let’s see how that fares

Passing By Reference

In many languages, we hear about how things are passed by reference, as opposed to things that are passed by value. In many interpreted languages you have little to no control over how that works, but C lets (makes?) you control that completely.

Let’s say I want to make a function that randomizes two numbers for me. In C you can’t return more than one value from a function, so setting two variables with one function call isn’t possible without pointers. I could return an array of two integers that have random values, then set my variables to those values, but that means my function doesn’t do everything I want it to on it’s own.

The easiest way to do this is to pass the addresses (pointers) to the variables I want randomized, then my function can dereference them as if they were ordinary variables with *.

#include <stdio.h>
#include <stdlib.h>


void randomize_them(int *a, int *b) {
  *a = random();
  *b = random();
}

int main(void) {

  int my_a = 1;
  int my_b = 2;

  printf("before\na = %d\nb = %d\n", my_a, my_b);
  randomize_them(&my_a, &my_b);
  printf("after\na = %d\nb = %d\n", my_a, my_b);

  return 0;
}

Now that wasn’t too bad, was it?

That’ll wrap things up for this post, this covers most of the basics about pointers. I will be covering a bit more pointer-fu in the next post about defining and using structures in C, but I think this provides a good start for now. Please leave any questions or feedback you have in the comments!

epoll() Tutorial – epoll() In 3 Easy Steps!

It wasn’t very long ago that it was a feat of greatness to get a single webserver setup to support 10,000 concurrent connections. There were many factors that made it possible to develop webservers, such as nginx, that could handle more connections with greater efficiency than their predecessors. One of the biggest factors was the advent of constant-time polling ( O(1) ) mechanisms for monitoring file descriptors introduced into most operating systems.

In the No Starch Press book, The Linux Programming Interface, section 63.4.5 provides a table of observations that describes the time it takes to check different quantities of file descriptors via some of the most common polling methods.

As this shows, the performance benefits of epoll are decent enough to have an impact on even as few as 10 descriptors. As the number of descriptors increases, using regular poll() or select() becomes a very unattractive option compared to epoll().

This tutorial will run through some of the basics of using epoll() on Linux 2.6.27+.

Prerequisite knowledge

This tutorial assumes you’re familiar and comfortable with Linux, the syntax of C and the usage of file descriptors in UNIX-like systems.



Getting started

Make a new directory to work out of for this tutorial, here’s the Makefile we’re using.

all: epoll_example
 
epoll_example: epoll_example.c
  gcc -Wall -Werror -o $@ epoll_example.c
 
clean:
  @rm -v epoll_example

Throughout this post I’ll be using functionality described by the following headers:

#include <stdio.h>     // for fprintf()
#include <unistd.h>    // for close(), read()
#include <sys/epoll.h> // for epoll_create1(), epoll_ctl(), struct epoll_event
#include <string.h>    // for strncmp

Step 1: Create epoll file descriptor

First I’ll go through the process of just creating and closing an epoll instance.

#include <stdio.h>     // for fprintf()
#include <unistd.h>    // for close()
#include <sys/epoll.h> // for epoll_create1()
 
int main()
{
	int epoll_fd = epoll_create1(0);
 
	if(epoll_fd == -1)
	{
		fprintf(stderr, "Failed to create epoll file descriptor\n");
		return 1;
	}
 
	if(close(epoll_fd))
	{
		fprintf(stderr, "Failed to close epoll file descriptor\n");
		return 1;
	}
	return 0;
}

Running this should work and display no output, if you do get errors you’re either probably running a very old Linux kernel or your system needs real help.

This first example uses epoll_create1() to create a file descriptor to a new epoll instance given to us by the mighty kernel. While it doesn’t do anything with it quite yet we should still make sure to clean it up before the program terminates. Since it’s like any other Linux file descriptor we can just use close() for this.

Level triggered and edge triggered event notifications

Level-triggered and edge-triggered are terms borrowed from electrical engineering. When we’re using epoll the difference is important. In edge triggered mode we will only receive events when the state of the watched file descriptors change; whereas in level triggered mode we will continue to receive events until the underlying file descriptor is no longer in a ready state. Generally speaking level triggered is the default and is easier to use and is what I’ll use for this tutorial, though it’s good to know edge triggered mode is available.

Step 2: Add file descriptors for epoll to watch

The next thing to do is tell epoll what file descriptors to watch and what kinds of events to watch for. In this example I’ll use one of my favorite file descriptors in Linux, good ol’ file descriptor 0 (also known as Standard Input).

#include <stdio.h>     // for fprintf()
#include <unistd.h>    // for close()
#include <sys/epoll.h> // for epoll_create1(), epoll_ctl(), struct epoll_event
 
int main()
{
	struct epoll_event event;
	int epoll_fd = epoll_create1(0);
 
	if(epoll_fd == -1)
	{
		fprintf(stderr, "Failed to create epoll file descriptor\n");
		return 1;
	}
 
	event.events = EPOLLIN;
	event.data.fd = 0;
 
	if(epoll_ctl(epoll_fd, EPOLL_CTL_ADD, 0, &event))
	{
		fprintf(stderr, "Failed to add file descriptor to epoll\n");
		close(epoll_fd);
		return 1;
	}
 
	if(close(epoll_fd))
	{
		fprintf(stderr, "Failed to close epoll file descriptor\n");
		return 1;
	}
	return 0;
}

Here I’ve added an instance of an epoll_event structure and used epoll_ctl() to add the file descriptor 0 to our epoll instance epoll_fd. The event structure we pass in for the last argument lets epoll know we’re looking to watch only input events, EPOLLIN, and lets us provide some user-defined data that will be returned for events.



Step 3: Profit

That’s right! We’re almost there. Now let epoll do it’s magic.

#define MAX_EVENTS 5
#define READ_SIZE 10
#include <stdio.h>     // for fprintf()
#include <unistd.h>    // for close(), read()
#include <sys/epoll.h> // for epoll_create1(), epoll_ctl(), struct epoll_event
#include <string.h>    // for strncmp
 
int main()
{
	int running = 1, event_count, i;
	size_t bytes_read;
	char read_buffer[READ_SIZE + 1];
	struct epoll_event event, events[MAX_EVENTS];
	int epoll_fd = epoll_create1(0);
 
	if(epoll_fd == -1)
	{
		fprintf(stderr, "Failed to create epoll file descriptor\n");
		return 1;
	}
 
	event.events = EPOLLIN;
	event.data.fd = 0;
 
	if(epoll_ctl(epoll_fd, EPOLL_CTL_ADD, 0, &event))
	{
		fprintf(stderr, "Failed to add file descriptor to epoll\n");
		close(epoll_fd);
		return 1;
	}
 
	while(running)
	{
		printf("\nPolling for input...\n");
		event_count = epoll_wait(epoll_fd, events, MAX_EVENTS, 30000);
		printf("%d ready events\n", event_count);
		for(i = 0; i < event_count; i++)
		{
			printf("Reading file descriptor '%d' -- ", events[i].data.fd);
			bytes_read = read(events[i].data.fd, read_buffer, READ_SIZE);
			printf("%zd bytes read.\n", bytes_read);
			read_buffer[bytes_read] = '\0';
			printf("Read '%s'\n", read_buffer);
 
			if(!strncmp(read_buffer, "stop\n", 5))
				running = 0;
		}
	}
 
	if(close(epoll_fd))
	{
		fprintf(stderr, "Failed to close epoll file descriptor\n");
		return 1;
	}
	return 0;
}

Finally we’re getting down to business!

I added a few new variables here to support and expose what I’m doing. I also added a while loop that’ll keep reading from the file descriptors being watched until one of them says ‘stop’. I used epoll_wait() to wait for events to occur from the epoll instance, the results will be stored in the events array up to MAX_EVENTS with a timeout of 30 second. The return value of epoll_wait() indicates how many members of the events array were filled with event data. Beyond that it’s just printing out what it got and doing some basic logic to close things out!

Here’s the example in action:

$ ./epoll_example 

Polling for input..

hello!

1 ready events
Reading file descriptor '0' -- 7 bytes read.
Read 'hello!
'

Polling for input...

this is too long for the buffer we made

1 ready events
Reading file descriptor '0' -- 10 bytes read.
Read 'this is to'

Polling for input...
1 ready events
Reading file descriptor '0' -- 10 bytes read.
Read 'o long for'

Polling for input...
1 ready events
Reading file descriptor '0' -- 10 bytes read.
Read ' the buffe'

Polling for input...
1 ready events
Reading file descriptor '0' -- 10 bytes read.
Read 'r we made
'

Polling for input...

stop

1 ready events
Reading file descriptor '0' -- 5 bytes read.
Read 'stop
'

First I gave it a small string that fits in the buffer and it works fine and continues iterating over the loop. The second input was too long for the read buffer, and is where level triggering helped us out; events continued to populate until it read all of what was left in the buffer, in edge triggering mode we would have only received 1 notification and the application as-is would not progress until more was written to the file descriptor being watching.

I hope this helped you get some bearings on how to use epoll(). If you have any problems, questions or feedback I’d appreciate you leaving a comment!

Cross Compiling C Code for ARM

Hello! In this brief post I will share how you can cross compile some simple C code for running on an ARM based device. I have a desire to write a few simple things in C for tinkering with OpenBMC on Barreleye G2 and figured I would share my process.



The Build Box

To build this program I’m going to use a freshly built Ubuntu 16.04.3 VM, that way I know for sure what dependencies are needed. My host system is also running Ubuntu 16.04.3 and I’m using Virt Manager as an interface to libvirt that is serving my VMs via QEMU and KVM.

Starting the install

After a few minutes I am ready to go!

The Codes

The code for this is pretty simple, I’ll just build the ol’ hello world!

#include <stdio.h>

int main() {
  printf("Hello, ARM!\n");
  return 0;
}

I’ll SSH into my VM so that my commands are easier to follow. I start by creating the source file with vim.

Building and Testing

Ubuntu offers many pre-built cross compilers. All you need to do is run a sudo apt update to ensure your package listing is up to date and a sudo apt install -y build-essential gcc-arm-linux-gnueabi will get you what you need.

To build, I’ll use the arm-linux-gnueabi-gcc compiler instead of just gcc. I’ll verify it was indeed for an ARM machine with readelf, then scp it from the VM to my host system.

Next, to test it, I’ll scp it over to my OpenBMC system, ssh into it and give it a go!

There it is! I’ve successfully cross compiled a simple program to run on an ARM machine. If you have any questions about this please let me know in the comments!