Such Programming

Tinkerings and Ramblings

Command Line Arguments in C

Today I’m going to share some tips and tricks to using command line arguments with C programs. First I’ll explore the plain ‘ol argc/argv style followed by a getopt approach.

Let’s jump right to it!

ad

Classic Approach

Most C programmers will be quite familiar with this approach, so I’ll keep it brief. The most common function signature for main is int main(int argc, char *argv[]). In this setup the argc will tell you the number of arguments passed to the program at launch, with argc getting a list of string pointers to those arguments.

With a small test program, we can inspect these variables pretty easily. I’ll even look at where the argv pointer is and where the string pointers within it are pointing to, because why not!

My classic.c source:

#include <stdio.h>

int main (int argc, char *argv[]) {
  int i;
  printf("Argument Count: %d\n", argc);
  printf("argv is at %p\n", argv);

  for (i = 0; i <= argc; i++) {
    printf("%d at %p: %s\n", i, argv[i], argv[i]);
  }

  return 0;
}

Testing it:

$ ./classic well hello there
Argument Count: 4
argv is at 0x7ffc923396b8
0 at 0x7ffc92339f4d: ./classic
1 at 0x7ffc92339f57: well
2 at 0x7ffc92339f5c: hello
3 at 0x7ffc92339f62: there
4 at (nil): (null)

The first argument is the name of the program as it was executed, the remaining argc - 1 are the command line arguments to that program. They are most often accessed using an index operator [i] as was done here, though usually i < argc; is used in the for loop condition to skip the null element that terminates the array.

Since this is a null terminated array, you could also use pointer arithmetic to process through the list and not use argc at all.

#include <stdio.h>

int main (int argc, char *argv[]) {

  char **arg = argv;
  while (*arg) {
    printf("%s\n", *arg);
    arg++;
  }

  return 0;
}

$ ./classic do it live ./classic do it live


How you interpret the command line arguments from there is entirely up to you, which is why you'll see a lot of variance in command line applications. As with other programming languages, some patterns are very common and there are libraries to assist in implementing them.

# Getopt Basics

The [getopt()](http://man7.org/linux/man-pages/man3/getopt.3.html) function is part of any Standard C Library implementation that follows a [POSIX](https://en.wikipedia.org/wiki/POSIX) standard. Many libraries will also follow the GNU standard for [getopt_long()](https://www.gnu.org/software/libc/manual/html_node/Getopt-Long-Options.html#Getopt-Long-Options).

The general idea is that options use a format starting with `-` followed by a single letter to indicate something about what the user wants the program to do. As an example, many programs on Linux will have a `-v` option you can use to see more verbose console output, or a `-h` option to get help on using the program.

The `getopt` implementation has a few variables that represent the current internal state of its parsing system.

```c
extern int optind, opterr, optopt;

The optind variable is the index value of the next argument that should be handled by the getopt() function. opterr will let you control if the getopt() function should print errors to the console. If the getopt() call returns ? because it did not recognize the option being given, optopt will be set to the character it did not recognize.

#include <stdio.h>
#include <unistd.h>


void print_getopt_state(void) { 
  printf("optind: %d\t" "opterr: %d\t" "optopt: %c (%d)\n" ,
    optind, opterr, optopt, optopt
  );
}


int main (int argc, char *argv[]) {
  print_getopt_state();
  return 0;
}
$ ./getopt 
optind: 1 opterr: 1 optopt: ? (63)

To use the getopt() function you provide the argc and argv variables along with an optstring variable that contains the list of options it should look for.

#include <stdio.h>
#include <unistd.h>


void print_getopt_state(void) {
  printf("optind: %d\t" "opterr: %d\t" "optopt: %c (%d)\n" ,
    optind, opterr, optopt, optopt
  );
}


int main (int argc, char *argv[]) {
  int character;
  char *options = "v";

  print_getopt_state();

  character = getopt(argc, argv, options);

  printf("getopt returned: '%c' (%d)\n", character, character);
  print_getopt_state();

  return 0;
}
$ ./getopt 
optind: 1 opterr: 1 optopt: ? (63)
getopt returned: '�' (-1)
optind: 1 opterr: 1 optopt: (0)

$ ./getopt -v
optind: 1 opterr: 1 optopt: ? (63)
getopt returned: 'v' (118)
optind: 2 opterr: 1 optopt: (0)

$ ./getopt -h
optind: 1 opterr: 1 optopt: ? (63)
./getopt: invalid option -- 'h'
getopt returned: '?' (63)
optind: 2 opterr: 1 optopt: h (104)

On each run of getopt(), until it reaches the end of the argument list and returns -1, it will check the next argument and return the option found or ? if an unrecognized option was given.

#include <stdio.h>
#include <unistd.h>


void print_getopt_state(void) {
  printf("optind: %d\t" "opterr: %d\t" "optopt: %c (%d)\n" ,
    optind, opterr, optopt, optopt
  );
}


int main (int argc, char *argv[]) {
  int character;
  char *options = "abcd";

  print_getopt_state();

  character = getopt(argc, argv, options);

  while(character != -1) {
    printf("getopt returned: '%c' (%d)\n", character, character);
    print_getopt_state();
  
    character = getopt(argc, argv, options);
  }

  printf("getopt returned: '%c' (%d)\n", character, character);
  print_getopt_state();

  return 0;
}
$ ./getopt -d -a -b
optind: 1 opterr: 1 optopt: ? (63)
getopt returned: 'd' (100)
optind: 2 opterr: 1 optopt: (0)
getopt returned: 'a' (97)
optind: 3 opterr: 1 optopt: (0)
getopt returned: 'b' (98)
optind: 4 opterr: 1 optopt: (0)
getopt returned: '�' (-1)
optind: 4 opterr: 1 optopt: (0)

You can also include multiple options in a single argument by just not separating them. Multiple instances of the same option will be iterated on multiple times.

$ ./getopt -baddd
optind: 1 opterr: 1 optopt: ? (63)
getopt returned: 'b' (98)
optind: 1 opterr: 1 optopt: (0)
getopt returned: 'a' (97)
optind: 1 opterr: 1 optopt: (0)
getopt returned: 'd' (100)
optind: 1 opterr: 1 optopt: (0)
getopt returned: 'd' (100)
optind: 1 opterr: 1 optopt: (0)
getopt returned: 'd' (100)
optind: 2 opterr: 1 optopt: (0)
getopt returned: '�' (-1)
optind: 2 opterr: 1 optopt: (0)

Optional and Positional Arguments

A colon after an option in the optstring can be used to indicate that option requires an argument, while two colons can indicate that it supports an argument but is not required. In either case, if an argument is given to an option that supports it, getopt() will set the optarg pointer it provides to the argument.

#include <stdio.h>
#include <unistd.h>


void print_getopt_state(void) {
  printf("optind: %d\t" "opterr: %d\t" "optopt: %c (%d)\t" "optarg: %s\n" ,
    optind, opterr, optopt, optopt, optarg
  );
}


int main (int argc, char *argv[]) {
  int character;
  char *options = "abc:d::";

  print_getopt_state();

  character = getopt(argc, argv, options);

  while(character != -1) {
    printf("getopt returned: '%c' (%d)\n", character, character);
    print_getopt_state();
  
    character = getopt(argc, argv, options);
  }

  printf("getopt returned: '%c' (%d)\n", character, character);
  print_getopt_state();

  return 0;
}
$ ./getopt -dwith -dwithout -c
optind: 1 opterr: 1 optopt: ? (63) optarg: (null)
getopt returned: 'd' (100)
optind: 2 opterr: 1 optopt: (0) optarg: with
getopt returned: 'd' (100)
optind: 3 opterr: 1 optopt: (0) optarg: without
./getopt: option requires an argument -- 'c'
getopt returned: '?' (63)
optind: 4 opterr: 1 optopt: c (99) optarg: (null)
getopt returned: '�' (-1)
optind: 4 opterr: 1 optopt: c (99) optarg: (null)

If optstring begins with -, non-option positional arguments can also be handled. In these cases getopt() will return the value 1 to indicate it has found a positional argument and set the optarg pointer to it.

#include <stdio.h>
#include <unistd.h>


void print_getopt_state(void) {
  printf("optind: %d\t" "opterr: %d\t" "optopt: %c (%d)\t" "optarg: %s\n" ,
    optind, opterr, optopt, optopt, optarg
  );
}


int main (int argc, char *argv[]) {
  int character;
  char *options = "-abc:d::";

  print_getopt_state();

  character = getopt(argc, argv, options);

  while(character != -1) {
    printf("getopt returned: '%c' (%d)\n", character, character);
    print_getopt_state();
  
    character = getopt(argc, argv, options);
  }

  printf("getopt returned: '%c' (%d)\n", character, character);
  print_getopt_state();
  return 0;
}
$ ./getopt well -a -b now
optind: 1 opterr: 1 optopt: ? (63) optarg: (null)
getopt returned: '' (1)
optind: 2 opterr: 1 optopt: (0) optarg: well
getopt returned: 'a' (97)
optind: 3 opterr: 1 optopt: (0) optarg: (null)
getopt returned: 'b' (98)
optind: 4 opterr: 1 optopt: (0) optarg: (null)
getopt returned: '' (1)
optind: 5 opterr: 1 optopt: (0) optarg: now
getopt returned: '�' (-1)
optind: 5 opterr: 1 optopt: (0) optarg: (null)

If optstring begins with a +, it will stop option parsing and return the value -1 at the first non-option argument. The - and + prefixes are only provided by C libraries that follow the GNU extension’s

Long Options

The GNU extensions provide a fancier version of getopt that supports longer, more verbose options that begin with --. It has the same first 3 parameters as getopt(). The 4th parameter is an array, longopts, of struct option structures that describe the longer options. The last parameter is an integer pointer, longindex, that on match will be the index of the matched option from the longopts array. longindex may be set to NULL if you don’t plan to use it.

The struct option has this format:

struct option {
  const char *name;
  int         has_arg;
  int        *flag;
  int         val;
};

The name is the option name that is supported. The has_arg field can be set to no_argument, required_argument or optional_argument, which correlate to the values 0, 1, and 2.

The flag pointer, will be set to val on a match and getopt_long() will return 0. If flag is NULL, val will be what getopt_long()returns when the long option is matched. Often a program will have a long option that returns the short option. This is common for an option like --help, to set val to h so that the code that handles -h can handle either.

Here’s an example that uses a few different ways of handling the long options.

#include <stdio.h>
#include <unistd.h>
#include <getopt.h>


int main (int argc, char *argv[]) {
  int character;
  char *options = "h";
  int longindex;
  int moartest_flag = 0;

  struct option longopts[] = {
    {"help", no_argument, NULL, 'h'},
    {"echo", required_argument, NULL, 0},
    {"longtest", optional_argument, &moartest_flag, 12},
    {NULL, 0, NULL, 0}
  };

  while((character = getopt_long(argc, argv, options, longopts, &longindex)) != -1) {
    printf("getopt_long returned: '%c' (%d)\n", character, character);
    switch (character) {
      case 'h':
        printf("help!\n");
        break;
      case 0:
        printf("longindex: %d\n", longindex);
        printf("longopts[longindex].name = %s\n", longopts[longindex].name);
        printf("optarg: %s\n", optarg);
        printf("moartest_flag: %d\n", moartest_flag);
    }
  }

  return 0;
}
$ ./getopt --help --longtest --echo wee
getopt_long returned: 'h' (104)
help!
getopt_long returned: '' (0)
longindex: 2
longopts[longindex].name = longtest
optarg: (null)
moartest_flag: 12
getopt_long returned: '' (0)
longindex: 1
longopts[longindex].name = echo
optarg: wee
moartest_flag: 12

In this code, -h and --help will match the h case and print my help message. Both --longtest and --echo are handled by the 0 case, I can tell them apart from the value that longindex gets set to. When --longtest is given, the moartest_flag integer will get set to 12 since I provided the pointer to that integer in my flags field for that option.

That covers what I wanted to share on parsing command line arguments! There are other strategies out there for handling it, but these are the most common and I hope this guide proves helpful.

ad