In the Unix world there are basically two types of daemons: forking daemons and multi-threaded ones.
The big disadvantage of forking daemons is that each 'fork' creates a new process, and uses system resources.
Let's dive into some code. Here is the most simple version of a forking daemon.
1 /* Sample program: forkingdaemon1.c */ 2 3 #include <errno.h> 4 #include <stdio.h> 5 #include <stdlib.h> 6 #include <string.h> 7 #include <unistd.h> 8 #include <sys/types.h> 9 10 static void event (void); 11 static void handle_event (void); 12 13 int main () { 14 15 /* In the parent process we simulate an 'event' each 5 seconds. */ 16 while (1) { 17 event(); 18 sleep (5); 19 } 20 21 return (0); 22 } 23 24 void event () { 25 pid_t pid; 26 27 printf ("Event simulated\n"); 28 if ( (pid = fork()) == 0 ) { 29 /* Child process code */ 30 handle_event(); 31 } else if (pid > 0) { 32 /* Parent process code */ 33 printf ("Event is being handled by child pid %u\n", pid); 34 } else { 35 fprintf (stderr, "Fork failure: %s\n", strerror(errno)); 36 exit (1); 37 } 38 } 39 40 void handle_event () { 41 printf ("Event handled in pid %u\n", getpid()); 42 exit (0); 43 } 44
The anatomy of this daemon is that upon event simulation, a child
process is started which handles the event. Note the exit(0) in
the event handling code. This is crucial! The child process
must stop, because the parent went back into standby mode. If the
child process wouldn't terminate, then it would return from
handle_event() into event() into main() to join the wait
loop. The number of running processes would explode.
There are two fundamental errors in the above daemon program.
So let's fix the errors. The system call wait4() allows you to
wait for a particular or for all
child processes, and if a child process has terminated, to collect
relevant data about it. The following code shows this.
The wait4() system call is enveloped in a function
report_on_children(). This function is called from three places in
the program:
main() stops;
sighandler(), which is called when the main
program receives a signal to stop. Signals will be
explained later in section 2.5.Incidentally, the event handling in the child processes is made longer in this example, to ensure that more than one event handler will be active at the same time.
1 /* Sample program: forkingdaemon2.c */ 2 3 #include <errno.h> 4 #include <stdio.h> 5 #include <stdlib.h> 6 #include <string.h> 7 #include <unistd.h> 8 #include <sys/types.h> 9 #include <sys/wait.h> 10 11 12 static void event (int ev); 13 static void report_on_children (void); 14 static void sighandler (int sig); 15 16 int main () { 17 int i; 18 19 /* Make sure that when the parent is interrupted, the signal 20 * handler goes off. */ 21 signal (SIGINT, sighandler); 22 signal (SIGTERM, sighandler); 23 signal (SIGKILL, sighandler); 24 /* .. and so on .. */ 25 26 /* In the parent process we simulate an 'event' each 5 seconds. */ 27 for (i = 0; i < 10; i++) { 28 report_on_children(); 29 event(i); 30 sleep (5); 31 } 32 33 /* Before normal termination, collect children again. */ 34 report_on_children(); 35 return (0); 36 } 37 38 static void sighandler (int sig) { 39 printf ("Parent process caught signal %d, stopping\n", sig); 40 report_on_children(); 41 exit (1); 42 } 43 44 static void report_on_children () { 45 pid_t child; 46 int status; 47 48 /* Wait for any child to tell us it's finished. We do this in 49 * a while loop to collect as many child stops as we can. 50 * The arguments to wait4() are: -1 (any child is ok), 51 * &status (get exit state into status), WNOHANG (don't block 52 * when there are no more children), 0 (don't collect resource 53 * usage data). */ 54 while( (child = wait4 (-1, &status, WNOHANG, 0)) > 0) { 55 printf ("Child %u exited\n", child); 56 if (WIFEXITED(status)) 57 printf (" Exit status: %d\n", WEXITSTATUS(status)); 58 if (WIFSIGNALED(status)) 59 printf (" Got signal: %d\n", WTERMSIG(status)); 60 if (WIFSIGNALED(status) && WCOREDUMP(status)) 61 printf (" Dumped core\n"); 62 if (WIFSTOPPED(status)) 63 printf (" Stopped due to signal: %d\n", WSTOPSIG(status)); 64 } 65 } 66 67 void event (int ev) { 68 pid_t pid, i; 69 70 printf ("Event simulated\n"); 71 if ( (pid = fork()) == 0 ) { 72 /* Child process code */ 73 signal (SIGINT, SIG_DFL); 74 signal (SIGTERM, SIG_DFL); 75 signal (SIGKILL, SIG_DFL); 76 77 for (i = 0; i < 7; i++) { 78 printf ("Event %d handled in pid %u (loop %d)\n", 79 ev, getpid(), i); 80 sleep (1); 81 } 82 exit (0); 83 } else if (pid > 0) { 84 /* Parent process code */ 85 printf ("Event is being handled by child pid %u\n", pid); 86 } else { 87 fprintf (stderr, "Fork failure: %s\n", strerror(errno)); 88 exit (1); 89 } 90 } 91
Just for completeness' sake: the 'zombie' problem can also be avoided
by ignoring SIGCHLD signals. These signals are delivered to the
parent process when a child stops. If the parent process ignores this
system event, then stopped child processes are silently cleaned up,
without bothering anyone.
Therefore, a statement signal(SIGCHLD, SIG_IGN) in the main()
function avoids zobies too. However, in contrast to the above program,
the exit status of the children isn't collected. Also this method may
not work on all Unix flavors.
This series of lectures won't go into POSIX threads at depth. But here's a sample program:
1 /* Sample program: posixthreads.c */ 2 3 #include <string.h> 4 #include <unistd.h> 5 #include <stdio.h> 6 #include <errno.h> 7 #include <pthread.h> 8 #include <stdlib.h> 9 10 int counter; 11 static void *display (void *data); 12 13 int main () { 14 pthread_t a, b; 15 16 printf ("Main: starting threads.\n"); 17 18 if (pthread_create (&a, 0, display, "A")) { 19 fprintf (stderr, "Thread cration failed: %s\n", strerror(errno)); 20 exit (1); 21 } 22 if (pthread_create (&b, 0, display, "B")) { 23 fprintf (stderr, "Thread cration failed: %s\n", strerror(errno)); 24 exit (1); 25 } 26 27 printf ("Main: looping.\n"); 28 while (counter < 20) { 29 printf ("Main: counter is %d\n", counter); 30 sleep (1); 31 } 32 printf ("Cancelling threads..\n"); 33 pthread_cancel (a); 34 pthread_cancel (b); 35 36 return (0); 37 } 38 39 static void *display (void *data) { 40 char *name = (char*) data; 41 pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; 42 43 while (1) { 44 pthread_mutex_lock(&mutex); 45 printf ("Thread %s: increasing counter\n", name); 46 counter++; 47 pthread_mutex_unlock(&mutex); 48 sleep (2); 49 } 50 } 51 The anatomy of this sample program is as follows.
main() creates two threads, identified by a and
b. The threads receive function display() as their
execution point. The identifier is set to "A" and "B"
respectively.
Once the threads have been created, they are
running. main() will simply wait until a variable
counter reaches value 20. After that, the threads are
stopped.
display(), receives its
identifier in a void *data. Since we know that this is
actually a string, it is typecast into name.
Actually, this data identifier is more than just a fancy
string: it is the "thread context". The thread starter can
prepare sets of data that are particular to that thread, and
can pass them as a typeless void* argument.
counter upon
each loop run. Note the locking mechanism.A very fine example is the Apache HTTP server. As of Apache 2.0, the number of forked processes is configurable, but also the threads per process. Here's an example of an Apache configuration:
ServerLimit 100 StartServers 25 ThreadsPerChild 50
Here Apache is configured to start no more than 100 processes, but 25 initially. There are 50 threads per process; resulting in 1250 to 5000 active threads. In Apache each thread serves one browser connection, so that this configuration is initially ready to accept 1250 concurrent connections without having to start up any extra processes, and is allowed serve up to 5000 before turning you down.
myprog & but that
doesn't really do the trick.
This section focuses on how to make a program run in the background by itself. There are a few steps in doing this:
daemon
started, and stop.
/dev/null. The reason is the
following. If you don't, then the next opened file will have
handle 0; the file after that 1, and so on. These handles may
be e.g. network connections. Now imagine that some obscure
piece of code in a library that you're linking, does a
printf(). This would get sent to file handle 1 -- which by
now is a network connection! If by contrast file handle 1 is
re-opened to write to /dev/null, then a spurious
printf() will silently disappear.
A sample is shown below. The necessary actions have been collected
into a function daemonize() that returns 0 when the code is now
executed as a daemon, or the daemon's PID. The function
daemonize() is structured in such a way that you might re-use it
for your own enjoyment.
1 /* Sample program: daemonizer.c */ 2 3 #include <errno.h> 4 #include <fcntl.h> 5 #include <stdio.h> 6 #include <stdlib.h> 7 #include <string.h> 8 #include <unistd.h> 9 #include <sys/types.h> 10 11 static int daemonize (void); 12 13 int main () { 14 pid_t pid; 15 16 if ( (pid = daemonize()) > 0 ) { 17 /* Parent process. */ 18 printf ("Daemon started as process ID %u\n", pid); 19 exit (0); 20 } 21 22 /* Child process, which should perform some extraordinarily 23 * interesting task. There's also a printf() for fun that 24 * will be routed to /dev/null. */ 25 printf ("Start of daemon's actions\n"); 26 sleep (20); 27 printf ("End of daemon's actions\n"); 28 29 return (0); 30 } 31 32 33 int daemonize () { 34 pid_t pid; 35 36 /* Try to fork. */ 37 if ( (pid = fork()) < 0 ) { 38 /* Failed */ 39 fprintf (stderr, "Fork failure: %s\n", strerror(errno)); 40 exit (1); 41 } else if (pid > 0) { 42 /* Parent branch */ 43 return (pid); 44 } 45 46 /* We now must be the child branch. Go to the root dir. */ 47 if (chdir ("/")) { 48 fprintf (stderr, "Cannot chdir to /: %s\n", strerror(errno)); 49 exit (1); 50 } 51 52 /* Become session leader of a new process group. */ 53 if (setsid() == -1) { 54 fprintf (stderr, "Failed to create process group. Already crated?\n"); 55 exit (1); 56 } 57 58 /* Close FD's 0,1,2 and reopen them on /dev/null. Note that we 59 * cannot report on failures of the open() below -- stderr is 60 * very likely already gone. */ 61 close (0); 62 close (1); 63 close (2); 64 open ("/dev/null", O_RDONLY); 65 open ("/dev/null", O_WRONLY); 66 open ("/dev/null", O_WRONLY); 67 68 /* We're a daemon now! */ 69 return (0); 70 } 71 kill -l command, which lists signal names
and their numbers. A signal is technically no more or less than an
old-fashioned interrupt.
By convention, most daemons react to different signals in different ways. Most often the following signals are used:
SIGHUP or 1: Most daemons will reload their
configuration upon receipt of this signal, and will continue
processing using the new configuration. This is in fact a
'graceful restart': the daemon continues to serve requests, and
doesn't drop existing actions.
SIGINT, SIGQUIT, SIGTERM: This would be a signal to
gracefully stop, which usually means: finish up the existing
actions, but don't start new ones.
Depending on your daemon you might need to implement complex logic to
make all this possible. Again by convention, signalling your daemon is
used from start and stop scripts of the Unix startup and shutdown
routines (e.g. under Linux, see /etc/rc.d/init.d/*). For a
very fine example, see how apachectl messages the Apache HTTP
server httpd for several actions.
Depending on the signal, there will be a already a default program behaviour in place:
SIGWINCH, which is delivered to a program when the
terminal window is resized. Other examples are SIGUSR1 and
SIGUSR2. These are user-defined, and aren't delivered to a
program by the kernel, but are -- obviously -- user defined.
SIGINT, which is sent to a program when an
attempt is made to interrupt it (using ^C). These signals can
however be intercepted by a program so that instead of
terminating, they are 'caught' and handled otherwise. An other
example is SIGSEGV, delivered to a program upon segment
violation. You might even catch this signal in your program to
try some error recovery.
SIGKILL, 9. Your program just can't avoid being
terminated by kill -9.2.5.1.1: Defining a signal handler
A signal handler is a C function that has as prototype: void
handler(int sig). Upon delivery of a signal, the function is
activated, and sig is the caught signal.
The handler is enabled using the system call signal(). This
function has two arguments: the number for which an action is
specified, and the address of the handler function. There are two
special define's for the handler function address:
SIG_IGN specifies to ignore the signal;
SIG_DFL specifies to perform the default action
(e.g., ignore, or terminate the program).
There are a few ways by which a signal can be sent to a process. The
foremost method is to use the kill() system call. The first
argument is usually a process ID, or zero (in the latter case the
signal will be sent to an entire process group). The second argument
is the signal to send, e.g. SIGUSR1.
Another common way of signal delivery is by using alarm(). This
requests the kernel to deliver signal SIGALRM after a given number
of seconds.
Below is a sample program that forever reads keyboard input. It has
enabled a signal handler function catcher() for 4
signals. Pressing ^C twice will terminate the program; and each 5
seconds an alarm signal goes off, which is displayed. When running
this sample program, also try to put the program on hold using
^Z. After re-starting it, either as a background process (using the
command bg) or as a foreground process (using the command fg),
signal SIGCONT will be delivered to the program.
1 /* Sample program: sigcatcher.c */ 2 3 #include <stdio.h> 4 #include <stdlib.h> 5 #include <unistd.h> 6 #include <sys/signal.h> 7 8 static void catcher (int sig); 9 10 int main () { 11 signal (SIGTERM, catcher); /* default 'kill <pid>' */ 12 signal (SIGINT, catcher); /* ^C signal */ 13 signal (SIGALRM, catcher); /* alarm clock */ 14 signal (SIGCONT, catcher); /* continuing after a ^Z stop */ 15 16 alarm (5); 17 while (1) { 18 printf ("Press ENTER..\n"); 19 getchar (); 20 } 21 22 return (0); 23 } 24 25 void catcher (int sig) { 26 printf ("Caught signal %d\n", sig); 27 28 switch (sig) { 29 case SIGINT: 30 /* Upon SIGINT: Enable the normal signal function. */ 31 printf ("Press ^C again to terminate this program.\n"); 32 signal (SIGINT, SIG_DFL); 33 break; 34 case SIGALRM: 35 /* Upon SIGALRM: Reset the alarm clock for another 5 sec. */ 36 printf ("Alarm clock!\n"); 37 alarm (5); 38 break; 39 } 40 } 41