...making Linux just a little more fun!
By Vishnu Ram V
Occasionally system administrators can run into situations where the conventional way of troubleshooting an issue may not yield results. The conventional way means using test scripts, observing the log files, tweaking configuration settings and the like. In such cases, one will have to dig deeper into the internals of the server. Strace proves to be a valuable tool in such situations. The strace utility intercepts and records the system calls which are made by a process and the signals which are received by a process. This article shows how to use strace to troubleshoot Apache by illustrating a real-world issue. I will begin with the problem statement and then move on to describe the initial troubleshooting attempts. The inner workings of the Apache server are briefly explained just before examining the strace results. Various options of strace and its internal working is beyond the scope of this article, so please refer to the man pages for the same.
ISSUE: Can't send mail from web pages using PHP 'mail()'
function.
OS: RedHat
SMTP Server: Sendmail
Web Server: Apache/1.3.22 - It's a virtual hosting
environment, there are many sites hosted in the server.
PHP version: 4.0.6
Let me first recreate the issue and see the error for myself. I will create a test script, 'mail.php', that will use the 'mail()' function for sending mail. The test script is as follows.
<?php error_reporting(E_ALL); $to = "Vishnu <we2cares@fastmail.fm>"; $subject = "This is a test email"; $message = "This is a test email, please ignore."; if (mail($to,$subject,$message)) { echo "Thank you for sending email"; } else { echo "Can't send email"; } ?>
I placed mail.php into the web area of the virtual domain in question, and then accessed it through a browser. The resulting page displayed an echoed message "Can't send email", no PHP specific error messages were shown. Analyzing the maillog shows no trace of mail injected from the virtual domain in question. I needed to verify whether the issue is specific to the virtual host in question or whether the issue is server wide. The same test script was used in a few other virtual hosts and produced the same result. That means the issue is not specific to a virtual host, it's a server wide issue. Looks like Apache/PHP is broken.
Let me see whether the mail() function is disabled in the php.ini file using disable_functions.
[root@dns1 root]# grep disable_functions /etc/php.ini disable_functions =
No, mail function is not disabled.
I have turned on display_errors and display_startup_errors in php.ini so that any internal PHP error is displayed on the web page, but that didn't help either. The test PHP page doesn't display any error, there are no error messages in the Apache, sendmail or other system logs. What next?
[ When debugging a production system be sure to use your development system (which, by the way, should be identical to and separate from your production environment). If you cannot avoid using the production system for this, make sure your error messages are never displayed to the browser and redirect them to your logs. This is more secure and doesn't bother your end users with messages they can't understand. -- René ]
As I mentioned earlier, in order to know what's happening at the process level, the strace utility is very useful. Before using strace to troubleshoot the issue, I will give a brief explanation of how Apache serves an incoming request.
Apache starts by creating a parent process with root privileges. This is the main process, it is responsible for forking child processes and maintaining them. The main Apache process doesn't serve any requests, the requests are served by child processes. The number of idle child processes at a given time is determined by the MinSpareServers and MaxSpareServers directives in httpd.conf. When a new request comes in, it is served by one of the idle child processes. If there are no idle child processes, then a new child process is forked by the parent, to serve an incoming request. From the ps result shown below, it's clear the the process with PID 1861 is the Apache parent process. It is running with "root" privileges, all the child processes are running as user "apache".
[root@haqmail ~]# ps aux | grep httpd USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1861 0.0 0.4 25680 1992 ? Ss Sep24 0:02 /usr/sbin/httpd apache 2295 0.0 0.4 25852 2024 ? S Sep24 0:00 /usr/sbin/httpd apache 2296 0.0 0.4 25852 2024 ? S Sep24 0:00 /usr/sbin/httpd apache 2297 0.0 0.4 25852 2024 ? S Sep24 0:00 /usr/sbin/httpd apache 2298 0.0 0.4 25852 2024 ? S Sep24 0:00 /usr/sbin/httpd apache 2299 0.0 0.4 25852 2024 ? S Sep24 0:00 /usr/sbin/httpd apache 2300 0.0 0.4 25852 2024 ? S Sep24 0:00 /usr/sbin/httpd apache 2301 0.0 0.4 25852 2024 ? S Sep24 0:00 /usr/sbin/httpd apache 2302 0.0 0.4 25852 2024 ? S Sep24 0:00 /usr/sbin/httpd
A better view of parent child relationship is available from the pstree result.
[root@haqmail ~]# pstree -p | grep httpd +-httpd(1861)---httpd(2295) | +-httpd(2296) | +-httpd(2297) | +-httpd(2298) | +-httpd(2299) | +-httpd(2300) | +-httpd(2301) | +-httpd(2302)
Now we know that our request for mail.php is served by one of the Apache child processes and that strace can be used to get details of how requests are served, but there is one more problem: Which child process serves the mail.php request? We either know the process ID of the exact child process, or we trace all the child processes and then sort out the output of strace. There is no way to know in advance which child process will serve the mail.php request, so we will have to trace the parent Apache and all its child processes. The "-f" strace option traces child processes as they are created by currently traced processes as a result of the 'fork' system call.
Here we go...
First stop Apache and then restart Apache with strace.
[root@dns1 root]# strace -f -o trace.txt /etc/rc.d/init.d/httpd start
The "-o" option saves the result to "trace.txt" file. Now access the test PHP page through the browser. Stop strace and restart Apache as usual. It may be necessary to send the strace process a SIGKILL signal, because it captures some signals it gets from the terminal session.
Let us now go ahead and examine the strace result in the trace.txt file.
[root@dns1 root]# grep mail.php trace.txt 21837 read(7, "GET /mail.php HTTP/1.1\r\nUser-Age"..., 4096) = 472 21837 stat64("/var/www/virtdomain/mail.php", {st_mode=S_IFREG|0644, st_size=587, ...}) = 0
From the above grep result it's clear that the Apache child process serving our request for mail.php is the one with PID "21837". Now grep trace.txt for "21837". Relevant result is pasted below.
21837 chdir("/var/www/virtdomaint") = 0 21837 open("/var/www/virtdomain/mail.php", O_RDONLY) = 8 . . . . 21837 fork() = 21844
The last line shows that the Apache child process forks another process with PID 21844. Let us grep for 21844 in trace.txt and see what it does.
21844 execve("/bin/sh", ["sh", "-c", "/usr/sbin/sendmail -t -i"], [/* 21 vars */]) = -1 EACCES (Permission denied)
Well, the process is used for sending mail using /usr/sbin/sendmail, but an incorrect permissions prevent it doing so. Sendmail permissions are set correctly, but checking /bin/sh reveals that it is set to "770" with "root.root" ownership. Since the Apache child process is running as user "apache", it doesn't have read and execute permission over /bin/sh, and hence the issue. Changing /bin/sh permission to "755" fixed it.
[ At this point I'd be inclined to wonder why the file permissions were wrong. It could be that someone is tip-toeing through your filesystem. -- Steve Brown ]
With a basic understanding of Apache and using strace, we could find the root cause of the issue and hence fix it. Strace is a general purpose utility and it can be used to troubleshoot any program. Strace and GDB (GNU Debugger) are very useful in system level troubleshooting. Here is a good article discussing both the utilities.
Talkback: Discuss this article with The Answer Gang
I'm an MTech. in Communication Systems from the IIT Madras. I joined Poornam Info Vision Pvt Ltd. in 2003 and have been working for Poornam since then.
My area of interest are performance tuning, server monitoring, and security. In my free time I practice Karate, read books and listen to music.