Google

Trouble Shooting Apcupsd

Testing

The first step in trouble shooting apcupsd is to read the Testing Apcupsd section of this manual.

Network Problems with Mater/Slave Configurations

When working with a master/slave configuration (one UPS powering more than one computer), the master and slave communicate via the network. In many configurations, apcupsd is started before the network is initialized. In this case, it is possible that the master will be unable to contact the slave. On apcupsd versions prior to 3.8.0, this could cause apcupsd to error off. The solution to this problem is to either force apcupsd to be started after the network and the DNS (fiddle the symbolic links in /etc/rc.d), or put the names of the slave machines in your /etc/hosts file, or even more preferable, use IP addresses rather than machine names. On some configurations, you may need to use fully qualified names (host.domain.xxx) rather than simple host names.

Error Messages from a Master Configuration

In a master/slave configuration, you can get the following error messages from a master. The error message is followed by a possible explanation:

Cannot resolve slave name XXX

To contact the slave, the slave name given in the configuration file must be resolved to an IP address. In this case, apcupsd could not get the IP address. Either the slave name is incorrect, your DNS may not be working, or you have started apcupsd during the boot process before the network is operational.

Got slave shutdown from SSS

This message should not be printed as it is not yet used.

Cannot write to slave SSS

This message occurs when the master attempts to send a message to the slave SSS and gets an error. It indicates that either the slave machine is not responding (apcupsd died, the system crashed, ...) or that the network is down.

Cannot read magic from slave SSS

This message indicates that the master attempted to read the code key from the slave SSS and it did not match the value expected. A common cause of this problem is that the master and slave versions of apcupsd are not the same. Please be sure you are running the same version of apcupsd on all your master and slave machines.

Connect to slave SSS failed

This message is logged when the master attempts to connect to slave SSS and no connection is accepted. The most common cause of this problem is that the slave copy of apcuspd is not yet ready to accept connections or is not running. Generally, apcupsd will retry the connection a bit later. If the problem is persistent, it can indicate a network problem or the slave name on the SLAVE directive of the master's configuration file is incorrect.

Cannot open stream socket

This indicates a fundamental networking problem on your system -- either a lack of sufficient resources or you have not configured TCP/IP operations.

Error Messages from a Slave Configuration

In a master/slave configuration, you can get the following error messages from a slave. The error message is followed by a possible explanation:

Can't resolve master name MMM

This message is logged when the slave attempts to resolve the name given on the MASTER configuration directive to an IP address. It probably means that the master name MMM is not defined, your DNS is not properly working, or you have started apcupsd in the boot process before the network is initialized. Check the name MMM, or use an explicit IP address on the MASTER configuration directive in the slave's configuration file.

Cannot bind local address, probably already in use

This means that the slave has attempted to bind the port number so that it can listen for messages from the master. This can occur if already have a copy of apcupsd running, or you have previously run apcupsd in the past 5 or 10 minutes, because occasionally the operating system will not shutdown a port correctly for 5 to 10 minutes after a program exits. In this case, you can either wait a few minutes for the problem to go away, or use a different port in both your master and slave configuration files.

Socket accept error

The slave got an error waiting on the accept() system call. This is probably due to a fundamental networking problem.

Unauthorised attempt from master MMM

The master named MMM (probably an IP address) contacted the slave but MMM is not the master that was listed on the MASTER configuration directive in /etc/apcupsd.conf, and consequently, it is not authorized to communicate with the slave. Please check that your MASTER and SLAVE names in your slave and master configuration files respectively are correct.

Read failure from socket

The slave got an error reading the socket open to the master. This indicates a fundamental networking problem.

Bad APC magic from master: MMM

The slave received a code key from the master that does not correspond to the one expected by the slave. The most common cause of this problem is that you are running a different version of apcupsd on the master and the slave. Please ensure that you are running the same version of apcupsd on all your master and slaves.

Bad user magic from master: MMM

This message indicates that the master and slave have previously communicated, but that the code key transmitted with the most recent message from the master does not correspond to what the slave expects. This problem is probably due to a network error or some other user or machine contacting the slave on the network port.

Master/Slave Connection Not Working

Master/slave problems are usually related to one of the following items:
  1. Improper apcupsd.conf files. A good starting point are the master/slave example files in the examples subdirectory of the source.
  2. Master or slave IP address or name incorrect. Try ping'ing each machine from the other using the names or addresses that you have put in the respective apcupsd.conf files.
  3. Make sure no other program is using socket number 6666 or change the NETPORT directive in both apcupsd.conf files.
  4. Make sure you are using the same version of apcupsd on both the master and slave machines.

CGI Programs Do Not Work

Try checking the following:
  1. Did you successfully compile and link the cgi programs without errors? If not sure, cd to the cgi directory, do a "make clean" followed by a "make"
  2. Did you move or copy all the .cgi programs in the cgi directory to your Web server cgi-bin directory on the SAME machine?
  3. Did you verify that the cgi programs located in the cgi-bin all have execute permission?
  4. Have you tried any other cgi programs and proven that they work?
  5. Have you verified that the Network Information Server process of apcupsd is running as described in this manual?
  6. Have you verified that your apcupsd.conf file is properly configured for the Network Information Server and that the port is defined as 7000? I.e. "NETSERVER on" and "SERVERPORT 7000"
  7. If one or more machines does not show up in the multimon output, it is most likely due to a configuration error in the hosts.conf file in your /etc/apcupsd directory.

Battery Problems

Please see the Battery Chapter of this document for more details.

Cable or Connection Problems

Frequently during the initial installation, users don't know what cable they have or have problems connecting to the serial port. If this is your case, one means of diagnosing the problem can be to use the apctest program. To do so, you must first build it with:

make apctest

Then, you simply execute it with:

./apctest

and follow the instructions. It will place the output from the session in the file apctest.output. If you are not able to resolve your problem, sometimes we can help if you email us this output file along with your apcupsd.conf file. Please see the Testing Chapter of this document for additional details on how to build and use apctest.

Bizarre Intermittent Behavior

In one case, a user reported that he received random incorrect values from the UPS in the status output. It turned out that gpm, the mouse control program for command windows, was using the serial port without using the standard Unix locking mechanism. As a consequence, both apcupsd and gpm were reading the serial port. Please ensure that if you are running gpm that it is not configured with a serial port mouse on the same serial port.