PingDirectory

Installation and maintenance issues

The following are common installation and maintenance issues and possible solutions.

The setup program will not run

If the setup tool does not run properly, some of the most common reasons include the following.

A Java environment is not available

The server requires that Java be installed on the system prior to running the setup tool.

If there are multiple instances of Java on the server, run the setup tool with an explicitly defined value for the JAVA_HOME environment variable that specifies the path to the Java installation. For example:

$ env JAVA_HOME=/ds/java ./setup
shell

Another issue may be that the value specified in the provided JAVA_HOME environment variable can be overridden by another environment variable. If that occurs, use the following command to override any other environment variables:

$ env UNBOUNDID_JAVA_HOME="/ds/java" UNBOUNDID_JAVA_BIN="" ./setup
shell

Unexpected arguments provided to the JVM

If the setup tool attempts to launch the java command with an invalid set of arguments, it may prevent the from starting. By default, no special options are provided to the JVM when running setup, but this might not be the case if either the JAVA_ARGS or UNBOUNDID_JAVA_ARGS environment variable is set. If the setup tool displays an error message that indicates that the Java environment could not be started with the provided set of arguments, run the following command:

$ unset JAVA_ARGS UNBOUNDID_JAVA_ARGS
shell

The server has already been configured or started

The setup tool is only intended to provide the initial configuration for the server. It will not run if it detects that it has already been run.

A previous installation should be removed before installing a new one. However, if there is nothing of value in the existing installation, the following steps can be used to run the setup program:

  • Remove the config/config.ldiffile and replace it with the config/update/config.ldif.{revision} file containing the initial configuration.

  • If there are any files or subdirectories in the db directory, then remove them.

  • If a config/java.properties file exists, then remove it.

  • If a lib/setup-java-home script (or lib\set-java-home.bat file on Microsoft Windows) exists, then remove it.

The server will not start

If the server does not start, then there are a number of potential causes.

The server or other administrative tool is already running

Only a single instance of the server can run at any time from the same installation root. Other administrative operations can prevent the server from being started. In such cases, the attempt to start the server should fail with a message like:

The <server> could not acquire an exclusive lock on file
/ds/PingData<server>/locks/server.lock:
The exclusive lock requested for file
/ds/PingData<server>/locks/ server.lock
was not granted, which indicates that another
process already holds a shared or exclusive lock on
that file. This generally means that another instance
of this server is already running.

If the server is not running (and is not in the process of starting up or shutting down), and there are no other tools running that could prevent the server from being started, it is possible that a previously held lock was not properly released. Try removing all of the files in the locks directory before attempting to start the server.

There is not enough memory available

When the server is started, the attempts to allocate all memory that it has been configured to use. If there is not enough free memory available on the system, the server generates an error message indicating that it could not be started.

There are a number of potential causes for this:

  • If the amount of memory in the underlying system has changed, the server might need to be re-configured to use a smaller amount of memory.

  • Another process on the system is consuming memory and there is not enough memory to start the server. Either terminate the other process, or reconfigure the server to use a smaller amount of memory.

  • The server just shut down and an attempt was made to immediately restart it. If the server is configured to use a significant amount of memory, it can take a few seconds for all of the memory to be released back to the operating system. Run the vmstat Installation and maintenance issues command and wait until the amount of free memory stops growing before restarting the server.

  • If the system is configured with one or more memory-backed file systems (such as /tmp), determine if any large files are consuming a significant amount of memory. If so, remove them or relocate them to a disk-based file system.

An invalid Java environment or JVM option was used

If an attempt to start the server fails with 'no valid Java environment could be found,' or 'the Java environment could not be started,' and memory is not the cause, other causes may include the following:

  • The Java installation that was previously used to run the server no longer exists. Update the config/java.properties file to reference the new Java installation and run the bin/dsjavaproperties command to apply that change.

  • The Java installation has been updated, and one or more of the options that had worked with the previous Java version no longer work. Re-configure the server to use the previous Java version, and investigate which options should be used with the new installation.

  • If an UNBOUNDID_JAVA_HOME or UNBOUNDID_JAVA_BIN environment variable is set, its value may override the path to the Java installation used to run the server (defined in the config/java.properties file). Similarly, if an UNBOUNDID_JAVA_ARGS environment variable is set, then its value might override the arguments provided to the JVM. If this is the case, explicitly unset the UNBOUNDID_JAVA_HOME, UNBOUNDID_JAVA_BIN, and UNBOUNDID_JAVA_ARGSenvironment variables before starting the server.

Any time the config/java.properties file is updated, the bin/dsjavaproperties tool must be run to apply the new configuration. If a problem with the previous Java configuration prevents the bin/dsjavaproperties tool from running properly, remove the lib/set-java-home script (orlib\set-java-home.bat file on Microsoft Windows) and invoke the bin/dsjavaproperties tool with an explicitly-defined path to the Java environment, such as:

$ env UNBOUNDID_JAVA_HOME=/ds/java bin/dsjavaproperties
shell

An invalid command-line option was used

There are a small number of arguments that can be provided when running the bin/start-server command. If arguments were provided and are not valid, the server displays an error message. Correct or remove the invalid argument and try to start the server again.

The server has an invalid configuration

If a change is made to the server configuration using dsconfig or the administrative console, the server will validate the change before applying it. However, it is possible that a configuration change can appear to be valid, but does not work as expected when the server is restarted.

In most cases, the server displays (and writes to the error log) a message that explains the problem. If the message does not provide enough information to identify the problem, the logs/config-audit.logfile provides recent configuration changes, or the config/archived-configs directory contains configuration changes not made through a supported configuration interface. The server can be started with the last valid configuration using the -- useLastKnownGoodConfig option:

$ bin/start-server --useLastKnownGoodConfig
shell

To determine the set of configuration changes made to the server since the installation, use the config-difftool with the arguments --sourceLocal --targetLocal --sourceBaseline. The dsconfig --offline command can be used to make configuration changes.

Proper permissions are missing

The server should only be started by the user or role used to initially install the server. However, if the server was initially installed as a non-root user and then started by the root account, the server can no longer be started as a non-root user. Any new files that are created are owned by root.

If the user account used to run the server needs to change, change ownership of all files in the installation to that new user. For example, if the server should be run as the "ds" user in the "other" group, run the following command as root:

$ chown -R ds:other /ds/PingData<server>
shell

The server has shutdown

Check the current server state by using the bin/server-state command. If the server was previously running but is no longer active, potential reasons may include:

  • Shut down by an administrator – Unless the server was forcefully terminated, then messages are written to the error and server logs stating the reason.

  • Shut down when the underlying system crashed or was rebooted – Run the uptime command on the underlying system to determine what was recently started or stopped.

  • Process terminated by the underlying operating system – If this happens, a message is written to the system error log.

  • Shut down in response to a serious problem – This can occur if the server has detected that the amount of usable disk space is critically low, or if errors have been encountered during processing that left the server without worker threads. Messages are written to the error and server logs (if disk space is available).

  • has crashed – If this happens, then the JVM should provide a fatal error log (a hs_err_pid<processID>.log file), and potentially a core file.

The server will not accept client connections

Check the current server state by using the bin/server-state command. If the server does not appear to be accepting connections from clients, reasons can include the following:

  • The server is not running.

  • The underlying system on which the server is installed is not running.

  • The server is running, but is not reachable as a result of a network or firewall configuration problem. If that is the case, connection attempts should time out rather than be rejected.

  • If the server is configured to allow secure communication through or StartTLS, a problem with the key manager or trust manager configuration can cause connection rejections. Messages are written to the server access log for each failed connection attempt.

  • The server may have reached its maximum number of allowed connections. Messages should be written to the server access log for each rejected connection attempt.

  • If the server is configured to restrict access based on the address of the client, messages should be written to the server access log for each rejected connection attempt.

  • If a connection handler encounters a significant error, it can stop listening for new requests. A message should be written to the server error log with information about the problem. Restarting the server can also solve the issue. Another option is to create an file that disables and then re-enables the connection handler, create the config/auto-process-ldif directory if it does not already exist, and then copy the LDIF file into it.

The server is unresponsive

Check the current server state by using the bin/server-state command. If the server process is running and appears to be accepting connections but does not respond to requests received on those connections, potential reasons for this include:

  • If all worker threads are busy processing other client requests, new requests are forced to wait until a worker thread becomes available. A stack trace can be obtained using the jstack command to show the state of the worker threads and the waiting requests.

    If all worker threads are processing the same requests for a long time, the server sends an alert that it might be deadlocked. All threads might be tied up processing unindexed searches.

  • If a request handler is busy with a client connection, other requests sent through that request handler are forced to wait until it is able to read data. If there is only one request handler, all connections are impacted. Stack traces obtained using the jstack command will show that a request handler thread is continuously blocked.

  • If the in which the server is running is not properly configured, it can spend too much time performing garbage collection. The effect on the server is similar to that of a network or firewall configuration problem. A stack trace obtained with the pstack utility will show that most threads are idle except the one performing garbage collection. It is also likely that a small number of CPUs is 100% busy while all other CPUs are idle. The server will also issue an alert after detecting a long JVM pause that will include details.

  • If the JVM in which the server is running has hung, the pstack utility should show that one or more threads are blocked and unable to make progress. In such cases, the system CPUs should be mostly idle.

  • If a there is a network or firewall configuration problem, communication attempts with the server will fail. A network sniffer will show that packets sent to the system are not receiving TCP acknowledgment.

  • If the host system is hung or lost power with a graceful shutdown, the server will be unresponsive.

Problems with the administrative console

If a problem occurs when trying to use the administrative console, reasons might include one of the following:

  • The web application container that hosts the console is not running. If an error occurs while trying to start it, consult the logs for the web application container.

  • If a problem occurs while trying to authenticate, make sure that the target server is online. If it is, the access log might provide information about the authentication failure.

  • If a problem occurs while interacting with the server instance using the administrative console, the access and error logs for that instance might provide additional information.