Page created: 15 Jul 2022 |
Page updated: 22 Dec 2022
If the server is running and does respond to clients, but clients take a long time to receive responses, then the problem can be attributable to a number of potential problems. In these cases, use the Periodic Stats Logger, which is a valuable tool to get per-second monitoring information on the server. The Periodic Stats Logger can save the information in csv format for easy viewing in a spreadsheet. For more information, see "Profiling Server Performance Using the Periodic Stats Logger". The potential problems that cause slow responses to client requests are as follows:
- The server is not optimally configured for the type of requests being processed, or clients are requesting inefficient operations. If this is the case, then the access log should show that operations are taking a long time to complete and they will likely be unindexed. In that case, updating the server configuration to better suit the requests, or altering the requests to make them more efficient, could help alleviate the problem. In this case, view the expensive operations access log in logs/expensive-ops, which by default logs operations that take longer than 1 second. You can also run the bin/status command or view the status in the Administrative Console to see the server’s Work Queue information (also see the next bullet point).
- The server is overwhelmed with client requests and has amassed a large backlog of requests
in the work queue. This can be the result of a configuration problem (for example, too few
worker thread configured), or it can be necessary to provision more systems on which to run
the server software. Symptoms of this problem appear similar to those experienced when the
server is asked to process inefficient requests, but looking at the details of the requests
in the access log show that they are not necessarily inefficient requests. Run the
bin/status command to view the Work Queue information. If everything is
performing well, you should not see a large queue size or a server that is near 100% busy.
The %Busy statistic is calculated as the percentage of worker threads that are busy
--- Work Queue --- : Recent : Average : Maximum -----------:--------:---------:-------- Queue Size : 10 : 1 : 10 % Busy : 17 : 14 : 100You can also view the expensive operations access log in logs/expensive-ops, which by default logs operations that take longer than 1 second.
- The server is not configured to fully cache all of the data in the server, or the cache is not yet primed. In this case, iostat reports a very high disk utilization. This can be resolved by configuring the server to fully cache all data, and to load database contents into memory on startup. If the underlying system does not have enough memory to fully cache the entire data set, then it might not be possible to achieve optimal performance for operations that need data which is not contained in the cache. For more information, see Disk-Bound Deployments.
- If the JVM is not properly configured, then it will need to perform frequent garbage collection and periodically pause execution of the Java code that it is running. In that case, the server error log should report that the server has detected a number of pauses and can include tuning recommendations to help alleviate the problem.
- If the server is configured to use a large percentage of the memory in the system, then it is possible that the system has gotten low on available memory and has begun swapping. In this case, iostat should report very high utilization for disks used to hold swap space, and commands like cat /proc/meminfo on Linux can report a large amount of swap memory in use. Another cause of swapping is if swappiness is not set to 0 on Linux. For more information, see Disable File System Swapping.
- If another process on the system is consuming a significant amount of CPU time, then it can adversely impact the ability of the server to process requests efficiently. Isolating the processes (for example, using processor sets) or separating them onto different systems can help eliminate this problem.