We’ve been experiencing an ongoing performance issues for one of our applications, at roughly the same time each night for awhile. We’ve looked at all of the normal culprits CPU, memory, disk queue length, etc… We’ve moved AntiVirus scans to less critical times, but the end user satisfaction with the application during that time, was very poor.
We moved our search into the vSphere console to see if there was anything happening at the virtualized hardware level. By using the “Past 24 Hours” functionality of the performance tab. We were able to “See” the performance degradation, something was causing a significant increase in the number of read requests to disk. We checked again the following day, and saw the exact same issue. Now, that we had a definitive time range, we began reviewing what was happening at night during that period of time. What we found was that it was one of our nightly interfaces for transferring data between our application and our ERP software.
Next we dove into what was causing the issue. after a bit of research, we found the suspect select statement. Further analysis, led us to find that the table being selected from did not have the correct indexing configured. We configured the indexes, and let the nightly cycle run again. Success!! Happy end users!
This nightly issue was getting progressively worse from the day that the interface was implemented. When first implemented, there were minimal numbers of rows in the table, so the indexing need was minimal. but as time went on, and the number of rows increased, the select statement became more costly in disk access.
So lessons learned. Something as simple as a small interface to transfer data between one application to another, can have long term implications.