Troubleshooting critical database bugs saves downtime and millions of dollars
Aker BP’s vision is to be the leading exploration and production company within oil and gas. They will achieve this by producing natural resources with low emissions and costs, while at the same time creating jobs and values for both their owners and the broader Norwegian society.
Measured in production, they are one of the largest independent listed oil companies in Europe. They have big plans for the future, which include further growth and a desire to improve the oil and gas industry.
The Challenge
Aker BP was experiencing critical issues with its asset management system which was running on an Oracle database. During heavy workloads their system would freeze, bringing business processes to a standstill. Multiple sessions would start blocking each other, making the asset management system unusable for hours.
BP had one of the larger System Integrators (SI’s) to manage their database stack and they had looked at this issue for many months. The only possible workaround the Database Team came up with, was to start killing database sessions, one by one, hundreds of them, until almost all were killed and then the application started to work again. This of course meant deep frustration amongst the business users who lost their connection and had to re-do some of their work again. This issue was getting progressively worse over time as more users were accessing the system and the amount of data grew. Something had to be done urgently before this problem triggered an offshore platform shutdown, which could have cost the company up to $1M US dollars for every hour of downtime.
Even the system supplier who had written the code, could not identify the root cause and was suggesting that the client provided the database more CPU and Memory, but this had no impact.
The Solution
The IT manager contacted Deverg and asked for help. Deverg engaged with one of its veteran trouble-shooters who has a remarkable track record of solving unintelligible problems with Oracle databases.
Our expert hit the ground running started looking deeply into the databases, running traces, and observing the problem over a short period of time. Within 48 hours he had discovered the root cause of the issue.
The problem originated in the application logic that was putting a time-stamp on some database records but wasn't releasing, causing a deadlock. Our specialist also provided the code fix that was needed, which the application vendor was very happy with and tested and implemented and then returned the system to operational normality.
The Result
The Aker BP processes were back running smoothly once again. System integrity and confidence was restored across multiple business units. Our work to quickly identify and resolve the problem not only saved our customer money, it also prevented further downtime and considerable frustration.