Skip Navigation

Application of Machine Learning to Petroleum Economics

Skip Side Navigation

Application of Machine Learning to Petroleum Economics


Machine learning use cases in the Oil & Gas industry have up until recent years been mostly confined to the use of deep learning techniques in seismic interpretation. Presently, with the exponential volume of data being generated by the industry, novel applications are being discovered with increasing frequency. One such application in the discipline of petroleum economics involves estimating future revenue from production sharing contracts (PSC).

Since the signing of the first PSC by the Indonesian government in 1966, the use of PSCs as a type of resource extraction contract in Oil & Gas producing countries has become widespread primarily as it allows host governments retain sovereignty over their petroleum resources. 

In the administration of a PSC, an oil producing company is engaged both as an investor and as a contractor to the host government, charged with developing oil properties specified within the contract. Where successful, the contractor is remunerated in kind, meaning in barrels of crude oil rather than in cash with contractor remuneration covering both investment costs and profit earned.

Through a process known as oil lifting, both the contractor and host government ship their respective entitlement barrels of crude oil from an oil storage terminal according to a schedule agreed upon a few months in advance by both parties. Each party’s oil lifting entitlement is calculated through economic modeling of the PSC fiscal and commercial terms. 

Given oil liftings are determined according to the cargo hold size of the lifting vessel, there is often a situation of either over-lifting or under-lifting of crude oil relative to calculated entitlements in a given financial year-end. This is generally understood by both parties and account reconciliations are accommodated for in subsequent lifting schedules.

Contractor annual cash flow key performance indices are usually set against anticipated oil liftings in the year to come and not against calculated future entitlements. These anticipated oil liftings are sometimes based on heuristics as actual oil production and oil price could vary from forecasted. However, for a contractor with mature producing oil properties, there exists a wealth of oil lifting information that lends itself to data-driven insights. 

In such an example, a machine learning algorithm implemented in Python using Scikit-Learn was applied to oil lifting data from a contractor to the Nigerian government to predict the contractor’s oil liftings in 2018. This entailed training a parsimonious model using ridge regression. 

The model predicted 72% of oil liftings in favor of the contractor as against the realized figure of 70%. The heuristic prediction in the same year was of 65% contractor oil liftings. Though the lower figure forecast could be seen as a way of managing expectations, it could also have the unintended consequence of the contractor’s management board redirecting capital to seemingly better performing oil projects. 

Input variables in the model were based on the selection of features most impactful in calculating relative oil lifting entitlements, namely oil production volume and oil price. Given that the contractor’s historical oil production has been directionally responsive to oil price over the 20-year sampling period, both features exhibit a degree of collinearity which was resolved by the choice of a ridge regression model designed to handle this characteristic. 

The output (target variable) is the contractor’s oil lifting percentage which determines the contractor’s annual cash flow. As is typical of the fiscal terms underpinning this type of Nigerian PSC, this oil lifting percentage decreases relative to that of the government with the increased oil price as more rent is accrued to the government in the form of royalties and taxes.

The trained model attained a 0.8 coefficient of determination. Carrying out back testing with the model on a select sample of observed data produced predictions falling within a 95% prediction interval.

This machine learning approach presents an alternative method for estimating oil liftings when planning future revenue. Through rigorous identification of features, it can be applied for as long as boundary conditions, namely fiscal and commercial terms of the PSC remain unchanged.