A real world example of using machine learning in the power sector
Machine Learning is a big buzzword within many business communities, and the Nordic power market is no exception. There seems to be a lot of talking and tweeting about machine learning and its potential, but the tweeting is sparser when it comes to practical real world examples. Through this project update, I will try provide a concrete example of how machine learning can be used within the power sector and illustrate how we in Optimeering currently are working together with Statnett and academia to use machine learning to better handle imbalances in the Nordic power system.
Project background – the imbalance problem
Electric power is a perishable commodity – it must be produced the moment it is consumed. For all of us to have a stable and reliable supply of power, production and consumption must always be balanced with great precision.
Statnett has the balancing responsibility in Norway and together with Svenska Kraftnät they also have the balancing responsibility for the Nordic synchronous area. Their starting point is a power system that is planned (by the invisible hand of the day-ahead and intraday market) to be in balance, but since no one really knows exactly how strong the wind will blow or when people brew their morning coffee, imbalances occurs. Their means of dealing with these imbalances are reserves. These reserves can be divided into automatic reserves – which are activated instantaneously in the event of imbalances, and manual reserves – which Statnett can activate through manual activation. The volume of automatic reserve is limited, the quality and security of supply of power therefore depends on how well Statnett can counteract imbalances using manual reserve activation. This requires good short term predictions for the upcoming imbalance. Based on experience and internal prognosis, Statnett has a good understanding of the short-term imbalances in today’s system and how to handle them, but also suspects that there is a significant potential in using the vast amount of available data to increase prediction accuracy. Furthermore, the power system and thus the magnitude of imbalances are changing with a rapid pace as market coupling and the share of variable power generation is increasing. Understanding how this will affect imbalances, and thus imbalance prediction is a prerequisite for making use of renewables without degrading the quality and security of supply.
The project’s target model – developing a prototype machine learning model
In this Norwegian Research Council ENERGIX-funded project we, in close collaboration with Statnett, University of Liège and NTNU, are developing a machine learning model that can predict short-term imbalances within the operating phase of the market – that is after the market participant’s deadline for updating production and consumption plans have been reached. The prediction made is intended to be used by the balancing responsible – Statnett and Svenska Kraftnät – as a decision support tool when deciding to activate manual reserves, either to free up in-use primary reserves or to prevent large area control errors. The whole Nordic synchronous area has the same frequency and the aggregate imbalance for the synchronous area is of therefore the main focus. However, as the balancing responsible must also insure that manual measures taken are not hindered by bottlenecks in the grid, imbalance on an area level is also of importance.
Gathering data, pre-processing, feature engineering, selecting and testing a wide range of algorithms as well as finding the appropriate way to minimize the cost/learning function is all part of the project “magic” that we are currently working on. Not all our thinking can be explained here, but the following problem aspects are worth mentioning:
Data and domain expertise
Whereas machine learning is most often categorized as a data driven modelling technique, domain knowledge is highly important in this project. Several physical limitations are present and there is extensive prior knowledge available. Using this domain knowledge will help the project overcome limitations in the training data, and ease the feature selection. To which extent the final model will rely on domain knowledge or “machine learnt” knowledge will also depend on the test results, as the final model, or ensemble of models could range from regression based methods requiring extensive feature engineering, to “deeper” methods such as recurrent neural nets that extracts its own features.
Input variables part 1 – complex interdependencies
Gathering and pre-processing data is always a challenge in a machine learning project, but an additional challenge in this project is the complex interdependencies between the input variables. Regressions and other linear models treat input variables as independent of each other and thus do not utilize the information present in these interdependencies (unless such variables are engineered), and are sensitive to multicollinearity. For this reason, interdependencies between the input variables must be sorted out in the pre-learning phase (again, unless purely relying on “deeper” methods). Furthermore, complex non-linear interdependencies also make statistical pre-processing procedures such as Principal Component Analysis less useful. Overall, handling these complex interdependencies is far from straightforward when we are working with variables such as temperature, wind speed, and hydrology with a 10 to 60-minute resolution at multiple geographical locations.
Input variables part 2 – sequential and non-sequential information
Textbook examples of machine learning are arguably often based on either non-sequential variables to classify a label – e.g. predict whether a tumour is malignant or benign based on number of input variables, or univariate time series to predict the ”H” next values based on trend, season, level and a regressive error-term.
For the imbalance problem, the input variables include both non-sequential information and sequential patterns that is expected to improve the prediction accuracy. Furthermore, some of these sequential patterns are present in time series that have seasonality’s with irregular calendar effects (such as Easter) that also need to be taken care of. Again, relevant approaches can range from “simple” time series models with repressors (for example ARIMAX or Facebooks newly released Prophet), or a “deeper” RNN-method known from speech recognition.
Interested in learning more?
If interested in learning more about the project, make suggestions, or learn how we use machine learning in other projects, please do not hesitate to contact project leader Karan Kathuria.