Drop shipping and optimisation – how to save retail?

Drop shipping – where retailers don’t stock a product, but rather obtain and ship it directly from the manufacturer only once the customer places an order – can be a boon for on- and offline retailers and manufacturers alike. A recent article suggests that profitability of manufacturers that use drop shipping is over 18% higher than manufacturers who do not. This suggests there is money “on the table” associated with order fulfilment, and new approaches to shipping can help get that money into the pockets of retailers and manufacturers alike.

Unfortunately, it’s not as simple as saying “yep, let’s use drop shipping”. For one thing, customer service needs to be maintained – not many customers are going to accept long shipping delays for example just to increase supplier profitability. Drop shipping needs to deliver a result as good as the alternatives. This comes down essentially to logistics, modelling, machine learning and optimisation – for anticipating demand, preparing production and shipping, analysing and predicting customer service requirements, and executing the production and shipping process that delivers fast but at minimum cost. This is where Optimeering can help – drop us a line if that 18% sounds interesting to you…

It also opens up some very interesting new business models. Large manufacturers or wholesalers take on the roll of production, storage (if any) and order fulfilment, and (online and offline) retailers focus on their specific customer concept – whether that is built around price, service, products offered or something else entirely. This is already beginning – online supermarkets are and will have to move in this direction rather than each of them replicating the fulfilment job. and, of course, good algorithms (like those made by Optimeering!) are essential to make it work, and work well.

Speeding up Mosel code – the sum operator

For those of you who don’t know, Mosel is the programming and modelling language that ships with FICO’s Xpress math programming solver. And if you don’t already know what it is, the rest of this blog post won’t be of too much interest – I would advise a cup of coffee, a stretch and a good eye-relaxing look out the window instead. But, for those of you that do use mosel, keep reading.

At Optimeering, we don’t actually use mosel much any more – pretty much all our modelling work is now done in python, including all our math programming modelling. However, when we were using mosel, we noticed a few issues with speed – particularly, how you use the conditional operator has a big impact on execution speed. We first really experienced this in the early prototyping of the PUMA model – the model’s datasets can get pretty big, and issues like this were very critical to good model performance.

So, last year we published a working paper within the PUMA project, with code, that explains things in detail and makes some recommendations regarding how you should use the conditional operator. I thought it was about time I made it available here – download and disseminate away!

Data Science Cheat Sheets

We are big fans of the team at DataCamp and the python and data science resources they’ve put together. Their Cheat Sheets are really useful – especially for people like myself who forget the correct syntax all the time. No matter whether you are new to python and data modelling, or have heaps of experience and are a python data science animal (if so, drop us a line…), check them out.

Quota Market Modelling

Quota markets are increasingly used to ration power market and environmental goods, including capacity, emissions permits and renewable subsidies. The obvious and most international of these markets is the EU ETS; however there are many more including the Swedish-Nordic El-Cert market, and the increasing number of capacity auctions and certificate markets popping up in Europe and elsewhere.

One characteristic of these markets is that the good that is traded is “artificial” – and typically the demand for it has been legislated by government. This means that the demand for the good is defined by rules, not by want or need, and thus behaves quite differently to “normal” markets. Quota markets are often either long (there are too many quotas or certificates) to meet the legislated demand, or short (there are not enough). If long, the price is zero. If short, then the price is set by a legislated penalty for not having enough quotas. Pricing in between is due to uncertainty – you don’t know whether the market is long or short. This dynamic results in behaviour like big swings in prices, or – often – price hikes, then price collapses as the market looks long, followed by rule changes to support price levels.

This means that such markets are not able to be analysed that effectively by traditional economic models that are based on the idea of equilibrium. Quota markets are not in equilibrium, so should not be analysed as if they were.

Instead, at Optimeering we attack this problem via the use of intelligent agent-based models, that combine AI techniques with modelling of actual market actors to simulate market behaviour under realistic market conditions. Together with Thema Consulting Group, we developed the MARC model for the Swedish-Norwegian El-cert market, that has been used by a range of developers, regulators and operators to better understand and predict future market outturns. We have a blog about the MARC model here (in Norwegian). To learn more how we can use agent modelling to help you better understand the quota markets that drive your bottom line, contact us here.

Market Clearing – aFRR

New markets for reserve services and near-real-time power delivery require new, sophisticated tools for calculating optimal market clearing. In theory clearing a market is simple – you just need to match supply and demand. In practice though, complex bid and offer structures as well as integration between reserve services often mean that things are not quite so straight forward. Getting it right though is very important – clearing and pricing new markets reliably, transparently and quickly is essential for efficient market operation.

Optimeering has a unique combination of modelling expertise and market know-how that positions us to deliver robust, customised market clearing tools for any type of power market. Recently, we have developed the clearing algorithm for the upcoming pan-Nordic aFRR market for the SVK, Statnett, Energinet.dk and Fingrid TSO consortia.

The four Nordic TSOs are planning to implement a common Nordic capacity market for aFRR (automatic frequency restoration reserves) in 2018, and our part of this work is developing and implementing the market clearing engine used to select that combination of bids that is most efficient (“maximises social welfare surplus”). Given that the bidding rules are quite complex (bids can be linked upwards, downwards, and in time, marked as indivisible, and be asymmetrical, to mention some) and the requirement to ensure that the market operates in a socio-economically efficient manner, the problem is not easy.

The problem itself is what we call a combinatorial optimization problem or (mixed) integer program. The linking and non-divisibility of bids is a critical and fundamental characteristic of the bid selection problem that means traditional clearing methods based on hourly bid price alone are insufficient. Given the size of the problem (hourly bids in multiple bidding zones), checking every single possible combination of bids is also not a feasible approach. Instead, a solution algorithm is needed to select bids for the aFRR market that accounts for the complex bid structures via the use of advanced mathematical optimization techniques. We have come up with some clever formulation approaches (if we do say so ourselves …) that take advantage of structure in the problem to clear the market optimally and very quickly (here we mean seconds, not minutes or hours). Our approach also works for several alternative pricing mechanisms, including pay-as-bid and marginal-cost pricing.

If you are interested in learning more about the aFRR market, or how we can help in other markets, please contact us.

The PUMA Algorithms

We are getting close to a Beta release of our first member of the PUMA Algorithm’s family, and I wanted to update everyone on what to expect and on progress so far.

The PUMA Algorithms are advanced models for the analysis and prognosis of power markets, that take detailed account of future uncertainty in input levels such as hydro inflow, demand, fuel and emissions prices. The first PUMA Algorithm, PUMA SP, is being developed as part of the PUMA Research Project sponsored by Research Council of Norway, and paid for by several major actors in the Nordic power market.

PUMA SP is a fundamental medium term time-frame model that captures the impact of multiple uncertainty drivers such as inflows, availabilities, demand, fuel and CO2 prices. In modelling something as complex as a power market, you have to make trade offs. Speed verses detail. Detailed modelling of hydro verses detailed modelling of CHP. Uncertainty verses perfect foresight. With PUMA SP we have taken the view that the user is best positioned to make these trade-offs, as they can change from analysis to analysis.

PUMA SP is therefore designed with flexibility and ease-of-use in mind, and can be configured to run at whatever level of detail you need. Want to run deterministically? Just one parameter. Uncertain inflows and fuel prices? The same. Add in uncertain demand? No problem. One nice thing – once you have one uncertain parameter, adding new ones does not add much solve overhead. So, if you need “quick-and-dirty”, that’s what you can have, whilst being able to use the same model and data later on to refine and model a fully detailed market response.

All Puma Algorithms are fully integrated with the PUMA Analytic Framework, and are provided as python packages and with a browser-based front-end. PUMA SP is currently in locked alpha testing with our development license group, but we are planning for it to be available to new customers in Beta at the end of 2017. Drop us a line to find out more.

PUMA Analytic Framework

Most of the models we build use quite a lot of data, and especially time series or time-stamped data. However, many of them don’t use huge, enormous petabytes of data. A data lake, not a data ocean. Handling this can be a bit tricky – spreadsheets are much too limited, and relational databases are generally too slow to read and (especially) write even moderate volumes of time series data. At the other end of the scale, large no-sql data solutions are just too complex and unwieldy, like trying to crack a nut with a hammer.

We needed something that could read and write time series and time-stamped data quickly; could classify, group and tag series (“inflow series”, “cases” or “scenarios”); could extract series even with missing or incomplete data; and had an overhead low enough that it could be installed on a desktop and accessed via the python tools we use every day.

We couldn’t find it, so we decided to build our own – the Puma Analytic Framework, for the flexible and rapid storage, retrieval, visualisation and manipulation of time series and time-stamped data. Implemented as a python package, the Framework uses a combination of a relational database and fast flat file storage and retrieval to create a tool to easily and cheaply store pretty large data volumes on the desktop, and use them in your data analysis and models. Being a python package, you have immediate access to all your usual python tools, such as numpy, scipy, pandas and Jupyter Notebooks.

The Framework makes developing new models and analytic tools easier and faster, by providing data in a readily accessible, standardised format. We use the Framework to store and prepare data for our own PUMA Algorithms power market model, and in the same way it can be fully integrated with your own existing and 3rd party power market modelling tools. This is one of the big advantages of the PUMA Analytic Framework in our opinion – it enables you to have a single, consistent database for all your models.

The PUMA Analytic Framework is in locked alpha testing and will be made available in Beta shortly, after which you will be able to download the package and documentation from our website for free. Email us in the meantime if you are interested in finding out more.

Predicting Power System Imbalances with Machine Learning

A real world example of using machine learning in the power sector

Machine Learning is a big buzzword within many business communities, and the Nordic power market is no exception. There seems to be a lot of talking and tweeting about machine learning and its potential, but the tweeting is sparser when it comes to practical real world examples. Through this project update, I will try provide a concrete example of how machine learning can be used within the power sector and illustrate how we in Optimeering currently are working together with Statnett and academia to use machine learning to better handle imbalances in the Nordic power system.

Project background – the imbalance problem

Electric power is a perishable commodity – it must be produced the moment it is consumed. For all of us to have a stable and reliable supply of power, production and consumption must always be balanced with great precision.

Statnett has the balancing responsibility in Norway and together with Svenska Kraftnät they also have the balancing responsibility for the Nordic synchronous area. Their starting point is a power system that is planned (by the invisible hand of the day-ahead and intraday market) to be in balance, but since no one really knows exactly how strong the wind will blow or when people brew their morning coffee, imbalances occurs. Their means of dealing with these imbalances are reserves. These reserves can be divided into automatic reserves – which are activated instantaneously in the event of imbalances, and manual reserves – which Statnett can activate through manual activation. The volume of automatic reserve is limited, the quality and security of supply of power therefore depends on how well Statnett can counteract imbalances using manual reserve activation. This requires good short term predictions for the upcoming imbalance. Based on experience and internal prognosis, Statnett has a good understanding of the short-term imbalances in today’s system and how to handle them, but also suspects that there is a significant potential in using the vast amount of available data to increase prediction accuracy. Furthermore, the power system and thus the magnitude of imbalances are changing with a rapid pace as market coupling and the share of variable power generation is increasing. Understanding how this will affect imbalances, and thus imbalance prediction is a prerequisite for making use of renewables without degrading the quality and security of supply.

The project’s target model – developing a prototype machine learning model

In this Norwegian Research Council ENERGIX-funded project we, in close collaboration with Statnett, University of Liège and NTNU, are developing a machine learning model that can predict short-term imbalances within the operating phase of the market – that is after the market participant’s deadline for updating production and consumption plans have been reached. The prediction made is intended to be used by the balancing responsible – Statnett and Svenska Kraftnät – as a decision support tool when deciding to activate manual reserves, either to free up in-use primary reserves or to prevent large area control errors. The whole Nordic synchronous area has the same frequency and the aggregate imbalance for the synchronous area is of therefore the main focus. However, as the balancing responsible must also insure that manual measures taken are not hindered by bottlenecks in the grid, imbalance on an area level is also of importance.

Model development

Gathering data, pre-processing, feature engineering, selecting and testing a wide range of algorithms as well as finding the appropriate way to minimize the cost/learning function is all part of the project “magic” that we are currently working on. Not all our thinking can be explained here, but the following problem aspects are worth mentioning:

Data and domain expertise

Whereas machine learning is most often categorized as a data driven modelling technique, domain knowledge is highly important in this project. Several physical limitations are present and there is extensive prior knowledge available. Using this domain knowledge will help the project overcome limitations in the training data, and ease the feature selection. To which extent the final model will rely on domain knowledge or “machine learnt” knowledge will also depend on the test results, as the final model, or ensemble of models could range from regression based methods requiring extensive feature engineering, to “deeper” methods such as recurrent neural nets that extracts its own features.

Input variables part 1 – complex interdependencies

Gathering and pre-processing data is always a challenge in a machine learning project, but an additional challenge in this project is the complex interdependencies between the input variables. Regressions and other linear models treat input variables as independent of each other and thus do not utilize the information present in these interdependencies (unless such variables are engineered), and are sensitive to multicollinearity. For this reason, interdependencies between the input variables must be sorted out in the pre-learning phase (again, unless purely relying on “deeper” methods). Furthermore, complex non-linear interdependencies also make statistical pre-processing procedures such as Principal Component Analysis less useful. Overall, handling these complex interdependencies is far from straightforward when we are working with variables such as temperature, wind speed, and hydrology with a 10 to 60-minute resolution at multiple geographical locations.

Input variables part 2 – sequential and non-sequential information

Textbook examples of machine learning are arguably often based on either non-sequential variables to classify a label – e.g. predict whether a tumour is malignant or benign based on number of input variables, or univariate time series to predict the ”H” next values based on trend, season, level and a regressive error-term.

For the imbalance problem, the input variables include both non-sequential information and sequential patterns that is expected to improve the prediction accuracy. Furthermore, some of these sequential patterns are present in time series that have seasonality’s with irregular calendar effects (such as Easter) that also need to be taken care of. Again, relevant approaches can range from “simple” time series models with repressors (for example ARIMAX or Facebooks newly released Prophet), or a “deeper” RNN-method known from speech recognition.

Interested in learning more?

If interested in learning more about the project, make suggestions, or learn how we use machine learning in other projects, please do not hesitate to contact project leader Karan Kathuria.