Thursday, 25 November 2010

Analysing Betting Exchange markets with Complex Event Processing, Esper and Scala

Recently, I posted on analysing betting exchange markets with betting market simulator and time line charts (read more...) At that time, I demonstrated charts with runner price that don't present a significant analytical value, except of being  a good example to introduce charting feature of betting market simulator. This day, I present how to analyse betting exchange markets with Esper complex event processing tool and Scala. This should be more useful in practice than studying simple metrics such as runner price or traded volume.

Complex event processing is a broad concept. I use it for analysing different types of market data such as price, probability, traded volume, risk metrics, etc. Complex event processing allows me to combine all those metrics altogether and prepare custom analysis  quickly and with minimal programming effort. I found it useful in both real time event processing as well as in historical analyses. There are many complex event processing tools on the market, both commercial and open source. The one I use is Esper, an open source tool that makes complex event processing simple in Java programming  language.

One of core features of Esper is an ability to create sliding windows and then, on top of that, to calculate some metrics such as avg, min, max or recognise patterns such as correlation between runner price and runner traded volume. Additionally, sliding windows may be used to calculate derivatives, which is an example that I will present in this post. Basically, I will demonstrate how to calculate first derivative of runner traded volume with respect to time and then how to calculate the second derivative. I'm finding those metrics useful to analyse correlation between market runner implied probability and traded volume for betting markets.

Below I present three charts for horse racing market during 10 minutes before market is turned in play. The first chart shows implied probability (1/runner price) for all market runners, for that, no complex event processing is required. The second chart presents the first derivative of traded volume with respect to time for all market runners, where as the last one displays the second derivative of runner traded volume.

To create a chart with implied probabilities for all market runners, we need to iterate through all market runners, then calculate runner implied probability and add it to chart values that will be used to generate time  line chart after simulation with Market Simulator tool is finished:

-------------------------------------------------------------------------------------------------------------
// For all market runners add implied probability to time line chart
for(runnerId <- ctx.runners.map(_.runnerId)) {

  //get best back and lay prices for market runner
  val bestPrices = ctx.getBestPrices(runnerId)
  
  // calculate average price between best back and lay prices
  val avgPrice = PriceUtil.avgPrice(bestPrices._1.price -> bestPrices._2 .price)

  
  //add average prices to chart values
  ctx.addChartValue(runnerId.toString, 1/avgPrice)
}
-------------------------------------------------------------------------------------------------------------

Figure 1 Implied probability for all market runners, y_axis - traded volume, x_axis - time

In order to display derivative of runner traded volume with respect to time we use the following formula:

y(v1,v0,t1,t0) = (v1 - v0) / (t1-t0) where
  • y - derivative of runner traded volume with respect to time
  • v1- runner traded volume at time t1
  • v0 - runner traded volume at time t0
  • t1 - time t1, in this example t1-t0 = 120sec
  • t0 - time t0
To calculate this function value  for every second during last 10 minutes before horse racing market is turned in play, a sliding window is a way to go. Having a sliding window available for every second with a size of 120sec, we can obtain the first and the last values of traded volume in that window. The last value is a traded volume at time t1, whereas the first value is a traded volume at time t0, which is t1-120sec. All that to work requires just a bit of Esper declarative configuration.

First of all, we need to send a timestamped event to Esper engine with trader volume for a market runner:

-------------------------------------------------------------------------------------------------------------
//Create market event as a map with three attributes: runnerId, traded volume and timestamp
val event = Map("runnerId" -> runnerId,"tradedVolume" -> ctx.getRunnerTradedVolume(runnerId).totalTradedVolume,"timestamp"-> ctx.getEventTimestamp/1000)

//Send event to Esper
epService.getEPRuntime().sendEvent(event, "TradedVolumeEvent")
-------------------------------------------------------------------------------------------------------------

then we create a sql like query to calculate derivative value and register it with Esper engine. One thing worth to mention is the last part of this query 'group by runnerId'. Events have to be grouped by runner id, because events for many market runners are processed.

-------------------------------------------------------------------------------------------------------------
val expression = "select runnerId, (last(tradedVolume)-first(tradedVolume)) / (last(timestamp)-first(timestamp)) as delta from TradedVolumeEvent.win:time(120 sec) group by runnerId"

val statement = epService.getEPAdministrator().createEPL(expression)
-------------------------------------------------------------------------------------------------------------

The value of this query is calculated every time when new event is added to Esper engine, a simple listener updates time line chart with derivative value:

-------------------------------------------------------------------------------------------------------------
statement.addListener(new UpdateListener {
  def update(newEvents: Array[EventBean], oldEvents: Array[EventBean] ) {
    val delta = newEvents(0).get("runnerId"),newEvents(0).get("delta").asInstanceOf[Double]
    ctx.addChartValue("tv" + delta)
  }
})
-------------------------------------------------------------------------------------------------------------

Scala class to calculate derivative of traded volume is here.

Figure 2 Derivative of traded volume for all market runners with respect to time

Calculating the second derivative of runner traded volume requires two phases. First of all, the first derivative of traded volume is calculated and added to Esper engine as an event:

-------------------------------------------------------------------------------------------------------------
insert into TradedVolumeDeltaEvent select runnerId,timestamp,(last(tradedVolume)-first(tradedVolume))/(last(timestamp)-first(timestamp)) as tradedVolumeDelta from TradedVolumeEvent.win:time(120 sec) group by runnerId
-------------------------------------------------------------------------------------------------------------

Then the second derivative is calculated using sql like query:

-------------------------------------------------------------------------------------------------------------
select runnerId,(last(tradedVolumeDelta)-first(tradedVolumeDelta))/(last(timestamp)-first(timestamp)) as tradedVolumeDeltaPrim from TradedVolumeDeltaEvent.win:time(120 sec) group by runnerId
-------------------------------------------------------------------------------------------------------------

Scala class to calculate the second derivative of traded volume is here.

Figure 3 Second derivative of traded volume for all market runners with respect to time.

There are junit tests for all three scala classes that I'm referring to (here), so you can run them yourself if you want to play with examples on complex event processing that I presented in this post. Those junit tests generate the same html graphs as above into a target directory. Eventually please use maven command 'mvn clean install' inside http://code.google.com/p/betting-ai/source/browse/trunk/trader-examples/ to run all junit tests with one call.

References:
  1. Complex Event Processing - http://en.wikipedia.org/wiki/Complex_event_processing
  2. Esper - http://esper.codehaus.org/
  3. Scala - http://www.scala-lang.org/
  4. Betting Market Simulator - http://code.google.com/p/betting-ai/