Quantitative Developers

This is a short and completely non-technical post that outlines our view on how to hedge a portfolio.

Our view is very simple and has 5 parts to it:

      • A portfolio manager takes 2 kinds of risk: stock risk, which is when your stocks go down for their own reasons, and factor risk, when you might lose money because the market crashes.
      • Stock risk is minimized by having a well diversified portfolio with not too much weight in any one stock.
      • A lot of the factor risk can be eliminated by hedging the market using an ETF or index future.
      • If you don’t have access to a risk-management utility, then we’d recommend hedging about 70% of your net position. For example, a $75M long amd $25M short portfolio would be hedged with roughly $35M short in a market future.
      • Do not be tempted to completely net out your position: $75M long, $25M short, with a $50M short hedge in a market index is not a good idea at all (unless your position is entirely large-cap names that are actually consituents of the index). If you dollar-neutral hedge, you’ll probably lose money if the stock market goes up because your stocks won’t go up as much as the market does.

We have checked that this rule holds both when the market is calm and during crashes. There is some variation in the ideal way to hedge from year to year, but it’s not true that during a crash you should make sure you’re dollar neutral.

The value of 70% comes from lots of what-if analysis and beta calculations, and seems pretty robust. It does vary slightly depending on what’s in your portfolio, the investment time, and what you’re using to hedge, but as a rule of thumb, it appears to work well in a lot of situations.



This summarises the general view that OTAS Technologies has toward hedging.


A previous post (https://blog.otastech.com/2015/11/the-dollar-hedge-part-1/) discussed the problem that the average beta of stocks in a typical portfolio is less than 1.

The consequence of this was that if you try to use a dollar hedge, you frequently end up with an overall short position — in other words, the dollar-hedged portfolio should be expected to lose money if the market goes up. The conclusion is that you should typically only hedge about 60%-70% of your dollar position, depending on the exact makeup of your portfolio.

However, you could worry that in times of crisis, the stocks might move together much more — if they’re driven by large-scale macro forces, then perhaps their correlations might go up.

I attempt to answer this question here by plotting average beta vs time. The article is somewhat technical, but the interesting result is in Figure 1, so feel free to scroll to that.




I picked the 1000 largest Euro-denominated (so that we don’t just measure currency vol) stocks and worked with their beta against the Eurostoxx-50 index, for the last 10 years. Each stock has its beta calculated in a moving (Gaussian) window with width 100 trading days, and the mean is taken.

Beta is hard to estimate based on too little data. This is because it involves dividing two quantities which are both products of returns. Given that stock returns tend to have quite noticeable outliers, the product of two returns (for instance the stock’s return multiplied by the market’s return), can vary wildly from day to day. If you keep outliers in the calculation, then the resulting beta is weighted heavily towards what happened to stocks on just one or two high volatility days, but if you take the outliers out (or clip them to a permissible range), then you don’t answer the question “what happens in high volatility conditions”.

So we need several months’ worth of data realistically, to get a handle on whether beta is currently high or low. There’s also a risk that the numbers we get will be specific to the methodology we choose. For instance, if, during a crash, all the stocks were to crash, but not all on the same day, then the 1-day returns might show low beta, but the 5-day returns might have a much higher beta.

The only way round this is to try lots of methodologies and see if they agree.



The average beta has been fairy constant over the last 10 years, and seems not to be particularly correlated to market volatility:


The beta fluctuates from year to year, but seems not to be directly related to volatility.

The uncertainty comes from assuming that the beta variation for a 10-day window is completely noise, and scaling the observed noise to the window used here (100 days).

If we try varying the return time, we get a similar shape, but shorter timescales have lower betas. This is completely expected if we take into account the short-term mean reversion: There is a slight tendency for stocks to revert from one day to the next, and although difficult to profit from, the effect is strong enough to increase correlations for longer timescales:



Shorter return periods give lower betas because of short-term mean-reverting price fluctuations.


Then, trying median beta rather than mean, and trying a less aggressive outlier reducing process:


Different methods give vaguely similar betas.

So it seems that at least for these 1000 stocks, the beta seems to be fairly unconnected to volatility.

One last test was to try the beta vs a home-made market (the largest 100 stocks in the same universe):


When a different definition of ‘Market’ is used, you get a comparable beta time series, but the details are different.



Beta did vary from year to year, and it seems to be significant – but the uncertainty in the beta estimate is difficult to estimate. There seems not to be a strong link between mean beta and volatility, though.


Underlying data courtesy of Stoxx. The Stoxx indices are the intellectual property (including registered trademarks) of STOXX Limited, Zurich, Switzerland and/or its licensors (“Licensors”), which is used under license. None of the products based on those Indices are sponsored, endorsed, sold or promoted by STOXX and its Licensors and neither of the Licensors shall have any liability with respect thereto.


OTAS stamps are one of the most recognisable ways in which we visually represent our data. We generate stamps for both single-stock and list data across many metrics, always seeking to present the most important information clearly and concisely. Currently, our stamps are handwritten in the SVG vector graphic format. In this post, I’m going to explore recreating one of our stamps in Haskell using the diagrams framework, leveraging the expressiveness and reuse that come with the abstraction as much as possible. This is not a diagrams tutorial; please refer to the manual for a comprehensive introduction.

OTAS single-stock stamps

OTAS single-stock stamps

I’m going to recreate one of our simpler stamps, the Signals stamp. This is for simplicity: I am confident diagrams is expressive enough to recreate all of our stamps. The Signals stamp displays the most recent close-to-close technical signal which has fired for the stock, showing the number of days ago the signals fired and the average one-month return the stock has experienced when this signal has fired in the past. Here’s an enlarged image of the stamp:

diagrams provides a very general Diagram type which can represent, amongst others, two-dimensional and three-dimensional objects. I’ll build up the stamp by combining smaller elements into a larger whole, focusing first on the core details contained in the inner rectangle: the row of boxes and lines of text. As we’ll see, the framework provides a rich array of primitive two-dimensional shapes with which to begin our diagram, and a host of combinators for combining them intuitively. Bear in mind that the Diagram type is wholly independent of the back-end used to render it: back-ends exist for software such as Cairo and PostScript as well as a first-class Haskell SVG representation provided by the diagrams-svg library.

First, let’s get some imports and definitions out of the way:

{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE OverloadedStrings #-}

module Stamps where

import Diagrams.Prelude
import Diagrams.Backend.SVG
import Text.Printf

data SignalsFlag = SignalsFlag
  { avgHistReturn :: Double
  , daysAgo :: Int

  = sRGB24read "#BBBBBB"
  = sRGB24read "#AAAAAA"
  = sRGB24read "#999999"
  = sRGB24read "#40981C"
  = sRGB24read "#101010"
  = sRGB24read "#2F2F2F"

Our point of reference is Diagrams.Prelude in the diagrams-lib package (I’m using version 1.3), which exports the core types and combinators together with handy tools from lens and base. Additionally, we import Diagrams.Backend.SVG from diagrams-svg as our back-end. As I explained above, Diagrams is back-end agnostic: we import one here only to simplify the type signatures. The remainder of the above snippet defines a representation of a signal and a set of colours, read from a hexadecimal colour code.

Primitive shapes such as squares and rectangles, as well as combinations of these, have type Diagram B (the B type is provided by the back-end). Crucially, each diagram has a local origin which determines how it composes with other diagrams. As you may guess, regular polygons use, by default, the canonical origin, but this may not be as easy to determine for more complex structures.

The most basic combinators are (<>), (|||) and (===) and their list equivalents mconcat, hcat and vcat. (<>) (monoidal append) superimposes the first diagram atop the second (clearly diagrams form a monoid under such an operation), so circle 5 <> square 3 is a circle of radius 5 overlaid on a 3 by 3 square, such that their origins coincide (the resulting diagram has the same origin). (|||) combines two diagrams side-by-side, with the resulting origin equal to the origin of the first shape, while (===) performs vertical composition. These combinators are instances of the more general beside function, but they are sufficient here. Additionally, beneath is a handy synonym for flip (<>), and # is reverse function application, provided to aid readability.

Here’s the function to generate the internals of the signals stamp. Remember, we’re using a SignalsFlag to generate a Diagram B, hence the type signature of signalsStamp.

signalsStamp :: SignalsFlag -> Diagram B
signalsStamp flag
  = vcat $
      [ strutY 1.2 `beneath` text perf
          # fontSizeL 1.2
          # fc white
      , strutY 1 `beneath` text "20d avg. perf."
          # fontSizeL 0.8
          # fc mediumGrey
      , strutY 0.5
      , boxRow
          # alignX 0
      , strutY 1 `beneath` text "days ago"
          # fontSizeL 0.8
          # fc mediumGrey
      , strutY 1 -- Padding for aesthetic
      = printf "%+.2f%%" (view avgHistReturn flag)
    boxRow :: Diagram B
      = hcat boxes
          = map (\k -> box k (k == view daysAgo flag)) [5, 4 .. 1]
            box n hasSignal
              = square `beneath` text (show n)
                  # fc white
                  = unitSquare
                      # lc lightGrey
                      # if hasSignal
                            fc brightGreen

Before I explain the code, here’s the resulting SVG:

The centre of the signals stamp

The centre of the signals stamp

The top level of the function vertically concatenates (vcat) the four components of the diagram. Text is enclosed in strutYs, invisible constructs which represent empty space in the diagram, and which are also used for padding. Font size is adjusted relative to the enclosing strut with the fontSizeL function, so the size of the rendered text is a function both of the size of the strut and the font size. boxRow is a Diagram B representing the row of boxes and needs to be aligned with alignX to recentre the origin horizontally prior to vertical concatenation.

Constructing boxRow requires horizontally concatenating (hcat) five boxes labelled 5 through 1, with the box corresponding to the signal’s daysAgo field shaded green. The unitSquare primitive constructs a 1 by 1 square, which we fill with fc if the boolean hasSignal is true. Remember, to insert text into the square we superimpose the string onto it with (<>). A simple map generating the box indices and hasSignal values finishes the construction of the row.

From this small example, I hope it’s clear that simple diagrams, at least, may be constructed in a clear, declarative way. Next, I want to focus on constructing the series of rectangles which will surround the above image, the bezel. This is where the higher-level representation of diagrams really shines: reusability. We can construct a function which will take any diagram, scale it to a uniform size, and then enclose it in our bezel. This could be reused for every stamp we create, with no code duplication. To implement this, we’ll create a function bezel :: String -> Diagram B -> Diagram B which will wrap the given diagram in a series of rectangles and add the given title:

bezel :: String -> Diagram B -> Diagram B
bezel title stamp
  = mconcat
      [ (title' === centre)
          # alignY 0
      , roundedRect (r + innerWidth + outerWidth + outerWidth') (1 + innerWidth + outerWidth + titleHeight + outerWidth') curvature
          # fc mud
          # font "Arial"
    -- Ratio between width and height of stamp
      = 0.8
      = stamp
          # scaleToY 1
          # scaleToX r
      = 0.1
      = 0.025
      = 0.075
      = 0.075
      = 0.15
      = strutY titleHeight `beneath` text title
          # fc darkGrey
          # fontSizeL titleHeight
          # font "Arial"
      = mconcat
          [ stamp'
              # alignY 0
          , roundedRect (r + innerWidth) (1 + innerWidth) curvature
              # fc steel
          , rect (r + innerWidth + outerWidth) (1 + innerWidth + outerWidth)
              # fc black

The core specification is given at the top level and in the definition of centre. centre encloses the scaled stamp (of aspect ratio r) in first a steel-coloured rounded rectangle and then a black regular rectangle. The scaling is important, as it allows the function to operate correctly regardless of the width and height of the diagram argument. Then, the title is vertically concatenated (===) to this object and superimposed onto a final mud-coloured rounded rectangle.

Here’s the result of bezel "Signals" mempty, the bezel enclosing the empty diagram:

The empty bezel

The empty bezel

And here’s the final result, bezel "Signals" (signalsStamp $ SignalsFlag 2.06 5):

The final stamp

The final stamp

I’m very happy with this result, especially for such straightforward code. I haven’t been truly faithful to the dimensions of the original image, but they could be recreated with a little more effort. As I mentioned, our bezel definition is reusable across other diagrams, so to recreate our other stamps I would only need focus on the core structure, logic which I’m confident would be straightforward with Diagrams. But Diagrams is capable of far more than this, and I encourage you to explore the manual for an in-depth introduction.

Free monads are quite a hot topic amongst the Haskell community at the moment. With a category-theoretic derivation, getting to grips with them can be rather daunting, and I have found it difficult to understand the engineering benefit they introduce. However, having finally come to grips with the fundamentals of free monads, I thought I’d write a brief introduction in my own terms, focussing on the benefit when designing interpreters and domain specific languages (DSLs). Certainly, the information described below can be found elsewhere, such as this Stack Exchange question or Gabriel Gonzalez’ excellent blog post.

The free monad interpreter pattern allows you to generate a DSL program as pure data with do-notation, entirely separate from the interpreter which runs it. Indeed, you could construct the DSL and generate DSL programs without any implementation of the semantics. Traditionally, your DSL would be constructed from monadic functions hiding the underlying interpreter, with these functions implementing the interpreter logic. For example, a DSL for a stack-based calculator might implement functionality to push an element to the stack as a function push :: Double -> State [Double] (). Our free monad pattern separates the specification from the implementation, so our calculator program would be represented first-class, which can be passed both to the ‘true’ interpreter, as well as any other we wish to construct.

Let’s stick with the example of constructing a DSL for a stack-based calculator. To begin with, we need to construct a functor representing the various operations we can perform.

data Op next
  = Push Double next -- ^ Push element to the stack
  | Pop (Maybe Double -> next) -- ^ Pop element and return it, if it exists
  | Flip next -- ^ Flip top two element of stack, if they exist
  | Add next -- ^ Add top two elements, if they exist
  | Subtract next
  | Multiply next
  | Divide next
  | End -- ^ Terminate program
  deriving (Functor)

Our calculator programs will be able to push and pop elements from the stack and flip, add, subtract, multiply and divide the top two elements, if they exist. This definition may seem slightly strange: the datatype has a type parameter next which is included in each non-final instruction. In this way, each operation holds the following instructions in the program. The End constructor is special in that no instructions may follow it. Pop is special too: its field is a function, indicating that Maybe Double will be returned as a value in the context of the program (see below).

As I mentioned, Op needs to be a functor, which we derive automatically with the DeriveFunctor pragma. This will always be possible with a declaration of this style because there is exactly one way to define a valid functor instance.

As it stands, programs of different lengths will have different types, due to the type parameter next. For example, the program End has type Op next (next is a free type parameter here) while Push 4 (Push 5 (Add End)) has type Op (Op (Op (Op next))). Note that these look suspiciously like lists, the elements of which are Op‘s constructors. The free monad of a functor encodes this list-like representation while also providing a way to lift a value to the monadic context. For any functor f, the free monad Free f is the datatype data Free f a = Pure a | Free (f (Free f a)). Notice the similarity to the definition of list.

Suffice to say, Free f is a monad. I don’t want to dwell on the mathematical significance of Free or its application beyond interpreters: I’d rather focus on how they are used to define DSL programs using do-notation.

-- From Control.Monad.Free 
type Program = Free Op

push :: Double -> Program ()
push x
  = liftF $ Push x ()

pop :: Program (Maybe Double)
  = liftF $ Pop id

flip :: Program ()
  = liftF $ Flip ()

add :: Program ()
  = liftF $ Add ()

subtract :: Program ()
  = liftF $ Subtract ()

multiply :: Program ()
  = liftF $ Multiply ()

divide :: Program ()
  = liftF $ Divide ()

end :: forall a. Program a
  = liftF End

Program is the monad representing our DSL program, and is defined as Free Op (Free is defined in the free package). We need to provide functions to lift each constructor into Program, using liftF. Notice that almost all of the functions are of type Program (), except for pop, which yields the top of the stack as a Maybe Double in the monadic context, and end, whose existential type indicates the end of a program (this requires the RankNTypes language extension). With these, we can start defining some calculator programs. Remember, we’re simply defining data here: we’re haven’t defined an interpreter yet!

-- Compute 2 * 2 + 3 * 3, leaving result on top of stack
prog :: forall a. Program a 
prog = do 
  push 2 
  push 2
  push 3 
  push 3 

pop allows us to read values into the Program context, meaning we can use pure Haskell functions in our programs:

-- Double the top element of the stack.
double :: Program ()
double = do
  x <- pop
  case x of 
    Nothing ->
      return ()
    Just x' ->
      push $ x' * 2

Finally, let’s define an interpreter for our DSL, which returns the state of the stack after program execution.

modStack :: (a -> a -> a) -> [a] -> [a]
modStack f (x : x' : xs)
  = f x x' : xs
modStack _ xs
  = xs

-- The existential type of the parameter forces the program to be 
-- terminated with 'end'.
interpret :: (forall a. Program a) -> [Double]
interpret prog
  = execState (interpret' prog) []
    interpret' :: Program a -> State [Double] ()
    interpret' (Free (Push x next)) = do
      modify (x :)
      interpret' next
    interpret' (Free (Pop next)) = do
      stack  do
          put $ xs
          interpret' $ next (Just x)
        [] ->
          interpret' $ next Nothing
    interpret' (Free (Flip next)) = do
          put $ x' : x : xs
        _ -> return ()
      interpret' next
    interpret' (Free (Add next)) = do
      modify $ modStack (+)
      interpret' next
    interpret' (Free (Subtract next)) = do
      modify $ modStack (-)
      interpret' next
    interpret' (Free (Multiply next)) = do
      modify $ modStack (*)
      interpret' next
    interpret' (Free (Divide next)) = do
      modify $ modStack (/)
      interpret' next
    interpret' (Free End)
      = return ()

The interpreter logic is very simple: we pattern match against each Op constructor, mutating an underlying list of doubles where necessary, being sure to recurse on the remainder of the program represented by next. interpret' accepts an arbitrary Program a parameter which doesn’t necessarily have to be terminated by end, although we insist on this by wrapping the function in interpret which demands an existential type. This detail is not important here, although may be for more complex interpreters which require resource clean-up once the interpretation has terminated. end could also be automatically appended when interpreting, too.

The beauty of this programming pattern is the separation of concerns: we can freely define a different interpreter which can operate on the same DSL program. It is often very natural to want to define more than one interpreter: in this case, I may want to execute the program, additionally warning the user when illegal pops or flips are made. This is possible because the interpreter logic does not define the embedded language, unlike in the traditional approach, making this a very attractive design pattern for DSL and interpreter construction.

OTAS Lingo is our natural language app to summarise many of our analyses in a textual report. Users can subscribe to daily, weekly or monthly Lingo emails for a custom portfolio of stocks or a public index, and receive a paragraph for each metric we offer, highlighting important changes across the time period. Our method of natural language generation is simple: given the state of the underlying data, we generate at least one sentence for each of the metric’s sub-topics, and concatenate the results to form a paragraph. This method of text generation has proved to be very customisable and extensible, despite being time-consuming to implement, and the results are very readable. However, given that we specify hundreds of these sentences in the code base, some of which are rendered very infrequently, how do we leverage compile-time checking to overcome the runtime issues of tools such as printf?

First, a little background. We render our natural language both as HTML and plain text, so we abstract the underlying structure with a GeneratedText type. This type represents dates, times and our internal security identifiers (OtasSecurityIds), in addition to raw text values. HTML and plain text are rendered with two respective functions, which use additional information, such as security identifier format (name, RIC code etc.), to customise the output. Here’s a simplified representation of the generated text type.

type HtmlClass = T.Text

data Token 
  = Text T.Text -- ^ Strict Text from text package
  | Day Day -- ^ From time package
  | OtasId OtasSecurityId 
  | Span [HtmlClass] [Token] -- ^ Wrap underlying tokens in span tag, with given classes, when rendering as HTML

newtype GeneratedText = GeneratedText [Token]
  deriving (Monoid)

We expose GeneratedText as a newtype, and not a type synonym, to hide the internal implementation as much as possible. Users are provided with functions such as liftOtasId :: OtasSecurityId-> GeneratedText, which generates a single-token text value, an IsString instance for convenience and a Monoid instance for composition. With this, we can also express useful combinators such as list :: [GeneratedText] -> GeneratedText, which will render a list more naturally as English. These combinators may be used to construct sentences such as the following.

-- ^ List stocks with high volume.
-- E.g. High volume was observed in Vodafone, Barclays and Tesco.
volumeText :: [OtasSecurityId] -> GeneratedText
volumeText sids
  = "High volume was observed in " <> list (map liftOtasId sids) <> "."

This definition is okay, but it is helpful to separate the format of the generated text from the underlying data as much as possible. Think about C-style format strings, which completely separate the format from the arguments to the format. Our code base contains hundreds of text sentences, so the code is more readable with this distinction in place. Additionally, there’s value in being able to treat formats as first class, allowing the structure of a sentence to be built up from sub-formats before being rendered to a generated text value. Enter formatting.

The formatting package is a fantastic type-safe equivalent to printf, allowing construction of typed format strings which can be rendered to strict of lazy text values. The type and number of expected arguments are encoded in the type of the formatter, so everything is checked at compile time: there are no runtime exceptions with formatting. The underlying logic is built upon the HoleyMonoid package, and many formatters are provided to render, for example, numbers with varying precision and in different bases. Formatting uses a text Builder under the hood, but the types will work with any monoid, such as our GeneratedText type, although we do need to reimplement the underlying functionality and formatters. However, this gives us a very clean way of expressing the format of our text:

volumeFormat :: Format r ([OtasSecurityId] -> r)
  = "High volume was observed in " % list sid % "."

volumeText :: [OtasSecurityId] -> GeneratedText
  = format volumeFormat

The Format type expresses the type and number of arguments expected by the format string, in the correct order. volumeFormat above expects a list of OtasSecurityIds, so has type Format r ([OtasSecurityId] -> r), whereas a format string expecting an OtasSecurityId and a Double would have type Format r (OtasSecurityId -> Double -> r). The r parameter represents the result type, which, in our case, will always be a GeneratedText. Formatters are composed with (%), which keeps track of the expected argument, and a helpful IsString instance exists to avoid lifting text values. Finally, format will convert a format string to a function expecting the correct arguments, and yielding a GeneratedText value.

The format string style is especially beneficial as the text becomes longer, as the format is described without any noise from the data being textualised. We use a range of formatters for displaying dates, times, numeric values and security IDs, but we can also encode more complex behaviours as higher-order formatters, such as list above. Users can also define custom formatters to abstract more specialised representations. In practice, we specify all our generated text in Lingo in this way: it’s simply a clearer style of programming.

This will be a short post discussing how risk is normally handled in finance and how OTAS uses them. Today I’ve been improving code that deals with them, so I thought I would write a bit about risk models.

The audience for this is either someone from outside finance who isn’t familiar with finance norms, or someone in finance who has not had the chance to study risk models in detail.

The definition of risk

Intuitively, the term ‘risk’ should be something to do with the chances that you will lose an uncomfortable amount of money. In the equities business it is normally defined to be the standard deviation of returns. So if in a given year, your portfolio makes perhaps on average £500k, but fluctuating so that perhaps on a bad year it loses £500k, or on a good year it makes £1.5M, your risk is probably about £1M.

This can catch people out – that the definition that is almost universally used (for equities) includes the risk of making lots of money as well as the risk of losing lots of money. You could make the argument that if the stock can go up by 10%, then it could go down 10% just as easily. You could imagine situations where that’s not true though: If 10 companies were bidding for an amazing contract that only one of them would win, then you’re more likely to make lots of money than lose it (if you buy shares in one company).

In fact, the reasons that standard deviation of returns is used is that it’s simple to calculate. That might sound as if it’s the technical teams making a decision to be lazy in order to make life easy, but actually trying to estimate risk in a better way is nightmarishly difficult – it’s not that the quant team would have to sit and think about the problem for *ages*, it’s that the problem becomes guesswork. Getting the standard deviation of a portfolio’s returns takes a surprisingly large number of data points in finance (because fat tails makes the calculation converge more slowly than expected), but getting a complete picture of how the risk works including outliers, catastrophic events, bidding wars, etc., takes far, far more data.

Since there isn’t enough data out there the missing gaps would have to be filled by guess work. And so most people stick to a plain standard-deviation based risk model.

Having a simple definition means that everyone agrees on what risk numbers are: If someone asks you to keep your risk to less than 5% per year, they and you can look at your portfolio and largely agree that a good estimate of risk would be under the threshold. Then most people can look at the actual returns at the end of the year and say whether or not the risk was under the 5% target.

How risk models work

Let’s accept that risk is modelled in terms of the standard deviation of portfolio returns. To estimate your realised risk, you just take how many dollars you made each day for the last year or so, and take a standard deviation. The risk model, though, is used to make predictions for the future.

The risk model could just contain a table of every possible or probable portfolio and the predicted risk for that portfolio, but it would be a huge table. On the other hand, that is a complete description of what a risk model does: It just tells you the risk for any hypothetical portfolio. We can simplify this a bit by noting that if you double a portfolio’s position, the risk must double, so don’t have to store every portfolio. In fact, similar reasoning means that if we have the standard deviation for N(N-1)/2 portfolios, we can work out the standard deviation for every portfolio.

Another way of saying the same is that all we need is the standard deviation for each stock, and the correlation between every stock and every other: If we know how volatile Vodafone is, and how volatile Apple is, and the correlation between them, then we can work out the volatility of any portfolio just containing Vodafone and Apple.

In the first instance, all you can do to predict the future correlation between two stocks is to look at their history – if they were historically correlated, we can say that they probably will be correlated in the future. However, we can probably do slightly better than that, and simplify the risk model at the same time using the following trick:

We make the assumption that the only reason that two stocks are correlated is that they share some factor in common: If a little paper manufacturer in Canada is highly correlated to a mid-sized management consultancy firm in Australia, we might say that it’s only because they’re both correlated to the market. Basically you have, say, 50 hypothetical influences, (known as “factors”) such as “telecoms”, or “large cap stocks” or “the market”, and you say that stocks can be correlated to those factors. You then ban the risk model from having any other view of correlation: The risk model won’t accept that two stocks are simply correlated to each other – it will only say that they’re both correlated to the same factors.

This actually helps quite a bit because the risk model ends up being much smaller – this step reduces the size of the risk model on the hard drive by up to 100 times, and it also speeds up most calculations that use it. If the factors are chosen carefully, it can also improve accuracy – the approximation that stocks are only correlated via a smallish number of factors can theoretically end up averaging out quite a lot of noise that would otherwise make the risk model less accurate.

What OTAS Tech does with them

OTAS Technologies uses risk models for correlation calculations, and for estimating clients portfolio risk, and for coming up with hedging advice. Risk models are also useful for working out whether a price movement was due to something stock-specific or whether it was to do with a factor move.


Several months ago, Stack added support for building and running Haskell modules on-the-fly. This means that you no longer need to include a Cabal file to build and run a singe-module Haskell applications. Here’s a minimal example, HelloWorld.hs.

#!/usr/bin/env stack
-- stack --resolver lts-3.12 --install-ghc runghc

  = putStrLn "Hello, world"

Given HelloWorld.hs has executable permissions, ./HelloWorld.hs will produce the expected output!

Stack can be configured in the second line of the file, with general form -- stack [options] runghc [options]. Above, the --install-ghc switch instructs Stack to download and install the appropriate version of GHC if it’s not already present. Specifying the resolver version is also crucial, as it enables the build to be forever reproducible, regardless of updates to any package dependencies. Happily, no extra work is required to import modules from other packages included in the resolver’s snapshot, although non-resolver packages can be loaded with the --package switch.

This feature finally allows programmers to leverage the advantages of Haskell in place of traditional scripting languages such as Python. This would be especially useful for ‘glue’ scripts, transforming data representations between programs, given Haskell’s type safely and excellent support for a myriad of data formats.

The Turtle Library

Gabriel Gonzalez’ turtle library attempts to leverage Haskell in place of Bash. I write Bash scripts very occasionally and am prone to forgetting the syntax, so immediately this is an interesting proposition to me. However, does the flexibility of the command line utilities translate usefully to Haskell?

The turtle library serves two purposes. Firstly, it re-exports commonly-used utilities provided by other packages, reducing the number of dependencies in your scripts. Some of these functions are renamed to match their UNIX equivalents. Secondly, it exposes the Shell type, which encapsulates streaming the results of running some of these utilities. For example, stdin :: Shell Text streams all lines from the standard input, while ls :: FilePath -> Shell FilePath streams all immediate children of the given directory.

The Shell type is similar to ListT IO, and shares its monadic behaviour. Binding to a Shell computation extracts an element from the computation’s output stream, and mzero terminates the current stream of computation. Text operations are line-oriented, so the monadic instance is all about feeding a line of output into another Shell computation. Putting this together lets us write really expressive operations, such as the following.

readCurrentDir :: Shell Text
readCurrentDir = do
  file <- ls "."
  guard $ testfile file -- Ensure 'file' is a regular file!
  input file -- Read the file

readCurrentDir is a Shell which streams each line of text of each file in the current directory. Remember, guard will terminate the computation when file isn’t a directory, so its contents will never be read. Without it, we risk an exception being thrown when trying to read a non-regular file.

The library exposes functions for aggregating and filtering the output of a Shell. grep :: Pattern a -> Shell Text -> Shell Text uses a Pattern to discard non-conforming lines, identical to its namesake. The Pattern type is a monadic backtracking parser capable of both matching and parsing values from Text, and is also used in the sed :: Pattern Text -> Shell Text -> Shell Text utility. The Fold interface collapses the Shell to a single result, and sh and view run a shell to completion in the IO monad.

Perhaps unsurprisingly, Turtle’s utilities are not as flexible as their command-line equivalents, as no variants exists for different command-line arguments. Of course, it would be possible for the library to expose a ‘Grep’ module with a typed representation of the program’s numerous optional parameters, but this seems overkill when I can execute and stream the results with inShell. This, then, informs my use of Turtle in the future. I do see myself using it in place of shell scripts, and not for typed representations of UNIX utilities. The library lifts shell output into an environment which I, as a Haskell programmer, can manipulate efficiently and expressively. It serves as the glue whose syntax and semantics I already reason with every day.

See Also

Stack script interpreter documentation

This post is less finance focused and more software-development focused.

At OTAS Technologies, we invest considerable time into making good software-development decisions. A poor choice can lead to thousands of man-hours of extra labour. We make a point of having an ecosystem that can accommodate several languages, even if one of them has to be used to glue the rest together. Our current question is how much Haskell to embed into our ecosystem. We have some major components (Lingo, for example) in Haskell, and we will continue to use and enjoy it. However it is unlikely to be our “main” language for some time. So it is largely guaranteed a place in our future, but still pacing out the boundaries of its position with us.

We started off assuming that python was going to simplify life. It was widely used and links in to fast C libraries reasonably well. This worked well at first, but there were significant disadvantages which lead us away from python.


Python’s first disadvantage was obvious from the start: Python is slow, several hundred times slower than C#, Haskell, etc., unless you can vectorise the entire calculation. For people coming from a Matlab or Python background, they would say that you can vectorise almost everything. In my opinion, if you’ve only used Matlab or Python, you shy away from the vast number of useful algorithms that you can’t vectorise. We find that vectorising works often, but in any difficult task, there’ll be at least something that has to be redone with a for-loop. It is true, however, that you can work round this problem by vectorising what you can, and using C (or just learning patience) for what you can’t.

Python’s second disadvantage though, is that large projects become unwieldy because they aren’t statically typed. If you change one object, you can’t know that the whole program still works. And when you look at a difficult set of code, there’s often nowhere that you can go to see what an object is made of. An extreme example actually came from code that went to a database, got an object and worked on that object. The object’s structure was determined by the table structure in the database. That makes it impossible to look at the code and see if it’s correct without comparing it to the tables in the database. This unwieldiness gets worse if you have C modules: Having some functionality that’s separated from the rest of the logic simply because it needs a for-loop makes it hard to look through the code.

Consequently, we only use Python now for really quick analysis nowadays. An example is to load a json from a file, and quickly plot a histogram of the market cap. For this task, it’s still good. There’s no danger of it getting too complicated. Nobody needs to read over the code later, and generally speed isn’t a problem.

The comparison process

The last week, though, has been spent comparing Haskell to C#. Haskell is an enormously sophisticated language that has, somewhat surprisingly, become easy to use and set up, thanks to the efforts of the Haskell community (Special thanks to FPComplete there). However, it’s a very different language to C#, and this makes it hard to compare. There is an impulse to try not to compare languages because the comparison can decay into a flame war. If you google X vs Y, where X and Y are languages, you come across a few well informed opinions, and a lot of badly informed wars.

There are several reasons for this, in my opinion. Often the languages’ advantages are too complicated for people to clearly see what they are and how important they are, and each language has some advantages and some disadvantages. A lot of the comparisons are entirely subjective – so two people can have the same information and disagree about it. The choice is often a psychology question regarding which is clearer (for both the first developer, and the year-later maintainer) and a sociology question about how the code will work with a team’s development habits.

There’s another difficulty that most of the evaluators are invested in one or the other language – nobody wants to be told that their favorite language, that they have spent 5 years using, is worse. We even evaluate ourselves according to how good we are at language X and how useful that is overall. A final difficulty is that there’s peer pressure and stigma and kudos attached to various languages.

So what’s the point? The task seems so difficult that perhaps it’s not worth attempting.

The point is that it’s the only way to decide between languages. And actually all the way through the history of science and technology, there have been comparisons between different technologies, and sure, the comparison’s difficult, but you can’t not compare things because the conclusions are valuable. The important thing is to remember that the comparison’s not going to be final, and that mistakes can be made, and that the comparison will be somewhat subjective, but that doesn’t make it pointless.


As a disclaimer, I am not a Haskell expert. I have spent a couple of months using Haskell, and I gravitate towards the lens and HMatrix libraries. I have benefited from sitting beside Alex Bates, who is considerably better at Haskell than I am.

Haskell has a very sophisticated type system, and the language is considered to be safe, concise and fast. For me, I suspect that I could write a lot of code in a way that makes it more general than I could in C# – Often I find that my C# code is tied to a specific implementation. You don’t have to: you could make heavy use of interfaces, but my impression is that Haskell is better for writing very general code.

Haskell also has an advantage when you wish to use only pure functions – its type system makes it explicit when you are using state and when you are not. However, in my experience, unintended side effects are actually a very minor problem in C#. Sure, a function that says it’s estimating the derivative of a spline curve *might* be using system environment variables as temporary variables. But probably not. If you code badly, you can get into a mess, but normal coding practices make it rare (in my experience) that state actually causes problems. Haskell’s purity guarantees can theoretically help the compiler, though, and may pay dividends when parallelising code – but I personally do not know. I personally reject the idea that state in itself is inherently bad – a lot of calculations are simpler when expressed using state. Of course, Haskell can manipulate state if you want it too, but at the cost of slightly (in my opinion) more complexity and in the 3 or so examples that I’ve looked at, slightly longer code too.

Haskell is often surprisingly fast – that often surprises non Haskell-users (often the surprise is accompanied by having to admit that the Haskell way is, in fact, faster). The extra complexity and loss of conciseness is something that better designed libraries might overcome. This is possibly accentuated for the highest level code: My impression is that lambda functions and the equivalent of LINQ expressions in Haskell produces less speed overhead in Haskell than in C#.

Another advantage of Haskell is safety – null reference exceptions are not easy to come by in Haskell, and some other runtime errors are eliminated. However, you still can get a runtime exception in Haskell from other things (like taking the head of an empty list, or finding the minus-1st element of an HMatrix array). On the other hand, exception handling is currently less uniform (I think) than C#, and possibly less powerful, so again, we have room for uncertainty.

Some disadvantages of Haskell seem to be that you can accidentally do things that ruin performance (for instance using lists in a way that makes an O(N) algorithm O(N^2) or give yourself stack overflows (doing recursion the wrong way). However, I know that these are considered to be newbie-related, and not serious problems. When I was using Haskell for backtests, I quickly settled on a numerics library and a way of doing things that didn’t in fact have either of these problems. However, when I look over online tutorials, it’s remarkable how many tutorial examples don’t scale up to N=1e6 because of stack overflow problems.

For me, perhaps the most unambiguously beneficial Haskell example was to produce HTML in a type-safe way. A set of functions are defined that form a domain-specific language. This library is imported, and the remaining Haskell code just looks essentially like a variant of HTML – except that it gets checked at compile time, and you have access to all the tools at Haskell’s disposal. You could do that in C#, but it would not look pretty, and as far as I know, it’s not how many people embed HTML in C#.

But we are just starting the comparison process, and it will be interesting in the coming days, months and years to find out exactly what the strengths and weaknesses of this emerging technology are.

In a future blog post, we’ll write about the areas in which the two win and lose on. But for now, it’s better to leave it unconcluded.

Here at OTAS, we run a private, internal instance of Hackage to easily share our Haskell packages and documentation amongst our developers. It’s really a core part of our Haskell development workflow and, once fully set-up, integrates seamlessly with Stack. Unfortunately, I’ve found deploying the server to be tricky and poorly-documented, so I’ve written this guide to help make the process easier.

We use Stack for building all our Haskell projects, so we do not explicitly install either GHC or cabal-install on our machines. I’ll assume you have a working installation of Stack on your machine before you begin. Cabal-install users should be able to follow easily enough.

Building the Hackage Server

I’ve found it very frustrating to build the hackage-server packages listed on Hackage. At the time of writing, two version, 0.4 and 0.5.0, are listed, both of which have an extended list of dependencies which Cabal’s solver isn’t able to resolve for me. As such, I went directly to the Github source and decided to build the most recent commit at the time of writing, e0bd7fc2a8701a807a5551afc1ab1e9df4179e81.

git clone git@github.com:haskell/hackage-server.git 
cd hackage-server
git checkout e0bd7fc2a8701a807a5551afc1ab1e9df4179e81

Stack successfully generated this stack.yaml file (using stack init --solver) which you can copy into your working directory to reproduce the build. Simply run stack build to generate the executables.

Running Hackage

Before we’re ready to initialise the server we need to generate private keys using hackage-repo-tool. First build and copy this program to your path with Stack then generate and install the keys.

hackage-repo-tool create-keys --keys keys
cp keys/timestamp/.private datafiles/TUF/timestamp.private
cp keys/snapshot/.private datafiles/TUF/snapshot.private
hackage-repo-tool create-root --keys keys -o datafiles/TUF/root.json
hackage-repo-tool create-mirrors --keys keys -o datafiles/TUF/mirrors.json

Now we’re ready to initialise the server! In the following command, substitute the administrator username and password with your chosen credentials.

./hackage-server init --admin=user:password --static-dir=datafiles/

Finally, run the server on your chosen port.

./hackage-server run --port=4050 --static-dir=datafiles/

Now that your server’s up and running (and assuming your firewall’s correctly configured) you’ll be able to access the local Hackage instance through your browser. There you’ll find documentation for administrative functions such as user management. This wiki page may be helpful.

Automatically Building Documentation

We use hackage-build to automatically generate documentation for uploaded packages. This program requires cabal-install, its dependencies and GHC to be accessible on your PATH. Fortunately, we can use Stack to install cabal-install for us!

stack install alex happy cabal-install

Here’s where things get a little tricky. You’ll need to use a different instance of hackage-build for each version of GHC that your packages require to build. In my case, I have packages depending on base versions and, so I require GHC versions 7.8.4 and 7.10.2. I’ve installed both of these locally, off-path, at $HOME/.local/bin/ghc-7.8.4 and $HOME/.local/bin/ghc-7.10.2 respectively.

Of course, Hackage will document your packages by building them with cabal-install, which may well fail to find a successful build plan. Before switching over to Stack, I used Stackage snapshots to constrain the range of packages cabal-install considered. Therefore, all my older packages build with lts-1.0, and all my newer packages with lts-3.12. In the future, I’ll release packages with using a more recent snapshot, but I’ll always control when this happens.

For me, then, the best course is to run a hackage-build instance for each snapshot I’ve used. Documentation will therefore certainly be built by this instance for a package known to build with the corresponding snapshot. For each snapshot, I initialise hackage-build in a separate directory using the username and password of the account with which I wish to upload the documentation with. Note that account needs to be enabled as a trustee.

./hackage-build init http://localhost:4050 --init-username=user --init-password=password --cache-dir=build-cache-lts-1.0 # For lts-1.0

In the newly created build-cache-lts-1.0 directory I edit the cabal-config file to include both Hackage and the local Hackage as remote-repos, and I add the constraints specified by the corresponding snapshot. The first few lines of the file look like this.

remote-repo: localhost:http://localhost:4050
remote-repo: hackage.haskell.org:http://hackage.haskell.org/packages/archive
remote-repo-cache: build-cache-lts-1.0/cached-tarballs
constraint: AC-Vector ==2.3.2
constraint: BlastHTTP ==1.0.1
constraint: BlogLiterately ==

Remember, I perform the above sequence for each of my two snapshots. Finally, I can run each build client with its corresponding GHC version. For the lts-1.0 process, I use the following command.

PATH=$HOME/.local/bin/ghc-7.8.4/bin:$PATH ./hackage-build build --cache-dir=build-cache-lts-1.0

From here, it’s simple to script the builder to run continuously. Also check out the output of ./hackage-build --help for more options.

Enabling access through Stack

To alert Stack to your new internal Hackage, include its details in the package-indices section of your global config file (~/.stack/config.yaml). Instructions are provided here.

See Also

hackage-server wiki

Edsko’s commit on settings up TUF for the server

The dollar hedge is used throughout finance, and it is a bad idea.

To see why dollar hedges are used so widely, look into the reasons that people use them in the first place.

An uninformed investor might decide to invest in a stock because they want to invest in the stock, and partly because they have an understanding that the equity market as a whole should increase in value, and so they are quite happy to have exposure to both stock and market.

The dollar hedge is where you make sure that the total net dollar value of your portfolio is zero, by shorting or buying a hedging instrument such as a market future in order to balance your other positions.

The dollar hedge is where you make sure that the total net dollar value of your portfolio is zero, by shorting or buying a hedging instrument such as a market future in order to balance your other positions.

As the investor’s portfolio widens to more stocks, the market exposure adds up steadily, whereas the exposure to individual stocks tends to be diluted by diversification: If you have 1000 bets on 1000 stocks, the chances of losing money because they all happen to be under-performing stocks is small, but the chance of losing money because you’re long on all of them and the market crashes is very high. So if the aim is not to get market exposure, the investor can limit their risk by taking an opposing position in something like an index future or ETF. This is particularly important if there is a net long or short bias in their stock-positions.

It seems intuitive that if the investor has $1 M long in equities, then they should have roughly $1 M short in the index future. After all, the market future is meant to represent the market as a whole. But there is a problem: This almost always over hedges. This can be explained in terms of the average beta of your portfolio.

The beta of a stock to an index is how much you expect it to respond to movements in that index. A beta of 100% means that if the index goes up 5%, then the stock will also go up by 5% on average. A beta of 50% to an index going up 5% would mean only 2.5% expected rise in that stock. On average the beta should be 100% – but *only* if the stock is one of the index constituents. In fact, the most traded indices (or futures / ETFs on those indices) have quite a small list of constituents (for instance the FTSE 100 or Eurostoxx 50), and if you’re trading outside that small list (which you typically will), the average beta does *not* have to be equal to 100%, and in fact is normally lower.

OTAS Technologies makes extensive use of in-house risk models. We noticed this effect when we found that the average beta for a range of sensible portfolios was significantly less than 100% to the Eurostoxx 50. We spent some time fixing things, putting checks in place, and analysing our smoothing and data cleaning processes. Useful though that was, ultimately the effect was real.

We suspect that the effect is partly due to simple maths: different things drive different stocks, and a random stock outside the FTSE 100 will not necessarily be pushed around by the same thing as the FTSE 100. Then there’s a capitalization bias: These indices focus on large cap names, whereas the average portfolio might not. Then there’s also the possibility that the market indices drive themselves: Because they’re considered to be a proxy for the market, they get traded by people who take large macro views, and perhaps that causes their constituents to behave subtly differently to the average stock.

The effect on the average portfolio manager of getting this wrong can be stark. The beta hedge is guaranteed (if the beta is calculated correctly) to reduce the risk of the portfolio, but the dollar-neutral hedge is not. We have seen extremely plausible portfolios where in fact the dollar-neutral hedge *increases* risk by even more than the beta-neutral hedge decreases it. The most important effect, though, is for portfolio managers who tend to have long ideas and so end up with a short hedge. If they pick a dollar-neutral hedge, they will have an overall short exposure to the market. This will increase their risk, and get them negative drift (assuming that the market has a slight long-term upward drift). This is frequently the cause of the complaint that “We had good positions today, but the market went up and we lost money overall on the hedge”. It’s certainly the case that a well-hedged book can lose money if the market goes up. But on average, if the portfolio is well balanced, using beta- not dollar-neutral hedging, the portfolio will not normally have down days simply because the market went up.

To summarise, there’s good news and there’s bad news. The good news is that you don’t need to hedge as much as dollar-neutral. The bad news is that you might currently be short.


A plot of all the stocks denominated in euros in the OTAS universe, sorted by market cap. The y-axis is their beta to the Eurostoxx-50 index futures. You can see that some stocks have a beta well above 100%, but the majority are significantly below 100%, as shown by the red and blue averaging lines. The red and blue are to the future and to the ETF respectively. (The calculation used almost 6 years of returns from 2010, 5-day returns, and some winsorizing.)

In part 2, coming soon, we’ll hopefully look at some specific examples of when the dollar hedge has messed up a portfolio’s risk profile and performance.



Underlying data courtesy of Stoxx. The Stoxx indices are the intellectual property (including registered trademarks) of STOXX Limited, Zurich, Switzerland and/or its licensors (“Licensors”), which is used under license. None of the products based on those Indices are sponsored, endorsed, sold or promoted by STOXX and its Licensors and neither of the Licensors shall have any liability with respect thereto.