Domain Specific Language (DSL) - Overview


The C and C++ programming languages weren't designed specifically for trading system automation and timeseries modelling. Implementing even relatively basic financial indicators like Welles Wilder's Relative Strength Index (RSI) can be very frustrating when starting from scratch. Fortunately C++ is a general purpose programming language that supports both operator overloading as well as template meta-programming. These features allows C++ to act as a 'host language' for more specialized languages, each designed to solve a particular domain of problems. Such languages are called Domain Specific Languages (DSL). There already exist numerous C++ hosted DSL's covering a variety of domains.

Some of the more popular ones mimic the capabilities of SQL (Structured Query Language; itself a domain specific language). Such a C++ hosted DSL extension could allow the creation of SQL statements in C++ without having to directly write and modify strings of SQL commands. A sample SQL statement in DSL could look something like:

 

         db( select( count( table.id)).from(table).where(table.name ="customers"));

 

As was demonstrated in the previous chapter, when working with series (sequences) of data, writing basic functions such as 'average()' is easy. But complexity quickly grows when functions need to be defined in terms of other functions where interim results need to be stored as series. A simple stateless framework is not suitable for such cases. Function objects (functors) are more suitable, but users would still be confronted with certain shortcomings such as series-bounds-error exceptions which were discussed in detail in the previous chapter.

Presenting a Domain Specific Language (DSL) for Series Modelling


TS-API offers its own DSL extension for modelling timeseries. The DSL extension is quite extensive and consists of various groups of functions. In order to tell them apart from their regular non-DSL counterparts, all DSL function names are CAPITALIZED and thus easily recognizable. In many cases a capitalized acronym is used (e.g.  EMA, SMA, WMA, RSI, CCI) to keep the code concise.

 

The DSL concept is not difficult to grasp but takes a little getting used to. There is much that happens behind the scene to keep the programming experience intuitive and predictable. As mentioned earlier, DSL functions can be nested, where the value returned from a function can immediately be used as input to another function.  For example, an average of an average can be written simply as:

 

SMA( SMA(data,10), 10);

 

This immediately conveys the feeling of a dedicated language. Another major DSL benefit is that many functions have constant evaluation time regardless of period! Calculating the correlation between two series over a 100 bar period has the same overhead as calculating the correlation over a 10000 bar period! That's a nice bonus when working with short timescales and millions of data items. Constant evaluation time is possible when a DSL function definition only relies on functions that themselves have constant evaluation time. Such functions are defined using an optimized algorithm, where the algorithm only needs to look at the most recent value (and sometimes the oldest value) in a series to update internal algorithm state.

 

Another important bonus of using the DSL extension is that there are no series-bounds-error exceptions to take into consideration. Each DSL function is evaluated only when the series it takes as argument has sufficient data. The DSL programming experience is thus quite pleasant and unencumbered.

'Preparing' DSL


DSL functions are defined either in terms of other existing DLS functions, or, alternatively, are implemented as function objects. We'll take a closer look at this mechanism shortly. For now, we are more interested in how DSL functions are actually used by the end user.

 

A strategy's DSL code is evaluated only once! It is evaluated as part of the strategy's strategy::on_prepare_dsl() member which is invoked when the strategy begins evaluation, just after the call to strategy::on_start() . DSL functions create and bind 'function objects' that will, once the strategy begins its evaluation loop (see strategy::run() ), evaluate the required algorithms iteratively over all strategy bars. Here is an example:

 

           class my_strategy : public strategy

          {

                    in_stream mkt;

                    series<double> a, b, c;

 

                    void on_prepare_dsl(void)   override

                    {

                              a = SMA( mkt.close, 5);

                              b = SMA(SMA( mkt.close, 5), 5);

                              c = SMA(SMA(SMA( mkt.close, 5), 5), 5);

                    }

     }

 

You can think of what happens in on_prepare_dsl() as a form of 'compilation'. Going back to the SQL analogy, the process is similar to the concept of 'stored procedures' and 'prepared statements' as used by Relational Database Management Systems (RDBMS). When a stored procedure is created in an SQL database, the database first parses the statement and then finds an optimized execution path. Subsequent calls simply skip all this overhead.

 

DSL as a 'Functional' Language:

TS-API's DSL is 'functional'. That is, the code looks like functions calling other functions and so forth. This is similar to formulas as used in spreadsheet applications. Spreadsheet 'data columns' and 'calculated columns' are a natural way of working with series, timeseries and series functions. You can think of DSL as a large spreadsheet application where new data is automatically added to the bottom row and all formulas are automatically recalculated.

The primary benefit of using DSL is that very little can go wrong! There is thus very little to debug! Almost all DSL errors are caught at compile time! Run-time errors are essentially limited to 'division by zero' exceptions. Furthermore, the DSL code is only evaluated once! You step through on_prepare_dsl() once, and you are essentially done! No debugging of complex loops necessary!

DSL Returns series<T> References


DSL functions return references to series objects! As an example, the AVERAGE() function does not return a single value, like its non-DSL average() counterpart, but rather returns an entire series of average values. This is semantically very different! This is why we are able to assign the value returned by AVERAGE() directly to a series<double> object, or nest it inside another call to AVERAGE(), or any other DSL function.

 

on_prepare_dsl() does not evaluate any financial indicators but rather puts in place the objects, structures and bindings required to perform the required calculations at 'strategy run-time', when  strategy:run() is invoked. At 'strategy run-time' the underlying logic is then evaluated iteratively, one bar at a time. Stepping through DSL code is really an illusion of sorts since the series references returned by DSL functions still reference empty series. The series are only filled with data during 'strategy run-time'.

 

series_cref<T> and sref<T>


A reference to a series<T> is represented by class series_cref<T>.  Since many functions declare arguments as well as return values of type series_cref<T> a shorter alias was introduced. This alias is sref<T>. The following is a simple DSL function that calculates the inner-product between two series over a given period.

 

          sref<double> INNER_PRODUCT(sref<double> x, sref<double> y, size_t period)

     {

                    return SUM(x * y, period);

          }

 

Using the 'auto' Keyword


 

DSL supports series of multiple types. This is why series<T> and series_cref<T> are declared as template classes! Since most applications built with the TS-API library are numerical in nature, the most commonly used type of objects will be series<double>, series_cref<double> and sref<double>. To keep the code concise a convention was introduced that replaces sref<double> and series_cref<double> with the c++ auto keyword. If you see auto in DSL code, you can therefore rely on it being a placeholder for series_cref<double> and its sref<double> alias. 

 

The following is a DSL implementation of the algorithm calculating 'variance'. Note the use of auto as a placeholder for series_cref<double>.

 

          sref<double> VARIANCE(sref<double> ser, size_t period)

     {

                    verify_non_zero(period);

                    auto s_of_sq = SUM(POW(ser, 2.0), period);

                    auto sum_sq  = POW(SUM(ser, period), 2.0);

                    auto vari    = (s_of_sq - (sum_sq / period)) / period;

                    return vari;

          }

 

Auto Casting of Constants to Series


DSL does not operate on individual values - only on series of values. You cannot access or operate on individual values using DSL functions. Nonetheless, the above VARIANCE function contained the following line:

 

          auto vari = (s_of_sq - (sum_sq / period)) / period;

 

Here the 'period' argument is neither a series nor a reference to a series. This is allowed because this period is treated as a constant value and is automatically cast to a series of constants by operator/. More explicitly the line could have been written as:

 

          auto vari = (s_of_sq - (sum_sq / CONST(period))) / CONST(period);

 

where CONST<T>() is a template function that casts a given value to a series of constants. As a beginning DSL user you may want to use CONST<T>() explicitly. Occasionally you will have to use CONST<T>() explicitly regardless, most notably when passing constants to the IF() function, which is not able to perform an automatic cast of constants.

 

Series Lifetime


In DSL, references to series are being passed around everywhere. But where do the series objects actually live? Who owns them? Unless a series is explicitly create as strategy member, the series is created behind the scene by a function objects which is itself owned by the strategy object. As a DSL user, you are not allowed to instantiate series<T> objects directly inside DSL functions. Such series objects would simply go out of scope when the DSL function returns without ever actually having served a useful purpose.

While DSL functions are usually defined in terms of other DSL functions, many basic library DSL functions are defined in terms of a function object, also known as functor. Such functors encapsulate both the DSL function's state as well as its logic. For example, the RAND_INT(int min, int max) function returns a reference to a series of random values.  It is implemented as follows:

 

          sref<int> RAND_INT(int low, int high) {

                    class  local_class : public functor::parent <int> { //define functor class

                              tsa::rand_gen rng;

                    public:

                              int m_low, m_high;

                              local_class(void) : functor::parent<int>(1){}

                              virtual void evaluate(void)override {

                                        parent::push(rng.int32(m_low, m_high));

                              }

                    };

                    auto ftor_ptr = new local_class(low, high);    // Create functor object

                    ftor_ptr->m_low = low;                         // Set properties

                    ftor_ptr->m_high = high;

 

                    strategy_context::strategy_ptr()->assume_functor_ownership(ftor_ptr); // Strategy takes ownership

                    return ftor_ptr->output(0).plot_as("RAND-INT", { low, high },   // Optional plot instructions

                              plot_type::line, color::auto_color, 2);

          }

 

As you can see, the RAND_INT() function declares and defines a local class which inherits from functor::parent<int>. This functor parent is initialized in the class constructor as  functor::parent<int>(1). The 1 represents the number of output series the functor will have, in this case just a single one. The reference returned by RAND_INT()  is thus a reference to the series created and owned by the functor::parent. The functor itself is owned by the strategy which takes ownership via the call to assume_functor_ownership(ftor_ptr). The strategy pointer itself comes from a 'thread local' variable which greatly simplifies the entire library.

 

There you have it! This is how DSL works at the fundamental level. The strategy DSL part is thus a more or less large collection of functors, each with its own evaluate() member, all of which get invoked once at every strategy interval. Most basic DSL functions were implemented this way.

 

Returning Series Tuples


DSL functions are able to return multiple series references as a tuple ( of type series_tuple<T> ). This allows a single DSL function to return a number of related series. Consider the 'Bollinger-Bands' indicator which consists of a centre moving average as well as two bands, one above and one below the centre line (see Tutorial 211). The function is able to returns all three series as a single tuple.

 

series_tuple<double> BOLLINGER_BANDS( series_sref<double> hi, sref<double> lo, sref<double> cl,

        size_t period, double std_dev_mult)

{

                    auto tp     = TYPICAL_PRICE(hi, lo, cl);

                    auto avg    = SMA(tp, period);

                    auto offset = STDEVP(tp, period) * std_dev_mult;

                    auto upper  = avg + offset;

                    auto lower  = avg - offset;

 

                    series_tuple<double> tuple = {

                              upper.name("upper"),

                              avg.name("avg"),

                              lower.name("lower")

                    };

                    return tuple;

} 

 

From Tutorial 211:

Control of Flow in DSL


TS-API's functional DSL does not have control of flow because control of flow constructs like switch and If-elseif-else cannot be overloaded in C++. But this is not a problem with modern C++11 because DSL functions like TRANSFORM() take a std::function argument, which means you can submit a lambda function or function object , which, in turn, can define all the control of flow you need.

 

For example, an ABS() function that returns a series of absolute values could be simply implemented as:

 

sref<double> ABS(sref<double> ser) {

                    return TRANSFORM(ser, [](double a) {

                              if(a >= 0.0) return a;

                              else return a * (-1.0);

                    });

}

 

The above code passes a lambda that performs the required calculation to the TRANSFORM() function. In this particular case, a DSL based solution would have been possible via the conditional IF function:

 

series_cref<double> ABS(const series<double>& ser) {

                    return IF(ser >= CONST(0.0), ser, ser * CONST(-1.0));

}

 

This produces the same output as the first version above. Nonetheless, in the case of complex expressions featuring many operators, a lambda based implementation will give improved performance!

 

Just keep in mind that if you cannot easily implement an idea using DSL, you can always fall back to a lambda or function object based solution. We'll cover various examples in later tutorials.

Plot Formatting


One last benefit of DSL based functions is that logic and formatting can be defined in the same scope. Consider the following DSL implementation of the well known MACD indicator. The following function returns a tuple of series<double> objects. At the point where we return the tuple, we use the series's plot_as() member to set information about how the series should be plotted and labeled.

 

 

series_tuple<double> MACD_DEMO(sref<double> close,

          size_t period_short, size_t period_long, size_t period_signal)

{

          auto ma_short  = EMA(close, period_short);

          auto ma_long   = EMA(close, period_long);

          auto macd_line = ma_short - ma_long;

          auto signal    = EMA(macd_line, period_signal);

          auto histo     = signal - macd_line;

 

          // Create tuple return value.

          series_tuple<double> tuple = {

                    macd_line.name("macd").plot_as("MACD", { period_short, period_long },

                                              plot_type::line, color::blue),

                    signal.name("signal").plot_as("MACD-SIG", period_signal, plot_type::line, color::white),

                    histo.name("hist").plot_as("MACD-HIST", 0, plot_type::bar, color::green),

          };

          return tuple;

}

 

The MACD indicator, as defined above, when plotted: