DSL - The bridge between productivity and performance

From July 12th to July 17th I was attending Summer School on DSL Design and Implementation at EPFL. This summer school completely changed my view about DSL.

Domain Specific Languages

Generally speaking, there are two categories of DSLs, external DSL and internal DSL. External DSLs are standalone languages, which don’t depend on other languages. For example, SQL is one of the most successful external DSLs. Other commonly used external DSLs include make, cron, etc.

Internal DSLs are also called embedded DSLs(EDSL), which are embedded in a host language. The DSL itself is usually called the object language. Most researches are around embedded DSL, because they are much easier to implement by reusing the syntactic structures and type system of the host language. Examples of embedded DSLs include Actor, Slick and Parser Combinator in Scala, Parsec in Haskell, etc.

There are two flavors of embedded DSL: shallow embbedding and deep embedding. This distinction was first put forward by Boulton in Experience with embedding hardware description languages in HOL(1992).

We can think shallow embbedding as some kind of syntactic sugar, which usually only uses macros, metaprogramming or just borrow the syntax and type system from the host language. Rakefile and Gemfile are examples of shallow embedding in Ruby, Actor and Parser combinator are shallow embedding in Scala.

Deep embedding usually involves syntactic or semantic representations of the DSL. These representations can then be transformed, optimised and evaluated. For example, Rails is considered as deep embedding in Ruby, because it has a lot of semantic representations for tables and queries. Deep embedding is the focus of DSL research, because it enables the development of powerful DSLs and optimisations that are impossible in shallow embedding.

Host Languages

Languages that have flexible syntatic structures, expressive type system and support of macros, reflections and manipulations of syntactic structures are good candidates for the development of EDSLs. In this regard, Scala, Haskell, Racket and Ruby are the best candidates for embedding.

The idea of code as data from Lisp is the key feature for implementing EDSLs. Both Scala and Haskell have support for quasi-quoting, and they go even furthur by supporting type-safe manipulation of ASTs.

Scala has very flexible syntactic structures, expressive type system and support of macros, language virtualization, reflection and AST manipulation which makes it ideal for development of embedded DSLs. The framework Lightweight Modular Staging in Scala is used in some successful DSL projects, such as Delite, LegoBase and Spiral.

Racket supports quoting, macros, manipulation of syntatic structures and custom syntactic parsers, which makes development of new DSLs in Racket super easy.

Haskell has a GHC extension called Template Haskell, which adds compile-time metaprogramming facilities to Haskell. Intuitively, Template Haskell enables manipulation of AST at compile-time in a type-safe way.

Ruby has a very natural syntax, and it supports a lot of metaprogramming techniques, such as extending any class or object at runtime, which makes Ruby an ideal language for developing DSLs. Rails is the most successful DSL for web development in Ruby.

Abstraction without Regret

Previously, I thought DSLs are only good for productivity, as they can benefit from the domain knowledge to improve the expressiveness of the language. However, that is only part of the story. In the summer school, most of the presentations were about using DSL for high-performance computing. DSL can be used for high-performance computing, because it enables domain-specific optimization, which is impossible in generic compiler optimisations. For example, in the project Spiral, the researchers use DSL to generate code directly from mathematical formulas, the generated C code beats the best benchmarks in the industry.

The way how it works is to use high-level abstractions in the host language to encode the function f: Rep A -> B -> Rep C, with specialized parameters represented by native type B and delayed parameters represented by representation type Rep A. During generation time the program representation is optimized by utilizing domain knowledge and simplified by specializing parameter type B with concrete values. The intermediate high-level abstractions such as high-level functions, option types, type classes are eliminated away in the process. The resulting representation only involves types that are equal to or simpler than A or C.

A theoretical explanation of the power of this approach is provided by Philip Wadler et al [2015]. The key of the explanation is to relate program to logic through the Curry-Howard Correspondance. The fact that high-level abstractions can be eliminated means that the intermediate high-level types can be eliminated. From the perspective of logic, it implies in the proof of A -> C, all intermediate complex propositions can be eliminated. This is a known theorem called sub-formula property in logic which was proved by Gentzen in 1935.

Broadly speaking, this approach of computation is called staged computation. The most widely used staged computation in real world is the compilation of regular expressions to speed up matching of strings.

Davies and Pfenning (2001) provided a theoretical foundation for staged computation based on intuitionistic modal logic.

Partial Evaluation

The idea of staged computation is closely related to another old idea: partial evaluation or program specialization. The idea of partial evaluation is to simplify a program by evaluating the parameters that are already known.

The idea of partial evaluation leads to enlightening perspective about compilation and compilers.

We can think compilation as partial evaluation of an interpreter with respect to a source program. Similarly, we can think compiler as partial evaluation of the partial evaluator itself with respect to a fixed interpreter.

Based on the idea of partial evaluation, we can easily turn an interpreter to a compiler.