Embedding Lisp in C++ – A Recipe

Challenging  Clojure’s Integration with Java in Lisp with C++

Preamble – An uncommonly common language

Lisp may be said to be simultaneously the most common and near enough most uncommon programming language in the world. We can quantify this. Head over to the Tiobe Index of Programming Languages at http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html .

As of August 2014, the lead of the pack is populated by the usual suspects…  C, Java, Objective-C & C++

TIOBE Index Top 20 Lead

TIOBE Index Top 20 Lead

Lisp is barely number 19 on the top-20 index.

TIOBE Index Top 20 Lead

TIOBE Index Top 20

This of course proves nothing besides popularity. A single lens reflex camera, while infinitely more useful for photography, is still less common than cameras built into smart phones.

Lingua Franca of your Compiler

When I say Lisp is more common than any other programming language I mean this: No matter if you are programming in Python or C++, Lisp is invariably what holds up the scaffolding behind the scenes. This is true because Lisp is the Lingua Franca of your compiler. Your compiler does not work with the syntax that you see, semicolons in C++ or significant white space in Python. Rather it discards this at the earliest possible moment and converts your code into an abstract syntax tree or AST.  That AST is composed of lists of lists containing statements and expressions. Your compiler prefers this format for working with your code because it needs a format that is suitable for representing code as data, one of the core tenets of Lisp. It needs this because it must be able to both transform and optimise your code.  For example, it may want to elide, re-arrange, or parallelise.  Yet it must reason about the equivalence of  the transformations it makes. In other words,  your compiler needs to “calculates code,” precisely what Lisp is great at by way of its homoiconic syntax. So your compiler borrows this concept to build one or more Intermediate Representations (IR) using abstract syntax trees.  Being a List Processor and working with Abstract Syntax Trees, your compiler may essentially be regarded as a Lisp engine. We can extract this intermediate representation from compilers like GCC and CLANG.

Let’s take the very trivial example of a C++ function square() being called from a function main.

Simple function square in C++

Simple function square in C++

CLANG generates this AST for the function square() using the command “clang++ -cc1 -ast-dump hellofun.cpp

Hint: clicking on the images expands them to full resolution.

Square function AST

CLANG AST

Immediately evident is the canonically correct Lisp indentation and code layout. This format is actually very illuminating because it makes it obvious, for example, where type casts have been inferred by the compiler.  Similarly, we may elicit our AST from GCC as shown below. We use “g++ -fdump-rtl-dfinish hello fun.cpp“.

GCC AST

GCC AST

Returning to our analogy from photography, less capable cameras built into smart phones are more common than professional single lens reflex cameras. But for the average professional photographer this is of no consequence. A professional wedding photographer capturing life’s shiny moments in glamorous portraits will not find himself compelled to reach for a smart phone camera.  But how would this be different if wedding photography took months and could avail itself of pre-existing work done by multitudes of  smart phone users the world over ? Assume further that re-use depended on compatibility of  the photographic material.  And here lies the problem of programming languages like Haskell, OCaml & Lisp. They are extremely expressive. But they require certain a mathematical acumen, that eludes mainstream IT. Consequently the majority of problems solved in IT are expressed in less expressive languages. So while Haskell, OCaml & Lisp are more expressive, what is the use of being more expressive if you have to express most everything yourself? Being pragmatic means realising that re-using the wealth of very mature Java, C++ & Python libraries can be just as or more useful than writing such library support yourself in a more expressive language.  Of course this consideration is subject to other factors, such as whether you require proof of correctness, or the ability to evolve third-party libraries yourself. Understanding your requirements will go a long way here.

Expressiveness vs Mainstream Re-Use Cost Benefit Analysis

What trend can be expected in the future? We have highlighted a convergence of C++ on Lisp as well as the emergence of functional programming in C++ in previous articles.  For C++ this is new.  In Java this trend is old and open by admission. Guy Steele, co-author of the Java language specification at Sun Microsystems is quoted as saying “We were after the C++ programmers. We managed to drag a lot of them about halfway to Lisp.” The original quote can be read in context here http://people.csail.mit.edu/gregs/ll1-discuss-archive-html/msg04045.html. Guy Steele is also known as the author of the Lisp dialect Scheme. Yet the irony is this: Each time another programming language adopts yet more features from Lisp — the same is true for functional programming and Haskell, OCaml etc. — this detracts from Lisp itself, or Haskell or OCaml. Why ? Because the cost benefit analysis tips in favour of the less expressive, more mainstream language. This is true because the mainstream language will always have superior library support, yet the list of benefits bestowed exclusively by features exclusive to Lisp ( or Haskell, OCaml, etc. ) has just diminished. So rather than being a case of advocacy, the trend becomes self-defeating. Programming languages are most often thought of as a man-machine interface. They are just as much a medium of communication between programmers and software engineers. It stands to reason then that the trend will always be towards mainstream. As university curricula target a wider audience, and IT becomes less a matter for computer scientists and mathematicians, so too mainstream IT will trend towards programming languages that require less mathematical acumen than is demanded by languages such as OCaml, Haskell or Lisp.

Rich Hickey understood this when he devised Clojure. Because there is only one thing more powerful than having extreme expressiveness OR mature library re-use at your fingertips: and that is to have them both. Yet again, there is precedent. Common Lisp has been embeddable in C for some time by way of ECL, Embedded Common Lisp. ECL has traditionally focused on embedding in C, but less well-known also works with C++, including the more recent C++11. Coincidentally, as C++ tends more towards modelling state in closures and functions, the lack of emphasis on object orientation in ECL will become less of an issue.

A Recipe

The recipe we will present here will support the features shown below:

1)  The extreme expressiveness of Lisp embedded in C++, not just C

2) “Live programming” via a Python style REPL directly in a C++ process

3)  Support for bidirectional calls from Lisp to C++ and C++ to Lisp

4) Variable support for interpreted, byte-compiled and natively compiled operation

5) The ability to re-use not only C++ libraries from Lisp but also re-use all of Lisp’s libraries

     People have been writing the latter since around 1958. Why not use them?

6) A means of configuration management via Lisp to replace INI files or  XML

ECL Configuration

If you are building ECL from source, make sure you are building with C++ support. See screenshot below for an example.

ECL Configuration

ECL Configuration

The Source Code

Show below is the C++ source code for our recipe.  Assuming your ECL installation is to be found under the /usr/local prefix and further assuming you have saved the source code in the single file main.cpp,  you might compile this example as follows :

g++ -std=c++11 main.cpp -I/usr/local/include -L/usr/local/lib -lecl -stdlib=libstdc++

Note that this example assumes OSX and g++ with an LLVM backend that requires the -stdlib=libstdc++ flag. On Linux, this would not be required.  Please refer to comments in the code for explanations.

main.cpp


/*
 "Example of a C++ program embedding ECL with two-way calls."
  Copyright (C) 2014 Chris Kohlhepp
*/

#include <iostream>
#include <cstdlib>
#include <ecl/ecl.h>

// A macro to create a DEFUN abstraction in C++
// Credit: https://gist.github.com/vwood/662109
#define DEFUN(name,fun,args) \
 cl_def_c_function(c_string_to_object(name), \
 (cl_objectfn_fixed)fun, \
 args)

// Define some variables in C++ that we might wish to access from Lisp
auto elapsed = 0; // seconds elapsed
auto maxtime = 3600; // one hour

// Define some accessors.
cl_object runtime() {
 return ecl_make_integer(elapsed);
}

cl_object set_runtime(cl_object i) {
 auto seconds = fix(i);
 elapsed = seconds;
 return ecl_make_integer(elapsed);
}

// Define a function to run arbitrary Lisp expressions
cl_object lisp(const std::string & call) {
 return cl_safe_eval(c_string_to_object(call.c_str()), Cnil, Cnil);
}

// Initialisation does the following
// 1) "Bootstrap" the lisp runtime
// 2) Load an initrc to provide initial
//    configuration for our Lisp runtime
// 3) Make our accessors available to Lisp
// 4) Any In-line Lisp functions for later reference
void initialize(int argc, char **argv) {

 // Bootstrap
 cl_boot(argc, argv);
 atexit(cl_shutdown);

 // Run initrc script
 lisp("(load \"initrc.lisp\")");

 // Make C++ functions available to Lisp
 DEFUN("runtime", runtime, 0);
 DEFUN("set_runtime", set_runtime, 1);

 // Define some Lisp functions to call from C++
 lisp("(defun header () (format t \"Starting program...~%\"))");
 lisp("(defun makeanumber () 3.2)");
}

int main(int argc, char* argv[]) {

 // Bootstrap Lisp
 initialize(argc,argv);

 // Run some Lisp functions...
 // Demonstrates calling Lisp from C++
 lisp("(header)");

 // Demonstrate calling Lisp from C++ and
 // return its value to C++ using C++11 style syntax.
 auto x = ecl_to_float(lisp("(makeanumber)"));
 std::cout << "A number is " << x << std::endl;

 // Main loop
 // Do something "not so useful."
 for (; elapsed < maxtime; elapsed++){
    sleep(1);
    std::cout << "Time elapsed " << elapsed << std::endl;
 }

 return EXIT_SUCCESS;
}

initrc.lisp

(format t "I've run the contents of init.lisp~%")

(defun foo ()
    (format t "We called foo...~%"))

Putting It All Together

If you ran the above g++ command line, you will have a binary called a.out.  Let’s start this.

ECL C++ Sample Run

ECL C++ Sample Run

So what happened ? We initialised our Lisp engine within C++, the performed any relevant initialization via initrc.lisp. This will prove incredibly useful later.  We then evaluated a Lisp function (makeanumber) and streamed its output to cout. Noteworthy here is that C++11 was happy to infer the type from ECLs eco_to_float() function, eliminating any redundancy in type declarations. Incidentally (makeanumber) has been byte-compiled. Subsequently we entered our program’s main loop.

Now to make this slightly more interesting. We hit CTRL-C.

Lisp REPL in C++

Lisp REPL in C++

We now have a REPL inside our C++ process using Lisp’s excellent exception handling and restarts system. Restarts are one of the finer points of Lisp, one yet to find its way into C++.  Having a REPL means we can go an poke around.  Whatever we do here will be  interpreted. One of the functions we defined was (runtime) it denotes our loop variable. Let’s try that.

C++ Lisp Repl - 1

C++ Lisp Repl – 1

Ok, so our C++ loop variable has the value 6. But really we can run anything that Lisp has scope to… arithmetic, anything.  This is really useful, because we might, for example, interactively redefine a Lisp function subsequently called as part of the regular execution of our C++ program.  This gives rise to an entire style of programming otherwise alien to C++: Live Programming.

In fact, lets do this right-now. The Lisp function (makeanumber) we called from C++ evaluates to the constant 3.2. We can verify this by re-evaluating it in the REPL.  Let’s change it.  We’ll redefine the function to return something else: 6.4.

Live Programming

Live Programming

There is nothing inherent about using constants here. This could be an arbitrarily complex operation. Indeed we might find other ways to inject the operation into our program, apart from hitting CTRL-C and getting a REPL. We might, for instance, inject this logic via something like Zero-MQ, a popular message bus technology that abstracts a range of architectural patterns.  Of course, our C++ program, does not call (makeanumber) again, but if it did, you get the idea … immediate feedback without the edit-compile-debug cycle. Hence the name  Live Programming.

Now let’s confuse our C++ runtime a bit. Say we want our loop variable to assume the value 60 instead and proceed from there.  Remember those restarts? Exceptions such as CTRL-C are “restartable.”  Just tell Lisp to (continue).

C++ Lisp REPL Restart

C++ Lisp REPL Restart

Iterations 7..59 were skipped and C++ continued with iteration 60.

Beyond Live-Programming and the REPL, it is easy to see how this paradigm might be extended to provide configuration management. If we can set application parameters and script this in a file without having to compile and link a new C++ binary, then we can provide a means of configuration management. But don’t we have XML for this today? We do. And a one-on-one comparison of XML vs Lisp based configuration could fill pages and start several flame wars. Yet this is not the goal.  We do observe that we have included but one single header file, ecl.h,  and in turn ended up with a REPL in C++, Live Programming and Configuration Management — all in one. A key aim of software engineering is to manage and reduce complexity. The astute reader will observe that all our boiler plate code so far fits in about 50 lines of code — excluding comments. A paradigm that solves a problem in 500 lines of code is the lesser of a paradigm that solves the same problem in 50 lines of code. A paradigm that solves 3 or more problems in 50 lines of code…

More than just a REPL, this so called BREAK-LOOP hides a full featured symbolic debugger.

Break Loop Symbolic Debugger

Break Loop Symbolic Debugger

Just to recap, so far we have seen C++ calling in-line Lisp; Lisp calling C++; a Lisp REPL inside of a C++ process; a full symbolic Lisp debugger inside of C++; byte compiled and interpreted mode of execution; as well as trivial Live-Programming. We are yet to see full integration with Lisp’s package management system and fully compiled Lisp code inside of C++.

For more information about package management, you might wish to read up on ASDF and Quicklisp.  There are some 1000+ libraries available under Quicklisp. We will skip the detail, but think cmake-and-Python-PIP combined. Imagine I wanted to use sqlite – how would I make this available to my application ?  Like so:

(ql:quicklisp ‘sqlite)

Quicklisp

Quicklisp

This achieves the equivalent of Python’s PIP install.  How do we make this available within a Lisp application? We “require” it.

(require ‘sqlite)

Pythonistas know this as “import.” But this is fully compiled code. No interpreter, no GIL ( Python global interpreter lock limiting concurrency ). Just the same convenience as Python.

Require Import

Require Import

The real question is: how do I make this available inside of C++ ? Well, essentially the same way we demonstrated above in the REPL. What works in the REPL, works the same if byte-compiled or fully compiled. When ECL starts, it loads a bootstrap file called .eclrc from the user’s home directory. My .eclrc file has three lines. The first two are:

   (require :asdf)
   (require :ecl-quicklisp)

The first imports ASDF, the second imports Quicklisp.  An embedded ECL instance does not load .eclrc by default since there is an expectation that the application might be deployed outside of the context of the developer’s home directory. But our recipe already envisages its own bootstrap file called initrc.lisp — associated specifically with our C++ application.  Loading sqlite support from within our embedded Lisp C++ application is thus essentially reduced to :

(require :asdf)
(require :ecl-quicklisp)
(require ‘sqlite)

But we call this from the application initrc.lisp rather than the default bootstrap file.

This brings us to our final point: fully compiling our Lisp code for better performance inside of our C++ application. What we are after is the expressiveness of Python without its lacklustre performance. To make this a little more interesting, we will inline C++ directly inside Lisp. Matthias Benkard’s journal has a great post on how to inline C++ in ECL.  The (c-inline) macro can be persuaded to inline C++ as well as C. What is not immediately obvious from the posting is that the code presented is not immediately usable. Rather inlining C++ presumes static compilation.  Matthias gives the following example:

C++ Inlined in ECL

C++ Inlined in ECL

To use Matthias’ code we must first compile it — as we might rather expect with any C++ source.
We do this simply via (compile-file) and (load) directly from within Lisp. Now executing the function (c++hex) works as expected.

ECL C++ Inlining

ECL C++ Inlining

Here again, if we want to avail ourselves of this technique in our C++ recipe, we require but one small modification to our initialize() function – two lines of code.  We replace (load) with (compile-file) and a subsequent (load) with the latter eliding the file name extension.

void initialize(int argc, char **argv) {

...

// Run initrc script
// lisp("(load \"initrc.lisp\")");
lisp("(compile-file \"initrc.lisp\")");
lisp("(load \"initrc\")");

...

}

This produces an interesting JIT style behaviour when running our C++ application.  We can even observe the system CLANG compiler doing it’s magic because CLANG warning are finding their way to standard out.

CLANG JIT of ECL

CLANG JIT of ECL

To conclude, we have changed but one line of Lisp code and added another to our C++ recipe and have in effect added both Lisp JIT and C++ JIT capabilities — prototyped and demonstrated working all in under one minute of coding. Solving complexity with the smallest number of moving parts: this is what software engineering is all about!

Like Clojure to Java, ECL can be used to host Lisp within C++.  Head over to Meta-Circular Adventures in Functional Abstraction on how to leverage this capability for full featured functional programming. One key difference with ECL and C++ is that we have simple yet effective control over the AOT of our JIT process. We may chose to interpret, byte-compile or fully AOT/JIT compile at a point of our choosing.

 

 

 

 

 

 

 

 

 

 

 

 

 

14 responses to “Embedding Lisp in C++ – A Recipe

  1. Very interesting approach. Thank you.

  2. Greg Helton

    Thanks. This is great.

    I think an ‘n’ is missing … “We initialised our Lisp engine within C++, the performed …” should be “… then performed …”?

  3. Hal

    GPLv2 not playing nicely with every other license means, sadly, this is likely to fail further investigation at that hurdle. Not bashing, GPLv2 is a fine license but there are plenty of good reasons for not using it which preclude also using ECL.

    • My understanding is that the license is LGPL which means there is a linking exception – precisely the sort of thing you’d want for embedding. In short, if you change the ECL runtime itself, you need to contribute your changes back. Your client code is unaffected. There is an FAQ on this: http://ecls.wikispaces.com/FAQ. Check section 1.5 entitled “What about the license.” I think this is reasonable for a compiler license. GCC too is part of GNU and unless you happen to modify the compiler itself AND distribute it, your C/C++ client code and artifacts generated by GCC are unaffected. Unless I’ve misread the license, ECL is rather the same.

  4. No One

    How does the ecl ctrl^c signal handler know it is safe to call the C routines? Either it’s unsafe or there’s some black magic you are not telling us about ….

  5. Spectacular post. Very much appreciated!

  6. Interesting! In this case you’re explicitly exposing only certain C functions as Lisp functions, and calling Lisp using a special call where you pass a string. But if the goal is to challenge Clojure’s integration with Java, it should be transparent so all Lisp functions and variables are visible in C++ using C++ syntax, and all C++ functions and variables are visible in Lisp using Lisp syntax.

    • Hello BRobinson, you are right. This approach “thin slices” the problem by selectively exposing parts, not the whole of one language to the other. Clojure does on “step up” from this by providing universal access. As a point of interest, I should write a follow-up to this article using Clasp Lisp, a new Lisp for the LLVM. It is more along the lines of what you discussed. That said, I wonder if Clasp and Clojure are doing the right thing here. The aim, ostensibly, is re-use of existing software with ease while allowing Lisp style meta programming to attain a higher level of abstraction. Hylang does the same with Python. Yet, I wonder given that programming paradigms are so vastly different between functional and imperative, between expression based and statement based languages, I would tend to think that the resulting artifacts would form subsystems that evolve largely independently. If that’s the case are we better off with loose rather than tight coupling and an explicit interface between the two – one that avoids the hassle normally associated with FFI based interaction, but that nonetheless keeps dependencies from one side of the fence spilling to the other? I’m thinking along the lines of Google Protocol Buffers. Yet, I suspect that while this might make sense from a large scale systems point of view, it would confine meta programming solely to whatever the Lisp dialect is that sits on top of it’s imperative cousin. Feedback welcome…

  7. Pingback: Domain Specific Languages in C++ – Part 2: External DSLs | Simplify C++!

  8. This is brilliant and I want to get it ALL to work. I am sure i configured –with-c++ and got everything going up to and not including compiling benkard.lisp (mac yosemite). it generates a .c file, and then does (RUN-PROGRAM “gcc” …), which can’t find because it’s not a .cpp file. My attempt to call (RUN-PROGRAM “g++” …) fails because RUN-PROGRAM is not known at the REPL. I am now looking at going down the FFI rabbit hole to debug this, but perhaps you can see it right away. There is a transcript here if you’re willing and able to help http://pastebin.com/4LKA7mW6.

  9. SOLVED: to get the benkard example running, looks like the required switch to .configure is –with-cxx and not –with-c++, which your screenshot shows). There is another typo in your blog — you say (ql:quicklisp ‘sqlite), but it’s (ql:quickload ‘sqlite). This time, your screenshot is right and the text is wrong 🙂 This is brilliant stuff!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s