A low-level counterpart to Lua
API calls in terralib
that return arrays will always return a List object, which is a more complete List data type for use inside Lua code.
The List type is a plain Lua table with additional methods that come from:
These make it easier to meta-program Terra objects.
local List = require("terralist")
List() -- empty list
List { 1,2,3 } -- 3 element list
Creates a new list, possibly initialized by a table.
List also has the following functions:
-- Lua's string.sub, but for lists
list:sub(i,j)
-- reverse list
list:rev() : List[A]
-- app fn to every element
list:app(fn : A -> B) : {}
-- apply map to every element resulting in new list
list:map(fn : A -> B) : List[B]
-- new list with elements where fn(e) is true
list:filter(fn : A -> boolean) : List[A]
-- apply map to every element, resulting in lists which are all concatenated together
list:flatmap(fn : A -> List[B]) : List[B]
-- find the first element in list satisfying condition
list:find(fn : A -> boolean) : A?
-- apply k,v = fn(e) to each element and group the values 'v' into bin of the same 'k'
list:partition(fn : A -> {K,V}) : Map[ K,List[V] ]
-- recurrence fn(a[2],fn(a[1],init)) ...
list:fold(init : B,fn : {B,A} -> B) -> B
-- recurrence fn(a[3],fn(a[2],a[1]))
list:reduce(fn : {B,A} -> B) -> B
-- is any fn(e) true in list
list:exists(fn : A -> boolean) : boolean
-- are all fn(e) true in list
list:all(fn : A -> boolean) : boolean
Every function that takes a higher-order function also has an i
variant that also provides the list index to the function:
list:mapi(fn : {int,A} -> B) -> List[B]
List functions like map
are higher-order functions that take a function as an argument.
For each function that is an argument of a high-order List function can be either:
src/terralist.lua
)Example: local mylist = List { a,b,c } mylist:map(“foo”) – selects the fields: a.foo, b.foo, c.foo, etc. – if a.foo is a function it will be treated as a method a:foo()
Extra arguments to the higher-order function are passed through to these function. Rationale: Lua inline function syntax is verbose, this functionality avoids inline functions in many cases.
local List = require("terralist")
List:isclassof(exp)
True if exp
is a list.
terralib.israwlist(l)
Returns true if l
is a table that has no keys or has a contiguous range of integer keys from 1
to N
for some N
, and contains no other keys.
Every Terra entity is also a first-class Lua object. These include Terra Functions, Terra Types, and Terra Global Variables. Quotes are the objects returned by Terra’s quotation syntax (backtick and quote
), representing a fragment of Terra code not yet inside a Terra function. Symbols represent a unique name for a variables and are used to define new parameters and locals.
When a Terra function returns a value that cannot be converted into an equivalent Lua object, it turns into a Terra Value, which is a wrapper that can be accessed from Lua (Internally this is a LuaJIT "cdata"
object).
Each object provides a Lua API to manipulate it. For instance, you can disassemble a function (terrafn:disas()
), or query properties of a type (typ:isarithmetic()
).
tostring(terraobj)
print(terraobj)
All Terra objects have a string representation that you can use for debugging.
terralib.islist(t)
terralib.isfunction(t)
terralib.types.istype(t)
terralib.isquote(t)
terralib.issymbol(t)
terralib.ismacro(t)
terralib.isglobalvar(t)
terralib.islabel(t)
terralib.isoverloadedfunction(t)
Checks that a particular object is a type of Terra class.
terralib.type(o)
Extended version of type(o)
with the following definition:
function terralib.type(t)
if terralib.isfunction(t) then return "terrafunction"
elseif terralib.types.istype(t) then return "terratype"
elseif terralib.ismacro(t) then return "terramacro"
elseif terralib.isglobalvar(t) then return "terraglobalvariable"
elseif terralib.isquote(t) then return "terraquote"
elseif terralib.istree(t) then return "terratree"
elseif terralib.islist(t) then return "list"
elseif terralib.issymbol(t) then return "terrasymbol"
elseif terralib.isfunction(t) then return "terrafunction"
elseif terralib.islabel(t) then return "terralabel"
elseif terralib.isoverloadedfunction(t) then return "overloadedterrafunction"
else return type(t) end
end
memoized_fn = terralib.memoize(function(a,b,c,...) ... end)
Memoize the result of a function. The first time a function is call with a particular set of arguments, it calls the function to calculate the return value and caches it. Subsequent calls with the same arguments (using Lua equality) will return that value. Useful for generating templated values, such as Vector(T)
where the same vector type should be returned everytime for the same T
.
Terra functions are entry-points into Terra code. Functions can be either defined or undefined (myfunction:isdefined()
). An undefined function has a known type but its implementation has not yet been provided. The definition of a function can be changed via myfunction:resetdefinition(another_function)
until it is first run.
[local] terra myfunctionname :: type_expresion
[local] terra myfunctionname :: {int,bool} -> {int}
Terra function declaration. It creates a new undefined function and stores it in the Lua variable myfunctionname
.
If the optional local
keyword is used, then myfunctionname
is first defined as a new local Lua variable. When used without the local
keyword, myfunctionname
can be a table specifier (e.g. a.b.c
).
[local] terra myfunctionname(arg0 : type0,
...
argN : typeN)
[...]
end
Terra function definition. Defines myfunctioname
using the body of code specified. If myfunctioname
already exists and is undefined, then it adds the definition to the existing function declaration. Otherwise it first creates a new function declaration and then adds the definition.
local func = terralib.externfunction(function_name,function_type)
Create a Terra function bound to an externally defined function. Example:
local atoi = terralib.externfunction("atoi",{rawstring} -> {int})
myfunction(arg0,...,argN)
myfunction
is a Terra function. Invokes myfunction
from Lua. It is an error to call this on undefined functions. Arguments are translated to Terra using the rules for translating Lua values to Terra and return values are translated by using the rules for translating Terra values to Lua.
local b = func:isdefined()
true
if function has a filled in definition. To define a function use func:adddefinition
,
func:resetdefinition
or using function definition syntax terra func(...) ... end
.
local b = func:isextern()
true
if this function is bound to an external symbol like libc’s printf
. External functions are created either through importing C functions via terralib.includec
, or by calling terralib.externfunction
func:adddefinition(another_function)
Sets the definition of func
to the current definition of another_function
. another_function
must be defined and func
must be undefined. The types of func
and another_function
must match.
func:resetdefinition(another_function)
Sets (or resets) the definition of func
to the current definition of another_function
. another_function
must be defined. func
may or may not be defined. It is an error to call this on a function that has already been compiled.
func:printstats()
Prints statistics about how long this function took to compile and JIT. Will cause the function to compile.
func:disas()
Disassembles all of the function definitions into x86 assembly and optimized LLVM, and prints them out. Useful for debugging performance. Will cause the function definition to compile.
func:printpretty([quote_per_line=true])
Print out a visual representation of the code in this function. By default, this prints each part of the code that was originally specified on a separate line as a individual lines. If quote_per_line
is false
, it will print a more collapsed representation that may be easier to read.
r0, ..., rn = myfunction(arg0, ... argN)
Invokes myfunction
from Lua. Arguments are converted into the expected Terra types using the rules for converting between Terra values and Lua values. Return values are converted back into Lua values using the same rules. Causes the function to be compiled to machine code.
func:compile()
Compile the function into machine code. Ensures that every function and global variable needed by the function is also defined.
function_type = func:gettype()
Return the type of the function. function_type.parameters
is a list of the parameters types. function_type.returntype
is the return type. If the function returns multiple values, this return type will be a tuple.
func:getpointer()
Return the LuaJIT ctype
object that points to the machine code for this function. Will cause the function to be compiled.
str = func:getname()
func:setname(str)
Get or set the pretty name for the function. This is useful when viewing generated code but does not otherwise change the behavior of the function.
func:setinlined(bool)
When true
function when be always inlined. When false
the function will never be inlined. By default, functions will be inlined at the discretion of LLVM’s function inliner.
func:setoptimized(bool)
All Terra functions are optimized by default (equivalent of Clang -O3
). Pass false
to this method to disable optimization (equivalent of Clang -O0
).
func:setcallingconv(string)
Set the calling convention of the function. LLVM’s default calling convention is used by default. Valid values are the same as can be specified in LLVM’s text-based assembly language. (Note that, as of the time of writing, the official LLVM documentation is incomplete, particularly for target-specific calling conventions. For additional calling conventions, it may be necessary to consult the source code directly.)
Type objects are first-class Lua values that represent the types of Terra objects. Terra’s built-in type system closely resembles that of low-level languages like C. Type constructors (like &int
) are valid Lua expressions that return Terra type objects. To support recursive types like linked lists, structs can be declared before their members and methods are fully specified. When a struct is declared but not defined, it is incomplete and cannot be used as value. However, pointers to incomplete types can be used as long as no pointer arithmetic is required. A type will become complete when it needs to be fully specified (e.g. we are using it in a compiled function, or we want to allocate a global variable with the type). At this point a full definition for the type must be available.
int int8 int16 int32 int64
uint uint8 uint16 uint32 uint64
bool
float double
Primitive types.
&typ
Constructs a pointer to typ
.
typ[N]
Constructs an array of N
instances of type typ
. N
must be a positive integer.
vector(typ,N)
Constructs a vector of N
instances of type typ
. N
must be an integer and typ
must be a primitive type. These types are abstractions vector instruction sets like SSE.
parameters -> returntype
Constructs a function pointer. Both parameters
and returns
can be lists of types (e.g. {int,int}
) or a single type like int
. If returntype
is a list, a tuple
of the values in the list is the type returned from the function.
To specify a void return type, use the empty tuple {}
.
struct { field0 : type0, ..., fieldN : typeN }
Constructs a user-defined type, or exotype. Each call to struct
creates a unique type since we use a nominative type systems. See Exotypes for more information.
tuple(type0,type1,...,typeN)
Constructs a tuple, which is a special kind of struct
that contains the values type0
… typeN
as fields obj._0
…. obj._N
. Unlike normal structs, each call to tuple
with the same arguments will return the same type.
terralib.types.istype(t)
True if t
is a type.
type:isprimitive()
True if type
is a primitive type (see above).
type:isintegral()
True if type
is any integer type.
type:isfloat()
True if type
is float
or double
.
type:isarithmetic()
True if type
is integral or float.
type:islogical()
True if type
is bool
(we might eventually supported sized boolean types that are closer to the machine representation of flags in vector instructions).
type:canbeord()
True if the type
can be used in expressions or
and and
(i.e. integral and logical but not float).
type:ispointer()
True if type
is a pointer. type.type
is the type pointed to.
type:isarray()
True if type
is an array. type.N
is the length. type.type
is the element type.
type:isfunction()
True if type
is a function (not a function pointer). type.parameters
is a list of parameter types. type.returntype
is return type. If a function returns multiple values this type will be a tuple
of the values.
type:isstruct()
True if type
is a struct.
type:ispointertostruct()
True if type
is a pointer to a struct.
type:ispointertofunction()
True if type
is a pointer to a function.
type:isaggregate()
True if type
is an array or a struct (any type that can hold arbitrary types).
type:iscomplete()
True if the type
is fully defined and ready to use in code. This is always true for non-aggregate types. For aggregate types, this is true if all types that they contain have been defined. Call type:complete() to force a type to become complete.
type:isvector()
True if the type
is a vector. type.N
is the length. type.type
is the element type.
type:isunit()
True if the type
is the empty tuple. The empty tuple is also the return type of functions that return no values.
type:(isprimitive|isintegral|isarithmetic|islogical|canbeord)orvector()
True if the type
is a primitive type with the requested property, or if it is a vector of a primitive type with the requested property.
type:complete()
Forces the type to be complete. For structs, this will calculate the layout of the struct (possibly calling __getentries
and __staticinitialize
if defined), and recursively complete any types that this type references.
type:printpretty()
Print the type, including its members if it is a struct.
terralib.sizeof(terratype)
Wrapper around ffi.sizeof
. Completes the terratype
and returns its size in bytes.
terralib.offsetof(terratype,field)
Wrapper around ffi.offsetof
. Completes the terratype
and returns the offset in bytes of field
inside terratype
.
terralib.types.pointer(typ, [addrspace])
Experimental. Alternative spelling for &typ
that allows an LLVM address space to be specified. Note that the semantics of non-zero address spaces are target-specific.
Quotes are the Lua objects that get returned by terra quotation operators (backtick and quote ... in ... end
). They represent a fragment of Terra code (a statement or expression) that has not been placed into a function yet. The escape operators ([...]
and escape ... emit ... end
) splice quotes into the surround Terra code. Quotes have a short form for generating just one expression and long form for generating statements and expressions.
quotation = `terraexpr
-- `create a quotation
The short form of a quotation. The backtick operator creates a quotation that contains a single terra expression. terraexpr
can be any Terra expression. Any escapes that terraexpr
contains will be evaluated when the expression is constructed.
quote
terrastmts
end
The long form of a quotation. The quote
operator creates a quotation that contains a list of terra statements. This quote can appear where an expression or a statement would be legal in Terra code. If it appears in an expression context, its type is the empty tuple.
quote
terrastmts
in
terraexp1,terraexp2,...,terraexpN
end
The long quote
operation can also include an optional in
statement that creates several expressions. When this quote
is spliced into Terra code where an expression would normally appear, its value is the tuple constructed by those expressions.
local a = quote
var a : int = foo()
var b : int = bar()
in
a + b + b
end
terra f()
var c : int = [a] -- 'a' has type int.
end
terralib.isquote(t)
Returns true if t
is a quote.
typ = quoteobj:gettype()
Return the Terra type of this quotation.
typ = quoteobj:astype()
Try to interpret this quote as if it were a Terra type object. This is normally used in macros that expect a type as an argument (e.g. sizeof([&int])
). This function converts the quote
object to the type (e.g. &int
).
bool = quoteobj:islvalue()
true
if the quote can be used on the left hand size of an assignment (i.e. it is an l-value).
luaval = quoteobj:asvalue()
Try to interpret this quote as if it were a simple Lua value. This is normally used in macros that expect constants as an argument. Only works for a subset of values (anything that can be a Constant expression). Consider using an escape rather than a macro when you want to pass more complicated data structures to generative code.
quoteobj:printpretty()
Print out a visual representation of the code in this quote. Because quotes are not type-checked until they are placed into a function, this will print an untyped representation of the function.
Symbols are abstract representations of Terra identifiers. They can be used in Terra code where an identifier is expected, e.g. a variable use, a variable definition, a function argument, a field name, a method name, a label (see also Escapes). They are similar to the symbols returned by LISP’s gensym
function.
terralib.issymbol(s)
True if s
is a symbol.
symbol(typ,[displayname])
Construct a new symbol. This symbol will be unique from any other symbol. typ
is the type for the symbol. displayname
is an optional name that will be printed out in error messages when this symbol is encountered.
We provide wrappers around LuaJIT’s FFI API that allow you to allocate and manipulate Terra objects directly from Lua.
terralib.typeof(obj)
Return the Terra type of obj
. Object must be a LuaJIT ctype
that was previously allocated using calls into the Terra API, or as the return value of a Terra function.
terralib.new(terratype,[init])
Wrapper around LuaJIT’s ffi.new
. Allocates a new object with the type terratype
. init
is an optional initializer that follows the rules for converting between Terra values and Lua values. This object will be garbage collected if it is no longer reachable from Lua.
terralib.cast(terratype,obj)
Wrapper around ffi.cast
. Converts obj
to terratype
using the rules for converting between Terra values and Lua values.
Global variables are Terra values that are shared among all Terra functions.
global(type,[init,name,isextern,isconstant,addrspace])
global(init,[name,isextern,isconstant,addrspace])
Creates a new global variable of type type
given the initial value init
. Either type
or init
must be specified. If type
is not specified we attempt to infer it from init
. If init
is not specified the global is left uninitialized. init
is converted to a Terra value using the normal conversion rules. If init
is specified, this completes the type.
init
can also be a Quote, which will be treated as a constant expression used to initialized the global.
name
is used as the debugging name for the global.
If isextern
is true, then this global is bound to an externally defined variable with the name name
.
If isconstant
is true, then the contents of the global are considered to be constant.
If addrspace
is not nil
, then the global is placed in the corresponding LLVM address space. Note that the semantics of non-zero address spaces are target-specific.
globalvar:getpointer()
Returns the ctype
object that is the pointer to this global variable in memory. Completes the type.
globalvar:get()
Gets the value of this global as a LuaJIT ctype
object. Completes the type.
globalvar:set(v)
Converts v
to a Terra values using the normal conversion rules, and the global variable to this value. Completes the type.
globalvar:setname(str)
str = globalvar:getname()
Set or get the debug name for this global variable. This can help with debugging but does not otherwise change the behavior of the global.
typ = globalvar:gettype()
Get the terra type of the global variable.
globalvar:setinitializer(init)
Set or change the initializer expression for this global. Only valid before the global is compiled. This can be used to update the value of a globalvar as you add more code to the system. For instance, if you have a global variable storing the vtable for you class, you can add more values to it as you add methods to the class.
Terra constants represent constant values used in Terra code. For instance, if you want to create a lookup table for the sin
function, you might first use Lua to calculate the values and then create a constant Terra array of floating point numbers to hold the values. Since the compiler knows the array is constant (as opposed to a global variable), it can make more aggressive optimizations.
constant([type],init)
Create a new constant. init
is converted to a Terra value using the normal conversion rules. If the optional type is specified, then init
is converted to that type
explicitly. Completes the type.
init
can also be a Terra quote object. In this case the quote is treated as a constant initializer expresssion:
local complexobject = constant(`Complex { 3, 4 })
--`
Constant expressions are a subset of Terra expressions whose values are guaranteed to be constant and correspond roughly to LLVM’s concept of a constant expression. They can include things whose values will be constant after compilation but whose value is not known beforehand such as the value of a function pointer:
terra a() end
terra b() end
terra c() end
-- array of function pointers to a,b, and c.
local functionarray = const(`array(a,b,c))
-- `
terralib.isconstant(obj)
True if obj
is a Terra constant.
Labels are abstract code locations that can be used e.g., with the goto
statement. Like symbols, label values allow programmatic generation of code locations.
terralib.islabel(l)
True if l
is a label.
label([displayname])
Construct a new label. This label will be unique from any other label, even if it has the same displayname
. displayname
is an optional name that will be printed out in error messages when this label is encountered.
Macros allow you to insert custom behavior into the compiler during type-checking. Because they run during compilation, they should be aware of asynchronous compilation when calling back into the compiler.
macro(function(arg0,arg1,...,argN) [...] end)
Create a new macro. The function will be invoked at compile time for each call in Terra code. Each argument will be a Terra quote representing the argument. For instance, the call mymacro(a,b,foo())
), will result in three quotes as arguments to the macro. The macro must return a single value that will be converted to a Terra object using the compilation-time conversion rules.
terralib.ismacro(t)
True if t
is a macro.
The following macros are built in to Terra.
terralib.intrinsic(name, type)
Returns a Terra function that calls the LLVM intrinsic corresponding to name
, with the type type
. For example, LLVM provides the following intrinsic for sqrt
:
local sqrt = terralib.intrinsic("llvm.sqrt.f32", float -> float)
Now sqrt
can be called, and this should generate efficient code for the target platform.
Please note that the precise sets of available intrinsics depends on the LLVM version and the target platform, and is not under Terra’s control.
terralib.attrload(addr, attrs)
Performs a load on the address addr
with the attributes attrs
. The attributes must be a literal table with one or more of the following keys:
nontemporal
(optional): if true
, the load is non-temporal.align
(optional): specifies the alignment of addr
.isvolatile
(optional): if true
, the contents of addr
are considered volatile.For example, the following attrload
returns 123
:
var i = 123
terralib.attrload(&i, { align = 1 })
terralib.attrstore(addr, value, attrs)
Performs a store on the address addr
with the value value
and attributes attrs
. The attributes are the same as for attrload
, above.
terralib.fence(attrs)
Experimental. Issues a fence operation. Depending on the attributes specified, prevents reordering of atomic instructions around the fence. The semantics of this operation are determined by LLVM.
The following attributes may be specified (note that the list of allowed attributes is specific to each kind of atomic operation):
syncscope
(optional): an LLVM syncscope. Note that many of these values are target-specific.ordering
(required): an LLVM memory ordering.terralib.cmpxchg(addr, cmp, new, attrs)
Experimental. Performs an atomic compare-and-exchange (cmpxchg) operation on the address addr
. If the value at addr
is the same as cmp
, writes the value new
at the address, otherwise the value at the address is unmodified. Returns a tuple containing the original value at addr
(regardless of whether the exchange succeeds), as well as a boolean that specifies whether the exchange succeeded or not.
The following attributes may be specified (note that the list of allowed attributes is specific to each kind of atomic operation):
syncscope
(optional): an LLVM syncscope. Note that many of these values are target-specific.success_ordering
(required): an LLVM memory ordering that applies if the exchange is successful.failure_ordering
(required): an LLVM memory ordering that applies if the exchange fails.align
(optional): specifies the alignment of addr
. Note that unlike attrload
, the value of align
must be greater than or equal to the size of the contents of addr
(see here).isvolatile
(optional): if true
, the contents of addr
are considered volatile.isweak
(optional): if true
, then spurious failure is allowed. The operation may not write even if cmp
matches new
.For example, the in following example code, the first cmpxchg
fails (assuming a single thread of execution), returning {1, false}
, while the second succeeds with {1, true}
. The final value of i
is 4
.
var i = 1
terralib.cmpxchg(&i, 2, 3, {success_ordering = "acq_rel", failure_ordering = "monotonic"})
terralib.cmpxchg(&i, 1, 4, {success_ordering = "acq_rel", failure_ordering = "monotonic"})
terralib.atomicrmw(op, addr, value, atomicattrs)
Experimental. Performs an atomic read-modify-write (RMW) operation on the address addr
with the value value
and operator op
. The operation is performed atomically. Returns the original value at addr
.
The valid operations that can be performed are specified in the LLVM documentation. Note that fadd
and fsub
operations require floating-point types; most other operations require integer (or pointer) types. The specific set of available operations may depend on the LLVM version and target platform.
The following attributes may be specified (note that the list of allowed attributes is specific to each kind of atomic operation):
syncscope
(optional): an LLVM syncscope. Note that many of these values are target-specific.ordering
(required): an LLVM memory ordering.align
(optional): specifies the alignment of addr
. Note that unlike attrload
, the value of align
must be greater than or equal to the size of the contents of addr
(see here).isvolatile
(optional): if true
, the contents of addr
are considered volatile.For example, the following atomicrmw
writes 21
into i
and returns 1
(assuming a single thread of execution):
var i = 1
terralib.atomicrmw("add", &i, 20, {ordering = "acq_rel"})
We refer to Terra’s way of creating user-defined aggregate types as exotypes
because they are defined external to Terra itself, using a Lua API.
The design tries to provide the raw mechanisms for defining the behavior of user-defined types without imposing any language-specific policies. Policy-based class systems such as those found in Java or C++ can then be created as libraries on top of these raw mechanisms. For conciseness and familiarity, we use the keyword struct
to refer to these types in the language itself.
We also provide syntax sugar for defining exotypes for the most common cases. This section first discuses the Lua API itself, and then shows how the syntax sugar translates into it.
More information on the rationale for this design is available in our publications.
A new user-defined type is created with the following call:
mystruct = terralib.types.newstruct([displayname])
displayname
is an optional name that will be displayed by error messages, but each call to newstruct
creates a unique type regardless of name (We use a nominative type system. The type can then be used in Terra programs:
terra foo()
var a : mystruct --instance of mystruct type
end
The memory layout and behavior of the type when used in Terra programs is defined by setting property functions in the types metamethods
table:
mystruct.metamethods.myproperty = function ...
When the Terra typechecker needs to know information about the type, it will call the property function in the metamethods table of the type. If a property is not set, it may have a default behavior which is discussed for each property individually.
The following fields in metamethods
are supported:
entries = __getentries(self)
A Lua function that determines the fields in a struct computationally. The __getentries
function will be called by the compiler once when it first requires the list of entries in the struct. Since the type is not yet complete during this call, doing anything in this method that requires the type to be complete will result in an error. entries
is a List of field entries. Each field entry is one of:
{ field = stringorsymbol, type = terratype }
, specifying a named field.{stringorsymbol,terratype}
, also specifying a named field.By default, __getentries
just returns the self.entries
table, which is set by the struct
definition syntax.
method = __getmethod(self,methodname)
A Lua function looks up a method for a struct when the compiler sees a method invocation mystruct:mymethod(...)
or a static method lookup mystruct.mymethod
. mymethod
may be either a string or a symbol. This metamethod will be called by the compiler for every static invocation of methodname
on this type. Since it can be called multiple times for the same methodname
, any expensive operations should be memoized across calls.
method
may be a Terra function, a Lua function, or a macros which will run during typechecking.
Assuming that __getmethod
returns the value method
, then in Terra code the expression myobj:mymethod(arg0,...argN)
turns into [method](myobj,arg0,...,argN)
if type of myobj
is T
.
If the type of myobj
is &T
then it desugars to [method](@myobj,arg0,...,argN)
.
If, when a method is invoked, myobj
has type T
but the formal parameter has type &T
then the argument will be automatically converted to a pointer by taking its address. This method receiver cast allows method calls on objects to modify the object.
By default, __getmethod(self,methodname)
will return self.methods[methodname]
, which is set by the method definition syntax sugar. If the table does not contain the method, then the typechecker will call __methodmissing
as described below.
__staticinitialize(self)
A Lua function called after the type is complete but before the compiler returns to user-defined code. Since the type is complete, you can now do things that require a complete type such as create vtables, or examine offsets using the terralib.offsetof
. The static initializers for entries in a struct will run before the static initializer for the struct itself.
castedexp = __cast(from,to,exp)
A Lua function that can define conversions between your type and another type. from
is the type of exp
, and to
is the type that is required. For type mystruct
, __cast
will be called when either from
or to
is of type mystruct
or type &mystruct
. If there is a valid conversion, then the method should return castedexp
where castedexp
is the expression that converts exp
to to
. Otherwise, it should report a descriptive error using the error
function. The Terra compiler will try any applicable __cast
metamethod until it finds one that works (i.e. does not call error
).
__for(iterable,body)
Experimental. A Lua function that generates the loop to iterate
the specified type. The value of iterable
will be an expression that
generates a value of the specified type. The body
is a Lua function
that, when called with the loop iterator variable, executes one
iteration of the loop. Note that both iterable
and the argument to
body
must be protected from multiple evaluation. The result of the
__for
metamethod must be a quote.
For example, an implementation of a simple Range
type might look like:
struct Range {
a : int
b : int
}
Range.metamethods.__for = function(iter,body)
return quote
var it = iter
for i = it.a,it.b do
[body(i)]
end
end
end
__methodmissing(mymethod,myobj,arg1,...,argN)
When a method is called myobj:mymethod(arg0,...,argN)
and __getmethod
is not set, then the macro __methodmissing
will be called if mymethod
is not found in the method table of the type. It should return a Terra quote to use in place of the method call.
__entrymissing(entryname,myobj)
If myobj
does not contain the filed entryname
, then __entrymissing
will be called whenever the typechecker sees the expression myobj.entryname
. It must be a macro and should return a Terra quote to use in place of the field.
Custom operators:
__sub, __add, __mul, __div, __mod, __lt, __le, __gt, __ge,
__eq, __ne, __and, __or, __not, __xor, __lshift, __rshift,
__select, __apply
Can be either a Terra method, or a macro. These are invoked when the type is used in the corresponding operator. __apply
is used for function application, and __select
for terralib.select
. In the case of binary operators, at least one of the two arguments will have type mystruct
. The interface for custom operators hasn’t been heavily tested and is subject to change.
__typename(self)
A Lua function that generates a string that names the type. This name will be used in error messages and tostring
.
[local] struct mystruct
Struct declaration If mystruct
is not already a Terra struct, it creates a new struct by calling terralib.types.newstruct("mystruct")
and stores it in the Lua variable mystruct
. If mystruct
is already a struct, then it does not modify it. If the optional local
keyword is used, then mystruct
is first defined as a new local Lua variable. When used without the local
keyword, mystruct
can be a table specifier (e.g. a.b.c
).
[local] struct mystruct {
field0 : type0;
...
union {
fieldUnion0 : type1;
fieldUnion1 : type2;
}
...
fieldN : typeN;
}
Struct definition. If mystruct
is not already a Struct, then it creates a new struct with the behavior of struct declarations. It then fills in the entries
table of the struct with the fields and types specified in the body of the definition. The union
block can be used to specify that a group of fields should share the same location in memory. If mystruct
was previously given a definition, then defining it again will result in an error.
terra mystruct:mymethod(arg0 : type0,..., argN : typeN)
...
end
Method definition. If mystruct.methods.mymethod
is not a Terra function, it creates one. Then it adds the method definition. The formal parameter self
with type &mystruct
will be added to beginning of the formal parameter list.
Overloaded functions are separate objects from normal Functions and are created using an API call:
local addone = terralib.overloadedfunction("addone",
{ terra(a : int) return a + 1 end,
terra(a : double) return a + 1 end })
You can also add methods later:
addone:adddefinition(terra(a : float) return a + 1 end)
Unlike normal functions overloaded functions cannot be called directly from Lua.
overloaded_func:getdefinitions()
Returns the List of definitions for this function.
Escapes are a special construct adapted from multi-stage programming that allow you to use Lua to generate Terra expressions. Escapes are created using the bracket operator and contain a single lua expression (e.g. [ 4 + 5 ]
) that is evaluated when the surrounding Terra code is defined (note: this is different from macros which run when a function is compiled). Escapes are evaluated in the lexical scope of the Terra code. In addition to including the identifiers in the surround Lua scope, this scope will include any identifiers defined in the Terra code. In Lua code these identifiers are represented as symbols. For example, in the following escape:
terra foo(a : int)
var b = 4
return [dosomething(a,b)]
end
The arguments a
and b
to dosomething
will be symbols that are references to the variables defined in the Terra code.
We also provide syntax sugar for escapes of identifiers and table selects when they are used in expressions or statements. For instance the Terra expression ident
is treated as the escape [ident]
, and the table selection a.b.c
is treated as the escape [a.b.c]
when both a
and b
are Lua tables.
terra foo()
return [luaexpr],4
end
[luaexpr]
is a single-expression escape. luaexpr
is a single Lua expression that is evaluated to a Lua value when the function is defined. The resulting Lua expression is converted to a Terra object using the compilation-time conversion rules. If the conversion results in a list of Terra values, it is truncated to a single value.
terra foo()
bar(3,4,[luaexpr])
end
[luaexpr]
is a multiple-expression escape since it occurs as the last expression in a list of expressions. It has the same behavior as a single expression escape, except when the conversion of luaexpr
results in multiple Terra expressions. In this case, the values are appended to the end of the expression list (in this case, the list of arguments to the call to bar
).
terra foo()
[luaexpr]
return 4
end
[luaexpr]
is a statement escape. This form has the same behavior as a multiple-expression escape but is also allowed to return quotes of Terra statements. If the conversion from luaexpr
results in a list of Terra values, then are all inserted into the current block.
terra foo([luaexpr] : int)
var [luaexpr] = 4
mystruct.[luaexpr]
end
Each [luaexpr]
is an example of a escape of an identifier. luaexpr
must result in a symbol. For field selectors (a.[luaexpr]
), methods (a:[luaexpr]()
) or labels (goto [luaexpr]
), luaexpr
can also result in a string. This form allows you to define identifiers programmatically. When a symbol with an explicitly defined type is used to define a variable, then the variable will take the type of the symbol unless the type of the variable is explicitly specified. For instance if we construct a symbol (foo = symbol(int)
), the var [foo]
will have type int
, and var [foo] : float
will have type float
.
terra foo(a : int, [luaexpr])
end
[luaexpr]
is an escape of a list of identifiers. In this case, it behaves similarly to an escape of a single identifier, but may also return a list of explicitly typed symbols which will be appended as parameters in the parameter list.
Terra uses the Clang frontend to allow Terra code to be backwards compatible with C. The current implementation of this functionality currently supports importing all functions, types, and enums from C header files. It will also import any macros whose definitions are a single number representable in a double such as:
#define FOO 1
However, we currently do not support importing global variables or constants. This will be improved in the future.
table = terralib.includecstring(code,[args,target])
Import the string code
as C code. Returns a Lua table mapping the names of included C functions to Terra function objects, and names of included C types (e.g. typedefs) to Terra types. The Lua variable terralib.includepath
can be used to add additional paths to the header search. It is a semi-colon separated list of directories to search. args
is an optional list of strings that are flags to Clang (e.g. includecstring(code,"-I","..")
). target
is a target object that makes sure the headers are imported correctly for the target desired.
table = terralib.includec(filename,[args,target])
Similar to includecstring
except that C code is loaded from filename
. This uses Clangs default path for header files. ...
allows you to pass additional arguments to Clang (including more directories to search).
terralib.linklibrary(filename)
Load the dynamic library in file filename
. If header files imported with includec
contain declarations whose definitions are not linked into the executable in which Terra is run, then it is necessary to dynamically load the definitions with linklibrary
. This situation arises when using external libraries with the terra
REPL/driver application.
local llvmobj = terralib.linkllvm(filename)
local sym = llvmobj:extern(functionname,functiontype)
Link an LLVM bitcode file filename
with extension .bc
generated with clang
or clang++
:
clang++ -O3 -emit-llvm -c mycode.cpp -o mybitcode.bc
The code is loaded as bitcode rather than machine code. This allows for more aggressive optimization (such as inlining the function calls) but will take longer to initialize in Terra since it must be compiled to machine code. To extract functions from this bitcode file, call the llvmobj:extern
method providing the function’s name in the bitcode and its Terra-equivalent type (e.g. int -> int
).
When compiling or invoking Terra code, it is necessary to convert values between Terra and Lua. Internally, we implement this conversion on top of LuaJIT’s foreign-function interface, which makes it possible to call C functions and use C values directly from Lua. Since Terra type system is similar to that of C’s, we can reuse most of this infrastructure.
When converting Lua values to Terra, we sometimes know the expected type (e.g. when the type is specified in a terralib.cast
or terralib.constant
call). In the case, we follow LuaJIT’s conversion semantics, substituting the equivalent C type for each Terra type.
When a Lua value is used directly from Terra code through an escape, or a Terra value is create without specifying the type (e.g. terralib.constant(3)
), then we attempt the infer the type of the object. If successful, then the standard conversion is applied. If the type(value)
is:
cdata
– If it was previously allocated from the Terra API, or returned from Terra code, then it is converted into the Terra type equivalent to the ctype
of the object.number
– If floor(value) == value
and value can fit into an int
then the type is an int
otherwise it is double
.boolean
– the type is bool
.string
– converted into a rawstring
(i.e. a &int8
). We may eventually add a special string type.terralib.cast
function to specify it.When a Lua value is used as the result of an escape operator in a Terra function, additional conversions are allowed:
terralib.cast
to the Terra function type that has no return values, and whose parameters are the Terra types of the actual parameters of the function call. If not use in a function call, results in an error.arg:astype()
will return the value. If used as a function call (e.g. [&int](v)
, it acts as an explicit cast to that type.terralib.israwlist
) – Each member of the list is recursively converted to a Lua value using compile-time conversions (excluding the conversions for Lists). If used as a statement or where multiple expressions can appear, all values of the list are spliced in place. Otherwise, if used where only a single expression can appear, the list is truncated to 1 value.cdata
aggregates (structs and arrays) – If a Lua cdata
aggregate of Terra type T
is referenced directly in Terra code, the value in Terra code will be an lvalue reference of type T
to the Lua-allocated memory that holds that aggregate.When converting Terra values back into Lua values (e.g. from the results of a function call), we follow LuaJIT’s conversion semantics from C types to Lua objects, substituting the equivalent C type for each Terra type. If the result is a cdata
object, it can be used with the Terra Value API.
These functions allow you to load chunks of mixed Terra-Code code at runtime.
terralib.load(readerfn)
Lua equivalent of C API call terra_load
. readerfn
behaves the same as in Lua’s load
function.
terralib.loadstring(s)
Lua equivalent of C API call terra_loadstring
.
terralib.loadfile(filename)
Lua equivalent of C API call terra_loadfile
.
require(modulename)
Load the terra code modulename
. Terra adds an additional code loader to Lua’s package.loaders
to handle the loading of Terra code as a module. require
first checks if modulename
has already been loaded by a previous call to require
, returning the previously loaded results if available. Otherwise it searches package.terrapath
for the module. package.terrapath
is a semi-colon separated list of templates, e.g.:
"lib/?.t;./?.t"
The modulename
is first converted into a path by replacing any .
with a directory separator, /
. Then each template is tried until a file is found. For instance, using the example path, the call require("foo.bar")
will try to load lib/foo/bar.t
or foo/bar.t
. If a file is found, then require
will return the result of calling terralib.loadfile
on the file. By default, package.terrapath
is set to the environment variable TERRA_PATH
. If TERRA_PATH
is not set then package.terrapath
will contain the default path (./?.t
). The string ;;
in TERRA_PATH
will be replaced with this default path if it exists.
Note that normal Lua code is also imported using require
. There are two search paths package.path
(env LUA_PATH
), which will load code as pure Lua, and package.terrapth
(env: TERRA_PATH
), which will load code as Lua-Terra code.
terralib.saveobj(filename [, filetype], functiontable[, arguments, target, optimize])
Save Terra code to an external representation such as an object file, or executable. filetype
can be one of "object"
(an object file *.o
), "asm"
(an assembly file *.s
), "bitcode"
(LLVM bitcode *.bc
), "llvmir"
(LLVM textual IR *.ll
), or "executable"
(no extension).
If filetype
is missing then it is inferred from the extension. functiontable
is a table from strings to Terra functions. These functions will be included in the code that is written out with the name given in the table.
arguments
is an additional list that can contain flags passed to the linker when filetype
is "executable"
. If filename
is nil
, then the file will be written in memory and returned as a Lua string.
To cross-compile objects for a different architecture, you can specific a target object, which describes the architecture to compile for. Otherwise saveobj
will use the native architecture.
By default, saveobj
compiles code with the equivalent of Clang -O3
. This optimization profile can be customized to either disable optimizations, or to enable additional, potentially unsafe fast-math optimizations. The possible values of optimize
are:
true
or false
: Enable or disable optimizations (equivalent of -O3
). Default is enabled. Does not include any fast-math optimizations.{optimize = ..., fastmath = ...}
: A table specifying an optimization profile. The optimize
key takes boolean values true
or false
as described above (default true
if left unspecified). The possible values for fastmath
are described below.The fastmath
key in an optimization profile may take any of the following values:
true
or false
: Enable or disable all LLVM fast-math flags. Default is false
if unspecified."flag"
: A string specifying a single fast-math flag enables just that one flag. All other flags are disabled.{"flag1", "flag2"}
: A list of strings specifying zero or more fast-math flags enable all of the listed flags. All other flags are disabled.The list of valid LLVM fast-math flags can be seen here. Note that the precise set of available flags may depend on the LLVM version, and is outside of Terra’s control.
Examples:
terralib.saveobj("a.o", {main=main}, nil, nil, false) -- Disable optimizations.
terralib.saveobj("a.o", {main=main}, nil, nil, {fastmath=true}) -- Enable all fast-math optimizations.
terralib.saveobj("a.o", {main=main}, nil, nil, {fastmath={"contract", "nnan"}}) -- Enable contract and nnan.
The functions terralib.saveobj
and terralib.includec
take an optional target object, that tells the compiler to compile the code for a different architecture. These targets can be used for cross-compilation. For example, to use an x86 machine to to compile ARM code for a Raspberry Pi, you can create the following target object:
local armtarget = terralib.newtarget {
Triple = "armv6-unknown-linux-gnueabi"; -- LLVM target triple
CPU = "arm1176jzf-s";, -- LLVM CPU name,
Features = ""; -- LLVM feature string
FloatABIHard = true; -- For ARM, use floating point registers
}
All entries in the table except the Triple
field are optional. Documentation for clang
includes more information about what these strings should be set to.
Terra provides a few library functions to help debug and performance tune code. Except for currenttimeinseconds
,
these debugging facilities are only available on OSX and Linux.
terralib.currenttimeinseconds()
A Lua function that returns the current time in seconds since some fixed time in the past. Useful for performance tuning Terra code.
terra terralib.traceback(uctx : &opaque)
A Terra function that can be called from Terra code to print a stack trace. If uctx
is nil
then this will print the current stack. uctx
can also be a pointer to a ucontext_t
object (see ucontext.h
) and will print the stack trace for that context.
By default, the interpreter will print this information when a program segfaults.
terra terralib.backtrace(addresses : &&opaque, naddr : uint64, ip : &opaque, frameaddress : &opaque)
A low-level interface used to get the return addresses from a machine stack. addresses
must be a pointer to a buffer that can hold at least naddr
pointers.
ip
should be the address of the current instruction and will be the first entry in addresses
, while frameaddress
should be the value of the base pointer.
addresses
will be filled with the return addresses on the stack. Requires debugging mode to be enabled (-g
) for it to work correctly.
terra terralib.disas(addr : &opaque, nbytes : uint64, ninst : uint64)
A low-level interface to the disassembler. Print the disassembly of instructions starting at addr
. Will print nbytes
of instructions or ninst
instructions, whichever causes more instructions to be printed.
terra terralib.lookupsymbol(ip : &opaque, addr : &&opaque, size : &uint64, name : &rawstring, namelength : &uint64) : bool
Attempts to look up information about a Terra function given a pointer ip
to any instruction in the function. Returns true
if successful,
filling in addr
with the start of the function and size
with the size of the function in bytes. Fills in name
with a pointer to a fixed-width string of to namemax
characters holding the function name.
terra terralib.lookupline(fnaddr : &opaque, ip : &opaque, filename : &rawstring, namelength : &uint64, line : &uint64) : bool
Attempts to look up information about a Terra instruction given a pointer ip
to the instruction and a pointer fnaddr
to the start of the function containing it.
Returns true
if successful, filling in line
with line on which the instruction occurred and filename
with a pointer to a fixed-width string of to namemax
characters holding the filename.
Fills up to namemax
characters of the function’s name into name
.
Like Lua, Terra is designed to be embedded into existing code.
The C API for Terra serves as the entry-point for running Terra-Lua programs.
In fact, the terra
executable and REPL are just clients of the C API. The Terra C API extends Lua’s API with a set of Terra-specific functions. A client first creates a lua_State
object and then calls terra_init
on it to initialize the Terra extensions. Terra provides equivalents to the lua_load
set of functions (e.g. terra_loadfile
), which treat the input as Terra-Lua code.
int terra_init(lua_State * L);
Initializes the internal Terra state for the lua_State
L
. L
must be an already initialized lua_State
.
typedef struct { /* default values are 0 */
int verbose; /* Sets verbosity of debugging output.
Valid values are 0 (no debug output)
to 2 (very verbose). */
int debug; /* Turns on debug information in Terra compiler.
Enables base pointers and line number
information in stack traces. */
} terra_Options;
int terra_initwithoptions(lua_State * L, terra_Options * options);
Initializes the internal Terra state for the lua_State
L
. L
must be an already initialized lua_State
. terra_Options
holds additional configuration options.
int terra_load(lua_State *L,
lua_Reader reader,
void *data,
const char *chunkname);
Loads a combined Terra-Lua chunk. Terra equivalent of lua_load
. This function takes the same arguments as lua_load
and performs identically except it parses the input as a combined Terra-Lua program (i.e. a Lua program that has Terra extensions). Currently there is no binary format for combined Lua-Terra code, so the input must be text.
int terra_loadfile(lua_State * L, const char * file);
Loads the file as a combined Terra-Lua chunk. Terra equivalent of luaL_loadfile
.
int terra_loadbuffer(lua_State * L,
const char *buf,
size_t size,
const char *name);
Loads a buffer as a combined Terra-Lua chunk. Terra equivalent of luaL_loadbuffer
.
int terra_loadstring(lua_State *L, const char *s);
Loads string s
as a combined Terra-Lua chunk. Terra equivalent of luaL_loadstring
.
terra_dofile(L, file)
Loads and runs the file file
. Equivalent to
(terra_loadfile(L, fn) || lua_pcall(L, 0, LUA_MULTRET, 0))
terra_dostring(L, s)
Loads and runs the string s
. Equivalent to
(terra_loadstring(L, s) || lua_pcall(L, 0, LUA_MULTRET, 0))
Language extensions in the Terra system allow you to create custom Lua statements and expressions that you can use to implement your own embedded language. Each language registers a set of entry-point keywords that indicate the start of a statement or expression in your language. If the Terra parser sees one of these keywords at the beginning of a Lua expression or statement, it will switch control of parsing over to your language, where you can parse the tokens into an abstract syntax tree (AST), or other intermediate representation. After creating the AST, your language then returns a constructor function back to Terra parser. This function will be called during execution when your statement or expression should run.
This guide introduces language extensions with a simple stand-alone example, and shows how to register the extension with Terra. We then expand on this example by showing how it can interact with the Lua environment. The end of the guide documents the language extension interface, and the interface to the lexer in detail.
To get started, let’s add a simple language extension to Lua that sums up a list of numbers. The syntax will look like sum 1,2,3 done
, and when run it will sum up the numbers, producing the value 6
. A language extension is defined using a Lua table. Here is the table for our language
local sumlanguage = {
name = "sumlanguage"; --name for debugging
-- list of keywords that will start our expressions
entrypoints = {"sum"};
keywords = {"done"}; --list of keywords specific to this language
--called by Terra parser to enter this language
expression = function(self,lex)
--implementation here
end;
}
We list "sum"
in the entrypoints
list since we want Terra to hand control over to our language when it encounters this token at the beginning of an expression. We also list "done"
as a keyword since we are using it to end our expression. When the Terra parser sees the sum
token it will call the expression
function passing in an interface to the lexer, lex
. Here is the implementation:
expression = function(self,lex)
local sum = 0
lex:expect("sum") --first token should be "sum"
if not lex:matches("done") then
repeat
--parse a number, return its value
local v = lex:expect(lex.number).value
sum = sum + v
--if there is a comma, consume it and continue
until not lex:nextif(",")
end
lex:expect("done")
--return a function that is run
--when this expression would be evaluated by Lua
return function(environment_function)
return sum
end
end
We use the lex
object to interact with the tokens. The interface is documented below. Since the statement only allows numeric constants, we can perform the summation during parsing. Finally, we return a constructor function that will be run every time this statement is executed. We can use it in Lua code like so:
print(sum 1,2,3 done) -- prints 6
The file tests/lib/sumlanguage.t
contains the code for this example, and tests/sumlanguage1.t
has an example of its use.
In order to use our language extension, it needs to be imported.
The language extension mechanism includes an import
statment to load the language extension:
import "lib/sumlanguage" --active the new parsing rules
result = sum 1,2,3 done
Since import
statements are evaluated at parse time, the argument must be a string literal.
The parser will then call require
on the string literal to load the language extension file.
The file specified should return the Lua table describing your language:
local sumlanguage = { ... } --fill in your table
return sumlanguage
The imported language will be enabled only in the local scope where the import statement occured:
do
import "lib/sumlanguage"
result = sum 1,2,3 done --ok, in scope
if result == 6 then
result = sum 4,5 done -- ok, still in scope
end
end
result = sum 6,7 done --error! sumlanguage is not in scope
Multiple languages can be imported in the same scope as long as their entrypoints
do not overlap.
If their entrypoints do overlap, the languages can still be imported in the same file as long as the import
statements occur in different scopes.
One of the advantages of Terra is that it shares the same lexical scope as Lua, making it easy to parameterize Terra functions. Extension languages can also access Lua’s static scope. Let’s extend our sum language so that it supports both constant numbers, as well as Lua variables:
local a = 4
print(sum a,3 done) --prints 7
To do this we need to modify the code in our expression
function:
expression = function(self,lex)
local sum = 0
local variables = terralib.newlist()
lex:expect("sum")
if not lex:matches("done") then
repeat
if lex:matches(lex.name) then --if it is a variable
local name = lex:next().value
--tell the Terra parser
--we will access a Lua variable, 'name'
lex:ref(name)
--add its name to the list of variables
variables:insert(name)
else
sum = sum + lex:expect(lex.number).value
end
until not lex:nextif(",")
end
lex:expect("done")
return function(environment_function)
--capture the local environment
--a table from variable name => value
local env = environment_function()
local mysum = sum
for i,v in ipairs(variables) do
mysum = mysum + env[v]
end
return mysum
end
end
Now an expression can be a variable name (lex.name
). Unlike constants, we don’t know the value of this variable at parse time, so we cannot calculate the entire sum before execution. Instead, we save the variable name (variables:insert(name)
) and tell the Terra parser that will need the value of this variable at runtime (lex:ref(name)
). In our constructor we now capture the local lexical environment by calling the environment_function
parameter, and look up the values of our variables in the environment to compute the sum. It is important to call lex:ref(name)
. If we had not called it, then this environment table will not contain the variables we need.
Sometimes in the middle of your language you may want to call back into the Lua parser to parse an entire Lua expression. For instance, Terra types are Lua expressions:
var a : int = 3
In this example, int
is actually a Lua expression.
The method lex:luaexpr()
will parse a Lua expression. It returns a Lua function that implements the expression. This functions takes the local lexical environment, and returns the value of the expression in that environment. As an example, let’s add a concise way of specifying a single argument Lua function, def(a) exp
, where a
is a single argument and exp
is a Lua expression. This is similar to Pythons lambda
statement. Here is our language extension:
{
name = "def";
entrypoints = {"def"};
keywords = {};
expression = function(self,lex)
lex:expect("def")
lex:expect("(")
local formal = lex:expect(lex.name).value
lex:expect(")")
local expfn = lex:luaexpr()
return function(environment_function)
--return our result, a single argument lua function
return function(actual)
local env = environment_function()
--bind the formal argument
--to the actual one in our environment
env[formal] = actual
--evaluate our expression in the environment
return expfn(env)
end
end
end;
}
The full code for this example can be found in tests/lib/def.t
and tests/def1.t
.
In addition to extending the syntax of expressions, you can also define new syntax for statements and local variable declarations:
terra foo() end -- a new statement
local terra foo() end -- a new local variable declaration
This is done by specifying the statement
and localstatement
functions in your language table. These function behave the same way as the expression
function, but they can optionally return a list of names that they define. The file test/lib/def.t
shows how this would work for the def
constructor to support statements:
def foo(a) luaexpr --defines global variable foo
local def bar(a) luaexpr --defines local variable bar
Writing a parser that directly uses the lexer interface can be tedious. One simple approach that makes parsing easier (especially for expressions with multiple precedence levels) is Pratt parsing, or top-down precedence parsing (for more information, see http://javascript.crockford.com/tdop/tdop.html). We’ve provided a library built on top of the Lexer interface to help do this. It can be found, along with documentation of the API in tests/lib/parsing.t
. An example extension written using this library is found in tests/lib/pratttest.t
and an example program using it in tests/pratttest1.t
.
This section describes the API for defining languages and interacting with the lexer
object in detail.
A language extension is defined by a Lua table containing the following fields.
name
a name for your language used for debugging
entrypoints
A Lua list specifying the keywords that can begin a term in your language. These keywords must not be a Terra or Lua keyword and cannot overlap with entry-points for other loaded languages (In the future, we may allow you to rename entry-points when you load a language to resolve conflicts). These keywords must be valid Lua identifiers (i.e. they must be alphanumeric and cannot start with a number). In the future, we may expand this to allow arbitrary operators (e.g. +=
) as well.
keywords
A Lua list specifying any additional keywords used in your language. Like entry-points, these also must be valid identifiers. A keyword in Lua or Terra is always considered a keyword in your language, so you do not need to list them here.
expression
(Optional) A Lua method function(self,lexer)
that is called whenever the parser encounters an entry-point keyword at the beginning of a Lua expression. self
is your language object, and lexer
is a Lua object used to interact with Terra’s lexer to retrieve tokens and report errors. Its API is described below. The expression
method should return a constructor function function(environment_function)
. The constructor is called every time the expression is evaluated and should return the value of the expression as it should appear in Lua code. Its argument, environment_function
, is a function that when called, returns the local lexical environment as Lua table from variable names to values.
statement
(Optional) A Lua method function(self,lexer)
called when the parser encounters an entry-point keyword at the beginning of a Lua statement. Similar to expression
, it returns a constructor function. Additionally, it can return a second argument that is a list of assignements that the statement performs to variables. For instance, the value { "a", "b", {"c","d"} }
will behave like the Lua statement a,b,c.d = constructor(...)
localstatement
(Optional) A Lua method function(self,lexer)
called when the parser encounters an entry-point keyword at the beginning of a local
statment (e.g. local terra foo() end
). Similar to statement
this method can also return a list of names (e.g. {"a","b"}
). However, in this case, these names will be defined as local variables local a, b = constructor(...)
The methods in the language are given an interface lexer
to Terra lexer, which can be used to examine the stream of tokens, and to report errors. A token is a Lua table with fields:
token.type
The token type. For keywords and operators this is just a string (e.g. "and"
, or "+"
). The values lexer.name
, lexer.number
, lexer.string
indicate the token is respectively an identifier (e.g. myvar
), a number (e.g. 3), or a string (e.g. "my string"
). The type lexer.eof
indicates the end of the token stream.
token.value
For names, strings, and numbers this is the specific value (e.g. 3.3
). Numbers are represented as Lua numbers when they would fit (floating point or 32-bit integers) and ‘[u]int64_t’ cdata types for 64-bit integers.
token.valuetype
For numbers this is the Terra type of the literal parsed. 3
will have type int
, 3.3
is double
, 3.f
is float
, 3ULL
is uint64
, 3LL
is int64
, and 3U
is uint
.
token.linenumber
The linenumber on which this token occurred (not available for lookahead tokens).
token.offset
The offset in characters from the beginning of the file where this token occurred (not available for lookahead tokens).
The lexer
object provides the following methods fields and methods. The lexer
itself is only valid during parsing. For instance, it should not be called from the constructor function.
lexer:cur()
Returns the current token. Does not modify the position.
lexer:lookahead()
Returns the token following the current token. Does not modify the position. Only 1 token of lookahead is allowed to keep the implementation simple.
lexer:matches(tokentype)
shorthand for lexer:cur().type == tokentype
lexer:lookaheadmatches(tokentype)
Shorthand for lexer:lookahead().type == tokentype
lexer:next()
Returns the current token, and advances to the next token.
lexer:nextif(tokentype)
If tokentype
matches the type
of the current token, it returns the token and advances the lexer. Otherwise, it returns false
and does not advance the lexer. This function is useful when you want to try to parse many alternatives.
lexer:expect(tokentype)
If tokentype
matches the type of the current token, it returns the token and advances the lexer. Otherwise, it stops parsing and emits an error. It is useful to use when you know what token should appear.
lexer:expectmatch(tokentype,openingtokentype,linenumber)
Same as expect
but provides better error reporting for matched tokens. For instance, to parse the closing brace }
of a list you can call lexer:expectmatch('}','{',lineno)
. It will report a mismatched bracket as well as the opening and closing lines.
lexer.source
A string containing the filename, or identifier for the stream (useful for future error reporting)
lexer:error(msg)
Report a parse error and give up. msg
is a string. Does not return.
lexer:errorexpected(msg)
Report that the string msg
was expected but did not appear. Does not return.
lexer:ref(name)
name
is a string. Indicates to the Terra parser that your language may refer to the Lua variable name
. This function must be called for any free identifiers that you are interested in looking up. Otherwise, the identifier may not appear in the lexical environment passed to your constructor functions. It is safe (though less efficient) to call it for identifiers that it may not reference.
lexer:luaexpr()
Parses a single Lua expression from the token stream. This can be used to switch back into the Lua language for expressions in your language. For instance, Terra uses this to parse its types (which are just Lua expressions): var a : aluaexpression(4) = 3
. It returns a function function(lexicalenv)
that takes a table of the current lexical scope (such as the one return from environment_function
in the constructor) and returns the value of the expression evaluated in that scope. This function is not intended to be used to parse a Lua expression into an AST. Currently, parsing a Lua expression into an AST requires you to writing the parser yourself. In the future we plan to add a library which will let you pick and choose pieces of Lua/Terra’s grammar to use in your language.
lexer:luastats()
Parses a set of Lua statement from the token stream until it reaches an end of block keyword (end
, else
, elseif
, etc.). This can be used to help build domain specific languages that are supersets of Lua without having to reimplement all of the Lua parser.
lexer:terraexpr()
Parses a single Terra expression from the token stream. This can be used to help build domain specific languages that are supersets of Terra without having to reimplement all of the Terra parser.
lexer:terrastats()
Parses a set of Terra statement from the token stream until it reaches an end of block keyword (end
, else
, elseif
, etc.). This can be used to help build domain specific languages that are supersets of Terra without having to reimplement all of the Terra parser.
Abstract Syntax Description Language (ASDL) is a way of describing compiler intermediate representations (IR) and other tree- or graph-based data structures in a concise way. It is similar in many ways to algebraic data types, but offers a consistent cross-language specification. ASDL is used in the Python compiler to describe its grammar, and is also used internally in Terra to represent Terra code.
We provide a Lua library for parsing ASDL specifications that can be used to implement IR and other data-structures that are useful when building domain-specific languages. It allows you to parse ASDL specifications to create a set of Lua classes (actually specially defined meta-tables) for building IR. The library automatically sets up the classes with constructors for building the IR, and additional methods can be added to the classes using standard Lua method definitions.
local asdl = require 'asdl'
The ASDL package comes with Terra.
context = asdl.NewContext()
ASDL classes are defined inside a context. Different contexts do not share anything. Each class inside a context must have a unique name.
local Types = asdl.NewContext()
Types:Define [[
# define a simple record type with two members
Real = (number mantissa, number exp)
# ^~~~ field type ^~~~~ field name
# define a tagged union (aka a variant, discriminated union, sum type)
# with several optional data types.
# Here the type Stm has three sub-types
Stm = Compound(Stm head, Stm next)
| Assign(string lval, Exp rval)
# '*' specifies that a field is a List object
# '?' marks a field optional (may be nil as well as the type)
| Print(Exp* args, string? format)
Exp = Id(string name)
| Num(number v)
| Op(Exp lhs, BinOp op, Exp rhs)
# Omitting () on a tagged union creates a singleton value
BinOp = Plus | Minus
]]
Types can be Lua primitives returned by type(v)
(e.g. number table function string boolean), other ASDL types, or checked with arbitrary functions registered with context:Extern
.
External types can be used by registering a name for the type and a function that returns true for objects of that type:
Types:Extern("File",function(obj)
return io.type(obj) == "file"
end)
local exp = Types.Num(1)
local assign = Types.Assign("x",exp)
local real = Types.Real(3,4)
local List = require 'terralist'
local p = Types.Print(List {exp})
Values are created by calling the Class as function. Arguments are checked to be the correct type on construction. Helpful warnings are emitted when the types are wrong.
Fields are initialized by the constructor:
print(exp.v) -- 1
By default classes have a string representation
print(assign) -- Assign(lval = x,rval = Num(v = 1))
And you can check for membership using :isclassof
assert(Types.Assign:isclassof(assign))
assert(Types.Stm:isclassof(assign))
assert(Types.Exp:isclassof(assign) == false)
Singletons are not classes but values:
assert(Types.BinOp:isclassof(Types.Plus))
Classes are the metatables of their values and have Class.__index = Class
assert(getmetatable(assign) == Types.Assign)
Tagged unions have a string field .kind that identifies which variant in the union the value is
assert(assign.kind == "Assign")
You can define additional methods on the classes to add additional behavior
function Types.Id:eval(env)
return env[self.name]
end
function Types.Num:eval(env)
return self.v
end
function Types.Op:eval(env)
local op = self.op
local lhs = self.lhs:eval(env)
local rhs = self.rhs:eval(env)
if op.kind == "Plus" then
return lhs + rhs
elseif op.kind == "Minus" then
return lhs - rhs
end
end
local s = Types.Op(Types.Num(1),Types.Plus,Types.Num(2))
assert(s:eval({}) == 3)
You can also define methods on the super classes which will be defined for sub-classes as well:
function Types.Stm:foo()
print("foo")
end
assign:foo()
WARNING: To keep the metatable structure simple, this is not implemented with chained tables. Instead definitions on the superclass also copy their method to the subclass because of this design YOU MUST DEFINE PARENT METHODS BEFORE CHILD METHODS. Otherwise, the parent method will clobber the child.
IF YOU NEED TO OVERRIDE AN ALREADY DEFINE METHOD LIKE __tostring SET IT TO NILFIRST IN THE SUPERCLASS:
Types.Stm.__tostring = nil
function Types.Stm:__tostring()
return "<Stm>"
end
As an extension to ASDL, you can use the module keyword to define a namespace. This helps when you have many different kinds of Exp and Type in your compiler.
Types:Define [[
module Foo {
Bar = (number a)
Baz = (Bar b)
}
Outside = (Foo.Baz x)
]]
local a = Types.Foo.Bar(3)
Another extension allows you to mark any concrete type ‘unique’. Unique types are memoized on construction so that if constructed with the same arguments (under Lua equality), the same Lua object is returned again. This works for types containing Lists (*) and Options (?) as well
Types:Define [[
module U {
Exp = Id(string name) unique
| Num(number v) unique
}
]]
assert(Types.U.Id("foo") == Types.U.Id("foo"))