wwlisp Syntax

Atoms

Atoms are the most elementary things manipulated by the language. An atom can be:

a symbol: car, need-space, | h e l l o |
an integer number: 1, 2000, -32768,
a floating point real number: 1.0, 1.0e+32,
a character string "the quick brown fox...",
a memory segment,
any object resulting from the instantiation of a class.

Symbols, or symbolic atoms, are just names to which value and properties can be bound. All the other atoms types are the result of the instantiation of a specific class, (either explicitely defined, or implicite) and as such they are also objects. Symbolic atoms are the only atoms which are not objects instantiated from a class.

Properties

Any atom (not only symbolic atoms) can have properties. A property is defined by a property-name and a property-value. The property-name is another symbolic atom; the property-value can be anything, even nothing. The value of a symbolic atom is actually the property-value of its property named 'value'. The properties are manipulated with the functions defprop, putprop, getprop, remprop.

Lists

Atoms can be grouped in lists. A list begins with an opening parenthesis and finishes by a closing parenthesis:

(one two three four)

A list can itself be composed of other lists. There is no limit to the level of embedding of lists in other lists:

(defun find(crit arg)
    (cond((= crit arg)t)
        ((listp arg)
            (remove(mapcar '(lambda fexpr(a)(find crit(car a))) arg)
                nil
                )
            )
        )
    )

A list can contain objects, but a list is never itself an object.

Forms

Atoms and lists make up expressions, called symbolic expressions. Evaluable symbolic expressions are called forms. The role of the intepreter is to evaluate each form that is presented to it, one after the other. The evaluation of a form consists of finding the value of the form and returning it. For a form to be evaluated, it must first be read, ie typed on the keyboard or fetched from a file, and translated to an internal representation useable by the interpreter. This is done by the function read.

The read function can only convert forms and more generally symbolic expressions which are well-formed, ie which obey to the conventions of the language. Once read, the form is evaluated by the function eval, and the result is shown on the console or stored in a file by the function print.

The following loop is the somewhat simplified fundamental structure of the interpreter:

(loop(print(eval(read))))

When a symbolic atom is evaluated, its value is returned (ie. the content of its value property). All the other types of atoms are self-evaluating, ie. they are their own value and the result of the evaluation is the atom itself. This is the case of numbers, strings, binary blocks, and other instantiated objects.

When a list is evaluated, the first element is regarded as a function, and the following elements as arguments to that function. Arguments which are something else than self-evaluating and are correct forms are first evaluated themselves and replaced by their respective values before to evaluate the function itself. The evaluation proceeds from left to right, deepest first.

Functions

A function is a form which uses independent variables bound to something at the entry of the evaluation of the form and undefined outside.

A function can be defined by the function defun, and in that case it has a name and is callable from everywhere and is persistent in the current environment. A function can be anonymous and defined in the very place where it is to be used, ie as first element of a list. In that case, the first element of the anonymous function definition itself must be the symbol lambda.

A function name is just a symbolic atom which has a function property. For a compiled function, this property has as value a pointer to somewhere in memory where the executable code has been loaded, and for an interpreted function (like one created with defun), the function property has as value a lambda function definition, ie a list beginning by lambda.

For example, the value of the function property of the symbol find from a previous example is:

(lambda(crit arg)
    (cond((= crit arg)t)
        ((listp arg)
            (remove(mapcar '(lambda fexpr(a)(find crit(car a))) arg)
                nil
                )
            )
        )
    )

There are several sorts of functions: subr, fsubr, expr, fexpr and macro. subr and fsubr are compiled and are part of the interpreter or are loaded from compiled libraries like shared objects or DLLs. expr, fexpr and macros are interpreted and defined with the help of defun.

subr and expr functions evaluate each of their arguments before binding the resulting value respectively to each independent variable at the entry into the body of the function. An expr function has a well defined number of formal arguments, and expects as many actual arguments. Missing actual arguments are replaced by nil, the null value, and superfluous actual arguments are ignored and unreachable from inside the function.

fsubr and fexpr functions do not evaluate their arguments, and bind them all in a list as the value of one independent variable at the entry of the body of the function. So a fexpr function has always only one formal argument which is bound to the list of all the actual arguments.

macro functions do not evaluate their arguments, and bind them all in a list as the value of one independent variable at the entry of the body of the function. So a macro function has always only one formal argument which is bound to the list of all the actual arguments. Furthermore, a macro performs a second round of evaluation on the result of the first evaluation. So this first result must be a valid evaluable form.

When the body of a function is about to be entered, the current properties of the symbolic atoms used as formal arguments are disconnected and stored in a cache, and the actual arguments are bound in their places. When leaving the body of a function, the temporary properties of those symbolic atoms are simply forgotten and the previous properties which were valid outside the body of the function are restored.

Classless Functions

A classless function is a function which is not a method specific to a class.

Generic Classless Functions

A generic function is a function which can be applied to several classes of objects.

Immediate Functions

An immediate function is a function which does what it does on a specific instance of an object or symbolic atom, or does it with a well defined value of implicit argument, and hence is somewhat limited in its generality.

Classes

As explained above, everything except symbolic atoms and lists are objects. Beside the simple types of objects like integer, unsignedinteger, float, string or binary, more complicated objects can be built. In order to create such an object, its class must first be defined. The class determines an internal structure, properties and methods.

(defclass drawing(class))
(defclass schematic(drawing))
(defclass logicgate(schematic))

The function defclass defines a new class. The arguments are the name of the new class and a list of the parent classes. The text of the definition is verified for syntactical correctness like does defun for a function.

The class definition is then attached as the class property to the symbolic atom chosen as class name.

A symbolic atom can have simultaneously a function property, a class property, a value property, and a lot of other different properties as well.

The list containing the lineage is automatically constructed by exploring all the parent classes, from the most specific to the most general and recursively for all list of super-classes encountered, excluding redundant and circular references, and storing the result as the ancestry property of the new class name symbolic atom. It is therefore mandatory that all the parent classes are already defined when a new definition making reference to them is evaluated, otherwise the yet unknown ancestors cannot appear in the ancestry of the new class.

The inheritance diagram of the basic classes.

These classes are the basic classes defined before the startup of the interpreter. All the basic classes can be used as ancestor class for further inheritance, and can be combined with new classes to define other inheriting classes. For example, one can define a new pressure class which inherits from float. All the operations available on float will be available on the new pressure type, beside the new operation that one can add explicitely to the new class. The interpreter can also be made wiser, and given the ability to read, manipulate and print directly objects of the new pressure class. To do that, a recognizer and a formatter for the new class (see the readerprinter class) must be added.

Objects

An object is an atom of which the type is a basic class (like integer, unsignedinteger, float, string, binary, etc...) or any user-defined class. The function make-instance creates and returns a new object of any class.

If the definition of a class contains a list of super-classes, this list is walked through from the most specific to the most general class, up to that a constructor method is found. That method is then invoked and the list of arguments is passed to it. Any class inherits thus all the constructors of its ancestors, and only the most specific is executed by default. If a class needs more than one constructor, it has to supply a wrapper constructor, which calls all the other constructors explicitely. This is needed especially in case of multiple inheritance.

The object resulting of make-instance, in order to be manipulated further, should be stored somewhere, for example as the value of a symbolic atom.

Once no more useful, one can dispose of an object by calling destroy-instance, which invokes the most specific destructor for the class.

Methods

Normally, all the interactions with an object and with the internal properties of an object must occur by using functions specially defined for the class of that object, and which are the only pieces of code aware of the internal structure of the object. Those functions are the methods of the class. A method is a function of any type, of which the name is composed of the class name, a hyphen, and another name.

<classname>-<methodname>

Like, for the class stream:

stream-constructor

A method is defined by defun. For instance, the definition of the method pretty-print for the class readerprinter is:

(defun readerprinter-pretty-print(sexpr)
    (prog(tab depth lasttoken switchstack flat)
        (setq tab 4 depth 0 switchstack nil)
        (if(catch 'error-*(eval(cons 'pprint(cons this(list sexpr))))
                t
                )
            nil
            (print this sexpr)
            )
        )
    )

Like a symbolic atom, an object needs a property list. Actually, objects are stored as nameless symbols. So, as for a symbolic atom, the properties of an object can be accessed by set, defprop, putprop, getprop, remprop and plist.

The form for invoking a method of an object is a bit special: the first element of the form must be the method name (just the method part of it, no class name nor hyphen) and the second element must be a form which evaluates to an object of the expected class. The next arguments are those that must be passed to the method. The fact that in a method invocation the object must always be the second argument of a form has lead to a peculiarity of wwlisp: the arguments of some classical sequence-handling Lisp functions have been reordered to some extent in order to get the object in the good place.

Example, in order to pretty-print a function definition on stdout:

(pretty-print stdout(getprop 'find 'function))

where pretty-print is actually the invocation of the method readerprinter-pretty-print of the class readerprinter, which is one of the ancestor classes of the class symbolicstream, which is the class of the object stored as value of the symbolic atom stdout.

Evaluation

An essential part of the interpeter is the eval function. The object-oriented nature of wwlisp is actually implemented inside eval. The following description details the functioning of eval, accordingly to the kind of form which is submitted to it.

Symbolic Atom

eval just returns the value of the symbolic atom, ie the value of the value property (this seems recursive but is not).

List

Having a list s:

if the first element of the list s is a symbolic atom which is bound to a function of type fsubr, fexpr or macro then the function is applied to the following arguments without other provision and the evaluation is finished;
if the first element of the list s is a symbolic atom not bound to a function of type fsubr, fexpr or macro then the following element of s (thus the second element) is evaluated and cached;
if this value is an object o, then it is looked for a function of which the name is the object class name + hyphen + the symbolic atom (first element of s); the lineage of the object is walked through to find the function;
if a method is found, it is called; before the call, its arguments (the rest of s from the third on) are evaluated or not depending on the method being subr/expr or fsubr/fexpr; before entering the body of the function, beside the binding of the formal arguments, the this global symbol is unbound from its properties which are cached and the cached value of the object o is temporarily bound to it; at the exit of the body, the initial properties of this will be restored at the same time of those of the formal arguments; the evaluation is then finished;
if the conditions 3 and 4 are false, ie the second element of the list s, after evaluation, is not an object, or there is no method defined in the lineage for this class of object, then a classless function of the same name as the method is invoked (thus without class and hyphen prepended), giving it all the arguments from the second (already evaluated) on; the evaluation is then finished;
if when doing 5 there is no classless function found, an error is thrown;
if the first element of the list s, as seen in condition 1, is not a symbol but a list, this list must be a lambda-definition; this lambda-definition is executed with the evaluated arguments, and the evaluation is finished;
if the first element of s is not a lambda-definition, then an error is thrown.

So, when a classless fsubr, fexpr or macro is defined with the same name as a class method name, the class method name will always be overridden by the classless fsubr, fexpr or macro and the method will never be called. On the contrary, a method can have the same name as a classless subr or expr without this side effect. Furthermore, a method can itself be a fsubr, fexpr or macro. This lack of symmetry is mandatory to respect the fact that a fsubr or fexpr may not evaluate its arguments.

Integer, Unsigned Integer, Real, String, Binary and Other Object

When an atom of type integer, unsignedinteger, float, string or binary, or an instantiated object of a non-base class is evaluated, the result of the evaluation is the object itself, unmodified. For those types, the evaluation is a neutral operation.

Specific Case of the Macro

In the case of a macro, the generated form does not inherit the context of the body of the macro (the this variable and the independent variables), and thus the form generated by the first pass of the macro must use the values directly instead of the symbols.

Natural Syntax

Thanks to the rules and priorities explained above, the interpreter can correctly understand problems like the following: in this file, the classes point, line, plane, vector have been implemented. With those classes, objects can be created in order to do some computational geometry, for example calculating the coordinates of the intersection between a plane and a line.

Scope

Environment

The whole of the symbols, objects and functions defined when an evaluation occurs form the environment for that evaluation. At entry of a function or of a prog block, respectively the independent variable or formal arguments and the local variables loose temporarily all the properties which were attached to them and receive new values. The former properties will be restored automatically at the exit of the function or prog block.

Dynamic Scoping

The language uses dynamic scoping by default. This means that the properties of a symbolic atom, be it a variable or a function, which is defined as formal argument of a function or local variable of a prog, can be seen and modified from any depth further in the stack of executing functions. A called function can use variables which seem global to it, but are in reality local to a calling function. The closure block allows to define variables and functions which do not play in the dynamic scoping, but are seen only by the functions defined in the same lexical closure.

Visibility in a Method

When the execution has entered the body of a method, the current object is automatically set as value of the this symbol. this has the same meaning and purpose as in C++ and allows calling other methods on the object once in a method. But the catch is that the properties of this are switched each time that the evaluation enters and leaves a called sub-method, so it is not possible to attache new properties to this in a method and use them in a called sub-method. As it is the object-oriented eval which does that trick, it is always possible to call directly a sub-method by its full name class+hyphen+method, without object argument; then the class-seeking mechanism will be defeated and the function call will occur directly, and the called function will work on the same instance of this than the calling function.

Flow Control

Basic Block Constructs and Non-Local Jumps

Basic blocks:

(prog ())
(prog1)
(prog2)
(progn)

Non-local jumps:

(go <label>) 
(return <optional value>)
(return-from '<from-function> <optional value>)
(throw '<label> <optional value> '<optional from-function>)

PROGN Basic Block

(progn
    <form 1>
    <form 2>
    ...
    <form n>
    )

returns the result of <form n>

Any non-local jump function evaluated in the dynamic scope of a progn block forces to quit the progn sequence of execution and unroll the stack up to the designated target. progn is transparent to go, return and throw which jump through it without scattering, but return-from can eventually be made to stop jumping just after unrolling the progn, allowing to return from a progn construct. return-from is the only non-local jump which can be used to return prematurely from an expr or fexpr function.

a progn construct is implicitely found in

do as in the stopping sequence
cond as in each clause
defun as in the function body
lambda as in the function body

Blocks Derivated from PROGN

(prog1
    <form 1>
    <form 2>
    ...
    <form n>
    )

returns the result of <form 1>

(prog2
    <form 1>
    <form 2>
    ...
    <form n>
    )

returns the result of <form 2>

prog1 and prog2 are similar to progn relatively to the non-local jumps. if return-from is used, it allows to override the default return value of the progX form.

PROG Basic Block

(prog (var1 var2 ... varn)
    <form 1>
    <form 2>
    ...
    tag1
    ...
    <form n>
    )

returns nil

A go form evaluated in the dynamic scope of the prog can cause a jump to a tag defined in the sequence of the prog. A go form which does not find a corresponding tag in an enclosing prog will jump down through as many levels of either lexically or dynamically nested forms, unrolling the stack altogether, until either one corresponding tag is found in an enclosing prog, or toplevel is reached; in the latter case an error is thrown.

A return form evaluated in the dynamic scope of the prog shall cause a jump to just after the first enclosing prog. A return form which does not find an enclosing prog will jump down through as many levels of either lexically or dynamically nested forms, unrolling the stack altogether, until either one enclosing prog is found, or toplevel is reached; in the latter case an error is thrown.

A return-from form evaluated in the dynamic scope of the prog shall cause a jump to just after any enclosing block construct of which the name is given as first argument to the return-from. A return-from form which does not find an enclosing block construct with the correct name will jump through as many levels of nested forms either lexically or dynamically, unrolling the stack altogether, until either one enclosing block construct with the correct name is found, or toplevel is reached; in the latter case an error is thrown.

A throw form evaluated anywhere shall cause a jump to the nearest either lexically or dynamically enclosing catch, which has the same label; if no catch with that label is found, an error is thrown; the stack is not unrolled and the error stops the evaluation at the throw.

a prog construct is implicitely found in

do as in the do body sequence

a prog construct without local variable is implicitely found in

loop as in the body of the loop