by

Peter Fritzson,

PELAB - Programming Environment Laboratory

Dept. of Computer and Information Science,

Linköping University, Sweden

Environments (section 7.5)

Formal semantics (sections 12.1-12.2)

Natural semantics and RML

Abstract data types

Chapter 7, section 7.5, p 220

The environment contains bindings of declared variables, types, etc.

See e.g. the RML tutorial, page 118.

program fibonacci;

var

res : integer;

function fib(x : integer) : integer;

...

end

after entering the scope level of function fib

The above is an environment example for the language Petrol, whose semantics has been specified using Natural Semantics and RML.

Chapter 12, p 460-473

Programming languages (and human languages) can be defined in stages

Symbols are primitives in the language, e.g. 5, "abc", foo

Syntax defines which combinations of symbols are legal,

e.g. x := 35, if x<5 then ...

Semantics defines the meaning of constructs in the language

*Informal semantics*.

This expresses the meaning of language constructs in human languages like Swedish or English.

*Formal semantics*.

This defines the meaning in terms of some formal, exact, "mathematical" notation with well-defined meaning.

Operational semantics -- use an abstract machine

Denotational semantics -- map to abstract functions

Axiomatic semantics -- logic, preconditions/postconditions

An earlier formalism called attribute grammars is an extension of BNF grammars

definition using an abstract machine

logical inference rules -- premises, conclusion

reduction machine, based on logical inference rules

**RML is a System for Practical Language Implementation from Natural Semantics -- a form of operational semantics basedon natural deduction**

See the RML tutorial, chapters 1,2 and 9.

A meta language for Natural Semantics

Efficiency

Separation of input and output arguments/results

Statically strongly typed

Polymorphic type inference

Efficient compilation of pattern-matching

Natural semantics specifications consists of rules inspired from natural deduction and structured operational semantics.

The general syntactic form of rules as they appear in most literature is approximately as below:

The clauses:

H1 ⊢ T1 : R1 . . . Hn ⊢ Tn : Rn

are premises, and need to be satisfied in order for a conclusion H ⊢ T : R to be drawn.

RML is a strongly typed, deterministic implementation of a meta language for specifications in Natural Semantics.

The above general form of rule would appear approximately as follows in RML:

rule RelNameX(H1,T1) => R1 &

...

RelNameY(Hn,Tn) => Rn &

...

<cond>

-----------------------------

ThisRelationName(H,T) => R

A rule consist of premises and a conclusion:

rule premise1 & premise2

-------------------

conclusion

Define a minimal evaluator for negating numbers.

Examples from this language:

5

-9

The RML abstract syntax of integers and negative integers:

datatype Exp = INTconst of int

| NEGop of Exp

The RML semantic rules within a relation eval:

relation eval: Exp => int =

axiom eval( INTconst(ival) ) => ival

rule eval(e) => v1 & int_neg(v1) => v2

-----------------------------------

eval( NEGop(e) ) => v2

end

Then an approximate *Mathematica* form, where the premises never fail:

ClearAll[eval];

SetAttributes[eval,HoldAll];

eval[INTconst[ival_]] := ival;

eval[NEGop[e_]] /;

(v1= eval[e]; v2= -v1; True) :=

v2;

Evaluate an INTconst leaf node:

eval[INTconst[34]]

Trace the rule calls:

TracePrint[eval[INTconst[33]], eval[___] | Set[__]]

Evaluate the negation of an INTconst node:

eval[NEGop[INTconst[33]]]

Trace the rule and assignment calls:

TracePrint[eval[NEGop[INTconst[33]]], eval[___] | Set[__]]

The RML abstract syntax definition for Exp1:

datatype Exp = INTconst of int

| ADDop of Exp * Exp

| SUBop of Exp * Exp

| MULop of Exp * Exp

| DIVop of Exp * Exp

| NEGop of Exp

An example expression, 12+5*13 represented as an abstract syntax tree:

The semantic rules for the Exp1 mini-language are defined in RML by the eval relation below.

For example, evaluation of an integer constant node is the integer itself.

Evaluation of an addition node ADDop is v3, if v3 is the result of adding the evaluated results of its children e1 and e2.

Subtraction, multiplication, division operators have similar rules.

relation eval: Exp => int =

axiom eval( INTconst(ival) ) => ival

rule eval(e1) => v1 & eval(e2) => v2 & int_add(v1,v2) => v3

----------------------------------------------------------

eval( ADDop(e1,e2) ) => v3

rule eval(e1) => v1 & eval(e2) => v2 & int_sub(v1,v2) => v3

----------------------------------------------------------

eval( SUBop(e1,e2) ) => v3

rule eval(e1) => v1 & eval(e2) => v2 & int_mul(v1,v2) => v3

----------------------------------------------------------

eval( MULop(e1,e2) ) => v3

rule eval(e) => v1 & int_neg(v1) => v2

-----------------------------------

eval( NEGop(e) ) => v2

end

An approximate version of the semantics for Exp1 in *Mathematica*, where we assume that premises to the rules never fail (i.e. use True):

ClearAll[eval];

SetAttributes[eval,HoldAll];

eval[INTconst[ival_]] := ival;

eval[ADDop[e1_,e2_]] /; Module[{v1,v2},

(v1=eval[e1]; v2=eval[e2]; v3=v1+v2; True)] :=

v3;

eval[SUBop[e1_,e2_]] /; Module[{v1,v2},

(v1=eval[e1]; v2=eval[e2]; v3=v1-v2; True)] :=

v3;

eval[MULop[e1_,e2_]] /; Module[{v1,v2},

(v1=eval[e1]; v2=eval[e2]; v3=v1*v2; True)] :=

v3;

eval[NEGop[e_]] /; Module[{v1},

(v1= eval[e]; v2= -v1; True)] :=

v2;

A simple expression: 12+5*13

Standard *Mathematica* abstract syntax, i.e. FullForm, for this expression:

Hold[12+5*13]//FullForm

Head[12]

Head[Plus[aa,bb]]

Our own abstract syntax:

ADDop[INTconst[12],MULop[INTconst[5],INTconst[13]]]

Call the Exp1 evaluator on the above example:

eval[ADDop[INTconst[12],MULop[INTconst[5],INTconst[13]]]]

Using *Mathematica*'s standard evaluator gives the same result:

12+5*13

Trace the rule calls and premise assignment calls:

TracePrint[eval[ADDop[INTconst[12],MULop[INTconst[5],INTconst[13]]]],

eval[___] | Set[__]]

relation lookup: (Env,Ident) => Value =

lookup returns the value associated with an identifier. If no association is present, lookup will fail.

Identifier id is found in the first pair of the list, and value is returned.

rule id = id2

------------------------------

lookup((id2,value) :: _, id) => value

id is not found in the first pair of the list, and lookup will recursively search the rest of the list. If found, value is returned.

rule not id=id2 & lookup(rest, id) => value

-------------------------------------

lookup((id2,_) :: rest, id) => value

end

Below is a *Mathematica* version of the Natural Semantics RML rules that specify lookup. Note that we have to handle Fail explicitly here by extra rules. The Fail mechanism is already built into RML.

ClearAll[lookup];

lookup[{{id2_,value_},___},id_] /;

(id2===id) :=

value;

lookup[{{id2_,_},rest___},id_] /;

(Not[id2===id]; value=lookup[{rest},id]; value=!=Fail) :=

value;

lookup[{},id_] := Fail;

lookup[___] := Fail;

Looking up the value of bb in the environment {{aa,10},{bb,35}}, which is a list of pairs of identifier and value:

lookup[{{aa,10},{bb,37}},bb]

Lookup of xx fails since it is not present in the environment:

lookup[{{aa,10},{bb,35}},xx]

In *Mathematica* it is more efficient to use the builtin indexing operation, e.g. lookup:

ClearAll[lookup];

lookup[env,aa]=10;

lookup[env,bb]=37;

lookup[env,bb]

Machine code:

LOAD Load accumulator

STO Store

ADD Add

SUB Subtract

MULT Multiply

DIV Divide

GET Input a value

PUT Output a value

J Jump

JN Jump on negative

JP Jump on positive

JNZ Jump on negative or zero

JPZ Jump on positive or zero

JNP Jump on negative or positive

LAB Label (no operation)

HALT Halt execution

PAM code:

read x,y;

while x<> 99 do

ans := (x+1) - (y / 2)

write ans;

read x,y

end

Translated machine code:

GET x STO T2

GET y LOAD T1

L1 LAB SUB T2

LOAD x STO ans

SUB 99 PUT ans

JZ L2 GET x

LOAD x GET y

ADD 1 J L1

STO T1 L2 LAB

LOAD y HALT

DIV 2

Representation:

MGET( I(x) ) MSTO( T(2) )

MGET( I(y) ) MLOAD( T(1) )

MLABEL( L(1) ) MB(MSUB,T(2) )

MLOAD( I(x) ) MSTO( I(ans) )

MB(MSUB,N(99) ) MPUT( I(ans) )

MJ(MJZ, L(2) ) MGET( I(x) )

MLOAD( I(x) ) MGET( I(y) )

MB(MADD,N(1) ) MJMP( L(1) )

MSTO( T(1) ) MLABEL( L(2) )

MLOAD( I(y) ) MHALT

MB(MDIV,N(2) )

Relation trans_expr** relation trans_expr: Exp => Mcode list = axiom trans_expr(INT(v)) => [MLOAD( N(v))] axiom trans_expr(IDENT(id)) => [MLOAD( I(id))] ....**

MB(MSUB ( e2))

SUB e2

trans_expr(e2) => [MLOAD(operand2)] &

trans_binop(binop) => opcode &

list_append(cod1, [MB(opcode,operand2)]) => cod3

-----------------------------------

trans_expr(BINARY(e1,binop,e2) => cod3

relation trans_expr: Exp => Mcode list =

(* Evaluation of expressions in the current environment *)

axiom trans_expr(INT(v)) => [MLOAD( N(v))] (* integer constant *)

axiom trans_expr(IDENT(id)) => [MLOAD( I(id))] (* identifier id *)

(* Arith binop: simple case, expr2 is just an identifier or constant *)

rule trans_expr(e1) => cod1 &

trans_expr(e2) => [MLOAD(operand2)] & (* expr2 simple *)

trans_binop(binop) => opcode &

list_append(cod1, [MB(opcode,operand2)]) => cod3

----------------------------------- (* expr1 binop expr2 *)

trans_expr(BINARY(e1,binop,e2) => cod3

(* Arith binop: general case, expr2 is a more complicated expr *)

rule trans_expr(e1) => cod1 &

trans_expr(e2) => cod2 &

trans_binop(binop) => opcode &

gentemp => t1 &

gentemp => t2 &

list_append6(

cod1, (* code for expr1 *)

[MSTO(t1)], (* store expr1 *)

cod2, (* code for expr2 *)=

[MSTO(t2)], (* store expr2 *)

[MLOAD(t1)], (* load expr1 value into Acc *)

[MB(opcode,t2)] ) => cod3 (* Do arith operation *)

----------------------------------- (* expr1 binop expr2 *)

trans_expr(BINARY(e1,binop,e2)) => cod3

end (* trans_expr *)

relation trans_binop: BinOp => MBinOp =

axiom trans_binop(PLUS) => MADD

axiom trans_binop(SUB) => MSUB

axiom trans_binop(MUL) => MMULT

axiom trans_binop(DIV) => MDIV

end

relation gentemp: () => MTemp =

rule tick => no

----------

gentemp => T(no)

end (* gentemp *)

Small functional language with call-by-name semantics (mini-Freja)

Interpreter performance compared to Centaur/Typol:

Almost full Pascal with some C features (Petrol)

(specification size around 2000 lines)

Generated compiler ran 39% faster than hand-written compiler for "similar" language

Mini-ML including type inference

Ongoing application work:

Specification of Erlang

Specification of HLPLEX

Specification of Java

Specification of Modelica

Chapter 8, p 250

*Data Type & operations*.

A method for defining a data type and operations on that type

*Information Hiding*.

Collecting implementation details of type+operations in one place; restricting access to those details

*Syntactic specification*:

Operations, their names and arguments

*Semantic specification*:

Actual properties of the operation.

Often described using axioms

**type** complex **imports** real**operations**:

+ : complex ô complex -> complex

- : complex ô complex -> complex

* : complex ô complex -> complex

/ : complex ô complex -> complex

- : complex -> complex

makecomplex : real ô real -> complex

realpart : complex -> real

imaginarypart : complex -> real

...**variables**: x,y,z: complex; r,s: real**axioms**:

realpart(makecomplex(r,s)) = r

imaginarypart(makecomplex(r,s)) = s

realpart(x+y) = realpart(x) + realpart(y)

imaginarypart(x+y) = imaginarypart(x)+imaginarypart(y)

...

Example: Complex numbers in

Complex[5,2]

Adding two complex numbers:

Complex[5,2]+Complex[10,3]

Adding two integers:

5+10

The + operator is overloaded! This means that it has different definitions for integers and complex numbers, respectively.