Principles of Programming Languages and Systems

by
Peter Fritzson,

PELAB - Programming Environment Laboratory
Dept. of Computer and Information Science,
Linköping University, Sweden

Lecture 5

Environments (section 7.5)

Formal semantics (sections 12.1-12.2)

Natural semantics and RML

Abstract data types

Environments

Chapter 7, section 7.5, p 220

Environment

The environment contains bindings of declared variables, types, etc.

Fibonacci example

See e.g. the RML tutorial, page 118.

program fibonacci;
var
res : integer;
function fib(x : integer) : integer;
...
end

The environment, represented as a stack that grows downwards,
after entering the scope level of function fib

The above is an environment example for the language Petrol, whose semantics has been specified using Natural Semantics and RML.

Formal Semantics

Chapter 12, p 460-473

What is semantics?

Programming languages (and human languages) can be defined in stages

Symbols are primitives in the language, e.g. 5, "abc", foo

Syntax defines which combinations of symbols are legal,
e.g. x := 35, if x<5 then ...

Semantics defines the meaning of constructs in the language

Kinds of semantics

Informal semantics.
This expresses the meaning of language constructs in human languages like Swedish or English.

Formal semantics.
This defines the meaning in terms of some formal, exact, "mathematical" notation with well-defined meaning.

Three principal formalisms to express formal semantics

Operational semantics -- use an abstract machine

Denotational semantics -- map to abstract functions

Axiomatic semantics -- logic, preconditions/postconditions

An earlier formalism called attribute grammars is an extension of BNF grammars

Operational semantics (p 465)

definition using an abstract machine

logical inference rules -- premises, conclusion

reduction machine, based on logical inference rules

Natural Semantics and RML - Relational Meta Language

RML is a System for Practical Language Implementation
from Natural Semantics -- a form of operational semantics based
on natural deduction

See the RML tutorial, chapters 1,2 and 9.

Some Properties of RML

A meta language for Natural Semantics

Efficiency

Separation of input and output arguments/results

Statically strongly typed

Polymorphic type inference

Efficient compilation of pattern-matching

Generating an interpreter implemented in C, using rml2C
Generating a compiler implemented in C
Natural semantics


Natural semantics specifications consists of rules inspired from natural deduction and structured operational semantics.
The general syntactic form of rules as they appear in most literature is approximately as below:

[Graphics:lecture5gr2.gif][Graphics:lecture5gr1.gif]

The clauses:
H1 ⊢ T1 : R1 . . . Hn ⊢ Tn : Rn
are premises, and need to be satisfied in order for a conclusion H ⊢ T : R to be drawn.

Natural semantics using RML

RML is a strongly typed, deterministic implementation of a meta language for specifications in Natural Semantics.
The above general form of rule would appear approximately as follows in RML:


rule RelNameX(H1,T1) => R1 &
...
RelNameY(Hn,Tn) => Rn &
...
<cond>
-----------------------------
ThisRelationName(H,T) => R

A rule consist of premises and a conclusion:



rule premise1 & premise2
-------------------
conclusion
A minimal evaluator for positive or negative constants

Define a minimal evaluator for negating numbers.

Examples from this language:

5

-9

The RML abstract syntax of integers and negative integers:

datatype Exp = INTconst of int
| NEGop of Exp


The RML semantic rules within a relation eval:

relation eval: Exp => int =
axiom eval( INTconst(ival) ) => ival

rule eval(e) => v1 & int_neg(v1) => v2
-----------------------------------
eval( NEGop(e) ) => v2
end

Then an approximate Mathematica form, where the premises never fail:

ClearAll[eval];
SetAttributes[eval,HoldAll];

eval[INTconst[ival_]] := ival;

eval[NEGop[e_]] /;
(v1= eval[e]; v2= -v1; True) :=
v2;

Evaluate an INTconst leaf node:

eval[INTconst[34]]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr3.gif]

Trace the rule calls:

TracePrint[eval[INTconst[33]], eval[___] | Set[__]]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr4.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr5.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr6.gif]

Evaluate the negation of an INTconst node:

eval[NEGop[INTconst[33]]]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr7.gif]

Trace the rule and assignment calls:

TracePrint[eval[NEGop[INTconst[33]]], eval[___] | Set[__]]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr8.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr9.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr10.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr11.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr12.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr13.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr14.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr15.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr16.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr17.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr18.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr19.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr20.gif]
The Exp1 evaluator for simple integer expressions

The RML abstract syntax definition for Exp1:

datatype Exp = INTconst of int
| ADDop of Exp * Exp
| SUBop of Exp * Exp
| MULop of Exp * Exp
| DIVop of Exp * Exp
| NEGop of Exp

An example expression, 12+5*13 represented as an abstract syntax tree:


The semantic rules for the Exp1 mini-language are defined in RML by the eval relation below.
For example, evaluation of an integer constant node is the integer itself.
Evaluation of an addition node ADDop is v3, if v3 is the result of adding the evaluated results of its children e1 and e2.
Subtraction, multiplication, division operators have similar rules.

relation eval: Exp => int =

axiom eval( INTconst(ival) ) => ival

rule eval(e1) => v1 & eval(e2) => v2 & int_add(v1,v2) => v3
----------------------------------------------------------
eval( ADDop(e1,e2) ) => v3

rule eval(e1) => v1 & eval(e2) => v2 & int_sub(v1,v2) => v3
----------------------------------------------------------
eval( SUBop(e1,e2) ) => v3

rule eval(e1) => v1 & eval(e2) => v2 & int_mul(v1,v2) => v3
----------------------------------------------------------
eval( MULop(e1,e2) ) => v3

rule eval(e) => v1 & int_neg(v1) => v2
-----------------------------------
eval( NEGop(e) ) => v2
end

An approximate version of the semantics for Exp1 in Mathematica, where we assume that premises to the rules never fail (i.e. use True):

ClearAll[eval];
SetAttributes[eval,HoldAll];

eval[INTconst[ival_]] := ival;

eval[ADDop[e1_,e2_]] /; Module[{v1,v2},
(v1=eval[e1]; v2=eval[e2]; v3=v1+v2; True)] :=
v3;

eval[SUBop[e1_,e2_]] /; Module[{v1,v2},
(v1=eval[e1]; v2=eval[e2]; v3=v1-v2; True)] :=
v3;

eval[MULop[e1_,e2_]] /; Module[{v1,v2},
(v1=eval[e1]; v2=eval[e2]; v3=v1*v2; True)] :=
v3;

eval[NEGop[e_]] /; Module[{v1},
(v1= eval[e]; v2= -v1; True)] :=
v2;

A simple expression: 12+5*13

Standard Mathematica abstract syntax, i.e. FullForm, for this expression:

Hold[12+5*13]//FullForm
[Graphics:lecture5gr2.gif][Graphics:lecture5gr21.gif]

Head[12]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr22.gif]

Head[Plus[aa,bb]]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr23.gif]

Our own abstract syntax:

ADDop[INTconst[12],MULop[INTconst[5],INTconst[13]]]

Call the Exp1 evaluator on the above example:

eval[ADDop[INTconst[12],MULop[INTconst[5],INTconst[13]]]]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr24.gif]

Using Mathematica's standard evaluator gives the same result:

12+5*13
[Graphics:lecture5gr2.gif][Graphics:lecture5gr25.gif]

Trace the rule calls and premise assignment calls:

TracePrint[eval[ADDop[INTconst[12],MULop[INTconst[5],INTconst[13]]]],
eval[___] | Set[__]]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr26.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr27.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr28.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr29.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr30.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr31.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr32.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr33.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr34.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr35.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr36.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr37.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr38.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr39.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr40.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr41.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr42.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr43.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr44.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr45.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr46.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr47.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr48.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr49.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr50.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr51.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr52.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr53.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr54.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr55.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr56.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr57.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr58.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr59.gif]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr60.gif]
Simple lookup in environments represented as lists


relation lookup: (Env,Ident) => Value =
lookup returns the value associated with an identifier. If no association is present, lookup will fail.
Identifier id is found in the first pair of the list, and value is returned.
rule id = id2
------------------------------
lookup((id2,value) :: _, id) => value

id is not found in the first pair of the list, and lookup will recursively search the rest of the list. If found, value is returned.

rule not id=id2 & lookup(rest, id) => value
-------------------------------------
lookup((id2,_) :: rest, id) => value
end

Below is a Mathematica version of the Natural Semantics RML rules that specify lookup. Note that we have to handle Fail explicitly here by extra rules. The Fail mechanism is already built into RML.

ClearAll[lookup];

lookup[{{id2_,value_},___},id_] /;
(id2===id) :=
value;

lookup[{{id2_,_},rest___},id_] /;
(Not[id2===id]; value=lookup[{rest},id]; value=!=Fail) :=
value;

lookup[{},id_] := Fail;

lookup[___] := Fail;

Looking up the value of bb in the environment {{aa,10},{bb,35}}, which is a list of pairs of identifier and value:

lookup[{{aa,10},{bb,37}},bb]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr61.gif]

Lookup of xx fails since it is not present in the environment:

lookup[{{aa,10},{bb,35}},xx]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr62.gif]

In Mathematica it is more efficient to use the builtin indexing operation, e.g. lookup:

ClearAll[lookup];

lookup[env,aa]=10;
lookup[env,bb]=37;

lookup[env,bb]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr63.gif]
Translational semantics of the PAM language
-- abstract syntax to machine code

Machine code:
LOAD Load accumulator
STO Store
ADD Add
SUB Subtract
MULT Multiply
DIV Divide
GET Input a value
PUT Output a value
J Jump
JN Jump on negative
JP Jump on positive
JNZ Jump on negative or zero
JPZ Jump on positive or zero
JNP Jump on negative or positive
LAB Label (no operation)
HALT Halt execution

Example

PAM code:
read x,y;
while x<> 99 do
ans := (x+1) - (y / 2)
write ans;
read x,y
end

Translated machine code:
GET x STO T2
GET y LOAD T1
L1 LAB SUB T2
LOAD x STO ans
SUB 99 PUT ans
JZ L2 GET x
LOAD x GET y
ADD 1 J L1
STO T1 L2 LAB
LOAD y HALT
DIV 2

Representation:
MGET( I(x) ) MSTO( T(2) )
MGET( I(y) ) MLOAD( T(1) )
MLABEL( L(1) ) MB(MSUB,T(2) )
MLOAD( I(x) ) MSTO( I(ans) )
MB(MSUB,N(99) ) MPUT( I(ans) )
MJ(MJZ, L(2) ) MGET( I(x) )
MLOAD( I(x) ) MGET( I(y) )
MB(MADD,N(1) ) MJMP( L(1) )
MSTO( T(1) ) MLABEL( L(2) )
MLOAD( I(y) ) MHALT
MB(MDIV,N(2) )

Arithmetic expression translation


Relation trans_expr
relation trans_expr: Exp => Mcode list =

axiom trans_expr(INT(v)) => [MLOAD( N(v))]
axiom trans_expr(IDENT(id)) => [MLOAD( I(id))]
....


Code template for simple subtraction expression:
<code for expression e1>
MB(MSUB ( e2))


and in assembly text form:
<code for expression e1>
SUB e2


RML rule for simple (expr1 binop expr2):

rule trans_expr(e1) => cod1 &
trans_expr(e2) => [MLOAD(operand2)] &
trans_binop(binop) => opcode &
list_append(cod1, [MB(opcode,operand2)]) => cod3
-----------------------------------
trans_expr(BINARY(e1,binop,e2) => cod3

The complete trans_expr relation


relation trans_expr: Exp => Mcode list =
(* Evaluation of expressions in the current environment *)

axiom trans_expr(INT(v)) => [MLOAD( N(v))] (* integer constant *)

axiom trans_expr(IDENT(id)) => [MLOAD( I(id))] (* identifier id *)

(* Arith binop: simple case, expr2 is just an identifier or constant *)
rule trans_expr(e1) => cod1 &
trans_expr(e2) => [MLOAD(operand2)] & (* expr2 simple *)
trans_binop(binop) => opcode &
list_append(cod1, [MB(opcode,operand2)]) => cod3
----------------------------------- (* expr1 binop expr2 *)
trans_expr(BINARY(e1,binop,e2) => cod3

(* Arith binop: general case, expr2 is a more complicated expr *)
rule trans_expr(e1) => cod1 &
trans_expr(e2) => cod2 &
trans_binop(binop) => opcode &
gentemp => t1 &
gentemp => t2 &
list_append6(
cod1, (* code for expr1 *)
[MSTO(t1)], (* store expr1 *)
cod2, (* code for expr2 *)=
[MSTO(t2)], (* store expr2 *)
[MLOAD(t1)], (* load expr1 value into Acc *)
[MB(opcode,t2)] ) => cod3 (* Do arith operation *)
----------------------------------- (* expr1 binop expr2 *)
trans_expr(BINARY(e1,binop,e2)) => cod3

end (* trans_expr *)


relation trans_binop: BinOp => MBinOp =
axiom trans_binop(PLUS) => MADD
axiom trans_binop(SUB) => MSUB
axiom trans_binop(MUL) => MMULT
axiom trans_binop(DIV) => MDIV
end


relation gentemp: () => MTemp =

rule tick => no
----------
gentemp => T(no)

end (* gentemp *)

Some applications of RML

Small functional language with call-by-name semantics (mini-Freja)

Interpreter performance compared to Centaur/Typol:

[Graphics:lecture5gr2.gif][Graphics:lecture5gr64.gif]

Almost full Pascal with some C features (Petrol)
(specification size around 2000 lines)
Generated compiler ran 39% faster than hand-written compiler for "similar" language

Mini-ML including type inference

Ongoing application work:
Specification of Erlang
Specification of HLPLEX
Specification of Java
Specification of Modelica

Abstract Data Types

Chapter 8, p 250

What is an Abstract Data Type?

Data Type & operations.
A method for defining a data type and operations on that type

Information Hiding.
Collecting implementation details of type+operations in one place; restricting access to those details

Algebraic specification of abstract data types (p 253)

Syntactic specification:

Operations, their names and arguments

Semantic specification:

Actual properties of the operation.
Often described using axioms

Complex number abstract data type

type complex imports real
operations:
+ : complex За complex -> complex
- : complex За complex -> complex
* : complex За complex -> complex
/ : complex За complex -> complex
- : complex -> complex
makecomplex : real За real -> complex
realpart : complex -> real
imaginarypart : complex -> real
...

variables: x,y,z: complex; r,s: real

axioms:
realpart(makecomplex(r,s)) = r
imaginarypart(makecomplex(r,s)) = s
realpart(x+y) = realpart(x) + realpart(y)
imaginarypart(x+y) = imaginarypart(x)+imaginarypart(y)
...

Overloading and polymorphism, p 267
Example: Complex numbers in Mathematica:

Complex[5,2]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr65.gif]

Adding two complex numbers:

Complex[5,2]+Complex[10,3]
[Graphics:lecture5gr2.gif][Graphics:lecture5gr66.gif]

Adding two integers:

5+10
[Graphics:lecture5gr2.gif][Graphics:lecture5gr67.gif]

The + operator is overloaded! This means that it has different definitions for integers and complex numbers, respectively.