LIGHTMAT USER MANUAL
by Anders Gertz and Vadim Engelson
What is LightMat
LightMat is a collection of C++ classes that provide storage, access and
mathematical operations with arrays.
It supports arrays of rank 1,2,3 and 4. Every array is stored
in a separate object and you can define it and manipulate with it by corresponding
C++ methods.
To use LightMat you need just to write the #include "lightmat.h"
at the start of your code and add some object files when linking
the application.
If you are interested in more details, read the Implementation
notes and Optimization.
Some options are not implemented yet, but they will be available in
the next versions.
Installation notes
Available platforms and compilers
Compilers |
The preprocessor definitions used by the compiler |
WorkShop Compilers 4.2 C++ 4.2 (Sun / Solaris 5.6) |
- |
SC4.0 / C++ 4.1 (Sun / Solaris 5.5.1) |
- |
GNU C++ V.2.6.3(Sun / Solaris 5.5.1, 5.6) |
__GNUG__ |
GNU C++ 2.7.2.1 (Sun / Solaris 5.5.1, 5.6, Linux 2.0.30) |
__GNUG__ |
GNU C++ 2.7.2 (Digital UNIX V3.2) |
__GNUG__ |
DEC C++ V1.3B-0 DEC OSF/1 (Alpha) |
__alpha |
MicroSoft Visual C++ 4.0 |
_WIN32 |
MicroSoft Visual C++ 5.0 |
_WIN32 |
If you use one of these compilers there should be no problems in compilation.
Otherwise there is a chance that some small modifications should be introduced
in the code. Please, let us know about your modifications.
Download and compilation
You download the library from the LightMat home page at
http://www.ida.liu.se/~pelab/lightmat.
You receive:
-
for UNIX - gzipped tar-ed file, like this: lightmat077.tar.gz
-
for Windows - zipped like this: lightmat077.zip
After un-zipping you can go to the "src" subdirectory.
Compilation can take some 4-5 minutes on a 125MHz processor.
You must have 70MB free on the Windows disk where your swap file is
located (usually in C:\WINDOWS) .
-
On UNIX you compile by "make -f make.unx"
-
On Windows with VisualC++ you compile by "nmake -f make.win"
The code will not work on 8.3 file systems like Windows 3.1 (i.e.
it uses long file names).
The as result of compilation all necessary object files are created.
Testing
It is important to test the library.
-
On UNIX you compile by "make -f make.unx all"
-
On DOS with VisualC++ you compile by "nmake -f make.win all"
Then several executable files are created. It can take some 4-5 minutes
on a 125MHz processor.
The make automatically runs the test.
If there were no assertion violations nor core dumps, then,
presumably you can use the code further, writing from scratch or
modifying the test.
How to use the library
The tests show two different ways of using the library : using inlined
functions and using non-inlined functions.
The execution is identical, but compilation in different.
Using inlined functions
You specify
#define LM_ALL
#include <lightmat.h>
in your files
You compile your files with usual compilation flags
. Compilation usually takes several minutes.
You link your object files with cstring.o and
extwin.o
Using non-inlined functions
You specify
#define LM_ALL
#include <lightmat.h>
in your files
You compile your files with usual flags and -DNOT_INLINED
You link your files with cstring.o , extwin.o
, not_inlined.o
All the library functions are non inlined.
This gives reasonable, but not the highest
performance.
Every time you compile your file, it will take less
than 1 minute.
Reducing compilation time for non-inlined functions.
If you use only part of LightMat classes you can reduce compilation time
for your code.
Use preprocessor definition #define LM_name
for every class you use.
Names of these definitions is given in the table
of classes.
#define LM_ALL means that you use all available classes.
The Different Classes
We will need some definitions first:
-
Scalar - an elementary data type like double or int
-
Array - a data structure with many scalar elements; array elements
can be accessed by indexing.
-
Vector - a one-dimensional array, array of rank 1, implemented
by class lightN
-
Matrix - a two-dimensional array, array of rank 2, implemented
by class lightNN
-
Tensor (tensor of rank 3) a three-dimensional array, e.g. lightNNN
-
Quansor (tensor of rank 4) a four-dimensional array, e.g. lightNNNN
There exist 10 different classes for vectors, matrices, tensors of
rank 3 and tensors of rank 4.
There are 4 universal classes and 6 specially optimized classes.
The specially optimized classes can be used in particular application areas:
for instance matrix 3x3 is used in geometry and 4x4 in computer graphics.
If you are not sure which classes to use - use universal classes.
CLASS TABLE
Class template name |
#LM_
name |
Use |
Allocated size by default |
Name when instantiated for type double,
template notation
(not available) |
Names,
non-template
notation (long form / short form) |
Explanations |
light3 |
LM_3 |
special |
3 |
light3<double> |
light3double, light3int/
double3,int3 |
Vector of 3 elements |
light4 |
LM_4 |
special |
4 |
light4<double> |
light4double, light4int/ double4, int4 |
Vector of 4 elements |
lightN |
LM_N |
universal |
10 |
lightN<double> |
lightNdouble, lightNint/ doubleN, intN |
Vector of arbitrary size |
light33 |
LM_33 |
special |
3 x 3 |
light33<double> |
light33double, light33int/ double33, int33 |
Matrix 3 by 3 elements |
light44 |
LM_44 |
special |
4 x 4 |
light44<double> |
light44double, light44int/ double44, int4 |
Matrix 4 by 4 elements |
lightN3 |
LM_N3 |
special |
10x 3 |
lightN3<double> |
lightN3double, lightN3int/ doubleN3, intN3 |
Matrix N by 3 elements, i.e. storage for N vectors if 3 elements |
lightNN |
LM_NN |
universal |
10 x 10 |
lightNN<double> |
lightNNdouble, lightNNint/ doubleNN, intNN |
Matrix of arbitrary size |
lightN33 |
LM_N33 |
special |
10 x 3 x 3 |
lightN33<double> |
lightN33double, lightN33int/ doubleN33, intN33 |
Tensor N by 3 by 3, i.e. storage N matrixes of 3 by 3 elements |
lightNNN |
LM_NNN |
universal |
5 x 5 x 5 |
lightNNN<double> |
lightNNNdouble, lightNNNint/ doubleNNN, intNNN |
Tensor of arbitrary size |
lightNNNN |
LM_NNNN |
universal |
4 x 4 x 4 x 4 |
lightNNNN<double> |
lightNNNNdouble, lightNNNNint/
doubleNNNN, intNNNN |
Quansor of arbitrary size |
Note on higher ranks: There are no classes for tensors
of higher ranks larger than 4. It is possible to add support for that in
the future though. Having different classes for vectors, matrices etc is
quite natural, and that's what most other packages also have chosen. Other
packages sometimes have one class for tensors of high ranks, but that does
not exist in LightMat. Having different classes for the different ranks
also avoids run-time choices of what algorithm, for the different ranks,
that should be used.
The name of class you will use depends on whether you use
template notation or non-template notation.
Since some compilers do not support templates in full scale, we suggest
to use classes in non-template notation.
Template notation (Not
available)
Note: Our experiments with various compilers show that template implementations
(SparcCompiler, GNU C++ and MicroSoft Visual C++) vary in syntactical limitations.
Furthermore, diagnostics produced by the compilers is not sufficient in
order to compile our tests. We spent several days in attempts to
go through compilation with no success. Therefore implementation
of our library with templates is delayed until better compilers
will be produced. You can check whether your compiler produces meaningful
messages by setting LIGHTMAT_TEMPLATE preprocessor definition. See preprocessor
definition table.
In the template notation classes are implemented as templates, for instance
class light4<T>.
The same code can therefore be used for both objects with elements
of type double and int.
The user specifies a variable as
light4<double> foo
This means that foo is a vector with 4 elements of
type double.
Other types of elements can be used, however, some arithmetical
operations with them will not be defined.
Non-template notation
Unfortunately, some compilers do not support templates in full scale. We
suggest to use classes in non-template notation.
The classes for elements of type int and double
are implemented.
The user specifies a variable
light4double foo
This means that foo is a vector with 4 elements of
type double.
There are typedef definitions that allow using another, shorter notation
for class names, for instance double4 instead of light4double.
All available names are given in the table.
Further, if we mention, for instance, class light3
we usually mean both light3int and light3double.
Specially optimized classes for certain sizes
Extra fast classes have been made for some specific sizes of vectors and
matrices. Those are vectors of length 3 and 4, and matrices of size 3x3
and 4x4. They have, of course, a static memory-area for the elements.
That means that the the compiler can make code which need a less number
of references through pointers. Code for calculations is written especially
suited to those sizes. Calculation loops are completely unrolled. This
makes these classes extra fast.
These classes, though, can also interface with the other
existing classes in lightmat.
There are also classes for matrices and tensors with partially
fixed sizes (lightN3 and lightN33). The
usefulness of these classes will be described in
Indexing. One is for matrices with three columns and the other one
is for tensors of rank 3 which have the size of the two last indices set
to three. Internally the matrix-class consist of a number of vectors of
length 3 (light3) and the tensor-class consist of a number
of 3x3 matrices (light33).
Since the sizes of objects of these classes are partially or totally
fixed they cannot dynamically change their sizes as freely as the the universal
LightMat classes. Only lightN3 and lightN33
can change the first dimension during calculations.
Constructors and access to arrays
All LightMat classes have some basic member functions like constructors,
destructors, assignment
operators and a few more.
Normal constructors
There exist a number of constructors, with different arguments, with
which a new object can be created. The reasons for having a number of different
constructors are efficiency and ease of use.
-
No arguments.These default constructors create an object with
default size (given in the table).
lightNNdouble a; doubleNN b;
An object of the same class as the only argument. This is the a
normal copy constructor. I.e. the new object is a copy of the argument.
The two objects will not share the same data area.
lightNNdouble b = a; doubleNN c = d;
Integer arguments. The integers specify the size of the created
object, e.g. 3 and 5 should be provided as arguments to the constructor
in order to construct a 3x5 matrix.
lightNNdouble a(3,5); doubleNN c(3,5);
Another class as the only argument .This creates an object with
the same size and values in its elements as the argument. A vector can,
of course, not be created from, for example, a matrix though. Both
data type should have the same rank. The values are copied from the object
given as an argument since all objects have their own data areas.
light33double a;
lightNNdouble c = a;
Such constructors are also called converters. There are converters for
-
conversion of arrays with integer elements to arrays with double
elements (light3int to light3double)
-
conversion of arrays with special size to universal arrays (light3
and light4 to lightN; light33, lightN3
and light44 to lightNN; lightN33 to lightNNN)
Integer arguments and a pointer to a memory area. The integers
specify the size of the created area and the elements are initialized from
the values in the memory area. The values in the memory area should be
stored in row major order (C++ style).
double arr[] = { 1.3, 2.6, 3, 4, 5.7, 6 };
lightNN<double> a(2,3,arr);
Integer arguments and a value. The integers specify the size
of the created object and all elements in the object are initialized
to the supplied value. Only one value can be given.
lightNN<double> a(3,4,17.7);
Arguments with the values for all elements. This constructor exist
only for the classes which have a fixed number of elements (light3,
light4, light33, light44). The values for the
elements should be given in row major order.
light33<double> a(1.1,2,3.5,4,5,6.9,7,8.3,9);
Element initialization
If values for the elements are not supplied in the constructor, the behaviour
depends on preprocessor definition LIGHTMAT_INIT_ZERO.
(See preprocessor definition table)If
it is defined, the elements initialized to zero.
Otherwise, the elements are not initialized. This saves some time.
It can be a good idea to not define LIGHTMAT_INIT_ZERO
if the user knows that objects never are used without explicit initialization.
Functions within the classes that do calculations and need a local object
use a special internal protected constructor that never initializes the
elements.
Memory allocation
The constructors automatically allocate dynamic memory in the heap if the
static memory area (area allocated by default in the stack) isn't sufficiently
large.
Destructors
The destructors free any memory which may have been allocated.
Assignment
The assignment operators copy the values from the elements in one object
to another.
The objects can be of different size. But the rank must be the same
or the right-hand operand can be scalar.
The size of the left-hand operand is changed to the size of
the right-hand operand if needed. More memory is also allocated if needed.
Note: Memory is not deallocated even if the objects need
of memory decreases. The reason is that deallocation and the sometimes
following allocation of a smaller memory area takes time. There may also,
later in the calculations, once again be a need of a larger memory area
in which case another reallocation would be needed once again.
lightNN<double> a(2,3), b(3,2);
a = b; // a becomes 3x2
a = 5.7; // all elements of a become 5.7
Assignment from a scalar will set all the elements in the object to that
value.
Read section about indexing
if you want to assign a value to an element of array.
Set() and Get()
All classes have member functions named Get and Set. The argument to those
functions are a pointer to a memory area. Set will set
the elements in the object from the values which are stored in the memory
area. Get does the opposite. The elements are stored in
row major order (C++ style) in the memory area.
lightNN<double> a(5,5,17.7), b(5,5);
double arr[25];
a.Get(arr); // a --> arr
b.Set(arr); // arr --> b
No bounds check is performed.
dimension()
The size of an object can be found with the function dimension().
It takes an integer argument which should specify the dimension. The argument
should, for example, be 1 for the number of rows in a matrix, and 2 for
the number of columns.
Equality and inequality
The usual operators for equality and inequality exist for all classes.
Two objects are considered equal if they have the same size and if the
values of all corresponding elements are equal. Only arrays of the
same rank can be compared.
if(a == b ?? c != d) { /* something */ }
reshape()
A function named reshape is used to promote an object to another object
with higher dimensions. E.g. a vector can be promoted to a matrix and all
the columns in the matrix will then be given the values of the vector.
lightN<double> v(3);
lightNN<double> a;
a.reshape(3,5,v); // a becomes 3x5
make_lightN()
It is not a method of a class, but a friend function. You can use it for
construction of an array from its components. Given several arrays of rank
n it constructs an array of rank n+1. Number of components should be specified
as the 1st argument. Obviously, the dimensions and element types of all
arrays should be the same.
Not more than 10 parameters can be given in the argument list.
lightNN a;
a=make_lightN(2, make_lightN( 3, 11,12,13),
make_lightN( 3, 14,15,16));
This code creates array 2x3 of integers from 11 to 16. Specification
is always given in row-major order.
Note: This function is very time and space consuming. Therefore use
Set() if you are interested in higher performance.
Indexing
The indexes in LightMat are 1-based, like in Fortran, not 0-based like
in C.
For matrices the first coordinate is usually named "row", the second
- "column".
Bound check in indexing operations can be turned on or off by preprocessor
definitions.
There is indexing for individual
elements , indexing for array
extraction, and special access functions.
Indexing for individual
elements
Individual elements can be indexed with the parentheses operator. The indexed
element can be either an l-value or an r-value. The number of indexes
should be equal to the rank of the array.
lightNN<double> a(5,3);
a(1,1) = a(2,3);
Indexing for array extraction
Arrays can be extracted from arrays of higher dimensions.
The indexing is limited to supplying values for a number of indices
in the beginning, the last indices is not supplied. E.g. it is possible
to index a whole row in a matrix but not a column. The result can only
be used as an r-value, and the indexed elements are copied in the process.
The elements need to be copied since the concept of different views of
the same data does not exist in this design. Note that this is relatively
expensive operation.
If the result is used as a l-value, the assignment result is simply
discarded.
The exception to this is the classes which have partially fixed
sizes. E.g. the class for Nx3 matrices (lightN3) contains
a number of vectors of size 3 and a reference to one of those can be returned
by the operator. There is no need to copy data and the returned reference
can also be used as an l-value. The same applies to the class
lightN33.
lightN<double> v;
lightNN<double> a(5,5);
lightNNN<double> c(7,8,9);
v = a(4);
v=c(4,5);
a= c(3);
a(1) = v; // The result is discarded, and matrix a does
not change.
a.SetRow(1,v) ;// This is the correct way. See access
functions.
light3<double> w;
lightN3<double> m(5);
m(3) = w + m(2); // This class is exception, m changes.
Access functions
Several access functions have been designed for specific universal classes.
-
lightN<T> SubVector(int,int) - returns interval
of the vector
-
SetSubVector(int,int,lightN<T>&) -
sets interval of the vector
Example:
lightN<int> s(5,30); // Length is 5, all elements initialized to
30
lightN<int> p(2,40); // Length is 2, elements are initialized
to 40
s.SetSubVector(3,4,p); // s becomes {30,30,40,40,30}
p=s.SubVector(2,4); // p changes length and becomes {30,40,40}
-
class lightNN<T>
-
SetCol(int n, lightN<T>& x) - puts x into
column n
-
SetRow(int n, lightN<T>& x) - puts x into
row n
-
SetCols(int n1,int n2, lightNN<T> &x) - puts matrix
x into columns n1...n2
-
SetRows(int n1,int n2, lightNN<T> &x) - puts matrix
x into rows n1...n2
-
SetSubMatrix(r1,r2,c1,c2,lightNN<T> &x) - puts matrix
x into rows from r1 to r2, columns from c1 to c2
-
SetCols(int n1,int n2, T x) - puts scalar x into columns
n1...n2
-
SetRows(int n1,int n2, T x) - puts scalar x into rows n1...n2
-
SetSubMatrix(r1,r2,c1,c2,T x) - puts scalar x into rows
from r1 to r2, columns from c1 to c2
-
lightN<T> Col(n) - get column n
-
lightN<T> Row(n) - get row n
-
lightNN<T> SubMatrix(r1,r2,c1,c2) - get submatrix, rows
from r1 to r2, columns from c1 to c2
-
lightNN<T> Cols(c1,c2) - get submatrix, columns c1 to
c2
-
lightNN<T> Rows(r1,r2) - get submatrix, rows r1 to r2.
-
class lightNNN<T>
-
SetMatrix1(n, lightNN<T> & x) - sets matrix x to "level" n of
the tensor, i. e. elements (n,*,*)
-
SetVector12(n, m, lightN<T> & x) - sets vector x to elements
(n,m,*) of the tensor.
Many other access functions will be added in next versions.
SetShape()
This function changes the shape of array. (Available for universal
classes only)
If necessary, new memory is allocated. The elements are not initialized.
lightNN<double> a;
lightNN<double> b(5,6);
a.SetShape(2,4);
b.SetShape(7,3);
a(7,3)=1;
data()
This function returns the address of internal presentation of the array.
(Available for universal classes only)
Internally the arrays are stored in column major order. It is the order
typical for Fortran.
Calculations in lightmat
A collection of operators and functions is defined for every class in the
LightMat library. The operators and function names are overloaded and perform
relevant operations on various arguments. All the operators and functions
are available as friend functions. Therefore the notation is similar
for scalar arguments and for arrays of different ranks.
Operators available for all classes are:
Unary operators:
-
-x, unary minus
-
+x, unary plus
Elementwise binary operators.
One of arguments can be scalar. If both are arrays they must be of the
same size.
-
x+y, Elementwise addition.
-
x-y, Elementwise subtraction.
-
pow(x,y), Elementwise power
-
Mod(x,y), Elementwise modulo. Definition is extended for
real numbers. If x and y are integers of different sign (x%y+y) will be
the result.
Inner product and multiplication with scalar.
x*y
-
If one of arguments is scalar, all elements of other argument
are multiplied by the scalar. The result has the same rank
as the array.
-
If both arguments are arrays, inner products is computed.
(For vector also called dot product.). If the arguments have rank
m and n, the result will be of rank m+n-2. The last dimension of the first
argument should be equal to the first dimension of the second argument.
This is checked at runtime.
All the possible variants are given in the table:
Arg1 |
Arg2 |
Res |
s |
s |
s |
l3 |
s |
l3 |
s |
l3 |
l3 |
l3 |
l3 |
s |
l33 |
l3 |
l3 |
l3 |
l33 |
l3 |
l3 |
lNN |
lN |
l4 |
s |
l4 |
s |
l4 |
l4 |
l4 |
l4 |
s |
l4 |
l44 |
l4 |
l44 |
l4 |
l4 |
l4 |
lNN |
lN |
s |
lN |
lN |
lN |
s |
lN |
lN |
lN |
s |
lNN |
lN |
lN |
|
Arg1 |
Arg2 |
Res |
lNN |
l3 |
lN |
lNN |
l4 |
lN |
lN3 |
l3 |
lN |
lN3 |
lN |
lN |
lN3 |
lN3 |
lN3 |
lN3 |
s |
lN3 |
s |
lN3 |
lN3 |
lN3 |
l33 |
lN3 |
l44 |
lN3 |
lN3 |
lNN |
lN3 |
lN3 |
lNN |
s |
lNN |
s |
lNN |
lNN |
lNN |
lNN |
lNN |
lNN |
l33 |
lNN |
lNN |
l44 |
lNN |
lN3 |
lNN |
lNN |
l33 |
lNN |
lNN |
|
Arg1 |
Arg2 |
Res |
l44 |
lNN |
lNN |
lN |
lNNN |
lNN |
lNNN |
lN |
lNN |
l3 |
lNNN |
lNN |
lNNN |
l3 |
lNN |
l4 |
lNNN |
lNN |
lNNN |
l4 |
lNN |
l33 |
s |
l33 |
s |
l33 |
l33 |
l33 |
l33 |
l33 |
l44 |
s |
l44 |
s |
l44 |
l44 |
l44 |
l44 |
l44 |
lNNN |
s |
lNNN |
s |
lNNN |
lNNN |
lNN |
lNNN |
lNNN |
lNNN |
lNN |
lNNN |
|
Arg1 |
Arg2 |
Res |
lNNNN |
lN |
lNNN |
lN |
lNNNN |
lNNN |
lNNN |
l33 |
lNNN |
l33 |
lNNN |
lNNN |
lNNNN |
l3 |
lNNN |
l3 |
lNNNN |
lNNN |
lNNN |
l44 |
lNNN |
l44 |
lNNN |
lNNN |
lNNNN |
l4 |
lNNN |
l4 |
lNNN N |
lNNN |
s |
lNNNN |
lNNNN |
lNNNN |
s |
lNNNN |
lNNNN |
lNN |
lNNNN |
lNN |
lNNNN |
lNNNN |
lNNNN |
l33 |
lNNNN |
l33 |
lNNNN |
lNNNN |
lNNNN |
l44 |
lNNNN |
l44 |
lNNNN |
lNNNN |
lNNN |
lNNN |
lNNNN |
|
Operation with assignment.
If y is array, it must have the same dimensions as x. Result
has the same dimension as array x.
-
x+=y, add array or scalar y
-
x-=y, subtract array or scalar y
-
x*=y, multiply by scalar y
-
x/=y, divide by scalar y
Division.
Result has the same dimension as array.
-
x/y, divide array x by scalar y
-
x/y, divide scalar x by array y
Division and multiplication as operations performed only element-wise.
The x and y are arrays. Result has the same dimension.
-
ElemProduct(x,y) - element-wise multiplication
-
ElemQuotient(x,y) - element-wise division
Other operations with arrays
-
Cross(x,y) - defined for light3 vectors only. Returns light3.
-
OuterProduct(x,y) - defined for light3, light4, lightN. Returns
lightNN.
-
Apply(arr,T f(T)) - applies function f to every element
of the array with element type T.
-
Apply(arr1,arr2,T f(T,T)) - applies function f to every
element of array arr1 and arr2 element-wise. The arr1 and arr2 has the
same dimension.
Mathematical functions, applied element-wise:
-
sign (return integer , -1, 0 or 1; defined for integer and double),
-
abs (return the same type as argument),
-
LightMax returns maximal element of array
-
LightMin returns minimal element of array
-
floor, ceil, rint , IntegerPart (return integer) ,
-
FractionalPart, sqrt, exp, log, sin, cos, tan, asin, acos, atan, sinh,
cosh, tanh (return double)
-
asinh, acosh, atanh (return double, not defined for Win32 implementation)
Note: (Version 0.76) FractionalPart, IntegerPart,
Mod, LightMax, LightMin - implemented for universal types only
Preprocessor definitions
In the file lightmat.h (or lightmat_id.h)
there are a number of preprocessor definitions that can be changed. Those
definitions are:
Preprocessor definition
|
Default value
|
Meaning if defined
|
Meaning if undefined |
LIGHTMAT_LIMITS_CHECKING |
defined |
LightMat does limits checking during calculations. I.e.
LightMat will abort computations and dump core if an index is out
of bounds. Computation speed is reduced by these checks. |
An operation with bad index can cause unpredictable results. |
LIGHTMAT_INIT_ZERO |
undefined |
LightMat will initialize all array elements to zero when constructing
a new object. Computation speed is reduced.
|
Array elements initially may contain any value. Care should be taken
if non-initialized values are used in some other interfaces, like
MathLink. |
LIGHTMAT_DONT_USE_BLAS |
defined |
LightMat uses its own function for double array multiplication |
LightMat calls BLAS library routines. The library
should be linked with the application.
(N/A) |
LIGHTMAT_OUTPUT_FUNCS |
defined |
The ToStr (conversion to Tools.h++ class RWCString)
functions are defined and operator<< functions
are defined for all arrays. |
(N/A) |
LIGHTMAT_TEMPLATE |
undefined |
LightMat classes can be used as templates. This option is not available
due to difficulties with various compilers.(N/A) |
LightMat classes are used as traditional C++ classes. |
If preprocessor definitions change, all the files should be recompiled.
Implementation notes
Memory usage in classes
One memory area for each object
Packages like math.h++ and M++ need to dynamically allocate memory for
objects. The ability to have different views in different objects of the
same data-area means that the objects need to maintain reference counts
of how many objects that use one memory-area. This leads to extra overhead
when maintaining the reference counts and when allocating/deallocating
memory. The lightmat classes tries to avoid this by having one memory area
for each object (no need for reference counts) and by letting each object
have a static area on the stack for the elements. But there are also a
performance disadvantage of not being able to share the same data area
in several objects. It will make assignment slower but see Temporary
Variables for a tip on how to avoid unnecessary assignments.
Static and dynamic memory
Since the amount of memory that is statically allocated need to be compiled
into the code this means that objects with few elements won't use all of
the memory on the stack and that objects with many elements will have to
allocate memory dynamically in any case. But if the size of the allocated
memory on the stack is chosen carefully, then an increase in speed can
be expected without an excessive use of memory. The amount of allocated
by default memory is given in the table above.
Storage for arrays and order of elements
The usual way to store matrices in C++ is as an array of arrays.
That means that the elements are stored in row major order, i.e. all elements
in the first row are stored first, then the second row and so on. Fortran
stores matrices in column major order, i.e. all elements in the first column
are stored first etc. Tensors are stored in a similar way. A lot
of code for manipulating matrices have been written in Fortran, for example
BLAS. Lightmat will therefore also store the elements in column major
order internally since that will make it easier to interface to routines
like BLAS within the classes.
A 3 x 2 matrix
|
Row major order
|
Column major order
|
The interface for the classes expect the elements to be stored
in row major order though. That will make it easier to interface
to other C++ code.
Dynamic resizing of objects
When assigning the result of some calculation to an existing object then
the size of the object will change its size to the size of the result of
the calculation. See Assignment for more
information. This can sometimes lead to allocation of more memory during
calculations.
Dynamic resizing is allowed because it is convenient to be able
to create a variable that will contain the result of a calculation without
knowing the size of the resulting matrix in beforehand. The functionality
also does not slow down other operations that can be done on the object.
The basic internal routines for calculation
Most calculations in lightmat are built upon some basic routines for addition,
dot product and similar calculations. They are all template functions,
so they can be used for different types of elements. These basic routines
operate on arrays of elements. Matrices and tensors should be stored in
column major order for correct results.
Some classes do not use these basic routines. The classes which
are specialized for different sizes have their own optimized routines which
do the calculations as quickly as possible. That means that loops are avoided
in the classes, the needed statements are instead in a (sometimes) long
serie. The functions doesn't get very big anyway since the number of elements
in those classes are rather few. Loops can't be avoided in the normal classes
since the number of elements aren't known at compile-time.
There are different routines for element-wise assignment, addition,
subtraction, multiplication and division of two arrays with elements or
of an array and a scalar. There are also different variants of those routines
which either put the result in some other array or in one of the already
supplied arrays. Putting the result in the same array as some of the operands
are taken from reduces the code which are needed to index the arrays. All
these routines have unrolled loops in order to increase the speed a little
more.
There are also some routines for calculating dot products and
inner products. Two routines can calculate dot products. They differ in
that one of them allows the step between elements in one of the arrays
to be greater than one. That feature is needed by some other routines,
one is the routine that calculate matrix-vector products. Both routines
for dot product calculation have unrolled loops.
The routines for calculating inner products use the routines that
calculate dot products. All needed routines for calculating inner products
between any combinations of vectors, matrices and tensors of rank 3 or
4 exist. No routines which produce a result which is a tensor with a higher
rank than 4 exist though (since the result can't be represented in any
way). All routines use the routines for dot product directly or via another
routine for calculating inner products.
Function |
Comment |
light_assign |
Assign the elements in an array the values or elements from another
array, or assign all elements in an array one value. |
light_plus |
Assign the elements in an array the element-wise addition of
elements in two other arrays, or
assign the elements in an array the sum of the value of one argument
and the elements of another array. |
light_minus |
Assign the elements in an array the element-wise subtraction of elements
in two other arrays, or assign the elements in an array the difference
of the value of one argument and the elements of another array. |
light_mult |
Assign the elements in an array the element-wise multiplication
of elements in two other arrays, or assign the elements in an array
the product of the value of one argument and the elements of
another array. |
light_divide |
Assign the elements in an array the element-wise division of
elements in two other arrays, or assign the elements in an array the quote
of the value of one argument and the elements of another array. |
light_plus_same |
Add the elements in an array to the elements in another array and put
the result in the second array, or add the value of one argument to the
elements of an array and put the result in the same array. |
light_minus_same |
Subtract the value of the elements in an array from the elements in
another array and put the result in the second array. |
light_mult_same |
Multiply the elements in an array with the elements in another array
and put the result in the second array, or multiply the value of one argument
with the elements of an array and put the result in the same array. |
light_dot |
Calculate the dot product of two vectors |
light_gemv |
Calculate the inner product of a matrix and a vector. |
light_gevm |
Calculate the inner product of a vector and a matrix. |
light_gemm |
Calculate the inner product of two matrices. |
light_ge3v |
Calculate the inner product of a tensor of rank 3 and a vector. |
light_gev3 |
Calculate the inner product of a vector and a tensor of rank 3. |
light_ge3m |
Calculate the inner product of a tensor of rank 3 and a matrix. |
light_gem3 |
Calculate the inner product of a matrix and a tensor of rank 3. |
light_ge33 |
Calculate the inner product of two tensors of rank 3. |
light_ge4v |
Calculate the inner product of a tensor of rank 4 and a vector. |
light_gev4 |
Calculate the inner product of a vector and a tensor of rank 4. |
light_ge4m |
Calculate the inner product of a tensor of rank 4 and a matrix. |
light_gem4 |
Calculate the inner product of a matrix and a tensor of rank 4. |
All basic routines are listed in the table above. Note that some
routines are overloaded and can hence have more than one use, although
similar in nature.
All classes use these routines whenever possible and that makes
it easy to change something, e.g. the degree of unrolling, in those routines
in order to customize lightmat for speed for a lightmat-user's own computer.
compiler and calculations.
Using LightMat with BLAS
There also exist specializations of the template routines that calculate
dot product and inner products involving vectors and matrices with elements
of type double. Those routines call Fortran BLAS routines which are available
for many kinds of computers. The BLAS routines are usually very fast since
they are highly optimized for the respective computers.
The normal lightmat calculation routines can normally be inlined
though, and calling a BLAS routine incurs an extra function call. So using
BLAS is probably only a good idea if large vectors, matrices or tensors
are used in the calculations. Your mileage may vary depending on the computer
and compiler that is used. See also preprocessor
defintions.
Optimizing calculations and constructors
The routines that only use the special classes which have predefined sizes
create, when possible, the results from calculations with the constructors
that initializes all elements to their values.
Routines for other classes usually do nothing more than call special
constructors which do the actual work. E.g. the operator+ functions usually
only contain a return statement those argument is a call to a constructor.
The reason for doing like that is that the C++ compilers that have been
tested produce a faster code if the functions return a in the return statement
constructed object. The other way to do it would be to create a, for the
function, local and uninitialized object, set the elements to their correct
values and then return the object. The object must then be copied (since
it's a local variable) when it's returned and that takes time. Having a
constructor as the argument to the return-statement can be used to avoid
that copying. I.e.
classA oper(classA a, classA b){
// Calculation is done in constructor
return classA(a, b, oper_enum);
}
is faster than
classA oper(classA a, classA b)
{
classA res;
// Do calculation here
return res;
}
The constructors that do the calculations call the basic
routines in when possible. Constructors that exist are ones that
calculate addition, subtraction, multiplication, division, dot product,
inner product, transpose, raise to the power and absolute value. There
also exist constructors that take a function as an argument and apply it
to one or two objects. Those constructors are used by a lot of trigonometric
functions.
A few calculation functions create a local object which is later
returned. This is needed by a few functions that do conversions between
different element types, e.g. from int to double. It is also used by the
indexing functions which does not return a single element but instead,
for example, a vector or matrix. The reason for these exceptions is that
some special copying is needed for which no constructors exist.
Temporary Variables
When doing a calculation and assigning the result to a variable, then a
temporary variable usually is created by the compiler. The result of the
calculation is put into the temporary variable and the variable is then
given as an argument to the assignment operator.
The creation of a temporary variable can be avoided if assignment
isn't used when saving the result of the calculation. The result can instead
be put directly into a new variable that is constructed at the same time.
lightNN<double> A,B,C;
A=B+C;
// a temporary variable is needed
lightNN<double> D=B+C; // no temporary
variable is needed
This will avoid the construction and destruction of an extra variable
and it will also avoid the call to the assignment operator. A constructor
for the new variable need to be called though, but the result-variable
has to be constructed somewhere anyway.
Unrolling
The loops that do most calculations and copying of data in LightMat are
in the source file light_basic.icc. Some compilers may produce faster code
if the loops are written in some special way. The loops in light_basic.icc
can easily be changed by any programmer if that is the case.
Note on assignments
A lot of these functions use another function named light_assign
that does the actual copying of elements. The loop in the function light_assign
is unrolled in order to make it faster. The degree of unrolling can also
easily be changed since the code for all assignments is in only one place.
Also see `The basic routines for calculation'' for
more information on light_assign.
Making Extensions to LightMat
Some tips for adding new functionality to LightMat is provided below. As
a general rule: look at how the existing functions/classes works and mimic
it where appropriate.
Functions
There are some things that can useful to remember when adding a new function
to a class and when changing an existing function.
-
The pointer variable elem points at the memory that is used. If it's equal
to sarea then the static memory area is being used. In classes with a fixed
size elem is the static area.
-
The function init() can be used in constructors in order to initialize
the size of the object.
-
If you need to write a function that must change the size of an already
constructed object then have a look at operator=(). There are a examples
of size-changes in several of the assignment operators.
-
Use the functions described in ``The basic routines for
calculation'' whenever possible in calculations since it will then
be easier to change unrolling etc. when those functions are used..
-
The limiterror() macro shall be used in order to check the sizes, where
needed, of arguments to functions. E.g.
limiterror(s.size == 3);
When writing a specialized version of a function for a specific type, e.g.
double or float, then write it as a template specialization. E.g. matrix
multiplication for type double using BLAS is written like that in light_double.icc.
New Classes
When making a new LightMat class, e.g. a new size, then it's easiest to
copy one of the existing ones and change it where needed.
It may also be necessary to add new functions, or friends, to
existing classes in order to integrate the new class with old classes.
Remember to write conversion functions from specialized sizes of classes
to the more general classes when appropriate. I.e. it shall be possible
to convert an object of a new light22 class to an object of type lightNN.
Last Modified: 11:49am MET, February 15, 1998