A number of derivative classes are available. At least one is needed to support computation with derivative objects. The term 'derivative class' is merely an abbreviation for the underlying functionality. The derivative class has to be seen as a package containing support functions, constructor functions, interface wrappers and the Matlab class needed to run a differentiated program.
Which derivative class to use depends in part on how you produce the differentiated code, and whether you want to run the differentiated function in scalar or vector mode.
Scalar mode means that only a single derivative direction is computed. This implies that any derivative variable g_x have the same dimensions as the variable x it is associated to. This case is preferable as it allows to use doubles as derivative objects. For example, if you have a scalar function f,
function z = f(a, b, c)then you could run the reverse mode function a_f by simply sticking a literal 1 into the adjoint input parameter:
gradient = a_f(a, b, c, 1)
Vector mode means that several different derivative inputs are concatenated. As in Matlab the objects may already be vectors, matrices or tensors, the derivative objects have one more dimension than the corresponding program variables. We use derivative classes to hide the additional dimension, and do the correct derivative computations in the overloaded operators of that class. In a different approach, implemented by the command admproc -f and the derivative class vector_directderivs we call special runtime functions with the same end.
In the following list you find the possible combinations of derivative code and derivative classes.
All installed derivative classes can be found in the directory '${ADIMAT_HOME}/share/adimat/'. The desired class can be selected at runtime using the function adimat_derivclass. Using the driver functions like admDiffFor, etc., the option derivClassName can be used to select the derivative class. The following derivative classes are available:
In this class, derivatives are stored internally in one big tensor array of size [NDD size(x)], where x is the corresponding original variable. With this class many operations very fast, because no loop over NDD is required. On the other hand, some other operations like cat, horzcat, vertcat, mtimes and mldivide are slower than with opt_derivclass. Also, some indexing operations may not be correctly supported by this class. In this case, please provide us with an example so we can try to fix the issue. This derivative class is used by default.
Like arrderivclass, but the internal array has a different layout. It is always two-dimentional of size [prod(size(x)), NDD].
This derivative class is written using the new classdef construct in Matlab, so it will not work in Octave 3.6. Otherwise it is identical to arrderivclass.
Sometimes also called opt3_derivclass, because it is the third version of a derivative class based on cell arrays. This derivative class comprises the whole set of operators needed for computation with first order and second order derivative objects (gradients and Hessians). This derivative class is well maintained and mostly stable. It is suitable for programs whose derivatives are known to be full. That is, the derivative objects have less than 70% zeros. Allthough this class supports sparse derivative objects, the derivatives are not converted back to sparse data structures after operations, which return full matrices like mtimes.
Also called opt3_sp_derivclass. Functional identically with the opt3_derivclass. Features conversion of directional derivatives with more than 70% of zero entries to sparse data structures conserving memory and computational resources. Note, that it possibly is slower than the opt3_derivclass if many non-zero entries are present in the directional derivatives.
Uses a matrix for storing directional derivatives instead of the cell array the opt3_derivclass uses. The complete set of operators is available but only for first order derivatives currently (i.e. no Hessian computations are possible). Because one level of indirection is missing (no access of a cell array), this class is faster. It is speed up further for certain operations because the operation is not applied every directional derivative successively, but to all directional derivatives at once.
This is not really a class but only a collection of runtime functions. Most importantly it has a version of g_zeros that returns native doubles. Note that it does not have the ls_* runtime functions, so adimat should be run with option --noloopsaving to produce the code.
This is not really a class but only a collection of runtime functions. The derivative object of an m x n double object is an d x n x m double object. Currently this derivative class is only for use with code produced by admproc -f, and that code can only be run with this derivative class.
The creation of derivative objects for all derivative class is done using constructor functions. These functions create seedings that are often used. See section constructor functions for more information on all available functions.
A derivative object should be regarded as a container storing directional derivatives. Derivative objects are associated to Matlab-objects, but do not store references to them. The association is by name only.
A derivative object stores a number of objects in it that have the same shape
as the associated Matlab-object. E.g., the derivative object g_t
associated to a 3×3-matrix t
stores a number of 3×3-matrices.
The number of 3×3-matrices stored in the derivative object is defined by
the number of directional derivatives of interest in the program.
Derivate objects may be one- or two-dimensional. One-dimesional derivative objects are called gradients or Jacobians depending on the context, while two-dimensional derivative objects are called Hessians.
The data within a derivative object is accessed using the standard Matlab cellarray-assignment- and indexing-operators. This is independent of the implemented storage model. The names of the actual derivative class vary. For example, the name of the Matlab-class of the opt_derivclass is adderiv, the one of the opt_sp_derivclass is adderivsp, and for the mat_derivclass madderiv. These names occur in the list displayed by the Matlab's 'whos' command, if derivative objects are present in the current workspace. Conventionally only one kind should occur. Intermixing them is not supported and may need manual conversion if desired.
Derivative objects are created using constructor functions.
'createZeroGradients()
' is one of them. The function is able to
initialise several derivative objects at once and may be called several times.
If calling the function several times, the number of directional derivatives
has to be the same in each call. Additional constructor functions exist, see
section
Constructor functions.
[g_v1, g_v2, ..., g_vn]= createZeroGradients(ndd, v1, v2, ..., vn);
or
g_v1= createZeroGradients(ndd, v1);
g_v2= createZeroGradients(ndd, v2);
...
g_vn= createZeroGradients(ndd, vn);
Initialise one or more derivative objects.
This function initialises one or more derivative objects. The
number of directional derivatives created per derivative object is denoted by
the parameter 'ndd
'. If using the vectorised call — the upper one —, the
order of the variables 'vi
' and the order of the corresponding derivative
objects have to be ensured by the user. There is no way to ensure this
automaticaly or check for a proper order.
The derivative object 'g_vi
' of the variable 'vi
' stores 'ndd
'
many copies of the variable 'vi
', but all entries are set to zero. That
is why this function is named createZeroGradients()
.
All derivative objects have to store the same number of directional
derivatives. It is therefore advised to use the vectorised function, which
ensures that all derivative objects have the same number. It may be possible
to change the number of directional derivatives in future version of ADiMat,
but upto now this is not supported. Messing around with the number of
directional derivatives during one run of the differentiated program is done
on your own risk, so do not complain about wrong derivatives.
The lighthouse example, which uses scalars only:
n= 10; % (m)
g= 0.375* pi; % (bogenmass)
o= 0.0001* pi; % (bogenmass)
t= 2; % (s)
[g_n, g_g, g_o, g_t]= createZeroGradients(4, n, g, o, t);
The derivative objects are all initialised to zero now. The contents of
'g_n
' is:
>> g_n
adderiv: number of directional derivatives: 4
0
0
0
0
Each line containing a zero shows one directional derivative. The example
above is to simple to see the effect, therefore a more complex one is
introduced here. Suppose a row-vector 'v
' containing five float numbers
and a scalar 's
' are the independent variables a function is
differentiated with respect to. The derivatives of interest are the first
three entries of the vector and the scalar. That is, four directional
derivatives are needed. The call to the constructor function is given by:
>> v= [1, 2, 3, 4, 5]; % Same like 1:5
>> s= 42;
>> [g_v, g_s]= createZeroGradients(4, v, s);
>> g_v
adderiv: number of directional derivatives: 4
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
>>
The output of the gradient object of 'g_v
' is shown above. There are four
row-vectors each containig five zeros. This is because the original vector had
5 entries and four directional derivatives are of interest. The output of
'g_s
' is identical to the output of 'g_n
' shown in the lighthouse
example above.
Each derivative object is an object of a Matlab-class provided by ADiMat.
The class overloads several operators and (re-)implements some
functions. The cellarray-indexing-operator '{n}' accesses
single, multiple or all directional derivatives in a derivative object.
Additionally a 'get()
' method is implemented, which basically does the
same job. The advantage of the 'get()
' method is, that access to all
directional derivatives is implement in a performant way. The drawback of the
get()
method is, that it implements a restricted set of adressing only.
I.e., it is not possible to select the third derivative of an object and get ist
(2,3)-th element in one statement.
g_v{n}= ...;
or
t= g_v{n};
Write or read data of the n-th directional derivative
of an object 'g_v
'.
The cellarray-indexing-operator either in assigning mode or
in referencing mode is used to assign data to a directional derivative or to
read it, respectively. If the expression 'g_v{n}
' appears on
the left-hand side of an assignment the cellarray-indexing-operator is in
assigning mode. If the expression occurs on the left-hand side of an
assignment or in no assignment at all, it is treated to be in referencing
mode.
Indexing operators may be concatenated. Suppose that v is a higher
dimensional object, a vector for example. The expression
'g_v{i}(j)' accesses the j-th element of the i-th
directional derivative of the object 'g_v
'. This is possible in assigning
and in referencing mode.
The cellarray-indexing-operators are often used to do the seeding or to look
at one directional derivative. The example presented with the
createZeroGradient()
-function is repeated here to show one possible
seeding to get the desired derivatives. Remember the derivatives of interest
are the first three entries of the vector v and the scalar. Create the
derivative objects first:
>> v= [1, 2, 3, 4, 5]; % Same like 1:5
>> s= 42;
>> [g_v, g_s]= createZeroGradients(4, v, s);
The seeding is done by inserting ones at the desired positions of the
derivative objects:
>> g_v{1}(1)= 1;
>> g_v{2}(2)= 1;
>> g_v{3}(3)= 1;
>> g_s{4}= 1;
>> g_v
adderiv: number of directional derivatives: 4
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0 0
>>g_s
adderiv: number of directional derivatives: 4
0
0
0
1
>>
r_i= get(g_v, index);
or
r_all= get(g_v, 'direct');
or
opt= get(g_v, optionname);
Get some or all directional derivatives of a derivative object or get values of options.
The get()
-method gets single directional
derivatives, multiple directional derivative, or all. Depending on the
underlying derivative class using get() to extract all directional derivatives
from a derivative object is more efficient, because the get()-method does not
truncate the directional derivatives when extracting them, but returns them in a
matrix.
To extract all directional derivatives from a derivative object g_v
it
is strongly advised to used r_all= get(g_v, 'direct')
. This special
form merely copies the internal matrix of the directional derivative to the
result matrix r_all
. The directional derivatives are concatenated
horizontally, if the directional derivative is one dimensional, i.e. first
order derivatives are computed, and are stored matrix-like if second order
derivatives are computed.
At last, the get()
-method enables reading of internally stored
options. To find out which options are by the derivative class, look at the
help text of get. Make sure that you preceed the name get by the correct name
of the derivative class you are using. That is, if you use the
opt_derivclass
then the command help adderiv/get
show the
correct help text. More information on the options system is available in the
section
The options system of the Matlab-class.
The example settings used in the example of the operator '{n}' are
reused here. I.e., the derivative objects g_v
and g_s
are
assumed to exist.
>> get(g_v, 1)
ans =
1 0 0 0 0
>> get(g_v, 'direct')
ans =
Columns 1 through 13
1 0 0 0 0 0 1 0 0 0 0 0 1
Columns 14 through 20
0 0 0 0 0 0 0
>>
These two examples show the fetch of the first directional derivative of
g_v
and the fetch of all directional derivatives of g_v
.
create...()
ADiMat provides some functions to easily create one or more derivative objects. The — so called — constructor functions provided, create derivative objects with all elements set to zero, the diagonal of all ojects set to one, and to create the full Jacobian.
[g_v1, g_v2, ..., g_vn]= createZeroGradients(ndd, v1, v2, ..., vn);
or
g_v1= createZeroGradients(ndd, v1);
g_v2= createZeroGradients(ndd, v2);
...
g_vn= createZeroGradients(ndd, vn);
Initialise one or more derivative objects with zeros.
This function initialises one or more derivative objects. The
number of directional derivatives created per derivative object is denoted by
the parameter 'ndd
'. If using the vectorised call — the upper one — , the
order of the variables 'vi
' and the order of the corresponding derivative
objects have to be ensured by the user. There is no way to ensure this
automaticaly or check for a proper order.
The derivative object 'g_vi
' of the variable 'vi
' stores 'ndd
'
many copies of the variable 'vi
', but all entries are set to zero. That
is why this function is named createZeroGradients()
.
All derivative objects have to store the same number of directional
derivatives. It is therefore advised to use the vectorised function, which
ensures that all derivative objects have the same number. It may be possible
to change the number of directional derivatives in future version of ADiMat,
but upto now this is not supported. Messing around with the number of
directional derivatives during one run of the differentiated program is done
on your own risk, so do not complain about wrong derivatives.
Have a look at createZeroGradients-example.
[g_v1, g_v2, ..., g_vn]= createFullGradients(v1, v2, ..., vn);
Create full Jacobian for all vi
.
Creates derivative objects for all vi
. The number of the
directional derivatives stored in each g_vi
is computed from the sum of
the product of the sizes of all vi
. Or to spell it in speudo-Matlab:
ndd=sum(prod(size(vi))) for i=1:n
. The seeding is done in a way, that the
derivatives are computed with respect to each input element.
The function is restricted to arrays for inputs. I.e. structures and
cellarrays are rejected. This function can be called once, only, in a program,
or after reseting the ADoptions (see
clearADoptions).
>> t=magic(3);
>> g_t=createFullGradients(t)
adderiv: number of directional derivatives: 9
1 0 0
0 0 0
0 0 0
0 1 0
0 0 0
0 0 0
0 0 1
0 0 0
0 0 0
... and so on ...
0 0 0
0 0 0
0 1 0
0 0 0
0 0 0
0 0 1
[g_v1, g_v2, ..., g_vn]= createEyeGradients(v1, v2, ..., vn);
Create derivatives with the diagonal elements set to one.
Creates derivative objects for all vi
. The number of the
directional derivatives is the sum of minimum of the size of each vi
, or
in pseudo-Matlab: ndd=sum(min(size(vi))) forall i=1:n
.
This function can be applied only once per program, or after reseting the
ADoptions (see
clearADoptions).
>> t=magic(3);
>> g_t=createEyeGradients(t)
adderiv: number of directional derivatives: 3
1 0 0
0 0 0
0 0 0
0 0 0
0 1 0
0 0 0
0 0 0
0 0 0
0 0 1
[h_v1, h_v2, ..., h_vn]= createHessians([ndd ndd], v1, v2, ..., vn);
or
[h_v1, h_v2, ..., h_vn]= createHessians([], v1, v2, ..., vn);
Create Hessians.
Creates twodimensional derivative objects for all vi
.
The number of the directional derivatives stored in each h_vi
is
specified by [ndd ndd]
or if the empty matrix is supplied, taken from the
options-system (see
getOption(...)). The dimension
of a h_vi
object is ensured to be twodimensional.
Each h_vi
contains sparse objects with all elements set to zero.
>> t=magic(3);
>> h_t= createHessians([], t)
adderiv: total number of directional derivatives: 3x3
(1,:)
(2,:)
(3,:)
g_v= g_zeros(size(v));
h_v= h_zeros(size(v));
Create one/two-dimensional derivative objects with all elements set to zero.
These functions create zero-objects. They are for internal
use only. Essentially, the functions give the derivative of a Matlab's
zeros(),ones(),eye()...
matrix constructor functions. The call of these
functions is insert everywhere, where no derivative can be computed. This may
be the cause for the assignment of constant arrays (scalars).
Direct use of these functions is not recommend, because the functions may
change without further notice!
The options system implements a way to store global information needed by the
process of computing derivatives invisibly. Traditionally, a flag used by
Matlab's toolboxes is stored in the global workspace, where it is lost
after a call to clear all
. The information maintained by the options
systems survives, because it is stored as persistent data in a private
member-function. (If you did not understand the previous sentence, do not bother
anymore, because it was developer language).
val=get(g_t,optionname);
Get the option's value specified by
optionname
or clear all options setable.
Get the value of a specific option. Use help
adderiv/get
to get a list of available options. The object supplied
as g_t
needs to be an object of the current derivative class. A
multi purpose object is available by supplying g_dummy
for
g_t
. Some options are local to an object, some are global. Wether an
option is local or global is documented in the help-text of the get()-method.
A special option is specified by 'ClearAll
'. This option resets all
options setable in the options system to their default values. This option may
be needed, if a program needs to manipulate the number of directional
derivatives that are stored in a derivative object by default. Allthough
computations on derivative objects storing distinct numbers of directional
derivatives is not supported by the derivative class, in some cases the number
of directional derivatives needs to be reset, if for instance another program
is to be executed, featuring a distinct number of directional derivatives.
>> get(g_dummy, 'NumberOfDirectionalDerivatives');
ans =
3
>> ver=get(g_dummy, 'Version')
ver =
0.5000
>>
set(...)
Exists, but hands off. Internal use, only.
This function is intentionally undocumented. It is for internal use only. Messing around with it, will cause unexpected behaviour.
Matlab implements two datatypes that get special treatment by the derivative class. The first one is the cellarray-datatype and the second one are structures.
Cellarrays are able to store objects of different types. I.e. a cellarray may
store a string, an array, and a scalar. The cellarray is organized like a
standard Matlab-array. I.e., it is indexable.
In conjunction with the derivative class: A cellarray can never be stored
within a derivative object. But the derivative object can be stored within the
cellarray. In fact, there is no need to modify codes containing cellarray when
they are to be differentiated by ADiMat. The source transformation component
ensures correct treatment of the cellarray and the derivative objects. The
only issue to take off is the access of the data. The derivative object is in
the cellarray. I.e. the first index accesses the derivative object, the second
the data within it.
Example:
Let a
, b
, and c
be active variables and
ca={a,b,c}
be the cellarray combining them to a vector. The
derivative expression for ca
is: g_ca={g_a,g_b,g_c}
.
The expression g_ca{1}
will access the first derivative object
in the vector, namely g_a
. To access the second component of first item of
g_ca
the expression: g_ca{1}{2}
has to be used.
Structures enable the storage of distinct data in a hierarchical way. The
source transformation component of ADiMat ensures that the base object, i.e.
the variable storing the structure, is a derivative object. This has to be
taken into account when creating structures that are active and the derivative
has to created. At first a dummy-derivative objects has to be created and then
the fields have to be inserted. In this way a structure is stored within a
derivative object. The other way around, a derivative object is not to be
stored within a structure. This is enforced, because during the activity
analysis of the source transformation component the variable containing the
structure is taken into account only and the fields are of no interest.
Derivative objects of structures may created using the constructor functions
createZeroGradients() for gradients and
Jacobians and
createHessians() for Hessians.
The constructor functions createFullGradients()
and
createEyeGradients()
can not be used to create a derivative object for a
structure.
Example:
str.field1=[1, 2] str.field2=42;
create a simple structure. The
constructor function g_str= createZeroGradients(3, str);
creates a
suitable derivative object:
>> g_str
adderiv: number of directional derivatives: 3
field1: [0 0]
field2: 0
field1: [0 0]
field2: 0
field1: [0 0]
field2: 0