The ADiMat Handbook: The derivative classes

7. The derivative classes

A number of derivative classes are available. At least one is needed to support computation with derivative objects. The term 'derivative class' is merely an abbreviation for the underlying functionality. The derivative class has to be seen as a package containing support functions, constructor functions, interface wrappers and the Matlab class needed to run a differentiated program.

Which derivative class to use depends in part on how you produce the differentiated code, and whether you want to run the differentiated function in scalar or vector mode.

Scalar mode means that only a single derivative direction is computed. This implies that any derivative variable g_x have the same dimensions as the variable x it is associated to. This case is preferable as it allows to use doubles as derivative objects. For example, if you have a scalar function f,

function z = f(a, b, c)

then you could run the reverse mode function a_f by simply sticking a literal 1 into the adjoint input parameter:

gradient = a_f(a, b, c, 1)

Vector mode means that several different derivative inputs are concatenated. As in Matlab the objects may already be vectors, matrices or tensors, the derivative objects have one more dimension than the corresponding program variables. We use derivative classes to hide the additional dimension, and do the correct derivative computations in the overloaded operators of that class. In a different approach, implemented by the command admproc -f and the derivative class vector_directderivs we call special runtime functions with the same end.

In the following list you find the possible combinations of derivative code and derivative classes.

Scalar mode
- Code produced with admTransform, mode 'F': use scalar_directderivs
- Code produced with admTransform, mode 'r': use scalar_directderivs, and adimat_adjoint('default') or any derivative class and adimat_adjoint('scalar')
Vector mode
- Code produced with admTransform, mode 'F': use arrderivclass, arrderivclassvxdd, opt_derivclass, opt_sp_derivclass or mat_derivclass
- Code produced with admTransform, mode 'f': use arrderivclass, arrderivclassvxdd, opt_derivclass, opt_sp_derivclass or mat_derivclass and adimat_adjoint('default')
- Code produced with admTransform, mode 'f': use vector_directderivs

All installed derivative classes can be found in the directory '${ADIMAT_HOME}/share/adimat/'. The desired class can be selected at runtime using the function adimat_derivclass. Using the driver functions like admDiffFor, etc., the option derivClassName can be used to select the derivative class. The following derivative classes are available:

arrderivclass: In this class, derivatives are stored internally in one big tensor array of size [NDD size(x)], where x is the corresponding original variable. With this class many operations very fast, because no loop over NDD is required. On the other hand, some other operations like cat, horzcat, vertcat, mtimes and mldivide are slower than with opt_derivclass. Also, some indexing operations may not be correctly supported by this class. In this case, please provide us with an example so we can try to fix the issue. This derivative class is used by default.
arrderivclassvxdd: Like arrderivclass, but the internal array has a different layout. It is always two-dimentional of size [prod(size(x)), NDD].
foderivclass: This derivative class is written using the new classdef construct in Matlab, so it will not work in Octave 3.6. Otherwise it is identical to arrderivclass.
opt_derivclass: Sometimes also called opt3_derivclass, because it is the third version of a derivative class based on cell arrays. This derivative class comprises the whole set of operators needed for computation with first order and second order derivative objects (gradients and Hessians). This derivative class is well maintained and mostly stable. It is suitable for programs whose derivatives are known to be full. That is, the derivative objects have less than 70% zeros. Allthough this class supports sparse derivative objects, the derivatives are not converted back to sparse data structures after operations, which return full matrices like mtimes.
opt_sp_derivclass: Also called opt3_sp_derivclass. Functional identically with the opt3_derivclass. Features conversion of directional derivatives with more than 70% of zero entries to sparse data structures conserving memory and computational resources. Note, that it possibly is slower than the opt3_derivclass if many non-zero entries are present in the directional derivatives.
mat_derivclass: Uses a matrix for storing directional derivatives instead of the cell array the opt3_derivclass uses. The complete set of operators is available but only for first order derivatives currently (i.e. no Hessian computations are possible). Because one level of indirection is missing (no access of a cell array), this class is faster. It is speed up further for certain operations because the operation is not applied every directional derivative successively, but to all directional derivatives at once.
scalar_directderivs: This is not really a class but only a collection of runtime functions. Most importantly it has a version of g_zeros that returns native doubles. Note that it does not have the ls_* runtime functions, so adimat should be run with option --noloopsaving to produce the code.
vector_directderivs: This is not really a class but only a collection of runtime functions. The derivative object of an m x n double object is an d x n x m double object. Currently this derivative class is only for use with code produced by admproc -f, and that code can only be run with this derivative class.

The creation of derivative objects for all derivative class is done using constructor functions. These functions create seedings that are often used. See section constructor functions for more information on all available functions.

7.1 Structure of derivative objects

A derivative object should be regarded as a container storing directional derivatives. Derivative objects are associated to Matlab-objects, but do not store references to them. The association is by name only.

A derivative object stores a number of objects in it that have the same shape as the associated Matlab-object. E.g., the derivative object g_t associated to a 3×3-matrix t stores a number of 3×3-matrices. The number of 3×3-matrices stored in the derivative object is defined by the number of directional derivatives of interest in the program.

Derivate objects may be one- or two-dimensional. One-dimesional derivative objects are called gradients or Jacobians depending on the context, while two-dimensional derivative objects are called Hessians.

The data within a derivative object is accessed using the standard Matlab cellarray-assignment- and indexing-operators. This is independent of the implemented storage model. The names of the actual derivative class vary. For example, the name of the Matlab-class of the opt_derivclass is adderiv, the one of the opt_sp_derivclass is adderivsp, and for the mat_derivclass madderiv. These names occur in the list displayed by the Matlab's 'whos' command, if derivative objects are present in the current workspace. Conventionally only one kind should occur. Intermixing them is not supported and may need manual conversion if desired.

7.2 Creation of derivative objects

Derivative objects are created using constructor functions. 'createZeroGradients()' is one of them. The function is able to initialise several derivative objects at once and may be called several times. If calling the function several times, the number of directional derivatives has to be the same in each call. Additional constructor functions exist, see section Constructor functions.

Function:


[g_v1, g_v2, ..., g_vn]= createZeroGradients(ndd, v1, v2, ..., vn);


g_v1= createZeroGradients(ndd, v1);
g_v2= createZeroGradients(ndd, v2);
         ...
g_vn= createZeroGradients(ndd, vn);

Short description:

Initialise one or more derivative objects.

Description:

This function initialises one or more derivative objects. The number of directional derivatives created per derivative object is denoted by the parameter 'ndd'. If using the vectorised call — the upper one —, the order of the variables 'vi' and the order of the corresponding derivative objects have to be ensured by the user. There is no way to ensure this automaticaly or check for a proper order. The derivative object 'g_vi' of the variable 'vi' stores 'ndd' many copies of the variable 'vi', but all entries are set to zero. That is why this function is named createZeroGradients(). All derivative objects have to store the same number of directional derivatives. It is therefore advised to use the vectorised function, which ensures that all derivative objects have the same number. It may be possible to change the number of directional derivatives in future version of ADiMat, but upto now this is not supported. Messing around with the number of directional derivatives during one run of the differentiated program is done on your own risk, so do not complain about wrong derivatives.

Examples:

The lighthouse example, which uses scalars only:


n= 10; % (m)
g= 0.375* pi; % (bogenmass)
o= 0.0001* pi; % (bogenmass)
t= 2; % (s)
[g_n, g_g, g_o, g_t]= createZeroGradients(4, n, g, o, t);

The derivative objects are all initialised to zero now. The contents of 'g_n' is:


>> g_n
adderiv: number of directional derivatives: 4
     0
     0
     0
     0

Each line containing a zero shows one directional derivative. The example above is to simple to see the effect, therefore a more complex one is introduced here. Suppose a row-vector 'v' containing five float numbers and a scalar 's' are the independent variables a function is differentiated with respect to. The derivatives of interest are the first three entries of the vector and the scalar. That is, four directional derivatives are needed. The call to the constructor function is given by:


>> v= [1, 2, 3, 4, 5]; % Same like 1:5
>> s= 42;
>> [g_v, g_s]= createZeroGradients(4, v, s);
>> g_v
adderiv: number of directional derivatives: 4
     0 0 0 0 0
     0 0 0 0 0
     0 0 0 0 0
     0 0 0 0 0
>>

The output of the gradient object of 'g_v' is shown above. There are four row-vectors each containig five zeros. This is because the original vector had 5 entries and four directional derivatives are of interest. The output of 'g_s' is identical to the output of 'g_n' shown in the lighthouse example above.

7.3 Accessing the derivative data

Each derivative object is an object of a Matlab-class provided by ADiMat. The class overloads several operators and (re-)implements some functions. The cellarray-indexing-operator '{n}' accesses single, multiple or all directional derivatives in a derivative object. Additionally a 'get()' method is implemented, which basically does the same job. The advantage of the 'get()' method is, that access to all directional derivatives is implement in a performant way. The drawback of the get() method is, that it implements a restricted set of adressing only. I.e., it is not possible to select the third derivative of an object and get ist (2,3)-th element in one statement.

Operator:


g_v{n}= ...;


t= g_v{n};

Short description:

Write or read data of the n-th directional derivative of an object 'g_v'.

Description:

The cellarray-indexing-operator either in assigning mode or in referencing mode is used to assign data to a directional derivative or to read it, respectively. If the expression 'g_v{n}' appears on the left-hand side of an assignment the cellarray-indexing-operator is in assigning mode. If the expression occurs on the left-hand side of an assignment or in no assignment at all, it is treated to be in referencing mode. Indexing operators may be concatenated. Suppose that v is a higher dimensional object, a vector for example. The expression 'g_v{i}(j)' accesses the j-th element of the i-th directional derivative of the object 'g_v'. This is possible in assigning and in referencing mode.

Examples:

The cellarray-indexing-operators are often used to do the seeding or to look at one directional derivative. The example presented with the createZeroGradient()-function is repeated here to show one possible seeding to get the desired derivatives. Remember the derivatives of interest are the first three entries of the vector v and the scalar. Create the derivative objects first:


>> v= [1, 2, 3, 4, 5]; % Same like 1:5
>> s= 42;
>> [g_v, g_s]= createZeroGradients(4, v, s);

The seeding is done by inserting ones at the desired positions of the derivative objects:


>> g_v{1}(1)= 1;
>> g_v{2}(2)= 1;
>> g_v{3}(3)= 1;
>> g_s{4}= 1;
>> g_v
adderiv: number of directional derivatives: 4
     1 0 0 0 0
     0 1 0 0 0
     0 0 1 0 0
     0 0 0 0 0
>>g_s
adderiv: number of directional derivatives: 4
     0
     0
     0
     1
>>

Function:


r_i= get(g_v, index);


r_all= get(g_v, 'direct');


opt= get(g_v, optionname);

Short description:

Get some or all directional derivatives of a derivative object or get values of options.

Description:

The get()-method gets single directional derivatives, multiple directional derivative, or all. Depending on the underlying derivative class using get() to extract all directional derivatives from a derivative object is more efficient, because the get()-method does not truncate the directional derivatives when extracting them, but returns them in a matrix.

To extract all directional derivatives from a derivative object g_v it is strongly advised to used r_all= get(g_v, 'direct'). This special form merely copies the internal matrix of the directional derivative to the result matrix r_all. The directional derivatives are concatenated horizontally, if the directional derivative is one dimensional, i.e. first order derivatives are computed, and are stored matrix-like if second order derivatives are computed. At last, the get()-method enables reading of internally stored options. To find out which options are by the derivative class, look at the help text of get. Make sure that you preceed the name get by the correct name of the derivative class you are using. That is, if you use the opt_derivclass then the command help adderiv/get show the correct help text. More information on the options system is available in the section The options system of the Matlab-class.

Examples:

The example settings used in the example of the operator '{n}' are reused here. I.e., the derivative objects g_v and g_s are assumed to exist.


>> get(g_v, 1)
ans =
    1 0 0 0 0
>> get(g_v, 'direct')
ans =
  Columns 1 through 13
     1 0 0 0 0 0 1 0 0 0 0 0 1
  Columns 14 through 20
     0 0 0 0 0 0 0
>>

These two examples show the fetch of the first directional derivative of g_v and the fetch of all directional derivatives of g_v.

7.4 Constructor functions `create...()`

ADiMat provides some functions to easily create one or more derivative objects. The — so called — constructor functions provided, create derivative objects with all elements set to zero, the diagonal of all ojects set to one, and to create the full Jacobian.

Function:


[g_v1, g_v2, ..., g_vn]= createZeroGradients(ndd, v1, v2, ..., vn);


g_v1= createZeroGradients(ndd, v1);
g_v2= createZeroGradients(ndd, v2);
         ...
g_vn= createZeroGradients(ndd, vn);

Short description:

Initialise one or more derivative objects with zeros.

Description:

This function initialises one or more derivative objects. The number of directional derivatives created per derivative object is denoted by the parameter 'ndd'. If using the vectorised call — the upper one — , the order of the variables 'vi' and the order of the corresponding derivative objects have to be ensured by the user. There is no way to ensure this automaticaly or check for a proper order. The derivative object 'g_vi' of the variable 'vi' stores 'ndd' many copies of the variable 'vi', but all entries are set to zero. That is why this function is named createZeroGradients(). All derivative objects have to store the same number of directional derivatives. It is therefore advised to use the vectorised function, which ensures that all derivative objects have the same number. It may be possible to change the number of directional derivatives in future version of ADiMat, but upto now this is not supported. Messing around with the number of directional derivatives during one run of the differentiated program is done on your own risk, so do not complain about wrong derivatives.

Examples:

Have a look at createZeroGradients-example.

Function:


[g_v1, g_v2, ..., g_vn]= createFullGradients(v1, v2, ..., vn);

Short description:

Create full Jacobian for all vi.

Description:

Creates derivative objects for all vi. The number of the directional derivatives stored in each g_vi is computed from the sum of the product of the sizes of all vi. Or to spell it in speudo-Matlab: ndd=sum(prod(size(vi))) for i=1:n. The seeding is done in a way, that the derivatives are computed with respect to each input element. The function is restricted to arrays for inputs. I.e. structures and cellarrays are rejected. This function can be called once, only, in a program, or after reseting the ADoptions (see clearADoptions).

Example:


>> t=magic(3);
>> g_t=createFullGradients(t)
adderiv: number of directional derivatives: 9
     1 0 0
     0 0 0
     0 0 0
     0 1 0
     0 0 0
     0 0 0
     0 0 1
     0 0 0
     0 0 0
  ... and so on ...
     0 0 0
     0 0 0
     0 1 0
     0 0 0
     0 0 0
     0 0 1

Function:


[g_v1, g_v2, ..., g_vn]= createEyeGradients(v1, v2, ..., vn);

Short description:

Create derivatives with the diagonal elements set to one.

Description:

Creates derivative objects for all vi. The number of the directional derivatives is the sum of minimum of the size of each vi, or in pseudo-Matlab: ndd=sum(min(size(vi))) forall i=1:n. This function can be applied only once per program, or after reseting the ADoptions (see clearADoptions).

Example:


>> t=magic(3);
>> g_t=createEyeGradients(t)
adderiv: number of directional derivatives: 3
     1 0 0
     0 0 0
     0 0 0
     0 0 0
     0 1 0
     0 0 0
     0 0 0
     0 0 0
     0 0 1

Function:


[h_v1, h_v2, ..., h_vn]= createHessians([ndd ndd], v1, v2, ..., vn);


[h_v1, h_v2, ..., h_vn]= createHessians([], v1, v2, ..., vn);

Short description:

Create Hessians.

Description:

Creates twodimensional derivative objects for all vi. The number of the directional derivatives stored in each h_vi is specified by [ndd ndd] or if the empty matrix is supplied, taken from the options-system (see getOption(...)). The dimension of a h_vi object is ensured to be twodimensional. Each h_vi contains sparse objects with all elements set to zero.

Example:


>> t=magic(3);
>> h_t= createHessians([], t)
adderiv: total number of directional derivatives: 3x3
  (1,:)
  (2,:)
  (3,:)

Functions:


g_v= g_zeros(size(v));
h_v= h_zeros(size(v));

Short description:

Create one/two-dimensional derivative objects with all elements set to zero.

Description:

These functions create zero-objects. They are for internal use only. Essentially, the functions give the derivative of a Matlab's zeros(),ones(),eye()... matrix constructor functions. The call of these functions is insert everywhere, where no derivative can be computed. This may be the cause for the assignment of constant arrays (scalars). Direct use of these functions is not recommend, because the functions may change without further notice!

7.5 The options system of the Matlab-class

The options system implements a way to store global information needed by the process of computing derivatives invisibly. Traditionally, a flag used by Matlab's toolboxes is stored in the global workspace, where it is lost after a call to clear all. The information maintained by the options systems survives, because it is stored as persistent data in a private member-function. (If you did not understand the previous sentence, do not bother anymore, because it was developer language).

Function:


val=get(g_t,optionname);

Short description:

Get the option's value specified by optionname or clear all options setable.

Description:

Get the value of a specific option. Use help adderiv/get to get a list of available options. The object supplied as g_t needs to be an object of the current derivative class. A multi purpose object is available by supplying g_dummy for g_t. Some options are local to an object, some are global. Wether an option is local or global is documented in the help-text of the get()-method. A special option is specified by 'ClearAll'. This option resets all options setable in the options system to their default values. This option may be needed, if a program needs to manipulate the number of directional derivatives that are stored in a derivative object by default. Allthough computations on derivative objects storing distinct numbers of directional derivatives is not supported by the derivative class, in some cases the number of directional derivatives needs to be reset, if for instance another program is to be executed, featuring a distinct number of directional derivatives.

Example:


>> get(g_dummy, 'NumberOfDirectionalDerivatives');
ans =
      3
>> ver=get(g_dummy, 'Version')
ver =
    0.5000
>>

Function:


set(...)

Short description:

Exists, but hands off. Internal use, only.

Description:

This function is intentionally undocumented. It is for internal use only. Messing around with it, will cause unexpected behaviour.

7.6 Matlab's cellarrays/structures and the derivative class

Matlab implements two datatypes that get special treatment by the derivative class. The first one is the cellarray-datatype and the second one are structures.

cellarray: { }

Cellarrays are able to store objects of different types. I.e. a cellarray may store a string, an array, and a scalar. The cellarray is organized like a standard Matlab-array. I.e., it is indexable. In conjunction with the derivative class: A cellarray can never be stored within a derivative object. But the derivative object can be stored within the cellarray. In fact, there is no need to modify codes containing cellarray when they are to be differentiated by ADiMat. The source transformation component ensures correct treatment of the cellarray and the derivative objects. The only issue to take off is the access of the data. The derivative object is in the cellarray. I.e. the first index accesses the derivative object, the second the data within it. Example: Let a, b, and c be active variables and ca={a,b,c} be the cellarray combining them to a vector. The derivative expression for ca is: g_ca={g_a,g_b,g_c}. The expression g_ca{1} will access the first derivative object in the vector, namely g_a. To access the second component of first item of g_ca the expression: g_ca{1}{2} has to be used.

structures: struct.field

Structures enable the storage of distinct data in a hierarchical way. The source transformation component of ADiMat ensures that the base object, i.e. the variable storing the structure, is a derivative object. This has to be taken into account when creating structures that are active and the derivative has to created. At first a dummy-derivative objects has to be created and then the fields have to be inserted. In this way a structure is stored within a derivative object. The other way around, a derivative object is not to be stored within a structure. This is enforced, because during the activity analysis of the source transformation component the variable containing the structure is taken into account only and the fields are of no interest. Derivative objects of structures may created using the constructor functions createZeroGradients() for gradients and Jacobians and createHessians() for Hessians. The constructor functions createFullGradients() and createEyeGradients() can not be used to create a derivative object for a structure. Example: str.field1=[1, 2] str.field2=42; create a simple structure. The constructor function g_str= createZeroGradients(3, str); creates a suitable derivative object:


>> g_str
adderiv: number of directional derivatives: 3
    field1: [0 0]
    field2: 0
    field1: [0 0]
    field2: 0
    field1: [0 0]
    field2: 0

Next Previous Contents

7. The derivative classes

7.1 Structure of derivative objects

7.2 Creation of derivative objects

7.3 Accessing the derivative data

7.4 Constructor functions create...()

7.5 The options system of the Matlab-class

7.6 Matlab's cellarrays/structures and the derivative class

7.4 Constructor functions `create...()`