Contents
Hybrid mode differentiation example
In this example we will differentiate a scalar function in a hybrid way, using both FM and RM AD. Since this is not supported directly by ADiMat, we have to split the function into two parts manually and apply ADiMat to each, and then combine the results to get the complete derivative.
The function is scalar, based on the Dirichlet energy function, and given in f.m:
type f
function E = f(v, ne) n = length(v); E = 0; for i = 1:n for j = 1:4 E = E + norm(v(i)-v(ne(i,j))).^2; end end
First we create the input arguments. v is a vector and ne is an array describing neighbour relations. Each v(i) has four neighbours:
n = 500; v = rand(n, 1); ne = zeros(n, 4); for i=1:4 ne(:,i) = circshift((1:n)', -i.^2); end
Next, we choose our FM differentiation function. The reason is the following: The function f performs only scalar computations, and in this setting admDiffVFor excells. So in this example we use admDiffFor, as it is more comparable to what admDiffRev does. admDiffVFor is so much faster that we would have to significantly increase the problem size n to see a difference between compression and no compression. But the reader is invited to try this demo with admDiffVFor as well.
ForDiff = @admDiffFor;
% ForDiff = @admDiffVFor;
Running the function f
We run the function first:
tic
E = f(v, ne);
tF = toc;
fprintf(1, 'f took %g s\n', tF);
f took 0.002572 s
Differentiation of the function f
Now we plainly differentiate the function in both FM and RM
adopts = admOptions('i', 1); adopts.functionResults = {E}; [Jf, zf, tf] = ForDiff(@f, 1, v, ne, adopts); fprintf(1, '%s took %g s (Factor: %g)\n', func2str(ForDiff), tf.evaluation, tf.evaluation ./ tF); [Jr, zr, tr] = admDiffRev(@f, 1, v, ne, adopts); fprintf(1, 'admDiffRev took %g s (Factor: %g)\n', tr.evaluation, tr.evaluation ./ tF); assert(norm(Jr - Jf) ./ norm(Jr) < 1e-14);
admDiffFor took 131.234 s (Factor: 51024) admDiffRev took 8.15145 s (Factor: 3169.31)
Splitting the function into two
Now we split the function f into two functions g and h such that h(g(.)) == f(.).
type g type h
function t = g(v, ne) n = length(v); t = zeros(length(v), 4); for i = 1:n for j = 1:4 t(i,j) = norm(v(i)-v(ne(i,j))).^2; end end function E = h(t) E = sum(t(:));
The idea here is that g is going to have a sparse Jacobian, so we can apply sparsity exploitation (compression) techniques to its computation. For that we need the sparsity pattern, which we first compute from the full Jacobian of g. This may be dangerous in the real world, but this is just an example here.
adopts1 = admOptions('i', 1); [Jg] = admDiffVFor(@g, 1, v, ne, adopts1); adopts1.JPattern = Jg ~= 0; spy(adopts1.JPattern) title('The non-zero pattern of the Jacobian of g');
Now we compute the derivative of g in FM with compression. We also use strip-mining (scalar mode) since the number of directional derivatives is so small that we can avoid the overhead of the derivative classes:
adopts1.derivClassName = 'scalar_directderivs'; [Jg, t, tf] = ForDiff(@g, @cpr, v, ne, adopts1); fprintf(1, '%s of g took %g s\n', func2str(ForDiff), tf.evaluation);
Colored the pattern with 3 colors Compressed seed matrix: 500x3 admDiffFor of g took 5.11895 s
And we compute the derivative of h in RM:
adopts2 = admOptions();
adopts2.functionResults = {E};
[Jh, Er, tr] = admDiffRev(@h, 1, t, adopts2);
fprintf(1, 'Rev of h took %g s\n', tr.evaluation);
Rev of h took 0.00647753 s
For the final result we have to multiply both derivatives together as mandated by the chain rule:
tic grad2 = Jh * Jg; tM = toc;
So let's see what the whole endeavour gives us in terms of performance:
tt = tf.evaluation + tr.evaluation + tM;
fprintf(1, 'Hybrid differentiation took %g s (Factor: %g)\n', tt, tt ./ tF);
Hybrid differentiation took 5.12548 s (Factor: 1992.8)
And also check that the results are correct:
assert((E - Er) ./ E < 1e-14); assert(norm(grad2 - Jr) ./ norm(grad2) < 1e-14);
As a result we can state that in this example computing three directional derivatives in FM in three individual runs (scalar mode) is faster than a single run of scalar RM.
Getting some real speed
Just for the sake of completeness, let's also try if we can really get this example to speed. The way forward is to use vector operations in order to cut down on the number of statements that have to be executed for each number that we compute. For example, instead of looping over the neighbours, we can compute the distance to all of them at once for each v(i). This change results in the function f2:
type f2
function E = f2(v, ne) n = length(v); E = 0; for i = 1:n E = E + norm(v(i)-v(ne(i,:))).^2; end
Differentiating this function should perform better already, since the function itself is more Matlab-friendly:
tic E = f2(v, ne); tF = toc; fprintf(1, 'f3 took %g s\n', tF); [Jf2, zf2, tf] = ForDiff(@f2, 1, v, ne, adopts); fprintf(1, '%s took %g s (Factor: %g)\n', func2str(ForDiff), tf.evaluation, tf.evaluation ./ tF); [Jr2, zr2, tr] = admDiffRev(@f2, 1, v, ne, adopts); fprintf(1, 'admDiffRev took %g s (Factor: %g)\n', tr.evaluation, tr.evaluation ./ tF); assert(norm(Jr2 - Jf) ./ norm(Jr) < 1e-14); assert(norm(Jf2 - Jf) ./ norm(Jr) < 1e-14);
f3 took 0.002117 s admDiffFor took 35.3272 s (Factor: 16687.4) admDiffRev took 2.08399 s (Factor: 984.405)
The same idea can also be done the other way around, computing the distance of all v(i) to their first neighbour. In this way we are left with a for loop with just 4 iterations. This leads to function f3:
type f3
function E = f3(v, ne) E = 0; for i=1:4 E = E + norm(v - v(ne(:,i))).^2; end
Differentiating f3 should now be really fast. Note that in this last example admDiffRev will be faster than admDiffVFor even for small n.
tic E = f3(v, ne); tF = toc; fprintf(1, 'f3 took %g s\n', tF); [Jf3, zf3, tf] = ForDiff(@f3, 1, v, ne, adopts); fprintf(1, '%s took %g s (Factor: %g)\n', func2str(ForDiff), tf.evaluation, tf.evaluation ./ tF); [Jr3, zr3, tr] = admDiffRev(@f3, 1, v, ne, adopts); fprintf(1, 'admDiffRev took %g s (Factor: %g)\n', tr.evaluation, tr.evaluation ./ tF); assert(norm(Jr3 - Jf) ./ norm(Jr) < 1e-14); assert(norm(Jf3 - Jf) ./ norm(Jr) < 1e-14);
f3 took 0.001082 s admDiffFor took 0.449334 s (Factor: 415.281) admDiffRev took 0.0224199 s (Factor: 20.7208)
Published with ADiMat version
adimat_version(2)
ans = ADiMat 0.5.9-3713:3723M