User guide
Definition of the objective function and gradient functions
A function must be defined receiving as argument the current point as a vector:
julia> f(x) = x[1]^2 + (x[2]-2)^2
And the gradient must receive as arguments the vector of variables and a vector which will be modified to contain the gradient at the current point:
julia> function g!(g,x)
g[1] = 2*x[1]
g[2] = 2*(x[2]-2)
end
By Julia convention, to indicate that the gradient function modifies the vector g
, we add the !
to its name, although this does not affect at all its behavior.
It is also possible to provide a single function that computes both the function value and the gradient, as explained in the Options menu.
Calling the solver, without bounds
The solver function is spgbox!
, which mutates the input value of x
(with the best solution found in it at the output). Alternatively, use spgbox
to call the solver without modifying the values of the input vector (although a copy of if this vector will be produced).
The solver calls have a minimal calling syntax of
julia> using SPGBox
julia> x = rand(2);
julia> R = spgbox(f,g!,x)
The results will be returned to the data structure R
of type SPGBoxResult
, and will be output as:
julia> R = spgbox(f,g!,x,lower=[-Inf,5])
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 9.0
Sample of best point = Vector{Float64}[ 0.0, 5.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 3
Calling the solver, with box bounds
Box bounds can be provided by the optional keywords lower
and upper
for lower and upper bounds, respectively. These are vectors that will delimit the bounds for each variable. For example, assuming the same function and gradient functions defined in the example above, a lower bound will be set for the second variable:
julia> using SPGBox
julia> x = rand(2);
julia> R = spgbox(f,g!,x,lower=[-Inf,5])
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 9.0
Sample of best point = Vector{Float64}[ 0.0, 5.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 3
Upper bounds can be similarly set with upper=[+Inf,-5]
, for example.
Note, the bounds can also be provided as non-keyword parameters, with:
julia> R = spgbox!(f,g!,x,lower=[-Inf,5],upper=[-2,+Inf])
Result data structure and possible outcomes
The minimization can result in a successful convergence, on in exceeded numbers of iterations or functional evaluations. These outcomes are explicit in the output printed (second line), and stored in the result structure, which contains the following data:
struct SPGBoxResult
x :: Vector{Float64}
f :: Float64
gnorm :: Float64
nit :: Int64
nfeval :: Int64
ierr :: Int64
return_from_callback::Bool
end
The data can be accessed as usual, using, for example:
julia> R.f
12.0
The data structure contains:
Variable name | Meaning |
---|---|
x | Best point found (solution if ierr=0 ) |
f | Best function value found. |
gnorm | Norm of gradient projected on the constraints. |
nit | Number of iterations performed. |
nfeval | Number of function evaluations. |
ierr | Exit status. |
return_from_callback | Boolan that states if the return was defined by the callback function. |
The possible outcomes of ierr
are:
Outcome | Meaning |
---|---|
ierr=0 | Success: convergence achieved. |
ierr=1 | Maximum number of iterations achieved. |
ierr=2 | Maximum number of function evaluations achieved. |
The convergence criteria can be adjusted using optional keywords, as described in the Options section.
Optional callback function
It is possible to pass an optional callback
function parameter to the spgbox
and spgbox!
functions. The callback
function must:
- Receive as argument a
SPGBoxResult
data structure. - Return
true
orfalse
to define if the algorithm must return immediately or continue.
For example, here we stop the optimization when the sum of two variables becomes smaller than 5.0
.
julia> using SPGBox
julia> f(x) = x[1]^4 + (x[2] - 1)^4
f (generic function with 1 method)
julia> function g!(g, x)
g[1] = 4 * x[1]^3
g[2] = 4 * (x[2] - 1)^3
end
g! (generic function with 1 method)
julia> x = [10.0, 18.0];
julia> my_callback(R::SPGBoxResult) = R.x[1] + R.x[2] < 5.0 ? true : false
my_callback (generic function with 1 method)
julia> R = spgbox!(f, g!, x; callback = my_callback)
SPGBOX RESULT:
Returned from callback function.
Final objective function value = 11.341529752085066
Sample of best point = Vector{Float64}[ 1.5429284794371168, 2.543387528602126]
Projected gradient norm = 14.705674573954493
Number of iterations = 10
Number of function evaluations = 10
Input data types
The implementation of SPGBox
is generic for input types of any abstract matrix or vector. Propagation of units, or numbers with arbitrary precision is supported. As minimal example illustrating these features, let us minimize a function where the input variables are given as a matrix of BigFloat
s, with units:
julia> function f(x)
f = zero(typeof(x[begin]^2))
for i in eachindex(x)
f += (x[i] - i*oneunit(eltype(x)))^2
end
return f
end
f (generic function with 1 method)
julia> function g!(g,x)
g .= zero(eltype(x))
for i in eachindex(x)
g[i] = 2*(x[i] - i*oneunit(eltype(x)))
end
return g
end
g! (generic function with 1 method)
julia> x = rand(BigFloat,2,2)u"nm"
2ร2 Matrix{Quantity{BigFloat, ๐, Unitful.FreeUnits{(nm,), ๐, nothing}}}:
0.128083 nm 0.817173 nm
0.139545 nm 0.391047 nm
julia> spgbox(f,g!,x)
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 0.0 nm^2
Sample of best point = Matrix{Quantity{BigFloat, ๐, Unitful.FreeUnits{(nm,), ๐, nothing}}}[ 1.0 nm, 2.0 nm, 3.0 nm, 4.0 nm]
Projected gradient norm = 0.0 nm
Number of iterations = 2
Number of function evaluations = 3
Data-dependent function evaluation
If the function requires additional parameters, two strategies are possible while preserving performance: 1) Declare the parameters as constants and define an extra method, or 2) Pass the function as an anonymous closure.
Constant parameters and new function and gradient methods
The solver requires a function with a single argument, x
, and a gradient function with two arguments, x
and g
. If the function and gradient evalutions require more parameters, use, for example:
julia> f(x,a,b,c) = a*x[1]^2 + (x[2]-b)^2 + c
julia> const a = 5. ; const b = 2. ; const c = 3. ;
julia> f(x) = f(x,a,b,c)
To preserve performance it is fundamental to declare the parameters, in this case a
, b
, and c
, as constants (using const
), to guarantee their type-stability. This will allow the function specializations and compiler optimizations that make Julia fast.
The gradient function will be defined accordingly:
julia> function g!(g,x,a,b)
g[1] = 2*a*x[1]
g[2] = 2*(x[2]-b)
end
julia> g!(g,x) = g!(g,x,a,b)
The function method which receives only the current point x
, and the gradient method which receives only x
and the gradient vector g
are the ones actually invoked by the solver.
Using anonymous closures
An anonymous closure is a function with a special syntax of the form
x -> f(x)
which should be read as "given x
, return f(x)
". These anonymous functions can be provided directly as arguments to the solver, while providing an interface for using external parameters. Considering the same function and gradient functions above, one uses anonymous functions directly as arguments in the solver call:
julia> R = spgbox!(x -> f(x,a,b,c), (g,x) -> g!(g,x,a,b), x)
where the first argument, x -> f(x,a,b,c)
indicates that the objective function is an anonymous function that, given x
, returns f(x,a,b,c)
. The gradient is evaluated by an anonymous function that, given (g,x)
, returns g!(g,x,a,b)
. This syntax also preserves performance and does not require the parameters to be declared as constants.
Using automatic differentiation (AD)
Julia provides various packages for automatic differentiation, which can be used in combination with SPGBox
. Here, we illustrate the use of ReverseDiff. The only two points that must be taken into consideration are: 1) The AD function must modify an existing gradient vector and 2) use anonymous closures to provide the gradient calculation function to the solver.
Here, a simple example, in which we use ReverseDiff.gradient!
to compute the derivative of a function which is the sum of squares of the variables:
julia> using SPGBox, ReverseDiff
julia> function f(x)
f = 0.
for i in eachindex(x)
f += x[i]^2
end
f
end
julia> x = rand(2)
julia> spgbox!(f, (g,x) -> ReverseDiff.gradient!(g,f,x), x, lower=[-Inf,2.])
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 4.0
Sample of best point = Vector{Float64}[ 0.0, 2.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 3