User guide
Definition of the objective function and gradient functions
A function must be defined receiving as argument the current point as a vector:
julia> f(x) = x[1]^2 + (x[2]-2)^2
And the gradient must receive as arguments the vector of variables and a vector which will be modified to contain the gradient at the current point:
julia> function g!(g,x)
g[1] = 2*x[1]
g[2] = 2*(x[2]-2)
end
By Julia convention, to indicate that the gradient function modifies the vector g
, we add the !
to its name, although this does not affect at all its behavior.
It is also possible to provide a single function that computes both the function value and the gradient, as explained in the Options menu.
Calling the solver, without bounds
The solver function is spgbox!
, which mutates the input value of x
(with the best solution found in it at the output). Alternatively, use spgbox
to call the solver without modifying the values of the input vector (although a copy of if this vector will be produced).
The solver calls have a minimal calling syntax of
julia> using SPGBox
julia> x = rand(2);
julia> R = spgbox(f,g!,x)
The results will be returned to the data structure R
of type SPGBoxResult
, and will be output as:
julia> R = spgbox(f,g!,x,lower=[-Inf,5])
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 9.0
Sample of best point = Vector{Float64}[ 0.0, 5.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 3
Reference documentation:
SPGBox.spgbox
— Functionspgbox(f, g!, x::AbstractVecOrMat; lower=..., upper=..., options...)`
See spgbox!
for additional help.
Minimizes function f
starting from initial point x
, given the function to compute the gradient, g!
. f
must be of the form f(x)
, and g!
of the form g!(g,x)
, where g
is the gradient vector to be modified.
Optional lower and upper box bounds can be provided using optional keyword arguments lower
and upper
.
spgbox(fg!, x::AbstractVecOrMat; lower=..., upper=..., options...)`
Given a single function fg!(g,x)
that updates a gradient vector g
and returns the function value, minimizes the function.
These functions return a structure of type SPGBoxResult
, containing the best solution found in x
and the final objective function in f
.
These functions do not mutate the x
vector, instead it will create a (deep)copy of it (see spgbox!
for the in-place alternative).
SPGBox.spgbox!
— Functionspgbox!(f, g!, x::AbstractVecOrMat; lower=..., upper=..., options...)
Minimizes function f
starting from initial point x
, given the function to compute the gradient, g!
. f
must be of the form f(x)
, and g!
of the form g!(g,x)
, where g
is the gradient vector to be modified. It modifies the x
vector, which will contain the best solution found (see spgbox
for a non-mutating alternative).
Optional lower and upper box bounds can be provided using optional arguments lower
and upper
, which can be provided as the fourth and fifth arguments or with keyword parameters.
Returns a structure of type SPGBoxResult
, containing the best solution found in x
and the final objective function in f
.
Alternativelly, a single function that computes the function value and the gradient can be provided, using:
spgbox(fg!, x; lower=..., upper=..., options...)
The fg!
must be of the form fg!(g,x)
where x
is the current point and g
the array that stores the gradient. And it must return the function value.
Examples
julia> f(x) = x[1]^2 + x[2]^2
julia> function g!(g,x)
g[1] = 2*x[1]
g[2] = 2*x[2]
end
Without bounds
julia> x = [1.0, 2.0]
julia> spgbox!(f,g!,x)
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 0.0
Sample of best point = Vector{Float64}[ 0.0, 0.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 3
With bounds
julia> x = [3.0, 4.0]
julia> spgbox!(f,g!,x,lower=[2.,-Inf])
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 4.0
Sample of best point = Vector{Float64}[ 2.0, 0.0]
Projected gradient norm = 0.0
Number of iterations = 1
Number of function evaluations = 1
With a single function to compute the function and the gradient
julia> function fg!(g,x)
g[1] = 2*x[1]
g[2] = 2*x[2]
fx = x[1]^2 + x[2]^2
return fx
end
fg! (generic function with 1 method)
julia> x = [1.0, 2.0];
julia> spgbox(fg!,x)
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 0.0
Sample of best point = Vector{Float64}[ 0.0, 0.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 3
SPGBox.SPGBoxResult
— TypeSPGBoxResult
Data structure that contains the results of the spgbox
optimization method.
`x`: Array containing final point (solution or best point found).
`f`: Final objective function value.
`gnorm`: Norm of the projected gradient on the bounds.
`nit`: Number of iterations performed.
`nfeval`: Number of function evaluations.
`ierr`: Status of execution:
0. Converged successfully to solution.
1. Maximum number of iterations achieved.
2. Maximum number of function evaluations achieved.
`return_from_callback`: Boolean indicating if the algorithm returned from the callback function.
Calling the solver, with box bounds
Box bounds can be provided by the optional keywords lower
and upper
for lower and upper bounds, respectively. These are vectors that will delimit the bounds for each variable. For example, assuming the same function and gradient functions defined in the example above, a lower bound will be set for the second variable:
julia> using SPGBox
julia> x = rand(2);
julia> R = spgbox(f,g!,x,lower=[-Inf,5])
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 9.0
Sample of best point = Vector{Float64}[ 0.0, 5.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 3
Upper bounds can be similarly set with upper=[+Inf,-5]
, for example.
Note, the bounds can also be provided as non-keyword parameters, with:
julia> R = spgbox!(f,g!,x,lower=[-Inf,5],upper=[-2,+Inf])
Result data structure and possible outcomes
The minimization can result in a successful convergence, on in exceeded numbers of iterations or functional evaluations. These outcomes are explicit in the output printed (second line), and stored in the result structure, which contains the following data:
struct SPGBoxResult
x :: Vector{Float64}
f :: Float64
gnorm :: Float64
nit :: Int64
nfeval :: Int64
ierr :: Int64
return_from_callback::Bool
end
The data can be accessed as usual, using, for example:
julia> R.f
12.0
The data structure contains:
Variable name | Meaning |
---|---|
x | Best point found (solution if ierr=0 ) |
f | Best function value found. |
gnorm | Norm of gradient projected on the constraints. |
nit | Number of iterations performed. |
nfeval | Number of function evaluations. |
ierr | Exit status. |
return_from_callback | Boolan that states if the return was defined by the callback function. |
The possible outcomes of ierr
are:
Outcome | Meaning |
---|---|
ierr=0 | Success: convergence achieved. |
ierr=1 | Maximum number of iterations achieved. |
ierr=2 | Maximum number of function evaluations achieved. |
The convergence criteria can be adjusted using optional keywords, as described in the Options section.
Optional callback function
It is possible to pass an optional callback
function parameter to the spgbox
and spgbox!
functions. The callback
function must:
- Receive as argument a
SPGBoxResult
data structure. - Return
true
orfalse
to define if the algorithm must return immediately or continue.
For example, here we stop the optimization when the sum of two variables becomes smaller than 5.0
.
julia> using SPGBox
julia> f(x) = x[1]^4 + (x[2] - 1)^4
f (generic function with 1 method)
julia> function g!(g, x)
g[1] = 4 * x[1]^3
g[2] = 4 * (x[2] - 1)^3
end
g! (generic function with 1 method)
julia> x = [10.0, 18.0];
julia> my_callback(R::SPGBoxResult) = R.x[1] + R.x[2] < 5.0 ? true : false
my_callback (generic function with 1 method)
julia> R = spgbox!(f, g!, x; callback = my_callback)
SPGBOX RESULT:
Returned from callback function.
Final objective function value = 11.341529752085066
Sample of best point = Vector{Float64}[ 1.5429284794371168, 2.543387528602126]
Projected gradient norm = 14.705674573954493
Number of iterations = 10
Number of function evaluations = 10
Input data types
The implementation of SPGBox
is generic for input types of any abstract matrix or vector. Propagation of units, or numbers with arbitrary precision is supported. As minimal example illustrating these features, let us minimize a function where the input variables are given as a matrix of BigFloat
s, with units:
julia> function f(x)
f = zero(typeof(x[begin]^2))
for i in eachindex(x)
f += (x[i] - i*oneunit(eltype(x)))^2
end
return f
end
f (generic function with 1 method)
julia> function g!(g,x)
g .= zero(eltype(x))
for i in eachindex(x)
g[i] = 2*(x[i] - i*oneunit(eltype(x)))
end
return g
end
g! (generic function with 1 method)
julia> x = rand(BigFloat,2,2)u"nm"
2×2 Matrix{Quantity{BigFloat, 𝐋, Unitful.FreeUnits{(nm,), 𝐋, nothing}}}:
0.128083 nm 0.817173 nm
0.139545 nm 0.391047 nm
julia> spgbox(f,g!,x)
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 0.0 nm^2
Sample of best point = Matrix{Quantity{BigFloat, 𝐋, Unitful.FreeUnits{(nm,), 𝐋, nothing}}}[ 1.0 nm, 2.0 nm, 3.0 nm, 4.0 nm]
Projected gradient norm = 0.0 nm
Number of iterations = 2
Number of function evaluations = 3
Data-dependent function evaluation
If the function requires additional parameters, two strategies are possible while preserving performance: 1) Declare the parameters as constants and define an extra method, or 2) Pass the function as an anonymous closure.
Constant parameters and new function and gradient methods
The solver requires a function with a single argument, x
, and a gradient function with two arguments, x
and g
. If the function and gradient evalutions require more parameters, use, for example:
julia> f(x,a,b,c) = a*x[1]^2 + (x[2]-b)^2 + c
julia> const a = 5. ; const b = 2. ; const c = 3. ;
julia> f(x) = f(x,a,b,c)
To preserve performance it is fundamental to declare the parameters, in this case a
, b
, and c
, as constants (using const
), to guarantee their type-stability. This will allow the function specializations and compiler optimizations that make Julia fast.
The gradient function will be defined accordingly:
julia> function g!(g,x,a,b)
g[1] = 2*a*x[1]
g[2] = 2*(x[2]-b)
end
julia> g!(g,x) = g!(g,x,a,b)
The function method which receives only the current point x
, and the gradient method which receives only x
and the gradient vector g
are the ones actually invoked by the solver.
Using anonymous closures
An anonymous closure is a function with a special syntax of the form
x -> f(x)
which should be read as "given x
, return f(x)
". These anonymous functions can be provided directly as arguments to the solver, while providing an interface for using external parameters. Considering the same function and gradient functions above, one uses anonymous functions directly as arguments in the solver call:
julia> R = spgbox!(x -> f(x,a,b,c), (g,x) -> g!(g,x,a,b), x)
where the first argument, x -> f(x,a,b,c)
indicates that the objective function is an anonymous function that, given x
, returns f(x,a,b,c)
. The gradient is evaluated by an anonymous function that, given (g,x)
, returns g!(g,x,a,b)
. This syntax also preserves performance and does not require the parameters to be declared as constants.
Using automatic differentiation (AD)
Julia provides various packages for automatic differentiation, which can be used in combination with SPGBox
. Here, we illustrate the use of ReverseDiff. The only two points that must be taken into consideration are: 1) The AD function must modify an existing gradient vector and 2) use anonymous closures to provide the gradient calculation function to the solver.
Here, a simple example, in which we use ReverseDiff.gradient!
to compute the derivative of a function which is the sum of squares of the variables:
julia> using SPGBox, ReverseDiff
julia> function f(x)
f = 0.
for i in eachindex(x)
f += x[i]^2
end
f
end
julia> x = rand(2)
julia> spgbox!(f, (g,x) -> ReverseDiff.gradient!(g,f,x), x, lower=[-Inf,2.])
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 4.0
Sample of best point = Vector{Float64}[ 0.0, 2.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 3