window_grad {CNN} |
R Documentation |
This is AdaGrad but with a moving window weighted average
Description
so the gradient is not accumulated over the entire history of the run.
it's also referred to as Idea #1 in Zeiler paper on AdaDelta.
Usage
window_grad(batch.size,
l2.decay = 0.001,
ro = 0.95);
Arguments
batch.size
[as integer]
l2.decay
[as double]
ro
[as double]
Details
Authors
MLkit
Value
this function returns data object of type
TrainerAlgorithm.
clr value class
Examples
[Package
CNN version 1.0.0.0
Index]