1. Test Modules
  2. Training Characteristics
    1. Input Learning
      1. Gradient Descent
      2. Conjugate Gradient Descent
      3. Limited-Memory BFGS
    2. Model Learning
      1. Gradient Descent
      2. Conjugate Gradient Descent
      3. Limited-Memory BFGS
    3. Composite Learning
      1. Gradient Descent
      2. Conjugate Gradient Descent
      3. Limited-Memory BFGS
    4. Results
  3. Results

Subreport: Logs for com.simiacryptus.ref.lang.ReferenceCountingBase

Test Modules

Using Seed 8275909129638068224

Training Characteristics

Input Learning

In this apply, we use a network to learn this target input, given it's pre-evaluated output:

TrainingTester.java:332 executed in 0.00 seconds (0.000 gc):

    return RefArrays.stream(RefUtil.addRef(input_target)).flatMap(RefArrays::stream).map(x -> {
      try {
        return x.prettyPrint();
      } finally {
        x.freeRef();
      }
    }).reduce((a, b) -> a + "\n" + b).orElse("");

Returns

    [
    	[ [ -1.688, 1.912, -0.712 ], [ -0.804, 0.028, 1.356 ] ],
    	[ [ 1.048, 1.556, 1.108 ], [ -0.852, 1.512, -1.616 ] ]
    ]
    [
    	[ [ -0.804, -0.852, 1.556 ], [ 1.048, -1.688, 1.912 ] ],
    	[ [ 1.356, -0.712, 1.512 ], [ -1.616, 0.028, 1.108 ] ]
    ]
    [
    	[ [ -0.712, -1.688, -0.852 ], [ -0.804, 1.356, 1.108 ] ],
    	[ [ 1.912, 1.048, 1.512 ], [ 1.556, -1.616, 0.028 ] ]
    ]
    [
    	[ [ -0.852, -1.688, 1.912 ], [ 1.108, -0.712, 1.048 ] ],
    	[ [ 0.028, -1.616, 1.556 ], [ -0.804, 1.512, 1.356 ] ]
    ]
    [
    	[ [ 1.108, 1.356, -0.852 ], [ 1.912, -1.616, -1.688 ] ],
    	[ [ 0.028, -0.712, 1.048 ], [ 1.556, -0.804, 1.512 ] ]
    ]

Gradient Descent

First, we train using basic gradient descent method apply weak line search conditions.

TrainingTester.java:480 executed in 0.47 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new ArmijoWolfeSearch());
      iterativeTrainer.setOrientation(new GradientDescent());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2418012237002
BACKPROP_AGG_SIZE = 3
THREADS = 64
SINGLE_THREADED = false
Initialized CoreSettings = {
"backpropAggregationSize" : 3,
"jvmThreads" : 64,
"singleThreaded" : false
}
Reset training subject: 2418045251432
Constructing line search parameters: GD
th(0)=47.64045858760439;dx=-3.524147200000001E23
New Minimum: 47.64045858760439 > 0.022982962639212467
Armijo: th(2.154434690031884)=0.022982962639212467; dx=-3.52414720000406E11 evalInputDelta=47.61747562496518
Armijo: th(1.077217345015942)=0.1162509684595133; dx=-3.5241472000043414E11 evalInputDelta=47.52420761914488
Armijo: th(0.3590724483386473)=0.23981688203629242; dx=-3.5241472000048206E11 evalInputDelta=47.4006417055681
Armijo: th(0.08976811208466183)=0.3022504262604834; dx=-3.524147200005114E11 evalInputDelta=47.33820816134391
Armijo: th(0.017953622416932366)=0.32924635975796906; dx=-3.5241472000052344E11 evalInputDelta=47.311212227846426
Armijo: th(0.002992270402822061)=0.33499579331507184; dx=-3.524147200005261E11 evalInputDelta=47.30546279428932
Armijo: th(4.2746720040315154E-4)=0.33598539025936647; dx=-3.5241472000052655E11 evalInputDelta=47.304473197345025
Armijo: th(5.343340005039394E-5)=0.3361298044911673; dx=-3.524147200005266E11 evalInputDelta=47.30432878311323
Armijo: th(5.9370444500437714E-6)=0.33614814459359227; dx=-3.524147200005266E11 evalInputDelta=47.3043104430108
Armijo: th(5.937044450043771E-7)=0.33615020788032357; dx=-3.524147200005266E11 evalInputDelta=47.30430837972407
Armijo: th(5.397313136403428E-8)=0.33615041629340825; dx=-3.524147200005266E11 evalInputDelta=47.304308171310986
Armijo: th(4.4977609470028565E-9)=0.3361504353979436; dx=-3.524147200005266E11 evalInputDelta=47.304308152206445
Armijo: th(3.4598161130791205E-10)=0.33615043700112146; dx=-3.524147200005266E11 evalInputDelta=47.30430815060327
Armijo: th(2.4712972236279432E-11)=5.04296096150769; dx=-1.0081280000343383E22 evalInputDelta=42.597497626096704
Armijo: th(1.6475314824186289E-12)=39.78134445004888; dx=-2.590233600006333E23 evalInputDelta=7.859114137555515
Armijo: th(1.029707176511643E-13)=47.64045858760436; dx=-3.524147200000001E23 evalInputDelta=3.552713678800501E-14
Armijo: th(6.057101038303783E-15)=47.64045858760439; dx=-3.524147200000001E23 evalInputDelta=0.0
MIN ALPHA (3.3650561323909904E-16): th(2.154434690031884)=0.022982962639212467
Fitness changed from 47.64045858760439 to 0.022982962639212467
Iteration 1 complete. Error: 0.022982962639212467 Total: 0.2357; Orientation: 0.0035; Line Search: 0.1902
th(0)=0.022982962639212467;dx=-0.6997165753131486
New Minimum: 0.022982962639212467 > 0.022982962639212397
WOLFE (weak): th(2.154434690031884E-15)=0.022982962639212397; dx=-0.6997165753131487 evalInputDelta=6.938893903907228E-17
New Minimum: 0.022982962639212397 > 0.022982962639212328
WOLFE (weak): th(4.308869380063768E-15)=0.022982962639212328; dx=-0.6997165753131486 evalInputDelta=1.3877787807814457E-16
New Minimum: 0.022982962639212328 > 0.022982962639212047
WOLFE (weak): th(1.2926608140191303E-14)=0.022982962639212047; dx=-0.6997165753131486 evalInputDelta=4.198030811863873E-16
New Minimum: 0.022982962639212047 > 0.022982962639210736
WOLFE (weak): th(5.1706432560765214E-14)=0.022982962639210736; dx=-0.6997165753131485 evalInputDelta=1.7312540290248535E-15
New Minimum: 0.022982962639210736 > 0.02298296263920373
WOLFE (weak): th(2.5853216280382605E-13)=0.02298296263920373; dx=-0.6997165753131477 evalInputDelta=8.7360674250192E-15
New Minimum: 0.02298296263920373 > 0.022982962639160043
WOLFE (weak): th(1.5511929768229563E-12)=0.022982962639160043; dx=-0.6997165753131428 evalInputDelta=5.242334344401911E-14
New Minimum: 0.022982962639160043 > 0.022982962638845545
WOLFE (weak): th(1.0858350837760695E-11)=0.022982962638845545; dx=-0.6997165753131077 evalInputDelta=3.6692177074471033E-13
New Minimum: 0.022982962638845545 > 0.022982962636277113
WOLFE (weak): th(8.686680670208556E-11)=0.022982962636277113; dx=-0.6997165753128208 evalInputDelta=2.935353349275971E-12
New Minimum: 0.022982962636277113 > 0.022982962612794266
WOLFE (weak): th(7.8180126031877E-10)=0.022982962612794266; dx=-0.6997165753101984 evalInputDelta=2.641820096016545E-11
New Minimum: 0.022982962612794266 > 0.022982962375030447
WOLFE (weak): th(7.818012603187701E-9)=0.022982962375030447; dx=-0.6997165752836451 evalInputDelta=2.6418202000999536E-10
New Minimum: 0.022982962375030447 > 0.022982959733210222
WOLFE (weak): th(8.599813863506471E-8)=0.022982959733210222; dx=-0.6997165749886092 evalInputDelta=2.9060022443960776E-9
New Minimum: 0.022982959733210222 > 0.022982927767187362
WOLFE (weak): th(1.0319776636207765E-6)=0.022982927767187362; dx=-0.6997165714186744 evalInputDelta=3.487202510435439E-8
New Minimum: 0.022982927767187362 > 0.022982509303199738
WOLFE (weak): th(1.3415709627070094E-5)=0.022982509303199738; dx=-0.6997165246850533 evalInputDelta=4.533360127290109E-7
New Minimum: 0.022982509303199738 > 0.02297661599684138
WOLFE (weak): th(1.878199347789813E-4)=0.02297661599684138; dx=-0.6997158665336467 evalInputDelta=6.346642371087702E-6
New Minimum: 0.02297661599684138 > 0.022887776978713734
WOLFE (weak): th(0.0028172990216847197)=0.022887776978713734; dx=-0.6997059467479128 evalInputDelta=9.51856604987332E-5
New Minimum: 0.022887776978713734 > 0.021463573671755953
WOLFE (weak): th(0.045076784346955515)=0.021463573671755953; dx=-0.6995473184039109 evalInputDelta=0.0015193889674565142
New Minimum: 0.021463573671755953 > 0.00635755554124789
WOLFE (weak): th(0.7663053338982437)=0.00635755554124789; dx=-0.6979052979536065 evalInputDelta=0.016625407097964576
New Minimum: 0.00635755554124789 > 0.0029222714883684947
WOLFE (weak): th(13.793496010168386)=0.0029222714883684947; dx=-0.6976645591792171 evalInputDelta=0.02006069115084397
New Minimum: 0.0029222714883684947 > 0.0
WOLFE (weak): th(262.07642419319933)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
WOLFE (weak): th(5241.528483863986)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
Armijo: th(110072.09816114372)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
Armijo: th(57656.813322503855)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
WOLFE (weak): th(31449.17090318392)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
Armijo: th(44552.99211284389)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
Armijo: th(38001.0815080139)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
Armijo: th(34725.12620559891)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
Armijo: th(33087.14855439142)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
WOLFE (weak): th(32268.15972878767)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
WOLFE (weak): th(32677.65414158954)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
Armijo: th(32882.40134799048)=0.0; dx=-0.6975728268648491 evalInputDelta=0.022982962639212467
mu ~= nu (32677.65414158954): th(262.07642419319933)=0.0
Fitness changed from 0.022982962639212467 to 0.0
Iteration 2 complete. Error: 0.0 Total: 0.1482; Orientation: 0.0011; Line Search: 0.1436
th(0)=0.0;dx=-0.6958329600000002
Armijo: th(70622.42891358322)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(35311.21445679161)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(11770.404818930538)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(2942.6012047326344)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(588.5202409465269)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(98.08670682442114)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(14.012386689203021)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(1.7515483361503776)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(0.1946164817944864)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(0.01946164817944864)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(0.0017692407435862402)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(1.4743672863218668E-4)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(1.1341286817860514E-5)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(8.100919155614653E-7)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(5.4006127704097684E-8)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(3.3753829815061053E-9)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(1.9855194008859441E-10)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(1.1030663338255245E-11)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(5.805612283292234E-13)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(2.902806141646117E-14)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Armijo: th(1.3822886388791035E-15)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
MIN ALPHA (6.283130176723198E-17): th(0.0)=0.0
Fitness changed from 0.0 to 0.0
Static Iteration Total: 0.0773; Orientation: 0.0007; Line Search: 0.0750
Iteration 3 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 3
Final threshold in iteration 3: 0.0 (> 0.0) after 0.462s (< 30.000s)

Returns

    0.0

Training Converged

Conjugate Gradient Descent

First, we use a conjugate gradient descent method, which converges the fastest for purely linear functions.

TrainingTester.java:452 executed in 0.15 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new QuadraticSearch());
      iterativeTrainer.setOrientation(new GradientDescent());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2418479703762
Reset training subject: 2418481476473
Constructing line search parameters: GD
F(0.0) = LineSearchPoint{point=PointSample{avg=47.64045858760439}, derivative=-3.524147200000001E23}
New Minimum: 47.64045858760439 > 0.33615043709610537
F(1.0E-10) = LineSearchPoint{point=PointSample{avg=0.33615043709610537}, derivative=-3.524147200005266E11}, evalInputDelta = -47.304308150508284
New Minimum: 0.33615043709610537 > 0.33615043686442
F(7.000000000000001E-10) = LineSearchPoint{point=PointSample{avg=0.33615043686442}, derivative=-3.524147200005266E11}, evalInputDelta = -47.30430815073997
New Minimum: 0.33615043686442 > 0.33615043524262217
F(4.900000000000001E-9) = LineSearchPoint{point=PointSample{avg=0.33615043524262217}, derivative=-3.524147200005266E11}, evalInputDelta = -47.30430815236177
New Minimum: 0.33615043524262217 > 0.3361504238900372
F(3.430000000000001E-8) = LineSearchPoint{point=PointSample{avg=0.3361504238900372}, derivative=-3.524147200005266E11}, evalInputDelta = -47.30430816371435
New Minimum: 0.3361504238900372 > 0.33615034442194736
F(2.4010000000000004E-7) = LineSearchPoint{point=PointSample{avg=0.33615034442194736}, derivative=-3.524147200005266E11}, evalInputDelta = -47.30430824318245
New Minimum: 0.33615034442194736 > 0.3361497881455302
F(1.6807000000000003E-6) = LineSearchPoint{point=PointSample{avg=0.3361497881455302}, derivative=-3.524147200005266E11}, evalInputDelta = -47.30430879945886
New Minimum: 0.3361497881455302 > 0.33614589422098684
F(1.1764900000000001E-5) = LineSearchPoint{point=PointSample{avg=0.33614589422098684}, derivative=-3.524147200005266E11}, evalInputDelta = -47.30431269338341
New Minimum: 0.33614589422098684 > 0.3361186372576243
F(8.235430000000001E-5) = LineSearchPoint{point=PointSample{avg=0.3361186372576243}, derivative=-3.524147200005266E11}, evalInputDelta = -47.304339950346765
New Minimum: 0.3361186372576243 > 0.33592786342202857
F(5.764801000000001E-4) = LineSearchPoint{point=PointSample{avg=0.33592786342202857}, derivative=-3.524147200005265E11}, evalInputDelta = -47.304530724182364
New Minimum: 0.33592786342202857 > 0.3345936651137063
F(0.004035360700000001) = LineSearchPoint{point=PointSample{avg=0.3345936651137063}, derivative=-3.524147200005259E11}, evalInputDelta = -47.305864922490684
New Minimum: 0.3345936651137063 > 0.3253133272268165
F(0.028247524900000005) = LineSearchPoint{point=PointSample{avg=0.3253133272268165}, derivative=-3.5241472000052167E11}, evalInputDelta = -47.31514526037758
New Minimum: 0.3253133272268165 > 0.27372880559296486
F(0.19773267430000002) = LineSearchPoint{point=PointSample{avg=0.27372880559296486}, derivative=-3.524147200004982E11}, evalInputDelta = -47.366729782011426
New Minimum: 0.27372880559296486 > 0.08648003136160469
F(1.3841287201) = LineSearchPoint{point=PointSample{avg=0.08648003136160469}, derivative=-3.524147200004241E11}, evalInputDelta = -47.55397855624279
New Minimum: 0.08648003136160469 > 0.002287648957225791
F(9.688901040700001) = LineSearchPoint{point=PointSample{avg=0.002287648957225791}, derivative=-3.524147200004016E11}, evalInputDelta = -47.638170938647164
New Minimum: 0.002287648957225791 > 0.0
F(67.8223072849) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004014E11}, evalInputDelta = -47.64045858760439
F(474.7561509943) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004015E11}, evalInputDelta = -47.64045858760439
F(3323.2930569601003) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004014E11}, evalInputDelta = -47.64045858760439
F(23263.0513987207) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004015E11}, evalInputDelta = -47.64045858760439
F(162841.3597910449) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004015E11}, evalInputDelta = -47.64045858760439
F(1139889.5185373144) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004015E11}, evalInputDelta = -47.64045858760439
F(7979226.6297612) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004015E11}, evalInputDelta = -47.64045858760439
F(5.58545864083284E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004014E11}, evalInputDelta = -47.64045858760439
F(3.909821048582988E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004014E11}, evalInputDelta = -47.64045858760439
F(2.7368747340080914E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004015E11}, evalInputDelta = -47.64045858760439
F(1.915812313805664E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004015E11}, evalInputDelta = -47.64045858760439
0.0 <= 47.64045858760439
F(1.0E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.524147200004015E11}, evalInputDelta = -47.64045858760439
Right bracket at 1.0E10
Converged to right
Fitness changed from 47.64045858760439 to 0.0
Iteration 1 complete. Error: 0.0 Total: 0.0722; Orientation: 0.0010; Line Search: 0.0662
F(0.0) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}
F(1.0E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
0.0 <= 0.0
F(5.0E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 5.0E9
F(2.5E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 2.5E9
F(1.25E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 1.25E9
F(6.25E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 6.25E8
F(3.125E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 3.125E8
F(1.5625E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 1.5625E8
F(7.8125E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 7.8125E7
F(3.90625E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 3.90625E7
F(1.953125E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 1.953125E7
F(9765625.0) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 9765625.0
F(4882812.5) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Right bracket at 4882812.5
F(2441406.25) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-0.6958329600000002}, evalInputDelta = 0.0
Loops = 12
Fitness changed from 0.0 to 0.0
Static Iteration Total: 0.0801; Orientation: 0.0006; Line Search: 0.0670
Iteration 2 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 2
Final threshold in iteration 2: 0.0 (> 0.0) after 0.153s (< 30.000s)

Returns

    0.0

Training Converged

Limited-Memory BFGS

Next, we apply the same optimization using L-BFGS, which is nearly ideal for purely second-order or quadratic functions.

TrainingTester.java:509 executed in 2.37 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new ArmijoWolfeSearch());
      iterativeTrainer.setOrientation(new LBFGS());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setIterationsPerSample(100);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2418637657342
Reset training subject: 2418638659902
Adding measurement 9183476 to history. Total: 0
LBFGS Accumulation History: 1 points
Constructing line search parameters: GD
Non-optimal measurement 47.64045858760439 < 47.64045858760439. Total: 1
th(0)=47.64045858760439;dx=-3.524147200000001E23
Adding measurement 5b2751be to history. Total: 1
New Minimum: 47.64045858760439 > 0.022982962639212467
Armijo: th(2.154434690031884)=0.022982962639212467; dx=-3.52414720000406E11 evalInputDelta=47.61747562496518
Non-optimal measurement 0.1162509684595133 < 0.022982962639212467. Total: 2
Armijo: th(1.077217345015942)=0.1162509684595133; dx=-3.5241472000043414E11 evalInputDelta=47.52420761914488
Non-optimal measurement 0.23981688203629242 < 0.022982962639212467. Total: 2
Armijo: th(0.3590724483386473)=0.23981688203629242; dx=-3.5241472000048206E11 evalInputDelta=47.4006417055681
Non-optimal measurement 0.3022504262604834 < 0.022982962639212467. Total: 2
Armijo: th(0.08976811208466183)=0.3022504262604834; dx=-3.524147200005114E11 evalInputDelta=47.33820816134391
Non-optimal measurement 0.32924635975796906 < 0.022982962639212467. Total: 2
Armijo: th(0.017953622416932366)=0.32924635975796906; dx=-3.5241472000052344E11 evalInputDelta=47.311212227846426
Non-optimal measurement 0.33499579331507184 < 0.022982962639212467. Total: 2
Armijo: th(0.002992270402822061)=0.33499579331507184; dx=-3.524147200005261E11 evalInputDelta=47.30546279428932
Non-optimal measurement 0.33598539025936647 < 0.022982962639212467. Total: 2
Armijo: th(4.2746720040315154E-4)=0.33598539025936647; dx=-3.5241472000052655E11 evalInputDelta=47.304473197345025
Non-optimal measurement 0.3361298044911673 < 0.022982962639212467. Total: 2
Armijo: th(5.343340005039394E-5)=0.3361298044911673; dx=-3.524147200005266E11 evalInputDelta=47.30432878311323
Non-optimal measurement 0.33614814459359227 < 0.022982962639212467. Total: 2
Armijo: th(5.9370444500437714E-6)=0.33614814459359227; dx=-3.524147200005266E11 evalInputDelta=47.3043104430108
Non-optimal measurement 0.33615020788032357 < 0.022982962639212467. Total: 2
Armijo: th(5.937044450043771E-7)=0.33615020788032357; dx=-3.524147200005266E11 evalInputDelta=47.30430837972407
Non-optimal measurement 0.33615041629340825 < 0.022982962639212467. Total: 2
Armijo: th(5.397313136403428E-8)=0.33615041629340825; dx=-3.524147200005266E11 evalInputDelta=47.304308171310986
Non-optimal measurement 0.3361504353979436 < 0.022982962639212467. Total: 2
Armijo: th(4.4977609470028565E-9)=0.3361504353979436; dx=-3.524147200005266E11 evalInputDelta=47.304308152206445
Non-optimal measurement 0.33615043700112146 < 0.022982962639212467. Total: 2
Armijo: th(3.4598161130791205E-10)=0.33615043700112146; dx=-3.524147200005266E11 evalInputDelta=47.30430815060327
Non-optimal measurement 5.04296096150769 < 0.022982962639212467. Total: 2
Armijo: th(2.4712972236279432E-11)=5.04296096150769; dx=-1.0081280000343383E22 evalInputDelta=42.597497626096704
Non-optimal measurement 39.78134445004888 < 0.022982962639212467. Total: 2
Armijo: th(1.6475314824186289E-12)=39.78134445004888; dx=-2.5902336000063332E23 evalInputDelta=7.859114137555515
Non-optimal measurement 47.64045858760436 < 0.022982962639212467. Total: 2
Armijo: th(1.029707176511643E-13)=47.64045858760436; dx=-3.524147200000001E23 evalInputDelta=3.552713678800501E-14
Non-optimal measurement 47.64045858760439 < 0.022982962639212467. Total: 2
Armijo: th(6.057101038303783E-15)=47.64045858760439; dx=-3.524147200000001E23 evalInputDelta=0.0
Non-optimal measurement 0.022982962639212467 < 0.022982962639212467. Total: 2
MIN ALPHA (3.3650561323909904E-16): th(2.154434690031884)=0.022982962639212467
Fitness changed from 47.64045858760439 to 0.022982962639212467
Iteration 1 complete. Error: 0.022982962639212467 Total: 0.0493; Orientation: 0.0028; Line Search: 0.0436
Non-optimal measurement 0.022982962639212467 < 0.022982962639212467. Total: 2
LBFGS Accumulation History: 2 points
Non-optimal measurement 0.022982962639212467 < 0.022982962639212467. Total: 2
th(0)=0.022982962639212467;dx=-0.6997165753131487
Adding measurement 3ff4ef81 to history. Total: 2
New Minimum: 0.022982962639212467 > 0.022982962639212397
WOLFE (weak): th(2.154434690031884E-15)=0.022982962639212397; dx=-0.6997165753131487 evalInputDelta=6.938893903907228E-17
Adding measurement 456ef468 to history. Total: 3
New Minimum: 0.022982962639212397 > 0.022982962639212328
WOLFE (weak): th(4.308869380063768E-15)=0.022982962639212328; dx=-0.6997165753131487 evalInputDelta=1.3877787807814457E-16
Adding measurement 46c2d982 to history. Total: 4
New Minimum: 0.022982962639212328 > 0.022982962639212047
WOLFE (weak): th(1.2926608140191303E-14)=0.022982962639212047; dx=-0.6997165753131486 evalInputDelta=4.198030811863873E-16
Adding measurement 7a415d9a to history. Total: 5
New Minimum: 0.022982962639212047 > 0.022982962639210736
WOLFE (weak): th(5.1706432560765214E-14)=0.022982962639210736; dx=-0.6997165753131485 evalInputDelta=1.7312540290248535E-15
Adding measurement 708e5317 to history. Total: 6
New Minimum: 0.022982962639210736 > 0.02298296263920373
WOLFE (weak): th(2.5853216280382605E-13)=0.02298296263920373; dx=-0.6997165753131477 evalInputDelta=8.7360674250192E-15
Adding measurement 4c39716c to history. Total: 7
New Minimum: 0.02298296263920373 > 0.022982962639160043
WOLFE (weak): th(1.5511929768229563E-12)=0.022982962639160043; dx=-0.6997165753131428 evalInputDelta=5.242334344401911E-14
Adding measurement 5b47a4ab to history. Total: 8
New Minimum: 0.022982962639160043 > 0.022982962638845545
WOLFE (weak): th(1.0858350837760695E-11)=0.022982962638845545; dx=-0.6997165753131076 evalInputDelta=3.6692177074471033E-13
Adding measurement 79b60f60 to history. Total: 9
New Minimum: 0.022982962638845545 > 0.022982962636277113
WOLFE (weak): th(8.686680670208556E-11)=0.022982962636277113; dx=-0.6997165753128208 evalInputDelta=2.935353349275971E-12
Adding measurement 19e5113c to history. Total: 10
New Minimum: 0.022982962636277113 > 0.022982962612794266
WOLFE (weak): th(7.8180126031877E-10)=0.022982962612794266; dx=-0.6997165753101983 evalInputDelta=2.641820096016545E-11
Adding measurement 73ad7685 to history. Total: 11
New Minimum: 0.022982962612794266 > 0.022982962375030447
WOLFE (weak): th(7.818012603187701E-9)=0.022982962375030447; dx=-0.6997165752836451 evalInputDelta=2.6418202000999536E-10
Adding measurement 442596f3 to history. Total: 12
New Minimum: 0.022982962375030447 > 0.022982959733210222
WOLFE (weak): th(8.599813863506471E-8)=0.022982959733210222; dx=-0.6997165749886092 evalInputDelta=2.9060022443960776E-9
Adding measurement 4831ea5e to history. Total: 13
New Minimum: 0.022982959733210222 > 0.022982927767187362
WOLFE (weak): th(1.0319776636207765E-6)=0.022982927767187362; dx=-0.6997165714186744 evalInputDelta=3.487202510435439E-8
Adding measurement 2689cdac to history. Total: 14
New Minimum: 0.022982927767187362 > 0.022982509303199738
WOLFE (weak): th(1.3415709627070094E-5)=0.022982509303199738; dx=-0.6997165246850534 evalInputDelta=4.533360127290109E-7
Adding measurement 2f6d5f09 to history. Total: 15
New Minimum: 0.022982509303199738 > 0.02297661599684138
WOLFE (weak): th(1.878199347789813E-4)=0.02297661599684138; dx=-0.6997158665336467 evalInputDelta=6.346642371087702E-6
Adding measurement 3a5fb0c7 to history. Total: 16
New Minimum: 0.02297661599684138 > 0.022887776978713734
WOLFE (weak): th(0.0028172990216847197)=0.022887776978713734; dx=-0.6997059467479128 evalInputDelta=9.51856604987332E-5
Adding measurement 2d45e365 to history. Total: 17
New Minimum: 0.022887776978713734 > 0.021463573671755953
WOLFE (weak): th(0.045076784346955515)=0.021463573671755953; dx=-0.6995473184039109 evalInputDelta=0.0015193889674565142
Adding measurement 48bc5d96 to history. Total: 18
New Minimum: 0.021463573671755953 > 0.00635755554124789
WOLFE (weak): th(0.7663053338982437)=0.00635755554124789; dx=-0.6979052979536065 evalInputDelta=0.016625407097964576
Adding measurement e569833 to history. Total: 19
New Minimum: 0.00635755554124789 > 0.0029222714883684947
WOLFE (weak): th(13.793496010168386)=0.0029222714883684947; dx=-0.6976645591792171 evalInputDelta=0.020060691150

...skipping 9869 bytes...

86-4fec-b30c-47d51d24937e = 1.000/1.000e+00, d080cf61-cdd1-45f2-b967-7b3233ef261f = 1.000/1.000e+00]
Orientation rejected. Popping history element from 0.0, 0.0029222714883684947, 0.00635755554124789, 0.021463573671755953, 0.022887776978713734, 0.02297661599684138, 0.022982509303199738, 0.022982927767187362, 0.022982959733210222, 0.022982962375030447, 0.022982962612794266
Rejected: LBFGS Orientation magnitude: 2.376e+02, gradient 8.342e-01, dot -0.969; [86681499-09b9-406e-adad-c555398a4768 = 1.000/1.000e+00, 1635672f-c1e8-417b-9f3a-0090cd2b2f4e = 1.000/1.000e+00, 7c1af3ac-4e86-4fec-b30c-47d51d24937e = 1.000/1.000e+00, d080cf61-cdd1-45f2-b967-7b3233ef261f = 1.000/1.000e+00, 38e6b7e8-25d3-4d65-bf53-bfa3f32be45a = 1.000/1.000e+00]
Orientation rejected. Popping history element from 0.0, 0.0029222714883684947, 0.00635755554124789, 0.021463573671755953, 0.022887776978713734, 0.02297661599684138, 0.022982509303199738, 0.022982927767187362, 0.022982959733210222, 0.022982962375030447
Rejected: LBFGS Orientation magnitude: 2.376e+02, gradient 8.342e-01, dot -0.969; [7c1af3ac-4e86-4fec-b30c-47d51d24937e = 1.000/1.000e+00, 86681499-09b9-406e-adad-c555398a4768 = 1.000/1.000e+00, d080cf61-cdd1-45f2-b967-7b3233ef261f = 1.000/1.000e+00, 1635672f-c1e8-417b-9f3a-0090cd2b2f4e = 1.000/1.000e+00, 38e6b7e8-25d3-4d65-bf53-bfa3f32be45a = 1.000/1.000e+00]
Orientation rejected. Popping history element from 0.0, 0.0029222714883684947, 0.00635755554124789, 0.021463573671755953, 0.022887776978713734, 0.02297661599684138, 0.022982509303199738, 0.022982927767187362, 0.022982959733210222
Rejected: LBFGS Orientation magnitude: 2.376e+02, gradient 8.342e-01, dot -0.969; [38e6b7e8-25d3-4d65-bf53-bfa3f32be45a = 1.000/1.000e+00, 1635672f-c1e8-417b-9f3a-0090cd2b2f4e = 1.000/1.000e+00, 7c1af3ac-4e86-4fec-b30c-47d51d24937e = 1.000/1.000e+00, d080cf61-cdd1-45f2-b967-7b3233ef261f = 1.000/1.000e+00, 86681499-09b9-406e-adad-c555398a4768 = 1.000/1.000e+00]
Orientation rejected. Popping history element from 0.0, 0.0029222714883684947, 0.00635755554124789, 0.021463573671755953, 0.022887776978713734, 0.02297661599684138, 0.022982509303199738, 0.022982927767187362
Rejected: LBFGS Orientation magnitude: 3.267e+02, gradient 8.342e-01, dot -0.955; [38e6b7e8-25d3-4d65-bf53-bfa3f32be45a = 1.000/1.000e+00, d080cf61-cdd1-45f2-b967-7b3233ef261f = 1.000/1.000e+00, 86681499-09b9-406e-adad-c555398a4768 = 1.000/1.000e+00, 1635672f-c1e8-417b-9f3a-0090cd2b2f4e = 1.000/1.000e+00, 7c1af3ac-4e86-4fec-b30c-47d51d24937e = 1.000/1.000e+00]
Orientation rejected. Popping history element from 0.0, 0.0029222714883684947, 0.00635755554124789, 0.021463573671755953, 0.022887776978713734, 0.02297661599684138, 0.022982509303199738
Rejected: LBFGS Orientation magnitude: 3.035e+02, gradient 8.342e-01, dot -0.995; [7c1af3ac-4e86-4fec-b30c-47d51d24937e = 1.000/1.000e+00, 38e6b7e8-25d3-4d65-bf53-bfa3f32be45a = 1.000/1.000e+00, d080cf61-cdd1-45f2-b967-7b3233ef261f = 1.000/1.000e+00, 1635672f-c1e8-417b-9f3a-0090cd2b2f4e = 1.000/1.000e+00, 86681499-09b9-406e-adad-c555398a4768 = 1.000/1.000e+00]
Orientation rejected. Popping history element from 0.0, 0.0029222714883684947, 0.00635755554124789, 0.021463573671755953, 0.022887776978713734, 0.02297661599684138
Rejected: LBFGS Orientation magnitude: 3.059e+02, gradient 8.342e-01, dot -0.995; [d080cf61-cdd1-45f2-b967-7b3233ef261f = 1.000/1.000e+00, 1635672f-c1e8-417b-9f3a-0090cd2b2f4e = 1.000/1.000e+00, 7c1af3ac-4e86-4fec-b30c-47d51d24937e = 1.000/1.000e+00, 38e6b7e8-25d3-4d65-bf53-bfa3f32be45a = 1.000/1.000e+00, 86681499-09b9-406e-adad-c555398a4768 = 1.000/1.000e+00]
Orientation rejected. Popping history element from 0.0, 0.0029222714883684947, 0.00635755554124789, 0.021463573671755953, 0.022887776978713734
Rejected: LBFGS Orientation magnitude: 5.051e+02, gradient 8.342e-01, dot -0.996; [1635672f-c1e8-417b-9f3a-0090cd2b2f4e = 1.000/1.000e+00, 7c1af3ac-4e86-4fec-b30c-47d51d24937e = 1.000/1.000e+00, d080cf61-cdd1-45f2-b967-7b3233ef261f = 1.000/1.000e+00, 38e6b7e8-25d3-4d65-bf53-bfa3f32be45a = 1.000/1.000e+00, 86681499-09b9-406e-adad-c555398a4768 = 1.000/1.000e+00]
Orientation rejected. Popping history element from 0.0, 0.0029222714883684947, 0.00635755554124789, 0.021463573671755953
LBFGS Accumulation History: 3 points
Removed measurement 78da7033 to history. Total: 20
Removed measurement e569833 to history. Total: 19
Removed measurement 48bc5d96 to history. Total: 18
Removed measurement 2d45e365 to history. Total: 17
Removed measurement 3a5fb0c7 to history. Total: 16
Removed measurement 2f6d5f09 to history. Total: 15
Removed measurement 2689cdac to history. Total: 14
Removed measurement 4831ea5e to history. Total: 13
Removed measurement 442596f3 to history. Total: 12
Removed measurement 73ad7685 to history. Total: 11
Removed measurement 19e5113c to history. Total: 10
Removed measurement 79b60f60 to history. Total: 9
Removed measurement 5b47a4ab to history. Total: 8
Removed measurement 4c39716c to history. Total: 7
Removed measurement 708e5317 to history. Total: 6
Removed measurement 7a415d9a to history. Total: 5
Removed measurement 46c2d982 to history. Total: 4
Removed measurement 456ef468 to history. Total: 3
Adding measurement 7575a84e to history. Total: 3
th(0)=0.0;dx=-0.6958329600000002
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(70622.42891358322)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(35311.21445679161)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(11770.404818930538)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(2942.6012047326344)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(588.5202409465269)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(98.08670682442114)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(14.012386689203021)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(1.7515483361503776)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(0.1946164817944864)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(0.01946164817944864)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(0.0017692407435862402)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(1.4743672863218668E-4)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(1.1341286817860514E-5)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(8.100919155614653E-7)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(5.4006127704097684E-8)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(3.3753829815061053E-9)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(1.9855194008859441E-10)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(1.1030663338255245E-11)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(5.805612283292234E-13)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(2.902806141646117E-14)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
Armijo: th(1.3822886388791035E-15)=0.0; dx=-0.6958329600000002 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 4
MIN ALPHA (6.283130176723198E-17): th(0.0)=0.0
Fitness changed from 0.0 to 0.0
Static Iteration Total: 2.2329; Orientation: 2.1730; Line Search: 0.0583
Iteration 3 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 3
Final threshold in iteration 3: 0.0 (> 0.0) after 2.367s (< 30.000s)

Returns

    0.0

Training Converged

TrainingTester.java:432 executed in 0.11 seconds (0.000 gc):

    return TestUtil.compare(title + " vs Iteration", runs);
Logging
Plotting range=[1.0, -2.638593988923217], [2.0, -0.6385939889232168]; valueStats=DoubleSummaryStatistics{count=2, sum=0.045966, min=0.022983, average=0.022983, max=0.022983}
Plotting 2 points for GD
Only 1 points for CjGD
Plotting 2 points for LBFGS

Returns

Result

TrainingTester.java:435 executed in 0.01 seconds (0.000 gc):

    return TestUtil.compareTime(title + " vs Time", runs);
Logging
Plotting range=[0.0, -2.638593988923217], [0.148, -0.6385939889232168]; valueStats=DoubleSummaryStatistics{count=2, sum=0.045966, min=0.022983, average=0.022983, max=0.022983}
Plotting 2 points for GD
Only 1 points for CjGD
Plotting 2 points for LBFGS

Returns

Result

Model Learning

In this apply, attempt to train a network to emulate a randomized network given an example input/output. The target state is:

TrainingTester.java:370 executed in 0.00 seconds (0.000 gc):

    RefList<double[]> temp_18_0042 = network_target.state();
    assert temp_18_0042 != null;
    String temp_18_0041 = temp_18_0042.stream().map(RefArrays::toString).reduce((a, b) -> a + "\n" + b).orElse("");
    temp_18_0042.freeRef();
    return temp_18_0041;

Returns

    [-0.384, -1.028, -1.72]

Gradient Descent

First, we train using basic gradient descent method apply weak line search conditions.

TrainingTester.java:480 executed in 0.03 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new ArmijoWolfeSearch());
      iterativeTrainer.setOrientation(new GradientDescent());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2421357145684
Reset training subject: 2421358785758
Constructing line search parameters: GD
th(0)=42.128889865140266;dx=-2.2668313600047933E24
New Minimum: 42.128889865140266 > 0.0
Armijo: th(2.154434690031884)=0.0; dx=-2.726942720003644E12 evalInputDelta=42.128889865140266
Armijo: th(1.077217345015942)=0.0017409562538554559; dx=-2.726942720003644E12 evalInputDelta=42.12714890888641
Armijo: th(0.3590724483386473)=0.006892413810504286; dx=-2.7269427200036445E12 evalInputDelta=42.12199745132976
Armijo: th(0.08976811208466183)=0.01701975336754217; dx=-2.726942720003653E12 evalInputDelta=42.111870111772724
Armijo: th(0.017953622416932366)=0.02133515271884993; dx=-2.726942720003657E12 evalInputDelta=42.10755471242142
Armijo: th(0.002992270402822061)=0.022974350725975968; dx=-2.7269427200036587E12 evalInputDelta=42.10591551441429
Armijo: th(4.2746720040315154E-4)=0.02325774434686041; dx=-2.7269427200036587E12 evalInputDelta=42.10563212079341
Armijo: th(5.343340005039394E-5)=0.023299131719747325; dx=-2.7269427200036587E12 evalInputDelta=42.10559073342052
Armijo: th(5.9370444500437714E-6)=0.023304388338532742; dx=-2.7269427200036587E12 evalInputDelta=42.105585476801735
Armijo: th(5.937044450043771E-7)=0.023304979723372095; dx=-2.7269427200036587E12 evalInputDelta=42.105584885416896
Armijo: th(5.397313136403428E-8)=0.02330503945938569; dx=-2.7269427200036587E12 evalInputDelta=42.10558482568088
Armijo: th(4.4977609470028565E-9)=0.023305044935188507; dx=-2.7269427200036587E12 evalInputDelta=42.10558482020508
Armijo: th(3.4598161130791205E-10)=0.02330504539469645; dx=-2.7269427200036587E12 evalInputDelta=42.10558481974557
Armijo: th(2.4712972236279432E-11)=0.023305045430253617; dx=-2.7269427200036587E12 evalInputDelta=42.10558481971001
Armijo: th(1.6475314824186289E-12)=0.023305045432806433; dx=-2.7269427200036587E12 evalInputDelta=42.10558481970746
Armijo: th(1.029707176511643E-13)=41.94799798981645; dx=-2.2668313600037228E24 evalInputDelta=0.180891875323816
Armijo: th(6.057101038303783E-15)=42.1147087289793; dx=-2.2668313600046843E24 evalInputDelta=0.014181136160964058
MIN ALPHA (3.3650561323909904E-16): th(2.154434690031884)=0.0
Fitness changed from 42.128889865140266 to 0.0
Iteration 1 complete. Error: 0.0 Total: 0.0261; Orientation: 0.0003; Line Search: 0.0198
th(0)=0.0;dx=-4.025943680000001
Armijo: th(2.154434690031884E-15)=0.0; dx=-4.025943680000001 evalInputDelta=0.0
Armijo: th(1.077217345015942E-15)=0.0; dx=-4.025943680000001 evalInputDelta=0.0
MIN ALPHA (3.5907244833864734E-16): th(0.0)=0.0
Fitness changed from 0.0 to 0.0
Static Iteration Total: 0.0051; Orientation: 0.0002; Line Search: 0.0039
Iteration 2 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 2
Final threshold in iteration 2: 0.0 (> 0.0) after 0.032s (< 30.000s)

Returns

    0.0

Training Converged

Conjugate Gradient Descent

First, we use a conjugate gradient descent method, which converges the fastest for purely linear functions.

TrainingTester.java:452 executed in 0.05 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new QuadraticSearch());
      iterativeTrainer.setOrientation(new GradientDescent());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2421392309046
Reset training subject: 2421393628540
Constructing line search parameters: GD
F(0.0) = LineSearchPoint{point=PointSample{avg=42.128889865140266}, derivative=-2.2668313600047933E24}
New Minimum: 42.128889865140266 > 0.02330504542192105
F(1.0E-10) = LineSearchPoint{point=PointSample{avg=0.02330504542192105}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.105584819718345
New Minimum: 0.02330504542192105 > 0.023305045355514635
F(7.000000000000001E-10) = LineSearchPoint{point=PointSample{avg=0.023305045355514635}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.10558481978475
New Minimum: 0.023305045355514635 > 0.023305044890669757
F(4.900000000000001E-9) = LineSearchPoint{point=PointSample{avg=0.023305044890669757}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.1055848202496
New Minimum: 0.023305044890669757 > 0.02330504163675568
F(3.430000000000001E-8) = LineSearchPoint{point=PointSample{avg=0.02330504163675568}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.10558482350351
New Minimum: 0.02330504163675568 > 0.02330501885935974
F(2.4010000000000004E-7) = LineSearchPoint{point=PointSample{avg=0.02330501885935974}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.105584846280905
New Minimum: 0.02330501885935974 > 0.023304859417716022
F(1.6807000000000003E-6) = LineSearchPoint{point=PointSample{avg=0.023304859417716022}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.10558500572255
New Minimum: 0.023304859417716022 > 0.023303743332477665
F(1.1764900000000001E-5) = LineSearchPoint{point=PointSample{avg=0.023303743332477665}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.10558612180779
New Minimum: 0.023303743332477665 > 0.023295931042903933
F(8.235430000000001E-5) = LineSearchPoint{point=PointSample{avg=0.023295931042903933}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.105593934097364
New Minimum: 0.023295931042903933 > 0.02324126005744106
F(5.764801000000001E-4) = LineSearchPoint{point=PointSample{avg=0.02324126005744106}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.10564860508283
New Minimum: 0.02324126005744106 > 0.022859298112692036
F(0.004035360700000001) = LineSearchPoint{point=PointSample{avg=0.022859298112692036}, derivative=-2.7269427200036587E12}, evalInputDelta = -42.106030567027574
New Minimum: 0.022859298112692036 > 0.020220878935683723
F(0.028247524900000005) = LineSearchPoint{point=PointSample{avg=0.020220878935683723}, derivative=-2.726942720003656E12}, evalInputDelta = -42.108668986204584
New Minimum: 0.020220878935683723 > 0.012171736855254247
F(0.19773267430000002) = LineSearchPoint{point=PointSample{avg=0.012171736855254247}, derivative=-2.7269427200036484E12}, evalInputDelta = -42.116718128285015
New Minimum: 0.012171736855254247 > 0.0
F(1.3841287201) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(9.688901040700001) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(67.8223072849) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(474.7561509943) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(3323.2930569601003) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(23263.0513987207) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(162841.3597910449) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(1139889.5185373144) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(7979226.6297612) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(5.58545864083284E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(3.909821048582988E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(2.7368747340080914E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
F(1.915812313805664E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
0.0 <= 42.128889865140266
F(1.0E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.726942720003644E12}, evalInputDelta = -42.128889865140266
Right bracket at 1.0E10
Converged to right
Fitness changed from 42.128889865140266 to 0.0
Iteration 1 complete. Error: 0.0 Total: 0.0285; Orientation: 0.0002; Line Search: 0.0253
F(0.0) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}
F(1.0E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
0.0 <= 0.0
F(5.0E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 5.0E9
F(2.5E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 2.5E9
F(1.25E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 1.25E9
F(6.25E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 6.25E8
F(3.125E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 3.125E8
F(1.5625E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 1.5625E8
F(7.8125E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 7.8125E7
F(3.90625E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 3.90625E7
F(1.953125E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 1.953125E7
F(9765625.0) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 9765625.0
F(4882812.5) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Right bracket at 4882812.5
F(2441406.25) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-4.025943680000001}, evalInputDelta = 0.0
Loops = 12
Fitness changed from 0.0 to 0.0
Static Iteration Total: 0.0223; Orientation: 0.0002; Line Search: 0.0214
Iteration 2 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 2
Final threshold in iteration 2: 0.0 (> 0.0) after 0.050s (< 30.000s)

Returns

    0.0

Training Converged

Limited-Memory BFGS

Next, we apply the same optimization using L-BFGS, which is nearly ideal for purely second-order or quadratic functions.

TrainingTester.java:509 executed in 0.03 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new ArmijoWolfeSearch());
      iterativeTrainer.setOrientation(new LBFGS());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setIterationsPerSample(100);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2421447641636
Reset training subject: 2421448886562
Adding measurement 5cd13a0c to history. Total: 0
LBFGS Accumulation History: 1 points
Constructing line search parameters: GD
Non-optimal measurement 42.128889865140266 < 42.128889865140266. Total: 1
th(0)=42.128889865140266;dx=-2.2668313600047933E24
Adding measurement 3a72ebd9 to history. Total: 1
New Minimum: 42.128889865140266 > 0.0
Armijo: th(2.154434690031884)=0.0; dx=-2.726942720003644E12 evalInputDelta=42.128889865140266
Non-optimal measurement 0.0017409562538554559 < 0.0. Total: 2
Armijo: th(1.077217345015942)=0.0017409562538554559; dx=-2.726942720003644E12 evalInputDelta=42.12714890888641
Non-optimal measurement 0.006892413810504286 < 0.0. Total: 2
Armijo: th(0.3590724483386473)=0.006892413810504286; dx=-2.7269427200036445E12 evalInputDelta=42.12199745132976
Non-optimal measurement 0.01701975336754217 < 0.0. Total: 2
Armijo: th(0.08976811208466183)=0.01701975336754217; dx=-2.726942720003653E12 evalInputDelta=42.111870111772724
Non-optimal measurement 0.02133515271884993 < 0.0. Total: 2
Armijo: th(0.017953622416932366)=0.02133515271884993; dx=-2.726942720003657E12 evalInputDelta=42.10755471242142
Non-optimal measurement 0.022974350725975968 < 0.0. Total: 2
Armijo: th(0.002992270402822061)=0.022974350725975968; dx=-2.7269427200036587E12 evalInputDelta=42.10591551441429
Non-optimal measurement 0.02325774434686041 < 0.0. Total: 2
Armijo: th(4.2746720040315154E-4)=0.02325774434686041; dx=-2.7269427200036587E12 evalInputDelta=42.10563212079341
Non-optimal measurement 0.023299131719747325 < 0.0. Total: 2
Armijo: th(5.343340005039394E-5)=0.023299131719747325; dx=-2.7269427200036587E12 evalInputDelta=42.10559073342052
Non-optimal measurement 0.023304388338532742 < 0.0. Total: 2
Armijo: th(5.9370444500437714E-6)=0.023304388338532742; dx=-2.7269427200036587E12 evalInputDelta=42.105585476801735
Non-optimal measurement 0.023304979723372095 < 0.0. Total: 2
Armijo: th(5.937044450043771E-7)=0.023304979723372095; dx=-2.7269427200036587E12 evalInputDelta=42.105584885416896
Non-optimal measurement 0.02330503945938569 < 0.0. Total: 2
Armijo: th(5.397313136403428E-8)=0.02330503945938569; dx=-2.7269427200036587E12 evalInputDelta=42.10558482568088
Non-optimal measurement 0.023305044935188507 < 0.0. Total: 2
Armijo: th(4.4977609470028565E-9)=0.023305044935188507; dx=-2.7269427200036587E12 evalInputDelta=42.10558482020508
Non-optimal measurement 0.02330504539469645 < 0.0. Total: 2
Armijo: th(3.4598161130791205E-10)=0.02330504539469645; dx=-2.7269427200036587E12 evalInputDelta=42.10558481974557
Non-optimal measurement 0.023305045430253617 < 0.0. Total: 2
Armijo: th(2.4712972236279432E-11)=0.023305045430253617; dx=-2.7269427200036587E12 evalInputDelta=42.10558481971001
Non-optimal measurement 0.023305045432806433 < 0.0. Total: 2
Armijo: th(1.6475314824186289E-12)=0.023305045432806433; dx=-2.7269427200036587E12 evalInputDelta=42.10558481970746
Non-optimal measurement 41.94799798981645 < 0.0. Total: 2
Armijo: th(1.029707176511643E-13)=41.94799798981645; dx=-2.2668313600037228E24 evalInputDelta=0.180891875323816
Non-optimal measurement 42.1147087289793 < 0.0. Total: 2
Armijo: th(6.057101038303783E-15)=42.1147087289793; dx=-2.2668313600046843E24 evalInputDelta=0.014181136160964058
Non-optimal measurement 0.0 < 0.0. Total: 2
MIN ALPHA (3.3650561323909904E-16): th(2.154434690031884)=0.0
Fitness changed from 42.128889865140266 to 0.0
Iteration 1 complete. Error: 0.0 Total: 0.0233; Orientation: 0.0006; Line Search: 0.0199
Non-optimal measurement 0.0 < 0.0. Total: 2
LBFGS Accumulation History: 2 points
Non-optimal measurement 0.0 < 0.0. Total: 2
th(0)=0.0;dx=-4.025943680000001
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(2.154434690031884E-15)=0.0; dx=-4.025943680000001 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(1.077217345015942E-15)=0.0; dx=-4.025943680000001 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 2
MIN ALPHA (3.5907244833864734E-16): th(0.0)=0.0
Fitness changed from 0.0 to 0.0
Static Iteration Total: 0.0057; Orientation: 0.0004; Line Search: 0.0044
Iteration 2 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 2
Final threshold in iteration 2: 0.0 (> 0.0) after 0.029s (< 30.000s)

Returns

    0.0

Training Converged

TrainingTester.java:432 executed in 0.01 seconds (0.000 gc):

    return TestUtil.compare(title + " vs Iteration", runs);
Logging
Plotting range=[0.0, 0.0], [2.0, 1.0]; valueStats=DoubleSummaryStatistics{count=0, sum=0.000000, min=Infinity, average=0.000000, max=-Infinity}
Only 0 points for GD
Only 0 points for CjGD
Only 0 points for LBFGS

Returns

Result

TrainingTester.java:435 executed in 0.00 seconds (0.000 gc):

    return TestUtil.compareTime(title + " vs Time", runs);
Logging
No Data

Composite Learning

In this apply, attempt to train a network to emulate a randomized network given an example input/output. The target state is:

TrainingTester.java:279 executed in 0.00 seconds (0.000 gc):

    RefList<double[]> temp_18_0037 = temp_18_0035.state();
    assert temp_18_0037 != null;
    String temp_18_0036 = temp_18_0037.stream().map(RefArrays::toString).reduce((a, b) -> a + "\n" + b).orElse("");
    temp_18_0037.freeRef();
    return temp_18_0036;

Returns

    [-0.384, -1.72, -1.028]

We simultaneously regress this target input:

TrainingTester.java:287 executed in 0.00 seconds (0.000 gc):

    return RefArrays.stream(RefUtil.addRef(input_target)).flatMap(x -> {
      RefStream<Tensor> temp_18_0006 = RefArrays.stream(RefUtil.addRef(x));
      if (null != x)
        RefUtil.freeRef(x);
      return temp_18_0006;
    }).map(x -> {
      String temp_18_0007 = x.prettyPrint();
      x.freeRef();
      return temp_18_0007;
    }).reduce((a, b) -> a + "\n" + b).orElse("");

Returns

    [
    	[ [ 1.048, -0.712, -1.688 ], [ 1.512, 1.108, -0.804 ] ],
    	[ [ 1.556, 0.028, 1.356 ], [ -1.616, 1.912, -0.852 ] ]
    ]
    [
    	[ [ -0.852, 1.048, 1.512 ], [ 1.108, -0.804, 1.356 ] ],
    	[ [ -1.616, -1.688, 1.912 ], [ -0.712, 1.556, 0.028 ] ]
    ]
    [
    	[ [ -0.804, -0.852, -1.688 ], [ -0.712, 1.048, 1.512 ] ],
    	[ [ -1.616, 1.912, 0.028 ], [ 1.356, 1.108, 1.556 ] ]
    ]
    [
    	[ [ 1.108, 1.356, -1.688 ], [ -0.852, 1.048, 1.912 ] ],
    	[ [ -1.616, -0.712, -0.804 ], [ 1.512, 1.556, 0.028 ] ]
    ]
    [
    	[ [ -1.616, 1.912, 0.028 ], [ -0.804, 1.048, 1.356 ] ],
    	[ [ -0.712, 1.512, 1.556 ], [ -1.688, -0.852, 1.108 ] ]
    ]

Which produces the following output:

TrainingTester.java:308 executed in 0.00 seconds (0.000 gc):

    return RefStream.of(RefUtil.addRef(output_target)).map(x -> {
      String temp_18_0008 = x.prettyPrint();
      x.freeRef();
      return temp_18_0008;
    }).reduce((a, b) -> a + "\n" + b).orElse("");

Returns

    [
    	[ [ 0.664, -2.432, -2.716 ], [ 1.1280000000000001, -0.6119999999999999, -1.832 ] ],
    	[ [ 1.1720000000000002, -1.692, 0.32800000000000007 ], [ -2.0, 0.19199999999999995, -1.88 ] ]
    ]
    [
    	[ [ -1.236, -0.6719999999999999, 0.484 ], [ 0.7240000000000001, -2.524, 0.32800000000000007 ] ],
    	[ [ -2.0, -3.408, 0.8839999999999999 ], [ -1.096, -0.16399999999999992, -1.0 ] ]
    ]
    [
    	[ [ -1.1880000000000002, -2.572, -2.716 ], [ -1.096, -0.6719999999999999, 0.484 ] ],
    	[ [ -2.0, 0.19199999999999995, -1.0 ], [ 0.9720000000000001, -0.6119999999999999, 0.528 ] ]
    ]
    [
    	[ [ 0.7240000000000001, -0.3639999999999999, -2.716 ], [ -1.236, -0.6719999999999999, 0.8839999999999999 ] ],
    	[ [ -2.0, -2.432, -1.832 ], [ 1.1280000000000001, -0.16399999999999992, -1.0 ] ]
    ]
    [
    	[ [ -2.0, 0.19199999999999995, -1.0 ], [ -1.1880000000000002, -0.6719999999999999, 0.32800000000000007 ] ],
    	[ [ -1.096, -0.20799999999999996, 0.528 ], [ -2.072, -2.572, 0.08000000000000007 ] ]
    ]

Gradient Descent

First, we train using basic gradient descent method apply weak line search conditions.

TrainingTester.java:480 executed in 0.05 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new ArmijoWolfeSearch());
      iterativeTrainer.setOrientation(new GradientDescent());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2421549255688
Reset training subject: 2421550763665
Constructing line search parameters: GD
th(0)=51.25945839960847;dx=-1.992293120002386E24
New Minimum: 51.25945839960847 > 0.0
Armijo: th(2.154434690031884)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Armijo: th(1.077217345015942)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Armijo: th(0.3590724483386473)=0.0; dx=-2.3792243200021724E12 evalInputDelta=51.25945839960847
Armijo: th(0.08976811208466183)=0.0; dx=-2.3792243200021724E12 evalInputDelta=51.25945839960847
Armijo: th(0.017953622416932366)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Armijo: th(0.002992270402822061)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Armijo: th(4.2746720040315154E-4)=0.0; dx=-2.3792243200021724E12 evalInputDelta=51.25945839960847
Armijo: th(5.343340005039394E-5)=0.0; dx=-2.3792243200021724E12 evalInputDelta=51.25945839960847
Armijo: th(5.9370444500437714E-6)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Armijo: th(5.937044450043771E-7)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Armijo: th(5.397313136403428E-8)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Armijo: th(4.4977609470028565E-9)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Armijo: th(3.4598161130791205E-10)=0.0; dx=-2.3792243200021724E12 evalInputDelta=51.25945839960847
Armijo: th(2.4712972236279432E-11)=0.029826007595965275; dx=-2.3861507793630703E12 evalInputDelta=51.229632392012505
Armijo: th(1.6475314824186289E-12)=18.39207925235264; dx=-3.8786944000297005E23 evalInputDelta=32.86737914725583
Armijo: th(1.029707176511643E-13)=51.15354135915187; dx=-1.9922931200020968E24 evalInputDelta=0.10591704045659611
Armijo: th(6.057101038303783E-15)=51.25231089644202; dx=-1.9922931200023612E24 evalInputDelta=0.0071475031664505195
MIN ALPHA (3.3650561323909904E-16): th(2.154434690031884)=0.0
Fitness changed from 51.25945839960847 to 0.0
Iteration 1 complete. Error: 0.0 Total: 0.0414; Orientation: 0.0008; Line Search: 0.0365
th(0)=0.0;dx=-3.027991040000001
Armijo: th(2.154434690031884E-15)=0.0; dx=-3.0279910400000007 evalInputDelta=0.0
Armijo: th(1.077217345015942E-15)=0.0; dx=-3.0279910400000007 evalInputDelta=0.0
MIN ALPHA (3.5907244833864734E-16): th(0.0)=0.0
Fitness changed from 0.0 to 0.0
Static Iteration Total: 0.0069; Orientation: 0.0004; Line Search: 0.0056
Iteration 2 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 2
Final threshold in iteration 2: 0.0 (> 0.0) after 0.049s (< 30.000s)

Returns

    0.0

Training Converged

Conjugate Gradient Descent

First, we use a conjugate gradient descent method, which converges the fastest for purely linear functions.

TrainingTester.java:452 executed in 0.09 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new QuadraticSearch());
      iterativeTrainer.setOrientation(new GradientDescent());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2421600764338
Reset training subject: 2421601615051
Constructing line search parameters: GD
F(0.0) = LineSearchPoint{point=PointSample{avg=51.25945839960847}, derivative=-1.9922931200023859E24}
New Minimum: 51.25945839960847 > 0.0
F(1.0E-10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(7.000000000000001E-10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(4.900000000000001E-9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
F(3.430000000000001E-8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(2.4010000000000004E-7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
F(1.6807000000000003E-6) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
F(1.1764900000000001E-5) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(8.235430000000001E-5) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
F(5.764801000000001E-4) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(0.004035360700000001) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(0.028247524900000005) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
F(0.19773267430000002) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(1.3841287201) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
F(9.688901040700001) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(67.8223072849) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(474.7561509943) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(3323.2930569601003) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
F(23263.0513987207) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
F(162841.3597910449) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(1139889.5185373144) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(7979226.6297612) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(5.58545864083284E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(3.909821048582988E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(2.7368747340080914E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
F(1.915812313805664E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.379224320002172E12}, evalInputDelta = -51.25945839960847
0.0 <= 51.25945839960847
F(1.0E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-2.3792243200021724E12}, evalInputDelta = -51.25945839960847
Right bracket at 1.0E10
Converged to right
Fitness changed from 51.25945839960847 to 0.0
Iteration 1 complete. Error: 0.0 Total: 0.0446; Orientation: 0.0004; Line Search: 0.0420
F(0.0) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}
F(1.0E10) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
0.0 <= 0.0
F(5.0E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 5.0E9
F(2.5E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 2.5E9
F(1.25E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.027991040000001}, evalInputDelta = 0.0
Right bracket at 1.25E9
F(2.5E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 2.5E9
F(1.25E9) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 1.25E9
F(6.25E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 6.25E8
F(3.125E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 3.125E8
F(1.5625E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 1.5625E8
F(7.8125E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.027991040000001}, evalInputDelta = 0.0
Right bracket at 7.8125E7
F(1.5625E8) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 1.5625E8
F(7.8125E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Right bracket at 7.8125E7
F(3.90625E7) = LineSearchPoint{point=PointSample{avg=0.0}, derivative=-3.0279910400000007}, evalInputDelta = 0.0
Loops = 12
Fitness changed from 0.0 to 0.0
Static Iteration Total: 0.0442; Orientation: 0.0004; Line Search: 0.0428
Iteration 2 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 2
Final threshold in iteration 2: 0.0 (> 0.0) after 0.089s (< 30.000s)

Returns

    0.0

Training Converged

Limited-Memory BFGS

Next, we apply the same optimization using L-BFGS, which is nearly ideal for purely second-order or quadratic functions.

TrainingTester.java:509 executed in 0.04 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new ArmijoWolfeSearch());
      iterativeTrainer.setOrientation(new LBFGS());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setIterationsPerSample(100);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 2421692966636
Reset training subject: 2421694084605
Adding measurement 38a79e20 to history. Total: 0
LBFGS Accumulation History: 1 points
Constructing line search parameters: GD
Non-optimal measurement 51.25945839960847 < 51.25945839960847. Total: 1
th(0)=51.25945839960847;dx=-1.992293120002386E24
Adding measurement 6b4822a9 to history. Total: 1
New Minimum: 51.25945839960847 > 0.0
Armijo: th(2.154434690031884)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(1.077217345015942)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(0.3590724483386473)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(0.08976811208466183)=0.0; dx=-2.3792243200021714E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(0.017953622416932366)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(0.002992270402822061)=0.0; dx=-2.3792243200021714E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(4.2746720040315154E-4)=0.0; dx=-2.3792243200021714E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(5.343340005039394E-5)=0.0; dx=-2.3792243200021714E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(5.9370444500437714E-6)=0.0; dx=-2.3792243200021714E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(5.937044450043771E-7)=0.0; dx=-2.3792243200021714E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(5.397313136403428E-8)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(4.4977609470028565E-9)=0.0; dx=-2.3792243200021714E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(3.4598161130791205E-10)=0.0; dx=-2.379224320002172E12 evalInputDelta=51.25945839960847
Non-optimal measurement 0.029826007595965275 < 0.0. Total: 2
Armijo: th(2.4712972236279432E-11)=0.029826007595965275; dx=-2.38615077936307E12 evalInputDelta=51.229632392012505
Non-optimal measurement 18.39207925235264 < 0.0. Total: 2
Armijo: th(1.6475314824186289E-12)=18.39207925235264; dx=-3.8786944000297005E23 evalInputDelta=32.86737914725583
Non-optimal measurement 51.15354135915187 < 0.0. Total: 2
Armijo: th(1.029707176511643E-13)=51.15354135915187; dx=-1.9922931200020965E24 evalInputDelta=0.10591704045659611
Non-optimal measurement 51.25231089644202 < 0.0. Total: 2
Armijo: th(6.057101038303783E-15)=51.25231089644202; dx=-1.9922931200023612E24 evalInputDelta=0.0071475031664505195
Non-optimal measurement 0.0 < 0.0. Total: 2
MIN ALPHA (3.3650561323909904E-16): th(2.154434690031884)=0.0
Fitness changed from 51.25945839960847 to 0.0
Iteration 1 complete. Error: 0.0 Total: 0.0344; Orientation: 0.0012; Line Search: 0.0302
Non-optimal measurement 0.0 < 0.0. Total: 2
LBFGS Accumulation History: 2 points
Non-optimal measurement 0.0 < 0.0. Total: 2
th(0)=0.0;dx=-3.027991040000001
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(2.154434690031884E-15)=0.0; dx=-3.027991040000001 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 2
Armijo: th(1.077217345015942E-15)=0.0; dx=-3.0279910400000007 evalInputDelta=0.0
Non-optimal measurement 0.0 < 0.0. Total: 2
MIN ALPHA (3.5907244833864734E-16): th(0.0)=0.0
Fitness changed from 0.0 to 0.0
Static Iteration Total: 0.0074; Orientation: 0.0006; Line Search: 0.0059
Iteration 2 failed. Error: 0.0
Previous Error: 0.0 -> 0.0
Optimization terminated 2
Final threshold in iteration 2: 0.0 (> 0.0) after 0.042s (< 30.000s)

Returns

    0.0

Training Converged

TrainingTester.java:432 executed in 0.01 seconds (0.000 gc):

    return TestUtil.compare(title + " vs Iteration", runs);
Logging
Plotting range=[0.0, 0.0], [2.0, 1.0]; valueStats=DoubleSummaryStatistics{count=0, sum=0.000000, min=Infinity, average=0.000000, max=-Infinity}
Only 0 points for GD
Only 0 points for CjGD
Only 0 points for LBFGS

Returns

Result

TrainingTester.java:435 executed in 0.00 seconds (0.000 gc):

    return TestUtil.compareTime(title + " vs Time", runs);
Logging
No Data

Results

TrainingTester.java:255 executed in 0.00 seconds (0.000 gc):

    return grid(inputLearning, modelLearning, completeLearning);

Returns

Result

TrainingTester.java:258 executed in 0.00 seconds (0.000 gc):

    return new ComponentResult(null == inputLearning ? null : inputLearning.value,
        null == modelLearning ? null : modelLearning.value, null == completeLearning ? null : completeLearning.value);

Returns

    {"input":{ "LBFGS": { "type": "Converged", "value": 0.0 }, "CjGD": { "type": "Converged", "value": 0.0 }, "GD": { "type": "Converged", "value": 0.0 } }, "model":{ "LBFGS": { "type": "Converged", "value": 0.0 }, "CjGD": { "type": "Converged", "value": 0.0 }, "GD": { "type": "Converged", "value": 0.0 } }, "complete":{ "LBFGS": { "type": "Converged", "value": 0.0 }, "CjGD": { "type": "Converged", "value": 0.0 }, "GD": { "type": "Converged", "value": 0.0 } }}

LayerTests.java:425 executed in 0.00 seconds (0.000 gc):

    throwException(exceptions.addRef());

Results

detailsresult
{"input":{ "LBFGS": { "type": "Converged", "value": 0.0 }, "CjGD": { "type": "Converged", "value": 0.0 }, "GD": { "type": "Converged", "value": 0.0 } }, "model":{ "LBFGS": { "type": "Converged", "value": 0.0 }, "CjGD": { "type": "Converged", "value": 0.0 }, "GD": { "type": "Converged", "value": 0.0 } }, "complete":{ "LBFGS": { "type": "Converged", "value": 0.0 }, "CjGD": { "type": "Converged", "value": 0.0 }, "GD": { "type": "Converged", "value": 0.0 } }}OK
  {
    "result": "OK",
    "performance": {
      "execution_time": "4.290",
      "gc_time": "0.303"
    },
    "created_on": 1586737005812,
    "file_name": "trainingTest",
    "report": {
      "simpleName": "Basic",
      "canonicalName": "com.simiacryptus.mindseye.layers.java.ImgBandBiasLayerTest.Basic",
      "link": "https://github.com/SimiaCryptus/mindseye-java/tree/93db34cedee48c0202777a2b25deddf1dfaf5731/src/test/java/com/simiacryptus/mindseye/layers/java/ImgBandBiasLayerTest.java",
      "javaDoc": ""
    },
    "training_analysis": {
      "input": {
        "LBFGS": {
          "type": "Converged",
          "value": 0.0
        },
        "CjGD": {
          "type": "Converged",
          "value": 0.0
        },
        "GD": {
          "type": "Converged",
          "value": 0.0
        }
      },
      "model": {
        "LBFGS": {
          "type": "Converged",
          "value": 0.0
        },
        "CjGD": {
          "type": "Converged",
          "value": 0.0
        },
        "GD": {
          "type": "Converged",
          "value": 0.0
        }
      },
      "complete": {
        "LBFGS": {
          "type": "Converged",
          "value": 0.0
        },
        "CjGD": {
          "type": "Converged",
          "value": 0.0
        },
        "GD": {
          "type": "Converged",
          "value": 0.0
        }
      }
    },
    "archive": "s3://code.simiacrypt.us/tests/com/simiacryptus/mindseye/layers/java/ImgBandBiasLayer/Basic/trainingTest/202004131645",
    "id": "fd821e43-0572-4fe6-a893-e5189779d1f8",
    "report_type": "Components",
    "display_name": "Comparative Training",
    "target": {
      "simpleName": "ImgBandBiasLayer",
      "canonicalName": "com.simiacryptus.mindseye.layers.java.ImgBandBiasLayer",
      "link": "https://github.com/SimiaCryptus/mindseye-java/tree/93db34cedee48c0202777a2b25deddf1dfaf5731/src/main/java/com/simiacryptus/mindseye/layers/java/ImgBandBiasLayer.java",
      "javaDoc": ""
    }
  }