1. Test Modules
  2. Training Characteristics
    1. Input Learning
      1. Gradient Descent
      2. Conjugate Gradient Descent
      3. Limited-Memory BFGS
    2. Results
  3. Results

Subreport: Logs for com.simiacryptus.ref.lang.ReferenceCountingBase

Test Modules

Using Seed 1710867965846480896

Training Characteristics

Input Learning

In this apply, we use a network to learn this target input, given it's pre-evaluated output:

TrainingTester.java:332 executed in 0.00 seconds (0.000 gc):

    return RefArrays.stream(RefUtil.addRef(input_target)).flatMap(RefArrays::stream).map(x -> {
      try {
        return x.prettyPrint();
      } finally {
        x.freeRef();
      }
    }).reduce((a, b) -> a + "\n" + b).orElse("");

Returns

    [ 0.5435525508209025, -0.7709240538133144, -0.09338712503948288 ]
    [ -0.5366310009669699 ]
    [ -0.7018064525823429, -0.660450338649003, -0.055836324828218986 ]
    [ -0.4061420910640319 ]
    [ 5.820331589430072E-4, 0.7070450737768597, 0.9476513167141603 ]
    [ -0.5244943960604862 ]
    [ -0.567512086846498, 0.6144728226415976, -0.9991191782684057 ]
    [ -0.5465493913671364 ]
    [ -0.9246361825979077, -0.004788297111236073, -0.7249979969166396 ]
    [ -0.41974955594430163 ]

Gradient Descent

First, we train using basic gradient descent method apply weak line search conditions.

TrainingTester.java:480 executed in 0.21 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new ArmijoWolfeSearch());
      iterativeTrainer.setOrientation(new GradientDescent());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 4010381399526
BACKPROP_AGG_SIZE = 3
THREADS = 64
SINGLE_THREADED = false
Initialized CoreSettings = {
"backpropAggregationSize" : 3,
"jvmThreads" : 64,
"singleThreaded" : false
}
Reset training subject: 4010415515250
Constructing line search parameters: GD
th(0)=6.5685013957808795;dx=-4.936990177918584E21
Armijo: th(2.154434690031884)=11.16155713368608; dx=5.1992905282285415E32 evalInputDelta=-4.5930557379052
Armijo: th(1.077217345015942)=11.22311235600066; dx=2.5996452641019327E32 evalInputDelta=-4.654610960219781
Armijo: th(0.3590724483386473)=11.28059068329377; dx=8.665484213508602E31 evalInputDelta=-4.712089287512891
Armijo: th(0.08976811208466183)=11.309298392448284; dx=2.16637105319208E31 evalInputDelta=-4.740796996667404
Armijo: th(0.017953622416932366)=11.318113168070019; dx=4.3327421044100745E30 evalInputDelta=-4.749611772289139
Armijo: th(0.002992270402822061)=11.320029254811555; dx=7.221236820120066E29 evalInputDelta=-4.751527859030675
Armijo: th(4.2746720040315154E-4)=11.320360702932696; dx=1.0316052388662346E29 evalInputDelta=-4.7518593071518165
Armijo: th(5.343340005039394E-5)=11.320409113061732; dx=1.289506332667178E28 evalInputDelta=-4.751907717280853
Armijo: th(5.9370444500437714E-6)=11.320415261730142; dx=1.4327826206461678E27 evalInputDelta=-4.751913865949263
Armijo: th(5.937044450043771E-7)=11.32041595347439; dx=1.4327604121828624E26 evalInputDelta=-4.75191455769351
Armijo: th(5.397313136403428E-8)=11.32041602334776; dx=1.3022851377086108E25 evalInputDelta=-4.751914627566881
Armijo: th(4.4977609470028565E-9)=11.043473475008433; dx=1.0829078757548234E24 evalInputDelta=-4.474972079227554
Armijo: th(3.4598161130791205E-10)=11.04347347554592; dx=8.112439946838643E22 evalInputDelta=-4.47497207976504
New Minimum: 6.5685013957808795 > 5.992982281455314
Armijo: th(2.4712972236279432E-11)=5.992982281455314; dx=2.469383144203904E21 evalInputDelta=0.5755191143255658
New Minimum: 5.992982281455314 > 4.601755366049961
Armijo: th(1.6475314824186289E-12)=4.601755366049961; dx=-4.253781981156338E21 evalInputDelta=1.9667460297309187
Armijo: th(1.029707176511643E-13)=6.570731379721853; dx=-4.912140286694593E21 evalInputDelta=-0.0022299839409738453
Armijo: th(6.057101038303783E-15)=6.5686318261991685; dx=-4.935528419611291E21 evalInputDelta=-1.3043041828897373E-4
MIN ALPHA (3.3650561323909904E-16): th(1.6475314824186289E-12)=4.601755366049961
Fitness changed from 6.5685013957808795 to 4.601755366049961
Iteration 1 complete. Error: 4.601755366049961 Total: 0.1770; Orientation: 0.0055; Line Search: 0.1193
th(0)=4.601755366049961;dx=-3.908446413433135E21
Armijo: th(2.154434690031884E-15)=4.601809753341172; dx=-3.9080520025860263E21 evalInputDelta=-5.4387291211099864E-5
Armijo: th(1.077217345015942E-15)=4.601782557667228; dx=-3.9082492080095803E21 evalInputDelta=-2.7191617267163792E-5
MIN ALPHA (3.5907244833864734E-16): th(0.0)=4.601755366049961
Fitness changed from 4.601755366049961 to 4.601755366049961
Static Iteration Total: 0.0202; Orientation: 0.0012; Line Search: 0.0153
Iteration 2 failed. Error: 4.601755366049961
Previous Error: 0.0 -> 4.601755366049961
Optimization terminated 2
Final threshold in iteration 2: 4.601755366049961 (> 0.0) after 0.198s (< 30.000s)

Returns

    4.601755366049961

This training apply resulted in the following configuration:

TrainingTester.java:610 executed in 0.00 seconds (0.000 gc):

    RefList<double[]> state = network.state();
    assert state != null;
    String description = state.stream().map(RefArrays::toString).reduce((a, b) -> a + "\n" + b)
        .orElse("");
    state.freeRef();
    return description;

Returns

    

And regressed input:

TrainingTester.java:622 executed in 0.00 seconds (0.000 gc):

    return RefArrays.stream(RefUtil.addRef(data)).flatMap(x -> {
      return RefArrays.stream(x);
    }).limit(1).map(x -> {
      String temp_18_0015 = x.prettyPrint();
      x.freeRef();
      return temp_18_0015;
    }).reduce((a, b) -> a + "\n" + b).orElse("");

Returns

    [ -0.29019357302772786, -0.12574522839753535, -0.6454004850583094 ]

To produce the following output:

TrainingTester.java:633 executed in 0.00 seconds (0.000 gc):

    Result[] array = ConstantResult.batchResultArray(pop(RefUtil.addRef(data)));
    @Nullable
    Result eval = layer.eval(array);
    assert eval != null;
    TensorList tensorList = Result.getData(eval);
    String temp_18_0016 = tensorList.stream().limit(1).map(x -> {
      String temp_18_0017 = x.prettyPrint();
      x.freeRef();
      return temp_18_0017;
    }).reduce((a, b) -> a + "\n" + b).orElse("");
    tensorList.freeRef();
    return temp_18_0016;

Returns

    [ 0.004772495976636522, 0.0020679940990668906, 0.010614195159882926 ]

Conjugate Gradient Descent

First, we use a conjugate gradient descent method, which converges the fastest for purely linear functions.

TrainingTester.java:452 executed in 0.03 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new QuadraticSearch());
      iterativeTrainer.setOrientation(new GradientDescent());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 4010594378006
Reset training subject: 4010598450530
Constructing line search parameters: GD
F(0.0) = LineSearchPoint{point=PointSample{avg=6.5685013957808795}, derivative=-4.936990177918584E21}
F(1.0E-10) = LineSearchPoint{point=PointSample{avg=8.999785429977447}, derivative=2.168210596421549E22}, evalInputDelta = 2.431284034196567
8.999785429977447 <= 6.5685013957808795
Converged to right
Fitness changed from 6.5685013957808795 to 6.5685013957808795
Static Iteration Total: 0.0321; Orientation: 0.0012; Line Search: 0.0168
Iteration 1 failed. Error: 6.5685013957808795
Previous Error: 0.0 -> 6.5685013957808795
Optimization terminated 1
Final threshold in iteration 1: 6.5685013957808795 (> 0.0) after 0.032s (< 30.000s)

Returns

    6.5685013957808795

This training apply resulted in the following configuration:

TrainingTester.java:610 executed in 0.00 seconds (0.000 gc):

    RefList<double[]> state = network.state();
    assert state != null;
    String description = state.stream().map(RefArrays::toString).reduce((a, b) -> a + "\n" + b)
        .orElse("");
    state.freeRef();
    return description;

Returns

    

And regressed input:

TrainingTester.java:622 executed in 0.00 seconds (0.000 gc):

    return RefArrays.stream(RefUtil.addRef(data)).flatMap(x -> {
      return RefArrays.stream(x);
    }).limit(1).map(x -> {
      String temp_18_0015 = x.prettyPrint();
      x.freeRef();
      return temp_18_0015;
    }).reduce((a, b) -> a + "\n" + b).orElse("");

Returns

    [ -0.29019357302772786, -0.12732256060281388, -0.645591557731622 ]

To produce the following output:

TrainingTester.java:633 executed in 0.00 seconds (0.000 gc):

    Result[] array = ConstantResult.batchResultArray(pop(RefUtil.addRef(data)));
    @Nullable
    Result eval = layer.eval(array);
    assert eval != null;
    TensorList tensorList = Result.getData(eval);
    String temp_18_0016 = tensorList.stream().limit(1).map(x -> {
      String temp_18_0017 = x.prettyPrint();
      x.freeRef();
      return temp_18_0017;
    }).reduce((a, b) -> a + "\n" + b).orElse("");
    tensorList.freeRef();
    return temp_18_0016;

Returns

    [ -0.003357839147679532, -0.0014732534353339313, -0.007470160635692368 ]

Limited-Memory BFGS

Next, we apply the same optimization using L-BFGS, which is nearly ideal for purely second-order or quadratic functions.

TrainingTester.java:509 executed in 0.10 seconds (0.000 gc):

    IterativeTrainer iterativeTrainer = new IterativeTrainer(trainable.addRef());
    try {
      iterativeTrainer.setLineSearchFactory(label -> new ArmijoWolfeSearch());
      iterativeTrainer.setOrientation(new LBFGS());
      iterativeTrainer.setMonitor(TrainingTester.getMonitor(history));
      iterativeTrainer.setTimeout(30, TimeUnit.SECONDS);
      iterativeTrainer.setIterationsPerSample(100);
      iterativeTrainer.setMaxIterations(250);
      iterativeTrainer.setTerminateThreshold(0);
      return iterativeTrainer.run();
    } finally {
      iterativeTrainer.freeRef();
    }
Logging
Reset training subject: 4010635951626
Reset training subject: 4010638510902
Adding measurement 1ffe5c68 to history. Total: 0
LBFGS Accumulation History: 1 points
Constructing line search parameters: GD
Non-optimal measurement 6.5685013957808795 < 6.5685013957808795. Total: 1
th(0)=6.5685013957808795;dx=-4.936990177918584E21
Non-optimal measurement 11.16155713368608 < 6.5685013957808795. Total: 1
Armijo: th(2.154434690031884)=11.16155713368608; dx=5.1992905282285415E32 evalInputDelta=-4.5930557379052
Non-optimal measurement 11.22311235600066 < 6.5685013957808795. Total: 1
Armijo: th(1.077217345015942)=11.22311235600066; dx=2.5996452641019327E32 evalInputDelta=-4.654610960219781
Non-optimal measurement 11.28059068329377 < 6.5685013957808795. Total: 1
Armijo: th(0.3590724483386473)=11.28059068329377; dx=8.665484213508602E31 evalInputDelta=-4.712089287512891
Non-optimal measurement 11.309298392448284 < 6.5685013957808795. Total: 1
Armijo: th(0.08976811208466183)=11.309298392448284; dx=2.16637105319208E31 evalInputDelta=-4.740796996667404
Non-optimal measurement 11.318113168070019 < 6.5685013957808795. Total: 1
Armijo: th(0.017953622416932366)=11.318113168070019; dx=4.3327421044100745E30 evalInputDelta=-4.749611772289139
Non-optimal measurement 11.320029254811555 < 6.5685013957808795. Total: 1
Armijo: th(0.002992270402822061)=11.320029254811555; dx=7.221236820120066E29 evalInputDelta=-4.751527859030675
Non-optimal measurement 11.320360702932696 < 6.5685013957808795. Total: 1
Armijo: th(4.2746720040315154E-4)=11.320360702932696; dx=1.0316052388662346E29 evalInputDelta=-4.7518593071518165
Non-optimal measurement 11.320409113061732 < 6.5685013957808795. Total: 1
Armijo: th(5.343340005039394E-5)=11.320409113061732; dx=1.289506332667178E28 evalInputDelta=-4.751907717280853
Non-optimal measurement 11.320415261730142 < 6.5685013957808795. Total: 1
Armijo: th(5.9370444500437714E-6)=11.320415261730142; dx=1.4327826206461678E27 evalInputDelta=-4.751913865949263
Non-optimal measurement 11.32041595347439 < 6.5685013957808795. Total: 1
Armijo: th(5.937044450043771E-7)=11.32041595347439; dx=1.4327604121828624E26 evalInputDelta=-4.75191455769351
Non-optimal measurement 11.32041602334776 < 6.5685013957808795. Total: 1
Armijo: th(5.397313136403428E-8)=11.32041602334776; dx=1.3022851377086108E25 evalInputDelta=-4.751914627566881
Non-optimal measurement 11.043473475008433 < 6.5685013957808795. Total: 1
Armijo: th(4.4977609470028565E-9)=11.043473475008433; dx=1.0829078757548234E24 evalInputDelta=-4.474972079227554
Non-optimal measurement 11.04347347554592 < 6.5685013957808795. Total: 1
Armijo: th(3.4598161130791205E-10)=11.04347347554592; dx=8.112439946838643E22 evalInputDelta=-4.47497207976504
Adding measurement 56d015c1 to history. Total: 1
New Minimum: 6.5685013957808795 > 5.992982281455314
Armijo: th(2.4712972236279432E-11)=5.992982281455314; dx=2.469383144203904E21 evalInputDelta=0.5755191143255658
Adding measurement 291aaba7 to history. Total: 2
New Minimum: 5.992982281455314 > 4.601755366049961
Armijo: th(1.6475314824186289E-12)=4.601755366049961; dx=-4.2537819811563375E21 evalInputDelta=1.9667460297309187
Non-optimal measurement 6.570731379721853 < 4.601755366049961. Total: 3
Armijo: th(1.029707176511643E-13)=6.570731379721853; dx=-4.912140286694593E21 evalInputDelta=-0.0022299839409738453
Non-optimal measurement 6.5686318261991685 < 4.601755366049961. Total: 3
Armijo: th(6.057101038303783E-15)=6.5686318261991685; dx=-4.935528419611291E21 evalInputDelta=-1.3043041828897373E-4
Non-optimal measurement 4.601755366049961 < 4.601755366049961. Total: 3
MIN ALPHA (3.3650561323909904E-16): th(1.6475314824186289E-12)=4.601755366049961
Fitness changed from 6.5685013957808795 to 4.601755366049961
Iteration 1 complete. Error: 4.601755366049961 Total: 0.0870; Orientation: 0.0040; Line Search: 0.0759
Non-optimal measurement 4.601755366049961 < 4.601755366049961. Total: 3
LBFGS Accumulation History: 3 points
Non-optimal measurement 4.601755366049961 < 4.601755366049961. Total: 3
th(0)=4.601755366049961;dx=-3.908446413433135E21
Non-optimal measurement 4.601809753341172 < 4.601755366049961. Total: 3
Armijo: th(2.154434690031884E-15)=4.601809753341172; dx=-3.9080520025860263E21 evalInputDelta=-5.4387291211099864E-5
Non-optimal measurement 4.601782557667228 < 4.601755366049961. Total: 3
Armijo: th(1.077217345015942E-15)=4.601782557667228; dx=-3.9082492080095803E21 evalInputDelta=-2.7191617267163792E-5
Non-optimal measurement 4.601755366049961 < 4.601755366049961. Total: 3
MIN ALPHA (3.5907244833864734E-16): th(0.0)=4.601755366049961
Fitness changed from 4.601755366049961 to 4.601755366049961
Static Iteration Total: 0.0118; Orientation: 0.0010; Line Search: 0.0088
Iteration 2 failed. Error: 4.601755366049961
Previous Error: 0.0 -> 4.601755366049961
Optimization terminated 2
Final threshold in iteration 2: 4.601755366049961 (> 0.0) after 0.099s (< 30.000s)

Returns

    4.601755366049961

This training apply resulted in the following configuration:

TrainingTester.java:610 executed in 0.00 seconds (0.000 gc):

    RefList<double[]> state = network.state();
    assert state != null;
    String description = state.stream().map(RefArrays::toString).reduce((a, b) -> a + "\n" + b)
        .orElse("");
    state.freeRef();
    return description;

Returns

    

And regressed input:

TrainingTester.java:622 executed in 0.00 seconds (0.000 gc):

    return RefArrays.stream(RefUtil.addRef(data)).flatMap(x -> {
      return RefArrays.stream(x);
    }).limit(1).map(x -> {
      String temp_18_0015 = x.prettyPrint();
      x.freeRef();
      return temp_18_0015;
    }).reduce((a, b) -> a + "\n" + b).orElse("");

Returns

    [ -0.29019357302772786, -0.12574522839753535, -0.6454004850583094 ]

To produce the following output:

TrainingTester.java:633 executed in 0.00 seconds (0.000 gc):

    Result[] array = ConstantResult.batchResultArray(pop(RefUtil.addRef(data)));
    @Nullable
    Result eval = layer.eval(array);
    assert eval != null;
    TensorList tensorList = Result.getData(eval);
    String temp_18_0016 = tensorList.stream().limit(1).map(x -> {
      String temp_18_0017 = x.prettyPrint();
      x.freeRef();
      return temp_18_0017;
    }).reduce((a, b) -> a + "\n" + b).orElse("");
    tensorList.freeRef();
    return temp_18_0016;

Returns

    [ 0.004772495976636522, 0.0020679940990668906, 0.010614195159882926 ]

TrainingTester.java:432 executed in 0.09 seconds (0.000 gc):

    return TestUtil.compare(title + " vs Iteration", runs);
Logging
Plotting range=[0.0, -0.3370764725857881], [2.0, 1.662923527414212]; valueStats=DoubleSummaryStatistics{count=2, sum=9.203511, min=4.601755, average=4.601755, max=4.601755}
Only 1 points for GD
Only 1 points for LBFGS

Returns

Result

TrainingTester.java:435 executed in 0.01 seconds (0.000 gc):

    return TestUtil.compareTime(title + " vs Time", runs);
Logging
Plotting range=[-1.0, -0.3370764725857881], [1.0, 1.662923527414212]; valueStats=DoubleSummaryStatistics{count=2, sum=9.203511, min=4.601755, average=4.601755, max=4.601755}
Only 1 points for GD
Only 0 points for LBFGS

Returns

Result

Results

TrainingTester.java:255 executed in 0.00 seconds (0.000 gc):

    return grid(inputLearning, modelLearning, completeLearning);

Returns

Result

TrainingTester.java:258 executed in 0.00 seconds (0.000 gc):

    return new ComponentResult(null == inputLearning ? null : inputLearning.value,
        null == modelLearning ? null : modelLearning.value, null == completeLearning ? null : completeLearning.value);

Returns

    {"input":{ "LBFGS": { "type": "NonConverged", "value": 4.601755366049961 }, "CjGD": { "type": "NonConverged", "value": NaN }, "GD": { "type": "NonConverged", "value": 4.601755366049961 } }, "model":null, "complete":null}

LayerTests.java:425 executed in 0.00 seconds (0.000 gc):

    throwException(exceptions.addRef());

Results

detailsresult
{"input":{ "LBFGS": { "type": "NonConverged", "value": 4.601755366049961 }, "CjGD": { "type": "NonConverged", "value": NaN }, "GD": { "type": "NonConverged", "value": 4.601755366049961 } }, "model":null, "complete":null}OK
  {
    "result": "OK",
    "performance": {
      "execution_time": "0.867",
      "gc_time": "0.174"
    },
    "created_on": 1586738598189,
    "file_name": "trainingTest",
    "report": {
      "simpleName": "N1Test",
      "canonicalName": "com.simiacryptus.mindseye.layers.java.ProductInputsLayerTest.N1Test",
      "link": "https://github.com/SimiaCryptus/mindseye-java/tree/93db34cedee48c0202777a2b25deddf1dfaf5731/src/test/java/com/simiacryptus/mindseye/layers/java/ProductInputsLayerTest.java",
      "javaDoc": ""
    },
    "training_analysis": {
      "input": {
        "LBFGS": {
          "type": "NonConverged",
          "value": 4.601755366049961
        },
        "CjGD": {
          "type": "NonConverged",
          "value": "NaN"
        },
        "GD": {
          "type": "NonConverged",
          "value": 4.601755366049961
        }
      }
    },
    "archive": "s3://code.simiacrypt.us/tests/com/simiacryptus/mindseye/layers/java/ProductInputsLayer/N1Test/trainingTest/202004134318",
    "id": "1b4778cf-303b-4382-8b74-8963caf0a074",
    "report_type": "Components",
    "display_name": "Comparative Training",
    "target": {
      "simpleName": "ProductInputsLayer",
      "canonicalName": "com.simiacryptus.mindseye.layers.java.ProductInputsLayer",
      "link": "https://github.com/SimiaCryptus/mindseye-java/tree/93db34cedee48c0202777a2b25deddf1dfaf5731/src/main/java/com/simiacryptus/mindseye/layers/java/ProductInputsLayer.java",
      "javaDoc": ""
    }
  }