Author: Smerity and others
Date created: 2015/08/17
Last modified: 2024/02/13
Description: A model that learns to add strings of numbers, e.g. "535+61" -> "596".
View in Colab โข GitHub source
In this example, we train a model to learn to add two numbers, provided as strings.
Example:
Input may optionally be reversed, which was shown to increase performance in many tasks in: Learning to Execute and Sequence to Sequence Learning with Neural Networks.
Theoretically, sequence order inversion introduces shorter term dependencies between source and target for this problem.
Results:
For two digits (reversed):
Three digits (reversed):
Four digits (reversed):
Five digits (reversed):
import keras
from keras import layers
import numpy as np
# Parameters for the model and dataset.
TRAINING_SIZE = 50000
DIGITS = 3
REVERSE = True
# Maximum length of input is 'int + int' (e.g., '345+678'). Maximum length of
# int is DIGITS.
MAXLEN = DIGITS + 1 + DIGITS
class CharacterTable:
"""Given a set of characters:
+ Encode them to a one-hot integer representation
+ Decode the one-hot or integer representation to their character output
+ Decode a vector of probabilities to their character output
"""
def __init__(self, chars):
"""Initialize character table.
# Arguments
chars: Characters that can appear in the input.
"""
self.chars = sorted(set(chars))
self.char_indices = dict((c, i) for i, c in enumerate(self.chars))
self.indices_char = dict((i, c) for i, c in enumerate(self.chars))
def encode(self, C, num_rows):
"""One-hot encode given string C.
# Arguments
C: string, to be encoded.
num_rows: Number of rows in the returned one-hot encoding. This is
used to keep the # of rows for each data the same.
"""
x = np.zeros((num_rows, len(self.chars)))
for i, c in enumerate(C):
x[i, self.char_indices[c]] = 1
return x
def decode(self, x, calc_argmax=True):
"""Decode the given vector or 2D array to their character output.
# Arguments
x: A vector or a 2D array of probabilities or one-hot representations;
or a vector of character indices (used with `calc_argmax=False`).
calc_argmax: Whether to find the character index with maximum
probability, defaults to `True`.
"""
if calc_argmax:
x = x.argmax(axis=-1)
return "".join(self.indices_char[x] for x in x)
# All the numbers, plus sign and space for padding.
chars = "0123456789+ "
ctable = CharacterTable(chars)
questions = []
expected = []
seen = set()
print("Generating data...")
while len(questions) < TRAINING_SIZE:
f = lambda: int(
"".join(
np.random.choice(list("0123456789"))
for i in range(np.random.randint(1, DIGITS + 1))
)
)
a, b = f(), f()
# Skip any addition questions we've already seen
# Also skip any such that x+Y == Y+x (hence the sorting).
key = tuple(sorted((a, b)))
if key in seen:
continue
seen.add(key)
# Pad the data with spaces such that it is always MAXLEN.
q = "{}+{}".format(a, b)
query = q + " " * (MAXLEN - len(q))
ans = str(a + b)
# Answers can be of maximum size DIGITS + 1.
ans += " " * (DIGITS + 1 - len(ans))
if REVERSE:
# Reverse the query, e.g., '12+345 ' becomes ' 543+21'. (Note the
# space used for padding.)
query = query[::-1]
questions.append(query)
expected.append(ans)
print("Total questions:", len(questions))
Generating data...
Total questions: 50000
print("Vectorization...")
x = np.zeros((len(questions), MAXLEN, len(chars)), dtype=bool)
y = np.zeros((len(questions), DIGITS + 1, len(chars)), dtype=bool)
for i, sentence in enumerate(questions):
x[i] = ctable.encode(sentence, MAXLEN)
for i, sentence in enumerate(expected):
y[i] = ctable.encode(sentence, DIGITS + 1)
# Shuffle (x, y) in unison as the later parts of x will almost all be larger
# digits.
indices = np.arange(len(y))
np.random.shuffle(indices)
x = x[indices]
y = y[indices]
# Explicitly set apart 10% for validation data that we never train over.
split_at = len(x) - len(x) // 10
(x_train, x_val) = x[:split_at], x[split_at:]
(y_train, y_val) = y[:split_at], y[split_at:]
print("Training Data:")
print(x_train.shape)
print(y_train.shape)
print("Validation Data:")
print(x_val.shape)
print(y_val.shape)
Vectorization...
Training Data:
(45000, 7, 12)
(45000, 4, 12)
Validation Data:
(5000, 7, 12)
(5000, 4, 12)
print("Build model...")
num_layers = 1 # Try to add more LSTM layers!
model = keras.Sequential()
# "Encode" the input sequence using a LSTM, producing an output of size 128.
# Note: In a situation where your input sequences have a variable length,
# use input_shape=(None, num_feature).
model.add(layers.Input((MAXLEN, len(chars))))
model.add(layers.LSTM(128))
# As the decoder RNN's input, repeatedly provide with the last output of
# RNN for each time step. Repeat 'DIGITS + 1' times as that's the maximum
# length of output, e.g., when DIGITS=3, max output is 999+999=1998.
model.add(layers.RepeatVector(DIGITS + 1))
# The decoder RNN could be multiple layers stacked or a single layer.
for _ in range(num_layers):
# By setting return_sequences to True, return not only the last output but
# all the outputs so far in the form of (num_samples, timesteps,
# output_dim). This is necessary as TimeDistributed in the below expects
# the first dimension to be the timesteps.
model.add(layers.LSTM(128, return_sequences=True))
# Apply a dense layer to the every temporal slice of an input. For each of step
# of the output sequence, decide which character should be chosen.
model.add(layers.Dense(len(chars), activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.summary()
Build model...
Model: "sequential"
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโ โ Layer (type) โ Output Shape โ Param # โ โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ โ lstm (LSTM) โ (None, 128) โ 72,192 โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค โ repeat_vector (RepeatVector) โ (None, 4, 128) โ 0 โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค โ lstm_1 (LSTM) โ (None, 4, 128) โ 131,584 โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค โ dense (Dense) โ (None, 4, 12) โ 1,548 โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโ
Total params: 205,324 (802.05 KB)
Trainable params: 205,324 (802.05 KB)
Non-trainable params: 0 (0.00 B)
# Training parameters.
epochs = 30
batch_size = 32
# Formatting characters for results display.
green_color = "\033[92m"
red_color = "\033[91m"
end_char = "\033[0m"
# Train the model each generation and show predictions against the validation
# dataset.
for epoch in range(1, epochs):
print()
print("Iteration", epoch)
model.fit(
x_train,
y_train,
batch_size=batch_size,
epochs=1,
validation_data=(x_val, y_val),
)
# Select 10 samples from the validation set at random so we can visualize
# errors.
for i in range(10):
ind = np.random.randint(0, len(x_val))
rowx, rowy = x_val[np.array([ind])], y_val[np.array([ind])]
preds = np.argmax(model.predict(rowx, verbose=0), axis=-1)
q = ctable.decode(rowx[0])
correct = ctable.decode(rowy[0])
guess = ctable.decode(preds[0], calc_argmax=False)
print("Q", q[::-1] if REVERSE else q, end=" ")
print("T", correct, end=" ")
if correct == guess:
print(f"{green_color}โ {guess}{end_char}")
else:
print(f"{red_color}โ {guess}{end_char}")
Iteration 1
1407/1407 โโโโโโโโโโโโโโโโโโโโ 10s 6ms/step - accuracy: 0.3258 - loss: 1.8801 - val_accuracy: 0.4268 - val_loss: 1.5506
Q 499+58 T 557 โ 511
Q 51+638 T 689 โ 662
Q 87+12 T 99 โ 11
Q 259+55 T 314 โ 561
Q 704+87 T 791 โ 811
Q 988+67 T 1055 โ 101
Q 94+116 T 210 โ 111
Q 724+4 T 728 โ 777
Q 8+673 T 681 โ 772
Q 8+991 T 999 โ 900
Iteration 2
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.4688 - loss: 1.4235 - val_accuracy: 0.5846 - val_loss: 1.1293
Q 379+6 T 385 โ 387
Q 15+504 T 519 โ 525
Q 552+299 T 851 โ 727
Q 664+0 T 664 โ 667
Q 500+257 T 757 โ 797
Q 50+818 T 868 โ 861
Q 310+691 T 1001 โ 900
Q 378+548 T 926 โ 827
Q 46+59 T 105 โ 122
Q 49+817 T 866 โ 871
Iteration 3
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.6053 - loss: 1.0648 - val_accuracy: 0.6665 - val_loss: 0.9070
Q 1+266 T 267 โ 260
Q 73+257 T 330 โ 324
Q 421+628 T 1049 โ 1022
Q 85+590 T 675 โ 660
Q 66+34 T 100 โ 90
Q 256+639 T 895 โ 890
Q 6+677 T 683 โ 683
Q 162+637 T 799 โ 792
Q 5+324 T 329 โ 337
Q 848+34 T 882 โ 889
Iteration 4
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 5ms/step - accuracy: 0.6781 - loss: 0.8751 - val_accuracy: 0.7037 - val_loss: 0.8092
Q 677+1 T 678 โ 676
Q 1+531 T 532 โ 535
Q 699+60 T 759 โ 756
Q 475+139 T 614 โ 616
Q 327+592 T 919 โ 915
Q 48+912 T 960 โ 956
Q 520+78 T 598 โ 505
Q 318+8 T 326 โ 327
Q 914+53 T 967 โ 966
Q 734+0 T 734 โ 733
Iteration 5
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.7142 - loss: 0.7807 - val_accuracy: 0.7164 - val_loss: 0.7622
Q 150+337 T 487 โ 489
Q 72+934 T 1006 โ 1005
Q 171+62 T 233 โ 231
Q 108+21 T 129 โ 135
Q 755+896 T 1651 โ 1754
Q 117+1 T 118 โ 119
Q 148+95 T 243 โ 241
Q 719+956 T 1675 โ 1684
Q 656+43 T 699 โ 695
Q 368+8 T 376 โ 372
Iteration 6
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 5ms/step - accuracy: 0.7377 - loss: 0.7157 - val_accuracy: 0.7541 - val_loss: 0.6684
Q 945+364 T 1309 โ 1305
Q 762+96 T 858 โ 855
Q 5+650 T 655 โ 655
Q 52+680 T 732 โ 735
Q 77+724 T 801 โ 800
Q 46+739 T 785 โ 785
Q 843+43 T 886 โ 885
Q 158+3 T 161 โ 160
Q 426+711 T 1137 โ 1138
Q 157+41 T 198 โ 190
Iteration 7
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.7642 - loss: 0.6462 - val_accuracy: 0.7955 - val_loss: 0.5433
Q 822+27 T 849 โ 849
Q 82+495 T 577 โ 563
Q 9+366 T 375 โ 373
Q 9+598 T 607 โ 696
Q 186+41 T 227 โ 226
Q 920+920 T 1840 โ 1846
Q 445+345 T 790 โ 797
Q 783+588 T 1371 โ 1360
Q 36+473 T 509 โ 502
Q 354+61 T 415 โ 416
Iteration 8
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.8326 - loss: 0.4626 - val_accuracy: 0.9069 - val_loss: 0.2744
Q 458+154 T 612 โ 612
Q 309+19 T 328 โ 328
Q 808+97 T 905 โ 905
Q 28+736 T 764 โ 764
Q 28+79 T 107 โ 107
Q 44+84 T 128 โ 129
Q 744+13 T 757 โ 757
Q 24+996 T 1020 โ 1011
Q 8+193 T 201 โ 101
Q 483+9 T 492 โ 491
Iteration 9
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.9365 - loss: 0.2275 - val_accuracy: 0.9657 - val_loss: 0.1393
Q 330+61 T 391 โ 391
Q 207+82 T 289 โ 299
Q 23+234 T 257 โ 257
Q 690+567 T 1257 โ 1257
Q 293+97 T 390 โ 380
Q 312+868 T 1180 โ 1180
Q 956+40 T 996 โ 996
Q 97+105 T 202 โ 203
Q 365+44 T 409 โ 409
Q 76+639 T 715 โ 715
Iteration 10
1407/1407 โโโโโโโโโโโโโโโโโโโโ 7s 5ms/step - accuracy: 0.9717 - loss: 0.1223 - val_accuracy: 0.9744 - val_loss: 0.0965
Q 123+143 T 266 โ 266
Q 599+1 T 600 โ 600
Q 729+237 T 966 โ 966
Q 51+120 T 171 โ 171
Q 97+672 T 769 โ 769
Q 840+5 T 845 โ 845
Q 86+494 T 580 โ 570
Q 278+51 T 329 โ 329
Q 8+832 T 840 โ 840
Q 383+9 T 392 โ 392
Iteration 11
1407/1407 โโโโโโโโโโโโโโโโโโโโ 7s 5ms/step - accuracy: 0.9842 - loss: 0.0729 - val_accuracy: 0.9808 - val_loss: 0.0690
Q 181+923 T 1104 โ 1104
Q 747+24 T 771 โ 771
Q 6+65 T 71 โ 71
Q 75+994 T 1069 โ 1069
Q 712+587 T 1299 โ 1299
Q 977+10 T 987 โ 987
Q 742+24 T 766 โ 766
Q 215+44 T 259 โ 259
Q 817+683 T 1500 โ 1500
Q 102+48 T 150 โ 140
Iteration 12
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.9820 - loss: 0.0695 - val_accuracy: 0.9823 - val_loss: 0.0596
Q 819+885 T 1704 โ 1604
Q 34+20 T 54 โ 54
Q 9+996 T 1005 โ 1005
Q 915+811 T 1726 โ 1726
Q 166+640 T 806 โ 806
Q 229+82 T 311 โ 311
Q 1+418 T 419 โ 419
Q 552+28 T 580 โ 580
Q 279+733 T 1012 โ 1012
Q 756+734 T 1490 โ 1490
Iteration 13
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.9836 - loss: 0.0587 - val_accuracy: 0.9941 - val_loss: 0.0296
Q 793+0 T 793 โ 793
Q 79+48 T 127 โ 127
Q 484+92 T 576 โ 576
Q 39+655 T 694 โ 694
Q 64+708 T 772 โ 772
Q 568+341 T 909 โ 909
Q 9+918 T 927 โ 927
Q 48+912 T 960 โ 960
Q 31+289 T 320 โ 320
Q 378+548 T 926 โ 926
Iteration 14
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 5ms/step - accuracy: 0.9915 - loss: 0.0353 - val_accuracy: 0.9901 - val_loss: 0.0358
Q 318+8 T 326 โ 325
Q 886+63 T 949 โ 959
Q 77+8 T 85 โ 85
Q 418+40 T 458 โ 458
Q 30+32 T 62 โ 62
Q 541+93 T 634 โ 634
Q 6+7 T 13 โ 14
Q 670+74 T 744 โ 744
Q 97+57 T 154 โ 154
Q 60+13 T 73 โ 73
Iteration 15
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.9911 - loss: 0.0335 - val_accuracy: 0.9934 - val_loss: 0.0262
Q 24+533 T 557 โ 557
Q 324+44 T 368 โ 368
Q 63+505 T 568 โ 568
Q 670+74 T 744 โ 744
Q 58+359 T 417 โ 417
Q 16+428 T 444 โ 444
Q 17+99 T 116 โ 116
Q 779+903 T 1682 โ 1682
Q 40+576 T 616 โ 616
Q 947+773 T 1720 โ 1720
Iteration 16
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 5ms/step - accuracy: 0.9968 - loss: 0.0175 - val_accuracy: 0.9901 - val_loss: 0.0360
Q 315+155 T 470 โ 470
Q 594+950 T 1544 โ 1544
Q 372+37 T 409 โ 409
Q 537+47 T 584 โ 584
Q 8+263 T 271 โ 271
Q 81+500 T 581 โ 581
Q 75+270 T 345 โ 345
Q 0+796 T 796 โ 796
Q 655+965 T 1620 โ 1620
Q 384+1 T 385 โ 385
Iteration 17
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 5ms/step - accuracy: 0.9972 - loss: 0.0148 - val_accuracy: 0.9924 - val_loss: 0.0278
Q 168+83 T 251 โ 251
Q 951+53 T 1004 โ 1004
Q 400+37 T 437 โ 437
Q 996+473 T 1469 โ 1569
Q 996+847 T 1843 โ 1843
Q 842+550 T 1392 โ 1392
Q 479+72 T 551 โ 551
Q 753+782 T 1535 โ 1535
Q 99+188 T 287 โ 287
Q 2+974 T 976 โ 976
Iteration 18
1407/1407 โโโโโโโโโโโโโโโโโโโโ 7s 5ms/step - accuracy: 0.9929 - loss: 0.0258 - val_accuracy: 0.9973 - val_loss: 0.0135
Q 380+62 T 442 โ 442
Q 774+305 T 1079 โ 1079
Q 248+272 T 520 โ 520
Q 479+736 T 1215 โ 1215
Q 859+743 T 1602 โ 1602
Q 667+20 T 687 โ 687
Q 932+56 T 988 โ 988
Q 740+31 T 771 โ 771
Q 588+88 T 676 โ 676
Q 109+57 T 166 โ 166
Iteration 19
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 5ms/step - accuracy: 0.9977 - loss: 0.0116 - val_accuracy: 0.9571 - val_loss: 0.1416
Q 635+89 T 724 โ 724
Q 50+818 T 868 โ 868
Q 37+622 T 659 โ 659
Q 913+49 T 962 โ 962
Q 641+962 T 1603 โ 1503
Q 11+626 T 637 โ 637
Q 20+405 T 425 โ 425
Q 667+208 T 875 โ 875
Q 89+794 T 883 โ 883
Q 234+55 T 289 โ 289
Iteration 20
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 5ms/step - accuracy: 0.9947 - loss: 0.0194 - val_accuracy: 0.9967 - val_loss: 0.0136
Q 5+777 T 782 โ 782
Q 1+266 T 267 โ 267
Q 579+1 T 580 โ 580
Q 665+6 T 671 โ 671
Q 210+546 T 756 โ 756
Q 660+86 T 746 โ 746
Q 75+349 T 424 โ 424
Q 984+36 T 1020 โ 1020
Q 4+367 T 371 โ 371
Q 249+213 T 462 โ 462
Iteration 21
1407/1407 โโโโโโโโโโโโโโโโโโโโ 7s 5ms/step - accuracy: 0.9987 - loss: 0.0081 - val_accuracy: 0.9840 - val_loss: 0.0481
Q 228+95 T 323 โ 323
Q 72+18 T 90 โ 90
Q 34+687 T 721 โ 721
Q 932+0 T 932 โ 932
Q 933+54 T 987 โ 987
Q 735+455 T 1190 โ 1190
Q 790+70 T 860 โ 860
Q 416+36 T 452 โ 462
Q 194+110 T 304 โ 304
Q 349+70 T 419 โ 419
Iteration 22
1407/1407 โโโโโโโโโโโโโโโโโโโโ 40s 28ms/step - accuracy: 0.9902 - loss: 0.0326 - val_accuracy: 0.9947 - val_loss: 0.0190
Q 95+237 T 332 โ 332
Q 5+188 T 193 โ 193
Q 19+931 T 950 โ 950
Q 38+499 T 537 โ 537
Q 25+21 T 46 โ 46
Q 55+85 T 140 โ 140
Q 555+7 T 562 โ 562
Q 83+873 T 956 โ 956
Q 95+527 T 622 โ 622
Q 556+558 T 1114 โ 1114
Iteration 23
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.9835 - loss: 0.0572 - val_accuracy: 0.9962 - val_loss: 0.0141
Q 48+413 T 461 โ 461
Q 71+431 T 502 โ 502
Q 892+534 T 1426 โ 1426
Q 934+201 T 1135 โ 1135
Q 898+967 T 1865 โ 1855
Q 958+0 T 958 โ 958
Q 23+179 T 202 โ 202
Q 138+60 T 198 โ 198
Q 718+5 T 723 โ 723
Q 816+514 T 1330 โ 1330
Iteration 24
1407/1407 โโโโโโโโโโโโโโโโโโโโ 20s 14ms/step - accuracy: 0.9932 - loss: 0.0255 - val_accuracy: 0.9932 - val_loss: 0.0243
Q 4+583 T 587 โ 587
Q 49+466 T 515 โ 515
Q 920+26 T 946 โ 946
Q 624+813 T 1437 โ 1437
Q 87+315 T 402 โ 402
Q 368+73 T 441 โ 441
Q 86+833 T 919 โ 919
Q 528+423 T 951 โ 951
Q 0+705 T 705 โ 705
Q 581+928 T 1509 โ 1509
Iteration 25
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.9908 - loss: 0.0303 - val_accuracy: 0.9944 - val_loss: 0.0169
Q 107+34 T 141 โ 141
Q 998+90 T 1088 โ 1088
Q 71+520 T 591 โ 591
Q 91+996 T 1087 โ 1087
Q 94+69 T 163 โ 163
Q 108+21 T 129 โ 129
Q 785+60 T 845 โ 845
Q 71+628 T 699 โ 699
Q 294+9 T 303 โ 303
Q 399+34 T 433 โ 433
Iteration 26
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 5ms/step - accuracy: 0.9965 - loss: 0.0139 - val_accuracy: 0.9979 - val_loss: 0.0094
Q 19+133 T 152 โ 152
Q 841+3 T 844 โ 844
Q 698+6 T 704 โ 704
Q 942+28 T 970 โ 970
Q 81+735 T 816 โ 816
Q 325+14 T 339 โ 339
Q 790+64 T 854 โ 854
Q 4+839 T 843 โ 843
Q 505+96 T 601 โ 601
Q 917+42 T 959 โ 959
Iteration 27
1407/1407 โโโโโโโโโโโโโโโโโโโโ 72s 51ms/step - accuracy: 0.9952 - loss: 0.0173 - val_accuracy: 0.9992 - val_loss: 0.0036
Q 71+628 T 699 โ 699
Q 791+9 T 800 โ 800
Q 19+148 T 167 โ 167
Q 7+602 T 609 โ 609
Q 6+566 T 572 โ 572
Q 437+340 T 777 โ 777
Q 614+533 T 1147 โ 1147
Q 948+332 T 1280 โ 1280
Q 56+619 T 675 โ 675
Q 86+251 T 337 โ 337
Iteration 28
1407/1407 โโโโโโโโโโโโโโโโโโโโ 8s 6ms/step - accuracy: 0.9964 - loss: 0.0124 - val_accuracy: 0.9990 - val_loss: 0.0047
Q 2+572 T 574 โ 574
Q 437+96 T 533 โ 533
Q 15+224 T 239 โ 239
Q 16+655 T 671 โ 671
Q 714+5 T 719 โ 719
Q 645+417 T 1062 โ 1062
Q 25+919 T 944 โ 944
Q 89+329 T 418 โ 418
Q 22+513 T 535 โ 535
Q 497+983 T 1480 โ 1480
Iteration 29
1407/1407 โโโโโโโโโโโโโโโโโโโโ 7s 5ms/step - accuracy: 0.9970 - loss: 0.0106 - val_accuracy: 0.9990 - val_loss: 0.0048
Q 2+962 T 964 โ 964
Q 6+76 T 82 โ 82
Q 986+20 T 1006 โ 1006
Q 727+49 T 776 โ 776
Q 948+332 T 1280 โ 1280
Q 921+463 T 1384 โ 1384
Q 77+556 T 633 โ 633
Q 133+849 T 982 โ 982
Q 301+478 T 779 โ 779
Q 3+243 T 246 โ 246
You'll get to 99+% validation accuracy after ~30 epochs.