Following Filip Malczak's and Seanny123's suggestions and comments, I implemented a neural network in tensorflow to check what happens when we try to teach it to predict (and interpolate) the 2-nd square.
Training on continuous interval
I trained the network on the interval [-7,7] (taking 300 points inside this interval, to make it continuous), and then tested it on the interval [-30,30]. The activation functions are ReLu, and the network has 3 hidden layers, each one is of size 50. epochs=500. The result is depicted in the figure below.
So basically, inside (and also close to) the interval [-7,7], the fit is quite perfect, and then it continues more or less linearly outside. It is nice to see that at least initially, the slope of the network's output tries to "match" the slope of x^2
. If we increase the test interval, the two graphs diverge quite a lot, as one can see in the figure below:
Training on even numbers
Finally, if instead I train the network on the set of all even integers in the interval [-100,100], and apply it on the set of all integers (even and odd) in this interval, I get:
When training the network to produce the image above, I increased the epochs to 2500 to get a better accuracy. The rest of the parameters stayed unchanged. So it seems that interpolating "inside" the training interval works quite well (maybe except of the area around 0, where the fit is a bit worse).
Here is the code that I used for the first figure:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow.python.framework.ops import reset_default_graph
#preparing training data
train_x=np.linspace(-7,7,300).reshape(-1,1)
train_y=train_x**2
#setting network features
dimensions=[50,50,50,1]
epochs=500
batch_size=5
reset_default_graph()
X=tf.placeholder(tf.float32, shape=[None,1])
Y=tf.placeholder(tf.float32, shape=[None,1])
weights=[]
biases=[]
n_inputs=1
#initializing variables
for i,n_outputs in enumerate(dimensions):
with tf.variable_scope("layer_{}".format(i)):
w=tf.get_variable(name="W",shape=[n_inputs,n_outputs],initializer=tf.random_normal_initializer(mean=0.0,stddev=0.02,seed=42))
b=tf.get_variable(name="b",initializer=tf.zeros_initializer(shape=[n_outputs]))
weights.append(w)
biases.append(b)
n_inputs=n_outputs
def forward_pass(X,weights,biases):
h=X
for i in range(len(weights)):
h=tf.add(tf.matmul(h,weights[i]),biases[i])
h=tf.nn.relu(h)
return h
output_layer=forward_pass(X,weights,biases)
cost=tf.reduce_mean(tf.squared_difference(output_layer,Y),1)
cost=tf.reduce_sum(cost)
optimizer=tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
#train the network
for i in range(epochs):
idx=np.arange(len(train_x))
np.random.shuffle(idx)
for j in range(len(train_x)//batch_size):
cur_idx=idx[batch_size*j:batch_size*(j+1)]
sess.run(optimizer,feed_dict={X:train_x[cur_idx],Y:train_y[cur_idx]})
#current_cost=sess.run(cost,feed_dict={X:train_x,Y:train_y})
#print(current_cost)
#apply the network on the test data
test_x=np.linspace(-30,30,300)
network_output=sess.run(output_layer,feed_dict={X:test_x.reshape(-1,1)})
plt.plot(test_x,test_x**2,color='r',label='y=x^2')
plt.plot(test_x,network_output,color='b',label='network output')
plt.legend(loc='center')
plt.show()