Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
387 views
in Technique[技术] by (71.8m points)

python - Problems implementing an XOR gate with Neural Nets in Tensorflow

I want to make a trivial neural network, it should just implement the XOR gate. I am using the TensorFlow library, in python. For an XOR gate, the only data I train with, is the complete truth table, that should be enough right? Over optimization is what I will expect to happen very quickly. Problem with the code is that the weights and biases do not update. Somehow it still gives me 100% accuracy with zero for the biases and weights.

x = tf.placeholder("float", [None, 2])
W = tf.Variable(tf.zeros([2,2]))
b = tf.Variable(tf.zeros([2]))

y = tf.nn.softmax(tf.matmul(x,W) + b)

y_ = tf.placeholder("float", [None,1])


print "Done init"

cross_entropy = -tf.reduce_sum(y_*tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.75).minimize(cross_entropy)

print "Done loading vars"

init = tf.initialize_all_variables()
print "Done: Initializing variables"

sess = tf.Session()
sess.run(init)
print "Done: Session started"

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1], [0], [0], [0]])


acc=0.0
while acc<0.85:
  for i in range(500):
      sess.run(train_step, feed_dict={x: xTrain, y_: yTrain})


  print b.eval(sess)
  print W.eval(sess)


  print "Done training"


  correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

  accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

  print "Result:"
  acc= sess.run(accuracy, feed_dict={x: xTrain, y_: yTrain})
  print acc

B0 = b.eval(sess)[0]
B1 = b.eval(sess)[1]
W00 = W.eval(sess)[0][0]
W01 = W.eval(sess)[0][1]
W10 = W.eval(sess)[1][0]
W11 = W.eval(sess)[1][1]

for A,B in product([0,1],[0,1]):
  top = W00*A + W01*A + B0
  bottom = W10*B + W11*B + B1
  print "A:",A," B:",B
  # print "Top",top," Bottom: ", bottom
  print "Sum:",top+bottom

I am following the tutorial from http://tensorflow.org/tutorials/mnist/beginners/index.md#softmax_regressions and in the final for-loop I am printing the results form the matrix(as described in the link).

Can anybody point out my error and what I should do to fix it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There are a few issues with your program.

The first issue is that the function you're learning isn't XOR - it's NOR. The lines:

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1], [0], [0], [0]])

...should be:

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[0], [1], [1], [0]])

The next big issue is that the network you've designed isn't capable of learning XOR. You'll need to use a non-linear function (such as tf.nn.relu() and define at least one more layer to learn the XOR function. For example:

x = tf.placeholder("float", [None, 2])
W_hidden = tf.Variable(...)
b_hidden = tf.Variable(...)
hidden = tf.nn.relu(tf.matmul(x, W_hidden) + b_hidden)

W_logits = tf.Variable(...)
b_logits = tf.Variable(...)
logits = tf.matmul(hidden, W_logits) + b_logits

A further issue is that initializing the weights to zero will prevent your network from training. Typically, you should initialize your weights randomly, and your biases to zero. Here's one popular way to do it:

HIDDEN_NODES = 2

W_hidden = tf.Variable(tf.truncated_normal([2, HIDDEN_NODES], stddev=1./math.sqrt(2)))
b_hidden = tf.Variable(tf.zeros([HIDDEN_NODES]))

W_logits = tf.Variable(tf.truncated_normal([HIDDEN_NODES, 2], stddev=1./math.sqrt(HIDDEN_NODES)))
b_logits = tf.Variable(tf.zeros([2]))

Putting it all together, and using TensorFlow routines for cross-entropy (with a one-hot encoding of yTrain for convenience), here's a program that learns XOR:

import math
import tensorflow as tf
import numpy as np

HIDDEN_NODES = 10

x = tf.placeholder(tf.float32, [None, 2])
W_hidden = tf.Variable(tf.truncated_normal([2, HIDDEN_NODES], stddev=1./math.sqrt(2)))
b_hidden = tf.Variable(tf.zeros([HIDDEN_NODES]))
hidden = tf.nn.relu(tf.matmul(x, W_hidden) + b_hidden)

W_logits = tf.Variable(tf.truncated_normal([HIDDEN_NODES, 2], stddev=1./math.sqrt(HIDDEN_NODES)))
b_logits = tf.Variable(tf.zeros([2]))
logits = tf.matmul(hidden, W_logits) + b_logits

y = tf.nn.softmax(logits)

y_input = tf.placeholder(tf.float32, [None, 2])

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, y_input)
loss = tf.reduce_mean(cross_entropy)

train_op = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

init_op = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init_op)

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1, 0], [0, 1], [0, 1], [1, 0]])

for i in xrange(500):
  _, loss_val = sess.run([train_op, loss], feed_dict={x: xTrain, y_input: yTrain})

  if i % 10 == 0:
    print "Step:", i, "Current loss:", loss_val
    for x_input in [[0, 0], [0, 1], [1, 0], [1, 1]]:
      print x_input, sess.run(y, feed_dict={x: [x_input]})

Note that this is probably not the most efficient neural network for computing XOR, so suggestions for tweaking the parameters are welcome!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...