I wrote an implementation of a simple XOR neural network, http://datasciencetoolbox.blogspot.com/2018/01/backpropagati.... It is based on the article https://www.cs.cmu.edu/~dst/pubs/byte-hiddenlayer-1989.pdf. I am hoping that this will be an easy way to introduce the topic to beginners. How can this be improved?