Implementing, Tuning and Data Curation of Perceptrons and Feedforward Networks

Overview

The purpose of this assignment is to learn the underpinnings of feedforward networks. You will implement a perceptron and a FFnet with backpropagation. You will experiment with key hyper-parameters so as to understand how they interact and how they influence the performance of the networks. You will additionally learn about the task of curating data so that it may be used to successfully train neural networks.

Assignment

Review the materials from the days 4, 5, 7, and 8.
Throughout this assignment, complete the lab manual
Engineered Perceptron [5 pts] As a warm-up exercise, install the EngineeredPerceptron.java file and the Testing.java file. It is supposed to implement the Boolean AND function. Run it. It should fail to implement AND. Your job is to identify the lowest positive values of the weights, using one decimal and the lowest value of the threshold, again, using one decimal so that the network implements AND. Write down the three values in the lab-manual.
Learning Perceptron [20 pts] Continuing with Perceptrons, install Perceptron.java. Above, you engineered a Perceptron to recognize AND. You will now experiment with learning AND. Implement a sigmoid activation function as well as function the calculates the derivative of the sigmoid activation function. Then, complete the perceptron learning algorithm and a testing procedure. Explain in the lab manual what the criteria for the network to successful produce a desired output are. The testing procedure may just print to the Console. Test your network implementation, using the initial weights and learning rate that we used in the AND Perceptron worksheet. The weights after running one epoch should be:
```
Weights, left to right are:
0.911496 0.911496
```
Now, change the perceptron to have small random initial weights and experiment with different learning rates and number of epochs. Attempt to learn boolean AND. Hint: You should not be able to learn AND. In your lab manual fill in the table for this question. Please experiment with the following learning rates: 0.3, 0.5, 0.7 and 0.9. Vary the number of training epochs. Answer the questions in the lab manual for this item.
FFnet [40 pts] Implement a feed-forward network and test it. Name the file "FFnet.java" and feel free to copy and paste from the "Perceptron.java" file. Your network only needs to have a single hidden layer. Test your network implementation in the "Testing.java" file, using the XOR FFnet worksheet. The weights after running one epoch should be:
```
Input layer Weights, left to right:
0.9824008133397731 -1.0174696871329179 -1.0175482180492386 0.9824436414782634
Output layer Weights, left to right:
0.937738882338471 0.938059808506576
```
Now, change the perceptron to have small random initial weights and experiment with different learning rates and number of epochs. Attempt to learn boolean XOR. Hint: You should not be able to learn AND. In your lab manual explain why this network cannot learn XOR.
Data Curation Light [10 pts] Modify the "Testing.java" file so that it reads in all of the training and testing data for MNIST. Here is the MNIST training data An explanation of it can be found here. Here are some hints for curating the data. Here is a set procedures that visualizes the data after you read it in properly. Here is the PixelGrid.java file that goes with it. Paste the visualizaion of the first three images from the training set in the lab manual.
Training an FFnet on MNSIT [25 pts] Modify your "Testing.java" file to train set-up and train an FFnet on MNIST. Experiment with the following hyper parameters and ranges:
- Size of hidden layer: less than 50, 50-500, larger than 500.
- Initial random weight are all positive, all negative and both, positive and negative.
- Learning rates: 0.3, 0.5, 0.7.
- Additional curation of the data. Right now, your data probably consists of a grey-scale image. Try removing some of the precision of the data.
Write a procedure that determines the number of errors. Since you are using a sigmoid activation function, you need to be slightly creative in determining when your network determined the correct answer. Please maintain two counts: The number of items you get wrong in the 10 neuron output layer. Notice that you may gety all 10 items wrong. The second count only tallies whether you get the desired output wrong. Run these statistics both for the training set and the testing set.
How many epochs did it take for the network to converge?
Answer the questions in the lab manual.

Submission

Please submit a zipped copy of the following items to the appropriate drop-box on Moodle.

The lab manual.
Your Perceptron.java, FFnet.java and Testing.java files.