Everyone that has an ear in the tech world has heard of machine learning. It’s known as a highly intellectual and mathematical field of study that is only practiced by the most scholarly programmers. The general opinion is that you need to know calculus to be able to create anything resembling machine learning. On the contrary, this article will guide you through creating a perceptron in Python without any advanced mathematical theory, and in less than 60 lines of code.
What is a Perceptron?
A perceptron uses the basic ideas of machine learning and neural networks. The idea is that you feed a program a bunch of inputs, and it learns how to process those inputs into an output. It does that by assigning each input a weight. Each input is multiplied by that weight, and summed together. Lastly, we need to turn that sum into a value: 1 or -1. When training a perceptron, we evaluate the output that our program generates, and adjust our weights based on what the inputs were supposed to put out. In effect, perceptrons can help us classify data. Don’t worry if that explanation confused you, you’ll begin to understand as we start coding.
Coding a Perceptron Class
We should start off by creating a perceptron class. In the initializer function, we want to initialize our weights, each one starting off as a random number between -1 and 1. To generate random numbers, we will use random.random()
, which returns a number between 0 and 1.
import random
class Perceptron:
def __init__(self, learn_speed, num_weights):
self.speed = learn_speed
self.weights = []
for x in range(0, num_weights):
self.weights.append(random.random()*2-1)
The first parameter, learn_speed
, is used to control how fast our perceptron will learn. The lower the value, the longer it will take to learn, but the less one value will change each overall weight. If this parameter is too high, our program will change its weights so quickly that they are inaccurate. On the other hand, if learn_speed
is too low, it will take forever to train the perceptron accurately. A good value for this parameter is about 0.01-0.05.
The second parameter, num_weights
, controls how many weights the perceptron will have. Our perceptron will also have the same number of inputs as it does weights, because each input has its own weight.
Next, we need to create a function in our class to take in inputs, and turn them into an output. We do this by multiplying each input by its corresponding weight, summing all those together, and then checking if the sum is greater than 0. In your perceptron class, add this code after the __init__
function:
def feed_forward(self, inputs):
sum = 0
# multiply inputs by weights and sum them
for x in range(0, len(self.weights)):
sum += self.weights[x] * inputs[x]
# return the 'activated' sum
return self.activate(sum)
def activate(self, num):
# turn a sum over 0 into 1, and below 0 into -1
if num > 0:
return 1
return -1
The code above is the base for our perceptron. If you can understand this code very well, you will have a fantastic grasp on the fundamentals of machine learning. Let’s dissect this code piece by piece.
The first function, feed_forward
, is used to turn inputs into outputs. The term feed forward is commonly used in neural networks to describe this process of turning inputs into outputs. This method weights each input based on each corresponding weights. It sums them up, and then uses the activate function to return either 1 or -1.
The activate
function is used to turn a number into 1 or -1. This is implemented because when we use a perceptron, we want to classify data. We classify it into two groups, one of which is represented by 1, and the other is represented by -1.
You might be wondering, “What’s the use of this if the weights are random?” That’s why we have to train the perceptron before we use it. In our train function, we want to make a guess based on the inputs provided, and then see how our guess compared to the output we wanted. The train function for the perceptron class is shown below.
def train(self, inputs, desired_output):
guess = self.feed_forward(inputs)
error = desired_output - guess
for x in range(0, len(self.weights)):
self.weights[x] += error*inputs[x]*self.speed
Most of the first few lines should make sense. Our function takes in inputs, and the output that should happen when we run the inputs through our program. We make a guess on the inputs using our feed_forward
function, and then calculate our error based on what we should have outputted. Notice that if we predicted correctly, error will equal to 0, and the last line of our function will not change our weights at all.
The last two lines of this function are the juicy part — they put the learning in machine learning. We loop through each weight and adjust it by how much error we had. Notice that we are using the self.speed
variable here, which determines how fast out perceptron learns. By running this train function on a bunch of inputs and their outputs, we can eventually teach our perceptron to get the correct output.
Training the Perceptron
Our perceptron has no use if we don’t actually train it. We will do this by coding a quick Trainer class. In this example, we will train our perceptron to tell us whether a point is above a line or below a line. Our line, in this case, is represented by the equation y = 0.5x + 10. Once you know how to train a perceptron to recognize a line, you can represent x and y as different attributes, and above or below the line as results of those attributes.
For example, if you had a dataset on the GPAs and ACT scores of Harvard applicants, and whether they got accepted or not, you could train a perceptron to find a line on a graph where x=GPA score and y=ACT score. Above the line would be students that got accepted, and below the line would be students that got rejected. You could then use this perceptron to predict whether or not a student will get accepted into Harvard based on their GPA and ACT scores.
In this example, we’ll stick with recognizing a line. To do this, we will create a Trainer class that trains a perceptron with points, and whether or not they are above the line. Below is the code for our Trainer class:
class Trainer:
def __init__(self):
self.perceptron = Perceptron(0.01, 3)
def f(self, x):
return 0.5*x + 10 # line: f(x) = 0.5x + 10
def train(self):
for x in range(0, 1000000):
x_coord = random.random()*500-250
y_coord = random.random()*500-250
line_y = self.f(x_coord)
if y_coord > line_y: # above the line
answer = 1
self.perceptron.train([x_coord, y_coord,1], answer)
else: # below the line
answer = -1
self.perceptron.train([x_coord, y_coord,1], answer)
return self.perceptron # return our trained perceptron
As you can see, the initializer for the Trainer class creates a perceptron with three inputs and a learning speed of 0.01. The first two inputs are x and y, but what is the last input? This is another core concept of neural networks and machine learning. That last input will always set to 1. The weight that corresponds to it will determine how it affects our line. For example, if you look back at our equation: y = 0.5x + 10, we need some way of representing the y-intercept, 10. We do this by creating a third input that increases or decreases based on the weight that the perceptron trains it to have. Think of it as a threshold that helps the perceptron understand that the line is adjusted 10 units upward.
In our f
function, we take in an x coordinate and return a y coordinate. This is used to find points on the line based on their x coordinate, which will come in handy in the next function.
This train
function for the Trainer class is where all the magic happens, and we actually get to train our perceptron. We start off by looping 1 million times. Remember how we had a learning speed for our perceptron? The more times that we train our perceptron (in this case, 1 million times), the more accurate it will become, even with a low learning speed.
In each iteration of the loop, we create a point, determine if it is above or below the line, and then feed those inputs into the perceptron’s train
method. First, x and y coordinates are randomly generated between -250 and 250. Next, we find where the y coordinate would be on the line for that x value to see if our point is above the line. For example, if we picked a point at (1, 3), then we should get the y coordinate of the line for the x value of 3. We do this with our f
function. If our random y coordinate is higher than the corresponding y coordinate on the line, we know that our random coordinate is above the line.
That’s what we do in the if...else
statement. If our point is above the line, we set the expected output, stored in answer
to be 1. If our point is below the line, our expected output is -1. We then train our perceptron based on the x coordinate, the y coordinate, and our expected output. After the whole loop is done, we return our newly trained perceptron object.
Running the Program
To run the program, we create a trainer object, and call its .train()
method.
trainer = Trainer()
p = trainer.train()
Now for the moment of glory; we run the program. Let’s pick two points, (-7, 9) and (3, 1). The first point is above the line, so it should return 1, and the second is below the line, so it should return -1. Let’s see how we would run our perceptron:
print "(-7, 9): " + p.feed_forward([-7,9,1])
print "(3, 1): " + p.feed_forward([3,1,1])
And if we run it:
(-7, 9): 1
(3, 1): -1
Success! Our program has detected whether each point is above or below the line. You can try more points to test for yourself that the program is running correctly.
###Wrapping up
This article is meant as a starting point to help you understand some of the fundamentals of machine learning; specifically, with neural networks. If you would like to delve further into this subject, check out some of these links: