Hi all, I’d like to do a series on neural networks (or machine learning, or AI), starting from the very basics, not using any frameworks. This is inspired by Andrej Karpathy’s intro video here so perhaps consider watching that (if you can find a few hours spare!). I seem to be writing about maths a lot lately, which gave me an idea: everyone understands spreadsheets (Excel / Pages / Google Sheets), so I’m going to use them to (hopefully!) make the maths clearer.
I want to make ‘what is a neuron’ concrete in some way, to give you a ‘scaffold’ to build your learning on, I believe that helps.
So: say you want to define a mathematical formula for ‘how much is a square block of land worth’. It has two inputs: width and length. It might look like this:
Land price($) = width(m) * length(m) * 200 + 100000
You could call this a function: value(s) in, value out. A machine learning neuron is just one of these: It takes some input(s), does some maths with them, and outputs a value. And a massive grid of these neurons all connected together can achieve surprisingly complex results.
Here’s how the maths behind a single neuron works. There’s not much to it:
Net input = Input 1 * Weight 1 + Input 2 * Weight 2 + Bias
Output = tanh(Net input)
Please click here to see the above in spreadsheet form. I tried embedding a nice JS spreadsheet but it didn’t work on mobile, thus the google sheets link. In case that doesn’t work, it looks like so:
A | B | C | D | E | F | G | H | I | |
---|---|---|---|---|---|---|---|---|---|
1 | Input 1 | Input 2 | Weight 1 | Weight 2 | Bias | Net | Output | Target | Loss |
2 | 0.9 | 0.8 | 0.7 | 0.6 | 0.5 | =A2 * C2 + B2 * D2 + E2 | =tanh(F2) | 0.8 | =(G2 - H2)^2 |
You may be wondering what ‘tanh’ is. It’s a hyperbolic tangent, which neatly squashes the net and spits out a value between -1 and 1. This is called the ‘activation function’ - there are other options (eg the logistic function) that can be used instead.
Initial values for the weights and bias are random numbers in the range -1..1. They are tweaked in the learning process, which I’ll explain in an upcoming article. The collection of weights and biases are also called the parameters.
Loss is used to calculate how ‘good’ a neural network is at calculating the desired target. It will be always positive, and the closer to zero the better. In this simple example, it is calculated as the square of the output-vs-target delta:
loss = (output - target)^2
Because I enjoy fooling around with Rust, here’s a little demo, perhaps this will solidify the concepts from a developer’s perspective:
struct Neuron {
input1: f32,
input2: f32,
weight1: f32,
weight2: f32,
bias: f32,
}
impl Neuron {
fn net(&self) -> f32 {
self.input1 * self.weight1 +
self.input2 * self.weight2 +
self.bias
}
fn output(&self) -> f32 {
self.net().tanh()
}
fn loss(&self, target: f32) -> f32 {
let delta = target - self.output();
delta * delta
}
}
fn main() {
let neuron = Neuron {
input1: 0.1,
input2: 0.2,
weight1: 0.3,
weight2: 0.4,
bias: 0.5,
};
println!("Net: {:.3}", neuron.net());
println!("Output: {:.3}", neuron.output());
println!("Loss: {:.3}", neuron.loss(0.5));
}
Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!
Photo by Josh Riemer on Unsplash
Thanks for reading! And if you want to get in touch, I'd love to hear from you: chris.hulbert at gmail.
(Comp Sci, Hons - UTS)
Software Developer (Freelancer / Contractor) in Australia.
I have worked at places such as Google, Cochlear, Assembly Payments, News Corp, Fox Sports, NineMSN, FetchTV, Coles, Woolworths, Trust Bank, and Westpac, among others. If you're looking for help developing an iOS app, drop me a line!
Get in touch:
[email protected]
github.com/chrishulbert
linkedin