{"id":12542,"date":"2025-04-10T18:12:29","date_gmt":"2025-04-10T18:12:29","guid":{"rendered":"https:\/\/cheesecakelabs.com\/blog\/"},"modified":"2025-05-29T19:05:56","modified_gmt":"2025-05-29T19:05:56","slug":"building-neural-networks-from-scratch","status":"publish","type":"post","link":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/","title":{"rendered":"Building Neural Networks from Scratch"},"content":{"rendered":"\n<p>Neural networks are powerful machine learning algorithms that have transformed countless industries. They can power everything from fraud detection and demand forecasting to personalized recommendations and autonomous systems and are a great way to incorporate smarter decision-making into your applications.&nbsp;<\/p>\n\n\n\n<p>In this guide, we&#8217;ll dive deep into the fundamentals of neural networks, from the first representations of artificial neurons to implementing your own linear regression and classification models.&nbsp;<\/p>\n\n\n\n<p>Get ready to unlock the full potential of neural networks and embark on an exciting journey in artificial intelligence.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>1. Neural networks as a machine learning algorithm<\/strong><\/h2>\n\n\n\n<p>Neural networks are a type of machine learning algorithm inspired by the structure and function of the human brain. They are made up of interconnected nodes, or \u201cneurons,\u201d that can learn to perform specific tasks by analyzing large amounts of data.<\/p>\n\n\n\n<p>Neural networks have many applications, including image recognition, natural language processing, speech recognition, and predictive analytics. They excel at identifying patterns and making complex decisions, making them a valuable tool in many industries.<\/p>\n\n\n\n<p>Neural networks are highly flexible and can adapt to a variety of problems. They can learn from data and improve their performance over time, making them powerful tools for tackling complex, real-world challenges.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2. Task, model, performance measurement, and experience<\/strong><\/h2>\n\n\n\n<p>To build an effective neural network, you need to consider several key components that define its development and success. These include clearly identifying the task the network will perform, selecting an appropriate model, establishing performance measurement criteria, and leveraging experience through training data.&nbsp;<\/p>\n\n\n\n<p>Each element plays a crucial role in shaping how well the neural network learns and generalizes to new data. Here\u2019s what each element does and why they\u2019re important:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Task &#8211; <\/strong>The first step in building a neural network is to define the task you want it to perform. This could be anything from image classification to natural language processing. Defining the task helps you design the appropriate network architecture and choose the correct training data.<\/li>\n\n\n\n<li><strong>Model &#8211; <\/strong>Model selection is an important aspect of building a neural network. Different types of neural network models (like feedforward neural networks, convolutional neural networks, and recurrent neural networks) are suited for different tasks. Choosing the right model improves the network&#8217;s performance and ability to solve the desired problem.<\/li>\n\n\n\n<li><strong>Performance measurement<\/strong> &#8211; Once you&#8217;ve defined the task, you need to determine how to measure your neural network\u2019s performance of your neural network. Metrics can include things like accuracy, precision, recall, or F1-score. Depending on the specific problem you&#8217;re trying to solve.<\/li>\n\n\n\n<li><strong>Experience<\/strong> &#8211; The final step is to provide the neural network with training data, allowing it to learn and improve its performance over time. This experience phase is crucial for building a high-performing model that can generalize well to new, unseen data.<\/li>\n<\/ul>\n\n\n\n<p>In this post, we\u2019re looking at the first artificial neuron model, so we\u2019ll define the task as a linear regression problem.&nbsp;<\/p>\n\n\n\n<p>The linear regression problem is when we use the artificial neuron to represent functions. Instead of using a function like y = mx + b, we present the data to the artificial neuron and let it discover the appropriate values for the coefficients <em>m<\/em> and <em>b <\/em>that represent the problem. We will return to this in Section 4.<\/p>\n\n\n\n<p>In the next section, we will describe artificial neuron modeling. Don\u2019t be scared of the math \u2014 we won\u2019t go too deep into it, promise!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>3. The First representation of an artificial neuron (Perceptron)<\/strong><\/h2>\n\n\n\n<p>The foundation of modern neural networks can be traced back to the earliest mathematical representation of an artificial neuron, known as the <a href=\"https:\/\/www.geeksforgeeks.org\/what-is-perceptron-the-simplest-artificial-neural-network\/\" target=\"_blank\" rel=\"noreferrer noopener\">Perceptron<\/a>. First introduced by Warren McCulloch and Walter Pitts in 1943, this simple yet powerful model mimics how biological neurons process information. By taking weighted inputs, applying a bias, and passing the result through an activation function, the perceptron is the fundamental building block for more advanced neural network architectures.&nbsp;<\/p>\n\n\n\n<p>This section explores the key components of the perceptron and how they contribute to its ability to make decisions.&nbsp;<\/p>\n\n\n\n<p>Equation 1 is the mathematical model of the artificial neuron:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"1013\" height=\"135\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/formula1.png\" alt=\"\" class=\"wp-image-12558\" style=\"width:328px;height:auto\" srcset=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/formula1.png 1013w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/formula1-600x80.png 600w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/formula1-768x102.png 768w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/formula1-760x101.png 760w\" sizes=\"(max-width: 1013px) 100vw, 1013px\" \/><\/figure>\n\n\n\n<p>Where:&nbsp;<\/p>\n\n\n\n<p><em>y<\/em> is the output,<\/p>\n\n\n\n<p>phi is the activation function,<\/p>\n\n\n\n<p><em>X<\/em> is the vector containing all the inputs,<\/p>\n\n\n\n<p><em>W<\/em> is the vector containing all the weights,<\/p>\n\n\n\n<p><em>b<\/em> is the bias<\/p>\n\n\n\n<p>From linear algebra, we can describe the vector product as:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"1192\" height=\"322\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Captura-de-Tela-2025-04-10-as-14.17.57.png\" alt=\"\" class=\"wp-image-12630\" style=\"width:539px;height:auto\" srcset=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Captura-de-Tela-2025-04-10-as-14.17.57.png 1192w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Captura-de-Tela-2025-04-10-as-14.17.57-600x162.png 600w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Captura-de-Tela-2025-04-10-as-14.17.57-768x207.png 768w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Captura-de-Tela-2025-04-10-as-14.17.57-760x205.png 760w\" sizes=\"(max-width: 1192px) 100vw, 1192px\" \/><\/figure>\n\n\n\n<p>Where:&nbsp;<\/p>\n\n\n\n<p><em>n<\/em> is the total number of inputs and weights.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Inputs &#8211; <\/strong>The first representation of an artificial neuron, proposed by McCulloch and Pitts, takes a set of inputs (<em>x<\/em><em><sub>1<\/sub><\/em>, <em>x<\/em><em><sub>2<\/sub><\/em>, &#8230;, <em>x<\/em><em><sub>n<\/sub><\/em>) and assigns a weight (<em>w<\/em><em><sub>1<\/sub><\/em>, <em>w<\/em><em><sub>2<\/sub><\/em>, &#8230;, <em>w<\/em><em><sub>n<\/sub><\/em>) to each input.<\/li>\n\n\n\n<li><strong>Weighted sum &#8211; <\/strong>The neuron then computes the weighted sum of the inputs, which is the sum of the products of each input and its corresponding weight, as shown by Equation 2.<\/li>\n\n\n\n<li><strong>Bias &#8211; <\/strong>The bias, represented by the parameter <em>b<\/em>, is added to the weighted sum before the activation function is applied. This allows the neuron to shift its activation threshold, enabling more complex decision boundaries.<\/li>\n\n\n\n<li><strong>Activation function &#8211; <\/strong>Finally, the neuron applies an activation function, such as a step function or a sigmoid function, to the weighted sum to determine the neuron&#8217;s output. We will talk about activation functions later.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" width=\"512\" height=\"284\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/neuron.png\" alt=\"\" class=\"wp-image-12543\" style=\"width:712px;height:auto\"\/><figcaption class=\"wp-element-caption\"> Depicts the neuron McCulloch and Pitts used as inspiration to develop the Perceptron model. <\/figcaption><\/figure>\n<\/div>\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" width=\"512\" height=\"263\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/equation-with-artificial-neuron.png\" alt=\"\" class=\"wp-image-12545\" style=\"width:710px;height:auto\"\/><figcaption class=\"wp-element-caption\">Visual representation of Equation 1, which makes it easy to compare with a real neuron.&nbsp;&nbsp;<\/figcaption><\/figure>\n<\/div>\n\n\n<p>Next, we will describe how to use this model to solve a linear regression problem. Plus, we\u2019ll also show you how to measure the model error and train it to reduce that error.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>4. The regression problem<\/strong><\/h2>\n\n\n\n<p>As described in Section 2, the first task we will assign to our artificial neuron is a regression problem. To simplify the representation, we can remove the activation function and keep only one input.&nbsp;<br><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"512\" height=\"294\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/simplification.png\" alt=\"\" class=\"wp-image-12547\"\/><figcaption class=\"wp-element-caption\">Simplified representation of an artificial neuron<\/figcaption><\/figure>\n<\/div>\n\n\n<p>Before we dive deeper into linear regression, let\u2019s answer a few questions:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>What is linear regression?<\/strong><\/h3>\n\n\n\n<p>Linear regression is a fundamental machine learning algorithm that can be used to model the relationship between a dependent variable and one or more independent variables. The goal is to find the best-fitting straight line that minimizes the distance between the data points and the line.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>What are the applications of linear regression?<\/strong><\/h3>\n\n\n\n<p>Linear regression has many applications, including predicting sales, forecasting stock prices, and analyzing the relationship between various factors in social and economic studies. It&#8217;s a powerful tool for understanding and quantifying the relationships between variables.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>How to implement a linear regression?<\/strong><\/h3>\n\n\n\n<p>To implement linear regression, we need to define the model equation, which takes the form <em>y = mx + b<\/em>, where <em>y<\/em> is the dependent variable, <em>x<\/em> is the independent variable, <em>m<\/em> is the slope, and <em>b<\/em> is the y-intercept. We then use optimization techniques to find the values of <em>m<\/em> and <em>b<\/em> that best fit the data.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>What are the limitations of linear regression?<\/strong><\/h3>\n\n\n\n<p>While linear regression is a practical algorithm, it has limitations. It assumes a linear relationship between the variables, which may not always be the case. It can also be sensitive to outliers and may not perform well when dealing with complex, non-linear relationships.<\/p>\n\n\n\n<p><a href=\"https:\/\/cheesecakelabs.com\/blog\/ai-regression-applications\/\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Check out this post to learn more about using regression in AI development.<\/em><\/a><\/p>\n\n\n\n<p>So, we\u2019ve defined the task and model we will work on, but we need two more steps to make everything fit. Let\u2019s take a look at how we measure the model results and how we train it.&nbsp;<\/p>\n\n\n\n<p>There are different methods and algorithms to measure and train artificial neurons. They get more complex as the problems we tackle get more challenging. But, to understand how we connect everything we will use the Mean Square Error (MSE) and the backpropagation algorithms to solve our problem.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>4.1 Measuring the error<\/strong><\/h2>\n\n\n\n<p>To measure the error of our model, we use a loss or cost function that quantifies the difference between the predicted values and the actual values of the dependent variable. The goal is to minimize this function by adjusting the slope and y-intercept values. The most commonly used loss function in linear regression is the<a href=\"https:\/\/en.wikipedia.org\/wiki\/Mean_squared_error\" target=\"_blank\" rel=\"noreferrer noopener\"> Mean Squared Error (MSE)<\/a>, which calculates the average squared difference between the predicted and actual values, as described in Equation 3.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"393\" height=\"143\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Captura-de-Tela-2025-04-10-as-14.20.10.png\" alt=\"\" class=\"wp-image-12632\" style=\"width:343px;height:auto\"\/><\/figure>\n\n\n\n<p>Where: Y is the predicted value, and \u0176 is the reference value from the training data set. <strong><br>Ok, now what do we do with the value of the MSE?<\/strong><\/p>\n\n\n\n<p>We use it to adjust the model\u2019s weight and bias, which makes it predict values closer to the reference value from the data set. For that, we use the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Backpropagation\" target=\"_blank\" rel=\"noreferrer noopener\">backpropagation algorithm<\/a> and a bunch of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hyperparameter_(machine_learning)\" target=\"_blank\" rel=\"noreferrer noopener\">hyperparameters<\/a> to help us train the model.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>4.2 Training the model<\/strong><\/h2>\n\n\n\n<p>Artificial neurons are trained by adjusting the weights and biases associated with them through a process called backpropagation. During training, the input data is fed through the neural network, and the output is compared to the desired output. The difference between the two is used to calculate the loss, and then the weights and biases are adjusted to minimize the loss using optimization algorithms like gradient descent. This process is repeated iteratively until the network reaches satisfactory accuracy.<\/p>\n\n\n\n<p>The<a href=\"https:\/\/en.wikipedia.org\/wiki\/Gradient_descent\" target=\"_blank\" rel=\"noreferrer noopener\"> Gradient Descent<\/a> algorithm in machine learning is used to minimize the cost function and find the optimal set of weights by following the steepest descent in the negative direction of the gradient.&nbsp;<\/p>\n\n\n\n<p>In each iteration, the parameters are updated by subtracting the gradient multiplied by a learning rate, which is a hyperparameter that determines the step size. The process is repeated until the convergence criteria are met.&nbsp;<\/p>\n\n\n\n<p>The gradient descent equation is shown in Equation 4, where \ud835\udefb<em>F(y)<\/em> is the gradient of the Cost Function, is the learning rate and <em>w<\/em> is the model weight.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"350\" height=\"85\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Captura-de-Tela-2025-04-10-as-14.24.16.png\" alt=\"\" class=\"wp-image-12634\" style=\"width:334px;height:auto\"\/><\/figure>\n\n\n\n<p>Figure 4 presents the process of converging the weight to a minimum value using the gradient descendent. It is important to note a few points:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Learning rate<\/strong>: The learning rate hyperparameter has no exact value. It must be chosen empirically by testing different values and observing which gives the best results. However, a hint is to use very small values, such as 0.0001 or 0.00001, as high values make the algorithm pass by the local minimum, skipping the optimum weight value and rapidly exploding.&nbsp;<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Epochs<\/strong>: The number of epochs is the number of times we want the backpropagation algorithm to run through our model to adjust the weights and bias. The epochs are another hyperparameter that must be chosen by the people developing the model. The hint here is to test because if the training is short and the number of epochs is low, the model can\u2019t learn very well and won\u2019t generalize the outputs correctly (this phenomenon is known as model underfit). Meanwhile, if the number of epochs is too high, the model becomes very specialized on the training data and loses the ability to generalize (leading to a phenomenon known as model overfit).<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" width=\"512\" height=\"319\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/Gradient-Descendent.png\" alt=\"\" class=\"wp-image-12551\" style=\"width:512px;height:auto\"\/><figcaption class=\"wp-element-caption\"><em>Graphical representation of the Gradient Descendent algorithm applied to a model to find the minimum weight value that optimizes the cost function.<\/em><\/figcaption><\/figure>\n<\/div>\n\n\n<p>Today, algorithms already exist that help to choose the best hyperparameter values for Perceptrons. However, here, we are presenting an example developed from scratch without the help of any framework or library. This process is essential to understanding how we train bigger neural networks. The algorithms are more sophisticated in reducing training time, but the process is the same.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>4.3 Show me the code<\/strong><\/h2>\n\n\n\n<p>Okay, after all this theory and math, let\u2019s dive into the code to understand how to develop a simplified artificial neuron to solve a linear regression task. All the code presented here is available in <a href=\"https:\/\/github.com\/paulormnas\/neuralNetFromScratch\/tree\/main\" target=\"_blank\" rel=\"noreferrer noopener\">this repository<\/a>.<br><br>This Python code implements a simple neural network with a single neuron, capable of learning a linear relationship between inputs and outputs. It includes methods for forward propagation, loss computation, and training using gradient descent. Let\u2019s break it down step by step.<\/p>\n\n\n<pre class=\"wp-block-code alignwide\" style=\"padding-top:0;padding-right:var(--wp--preset--spacing--80);padding-bottom:0;padding-left:var(--wp--preset--spacing--80)\" aria-describedby=\"shcb-language-1\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> random\n\n<span class=\"hljs-class\"><span class=\"hljs-keyword\">class<\/span> <span class=\"hljs-title\">NeuralNetwork<\/span>:<\/span>\n\t<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">__init__<\/span><span class=\"hljs-params\">(self)<\/span>:<\/span>\n    \t  <span class=\"hljs-comment\"># Random initialize weight<\/span>\n    \t  self.weight = random.random()\n    \t  self.bias = random.random()\n\n\t<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">forward<\/span><span class=\"hljs-params\">(self, _input: float | int)<\/span>:<\/span>\n    \t  <span class=\"hljs-comment\"># Calculate the weighted sum<\/span>\n    \t  output = self.weight * _input + self.bias\n    \t  <span class=\"hljs-keyword\">return<\/span> output\n\n\t<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">predict<\/span><span class=\"hljs-params\">(self, X: &#91;float])<\/span> -&gt; &#91;float]:<\/span>\n    \t  Y_predicted = &#91;]\n    \t  <span class=\"hljs-keyword\">for<\/span> x <span class=\"hljs-keyword\">in<\/span> X:\n          prediction = self.forward(x)\n          Y_predicted.append(prediction)\n    \t  <span class=\"hljs-keyword\">return<\/span> Y_predicted\n\n<span class=\"hljs-meta\">\t@staticmethod<\/span>\n\t<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">compute_loss<\/span><span class=\"hljs-params\">(predicted_output: float, target: float)<\/span> -&gt; float:<\/span>\n    \t  <span class=\"hljs-comment\"># In this example we used Mean Square Error (MSE)<\/span>\n    \t  <span class=\"hljs-keyword\">return<\/span> (predicted_output - target) ** <span class=\"hljs-number\">2<\/span>\n\n\t<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">compute_total_loss<\/span><span class=\"hljs-params\">(self, targets: &#91;float], predicted_outputs: &#91;float])<\/span> -&gt; float:<\/span>\n    \t  <span class=\"hljs-comment\"># In this example we used Mean Square Error (MSE)<\/span>\n    \t  total_loss = <span class=\"hljs-number\">0<\/span>\n    \t  <span class=\"hljs-keyword\">for<\/span> target, predicted_output <span class=\"hljs-keyword\">in<\/span> zip(targets, predicted_outputs):\n          total_loss += (predicted_output - target) ** <span class=\"hljs-number\">2<\/span>\n\n    \t  <span class=\"hljs-keyword\">return<\/span> total_loss\n\n\t<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">compute_loss_derivative<\/span><span class=\"hljs-params\">(self, predicted_output: float, target: float)<\/span> -&gt; float:<\/span>\n    \t  <span class=\"hljs-keyword\">return<\/span> (predicted_output - target) * <span class=\"hljs-number\">2<\/span>\n\n\t<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">train<\/span><span class=\"hljs-params\">(self, training_sample: &#91;float], learning_rate=<span class=\"hljs-number\">0.01<\/span>, epochs=<span class=\"hljs-number\">1000<\/span>)<\/span>:<\/span>\n    \t  <span class=\"hljs-keyword\">for<\/span> epoch <span class=\"hljs-keyword\">in<\/span> range(<span class=\"hljs-number\">1<\/span>, epochs + <span class=\"hljs-number\">1<\/span>):\n          total_loss = <span class=\"hljs-number\">0<\/span>\n\n          <span class=\"hljs-keyword\">for<\/span> _input, target <span class=\"hljs-keyword\">in<\/span> training_sample:\n            <span class=\"hljs-comment\"># Forward Pass<\/span>\n            predicted_output = self.forward(_input)\n            <span class=\"hljs-comment\"># print(_input, target, predicted_output)<\/span>\n\n            <span class=\"hljs-comment\"># Calculate Loss<\/span>\n            loss = self.compute_loss(predicted_output, target)\n            total_loss += loss\n\n            <span class=\"hljs-comment\"># Backward Pass (Calculate gradients)<\/span>\n            loss_derivative_value = self.compute_loss_derivative(predicted_output, target)\n            gradient = loss_derivative_value * _input\n            bias_gradient = loss_derivative_value\n\n            <span class=\"hljs-comment\"># Update Weights and Bias<\/span>\n            self.weight -= learning_rate * gradient\n            self.bias -= learning_rate * bias_gradient\n\n          <span class=\"hljs-comment\"># Print loss every 100 epochs<\/span>\n          <span class=\"hljs-keyword\">if<\/span> epoch % <span class=\"hljs-number\">100<\/span> == <span class=\"hljs-number\">0<\/span>:\n            print(<span class=\"hljs-string\">f\"Epoch: <span class=\"hljs-subst\">{epoch}<\/span>, Loss: <span class=\"hljs-subst\">{total_loss:<span class=\"hljs-number\">.4<\/span>f}<\/span>, Weight: <span class=\"hljs-subst\">{self.weight:<span class=\"hljs-number\">.4<\/span>f}<\/span>, Bias: <span class=\"hljs-subst\">{self.bias:<span class=\"hljs-number\">.4<\/span>f}<\/span>\"<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-1\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>The<code> NeuralNetwork<\/code> class starts with a constructor <code>(__init__),<\/code> which initializes a weight and a bias. These values are randomly assigned using <code>random.random()<\/code>, ensuring the model starts with non-zero parameters.<\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-2\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> random\n\n<span class=\"hljs-class\"><span class=\"hljs-keyword\">class<\/span> <span class=\"hljs-title\">NeuralNetwork<\/span>:<\/span>\n\t<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">__init__<\/span><span class=\"hljs-params\">(self)<\/span>:<\/span>\n    \t <span class=\"hljs-comment\"># Randomly initialize weight and bias<\/span>\n    \t self.weight = random.random()\n    \t self.bias = random.random()<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-2\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>The function <code>forward<\/code> performs a forward pass through the neuron. It calculates the weighted sum of the input and adds the bias.<br><\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-3\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">forward<\/span><span class=\"hljs-params\">(self, _input: float | int)<\/span>:<\/span>\n\t<span class=\"hljs-comment\"># Calculate the weighted sum<\/span>\n\toutput = self.weight * _input + self.bias\n\t<span class=\"hljs-keyword\">return<\/span> output<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-3\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>The <code>predict<\/code> method takes a list of inputs and returns the corresponding outputs. It simply applies the <code>forward<\/code> function to each input value.<\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-4\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">predict<\/span><span class=\"hljs-params\">(self, X: &#91;float])<\/span> -&gt; &#91;float]:<\/span>\n\tY_predicted = &#91;]\n\t <span class=\"hljs-keyword\">for<\/span> x <span class=\"hljs-keyword\">in<\/span> X:\n    \t  prediction = self.forward(x)\n    \t  Y_predicted.append(prediction)\n\t<span class=\"hljs-keyword\">return<\/span> Y_predicted<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-4\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>The <code>compute_loss function<\/code> calculates the Mean Squared Error (MSE) for a single prediction. The squared difference ensures that errors are always positive and penalizes larger deviations more heavily.<br><\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-5\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-meta\">@staticmethod<\/span>\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">compute_loss<\/span><span class=\"hljs-params\">(predicted_output: float, target: float)<\/span> -&gt; float:<\/span>\n\t<span class=\"hljs-keyword\">return<\/span> (predicted_output - target) ** <span class=\"hljs-number\">2<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-5\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>The <code>compute_total_loss<\/code> function computes the total loss over a dataset by summing individual squared errors. This helps track how well the model performs over multiple data points.<\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-6\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">compute_total_loss<\/span><span class=\"hljs-params\">(self, targets: &#91;float], predicted_outputs: &#91;float])<\/span> -&gt; float:<\/span>\n\ttotal_loss = <span class=\"hljs-number\">0<\/span>\n\t<span class=\"hljs-keyword\">for<\/span> target, predicted_output <span class=\"hljs-keyword\">in<\/span> zip(targets, predicted_outputs):\n    \ttotal_loss += (predicted_output - target) ** <span class=\"hljs-number\">2<\/span>\n\t<span class=\"hljs-keyword\">return<\/span> total_loss<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-6\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>The <code>compute_total_derivative<\/code> function calculates the derivative of the loss function with respect to the predicted output. Since we&#8217;re using MSE, the derivative is:&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"503\" height=\"96\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Captura-de-Tela-2025-04-10-as-14.28.35.png\" alt=\"\" class=\"wp-image-12636\" style=\"width:457px;height:auto\"\/><\/figure>\n\n\n\n<p>This derivative is essential for gradient descent to update model parameters.<\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-7\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">compute_loss_derivative<\/span><span class=\"hljs-params\">(self, predicted_output: float, target: float)<\/span> -&gt; float:<\/span>\n\t<span class=\"hljs-keyword\">return<\/span> (predicted_output - target) * <span class=\"hljs-number\">2<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-7\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Lastly, the <code>train<\/code> function is used to train the artificial neuron using gradient descent:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Forward Pass<\/strong>: Computes predictions for each input.<\/li>\n\n\n\n<li><strong>Loss Calculation<\/strong>: Evaluates how far predictions are from actual values.<\/li>\n\n\n\n<li><strong>Backward Pass<\/strong>: Uses <strong>loss derivative<\/strong> to compute gradients.<\/li>\n\n\n\n<li><strong>Parameter Update<\/strong>: Adjusts <code>weight<\/code> and <code>bias<\/code> using the <strong>learning rate<\/strong>.<\/li>\n<\/ol>\n\n\n\n<p>The loss is printed every 100 epochs to track training progress.<\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-8\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">train<\/span><span class=\"hljs-params\">(self, training_sample: &#91;float], learning_rate=<span class=\"hljs-number\">0.01<\/span>, epochs=<span class=\"hljs-number\">1000<\/span>)<\/span>:<\/span>\n\t<span class=\"hljs-keyword\">for<\/span> epoch <span class=\"hljs-keyword\">in<\/span> range(<span class=\"hljs-number\">1<\/span>, epochs + <span class=\"hljs-number\">1<\/span>):\n    \ttotal_loss = <span class=\"hljs-number\">0<\/span>\n\n    \t<span class=\"hljs-keyword\">for<\/span> _input, target <span class=\"hljs-keyword\">in<\/span> training_sample:\n        \t<span class=\"hljs-comment\"># Forward Pass<\/span>\n        \tpredicted_output = self.forward(_input)\n\n        \t<span class=\"hljs-comment\"># Calculate Loss<\/span>\n        \tloss = self.compute_loss(predicted_output, target)\n        \ttotal_loss += loss\n\n        \t<span class=\"hljs-comment\"># Backward Pass (Calculate gradients)<\/span>\n        \tloss_derivative_value = self.compute_loss_derivative(predicted_output, target)\n        \tgradient = loss_derivative_value * _input\n        \tbias_gradient = loss_derivative_value\n\n        \t<span class=\"hljs-comment\"># Update Weights and Bias<\/span>\n        \tself.weight -= learning_rate * gradient\n        \tself.bias -= learning_rate * bias_gradient\n\n    \t<span class=\"hljs-comment\"># Print loss every 100 epochs<\/span>\n    \t<span class=\"hljs-keyword\">if<\/span> epoch % <span class=\"hljs-number\">100<\/span> == <span class=\"hljs-number\">0<\/span>:\n        \tprint(<span class=\"hljs-string\">f\"Epoch: <span class=\"hljs-subst\">{epoch}<\/span>, Loss: <span class=\"hljs-subst\">{total_loss:<span class=\"hljs-number\">.4<\/span>f}<\/span>, Weight: <span class=\"hljs-subst\">{self.weight:<span class=\"hljs-number\">.4<\/span>f}<\/span>, Bias: <span class=\"hljs-subst\">{self.bias:<span class=\"hljs-number\">.4<\/span>f}<\/span>\"<\/span>)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-8\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Let\u2019s look at an example to clarify how we use this class. The code presented below was extracted from the same repository, and the complete version can be found in the linear_regression.ipynb Jupyter Notebook.<br><br>To illustrate the linear regression process, the notebook generates synthetic data:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Input Values (<\/strong><strong>X<\/strong><strong>)<\/strong>: A set of random values.<\/li>\n\n\n\n<li><strong>Output Values (Y)<\/strong>: Generated using a linear relationship with <code>X<\/code>, typically in the form <code>Y = mX + b + noise<\/code>, where m is the slope, b is the intercept, and<code> noise<\/code> adds variability to simulate real-world data.<\/li>\n<\/ul>\n\n\n\n<p>This synthetic data serves as a controlled environment to demonstrate the mechanics of linear regression. We added some random noise to create a dispersion between the values. Otherwise, we would have just an ascending straight line with a slope equal to 5.<\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-9\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">regression_function<\/span><span class=\"hljs-params\">(samples=<span class=\"hljs-number\">100<\/span>)<\/span> -&gt; (&#91;int], &#91;float]):<\/span>\n    X = &#91;]\n    Y = &#91;]\n    <span class=\"hljs-keyword\">for<\/span> x <span class=\"hljs-keyword\">in<\/span> range(samples):\n        <span class=\"hljs-comment\"># y = 5 * x + 1<\/span>\n        y = <span class=\"hljs-number\">5<\/span> * x + random.uniform(<span class=\"hljs-number\">-100<\/span>, <span class=\"hljs-number\">100<\/span>)\n        X.append(x)\n        Y.append(y)\n\n    <span class=\"hljs-keyword\">return<\/span> X, Y<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-9\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"512\" height=\"383\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/03\/synthetic-data-.png\" alt=\"\" class=\"wp-image-12554\"\/><figcaption class=\"wp-element-caption\">Shows a plot of the generated synthetic data to understand where the artificial neuron will learn from.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>Next, we instantiate an artificial neuron and send some data to check whether it can predict the output. Figure 6 shows the result, with the red line composed of the artificial neuron output or prediction values. <\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-10\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">from<\/span> perceptron.linear_regression <span class=\"hljs-keyword\">import<\/span> NeuralNetwork\n\nnn = NeuralNetwork()\nY_predicted = nn.predict(X)\nprint_prediction_function(X, Y, Y_predicted)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-10\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"552\" height=\"413\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/image14.png\" alt=\"\" class=\"wp-image-12640\"\/><\/figure>\n<\/div>\n\n\n<p>As you can see, the red line doesn&#8217;t reflect the reality of the training data, which means we need to train the artificial neuron. The next step is to call the method train with epochs = 100000 and learning_rate = 0.00001. This will apply the backpropagation algorithm 100000 times and update the weight bias with a very low step, which helps the model converge to optimum values representing the data. The final values obtained after the training were:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Epoch: 100000,&nbsp;<\/li>\n\n\n\n<li>Loss: 367628.9078,&nbsp;<\/li>\n\n\n\n<li>Weight: 5.4246,&nbsp;<\/li>\n\n\n\n<li>Bias: -18.1625<\/li>\n<\/ul>\n\n\n\n<p>It is important to note that the Loss value is accumulated along the train process. So, in the first steps, we have a huge error, but as the neuron updates its weight and bias, the error becomes stable.<\/p>\n\n\n<pre class=\"wp-block-code alignwide\" aria-describedby=\"shcb-language-11\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\">population = &#91;*zip(X, Y)]\ntraining_sample = &#91;*zip(X, Y)]\nlearning_rate = <span class=\"hljs-number\">0.00001<\/span>\nepochs = <span class=\"hljs-number\">100000<\/span>\nnn.train(training_sample, learning_rate, epochs)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-11\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Figure 7 presents the final result after a trained artificial neuron predicts the values. <\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" width=\"552\" height=\"413\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/image16.png\" alt=\"\" class=\"wp-image-12642\" style=\"width:533px;height:auto\"\/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\"><strong>Wrapping up<\/strong><\/h2>\n\n\n\n<p>Now, we can use this trained artificial neuron to predict new values. For example, we can expand this example to a house cost prediction task, where we have dozens or hundreds of input variables, and we need to use them to define a house&#8217;s final value based on its attributes.&nbsp;<\/p>\n\n\n\n<p>It is important to note that we didn\u2019t cover key aspects of the training process, such as allocating 80% of the data for training, using the remaining 20% for validation, or cleaning and standardizing the data. This kind of technique helps during model development and can be tricky. But you can count on the <a href=\"https:\/\/cheesecakelabs.com\/\">Cheesecake Labs<\/a> team to help you with neural network model development. We have specialized engineers ready to dive into the data and create models that solve the problems your business is facing.<br><br>In a future blog post, we will take a look at the classification problem. Until then, check out some of our other posts and reports that explain our approach to machine learning and <a href=\"https:\/\/cheesecakelabs.com\/services\/ai-development\">AI app development<\/a> at Cheesecake Labs:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/cheesecakelabs.com\/blog\/ai-regression-applications\/\">How to Use AI Regression to Build Better Applications<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/cheesecakelabs.com\/blog\/ai-classification\/\" target=\"_blank\" rel=\"noreferrer noopener\">How to Use AI Classification to Build More Efficient Apps<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/cheesecakelabs.com\/blog\/artificial-intelligence-beyond-genai-state-of-tech-research\/\" target=\"_blank\" rel=\"noreferrer noopener\">Artificial Intelligence Beyond GenAI: State of Tech Research in 2024<\/a><\/li>\n<\/ul>\n\n\n\n<p>If you have an idea for a project that would benefit from a neural network, <a href=\"https:\/\/cheesecakelabs.com\/contact\/\">send us a message<\/a>, and let\u2019s chat! We\u2019d love to help you bring your ideas to life.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/cheesecakelabs.com\/contact\/\"><img decoding=\"async\" width=\"1200\" height=\"544\" src=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/05\/cheesecake-labs-1200x544.png\" alt=\"schedule a call with cheesecake labs experts\" class=\"wp-image-12795\" srcset=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/05\/cheesecake-labs-1200x544.png 1200w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/05\/cheesecake-labs-600x272.png 600w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/05\/cheesecake-labs-768x348.png 768w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/05\/cheesecake-labs-1536x697.png 1536w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/05\/cheesecake-labs-760x345.png 760w, https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/05\/cheesecake-labs.png 1924w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><\/a><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Neural networks are powerful machine learning algorithms that have transformed countless industries. They can power everything from fraud detection and demand forecasting to personalized recommendations and autonomous systems and are a great way to incorporate smarter decision-making into your applications.&nbsp; In this guide, we&#8217;ll dive deep into the fundamentals of neural networks, from the first [&hellip;]<\/p>\n","protected":false},"author":89,"featured_media":12644,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1288,432],"tags":[305,54,1199],"class_list":["post-12542","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence","category-engineering","tag-tag-development","tag-tag-mobile-app-development","tag-software-development"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Building Neural Networks from Scratch<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Building Neural Networks from Scratch\" \/>\n<meta property=\"og:description\" content=\"Neural networks are powerful machine learning algorithms that have transformed countless industries. They can power everything from fraud detection and demand forecasting to personalized recommendations and autonomous systems and are a great way to incorporate smarter decision-making into your applications.&nbsp; In this guide, we&#8217;ll dive deep into the fundamentals of neural networks, from the first [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/\" \/>\n<meta property=\"og:site_name\" content=\"Cheesecake Labs\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cheesecakelabs\" \/>\n<meta property=\"article:published_time\" content=\"2025-04-10T18:12:29+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-05-29T19:05:56+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1921\" \/>\n\t<meta property=\"og:image:height\" content=\"861\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Cheesecake Labs\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@cheesecakelabs\" \/>\n<meta name=\"twitter:site\" content=\"@cheesecakelabs\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"15 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/\"},\"author\":{\"name\":\"Paulo Nascimento\"},\"headline\":\"Building Neural Networks from Scratch\",\"datePublished\":\"2025-04-10T18:12:29+00:00\",\"dateModified\":\"2025-05-29T19:05:56+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/\"},\"wordCount\":2758,\"image\":{\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg\",\"keywords\":[\"development\",\"mobile app development\",\"software development\"],\"articleSection\":[\"Artificial Intelligence\",\"Engineering\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/\",\"url\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/\",\"name\":\"Building Neural Networks from Scratch\",\"isPartOf\":{\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg\",\"datePublished\":\"2025-04-10T18:12:29+00:00\",\"dateModified\":\"2025-05-29T19:05:56+00:00\",\"author\":{\"@type\":\"person\",\"name\":\"Paulo Nascimento\"},\"breadcrumb\":{\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#primaryimage\",\"url\":\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg\",\"contentUrl\":\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg\",\"width\":1921,\"height\":861},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/cheesecakelabs.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Building Neural Networks from Scratch\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/#website\",\"url\":\"https:\/\/cheesecakelabs.com\/blog\/\",\"name\":\"Cheesecake Labs\",\"description\":\"Nearshore outsourcing company for Web and Mobile design and engineering services, and staff augmentation for startups and enterprises..\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/cheesecakelabs.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"name\":\"Paulo Nascimento\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/cheesecakelabs.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Paulo-Roberto.png\",\"contentUrl\":\"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Paulo-Roberto.png\",\"caption\":\"Paulo Nascimento\"},\"url\":\"https:\/\/cheesecakelabs.com\/blog\/autor\/paulo-nascimento\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Building Neural Networks from Scratch","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/","og_locale":"en_US","og_type":"article","og_title":"Building Neural Networks from Scratch","og_description":"Neural networks are powerful machine learning algorithms that have transformed countless industries. They can power everything from fraud detection and demand forecasting to personalized recommendations and autonomous systems and are a great way to incorporate smarter decision-making into your applications.&nbsp; In this guide, we&#8217;ll dive deep into the fundamentals of neural networks, from the first [&hellip;]","og_url":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/","og_site_name":"Cheesecake Labs","article_publisher":"https:\/\/www.facebook.com\/cheesecakelabs","article_published_time":"2025-04-10T18:12:29+00:00","article_modified_time":"2025-05-29T19:05:56+00:00","og_image":[{"width":1921,"height":861,"url":"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg","type":"image\/jpeg"}],"author":"Cheesecake Labs","twitter_card":"summary_large_image","twitter_creator":"@cheesecakelabs","twitter_site":"@cheesecakelabs","twitter_misc":{"Written by":null,"Est. reading time":"15 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#article","isPartOf":{"@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/"},"author":{"name":"Paulo Nascimento"},"headline":"Building Neural Networks from Scratch","datePublished":"2025-04-10T18:12:29+00:00","dateModified":"2025-05-29T19:05:56+00:00","mainEntityOfPage":{"@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/"},"wordCount":2758,"image":{"@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#primaryimage"},"thumbnailUrl":"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg","keywords":["development","mobile app development","software development"],"articleSection":["Artificial Intelligence","Engineering"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/","url":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/","name":"Building Neural Networks from Scratch","isPartOf":{"@id":"https:\/\/cheesecakelabs.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#primaryimage"},"image":{"@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#primaryimage"},"thumbnailUrl":"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg","datePublished":"2025-04-10T18:12:29+00:00","dateModified":"2025-05-29T19:05:56+00:00","author":{"@type":"person","name":"Paulo Nascimento"},"breadcrumb":{"@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#primaryimage","url":"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg","contentUrl":"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/cover-9.jpg","width":1921,"height":861},{"@type":"BreadcrumbList","@id":"https:\/\/cheesecakelabs.com\/blog\/building-neural-networks-from-scratch\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/cheesecakelabs.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Building Neural Networks from Scratch"}]},{"@type":"WebSite","@id":"https:\/\/cheesecakelabs.com\/blog\/#website","url":"https:\/\/cheesecakelabs.com\/blog\/","name":"Cheesecake Labs","description":"Nearshore outsourcing company for Web and Mobile design and engineering services, and staff augmentation for startups and enterprises..","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/cheesecakelabs.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","name":"Paulo Nascimento","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/cheesecakelabs.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Paulo-Roberto.png","contentUrl":"https:\/\/ckl-website-static.s3.amazonaws.com\/wp-content\/uploads\/2025\/04\/Paulo-Roberto.png","caption":"Paulo Nascimento"},"url":"https:\/\/cheesecakelabs.com\/blog\/autor\/paulo-nascimento\/"}]}},"_links":{"self":[{"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/posts\/12542","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/users\/89"}],"replies":[{"embeddable":true,"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/comments?post=12542"}],"version-history":[{"count":5,"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/posts\/12542\/revisions"}],"predecessor-version":[{"id":12808,"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/posts\/12542\/revisions\/12808"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/media\/12644"}],"wp:attachment":[{"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/media?parent=12542"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/categories?post=12542"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cheesecakelabs.com\/blog\/wp-json\/wp\/v2\/tags?post=12542"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}