Learn how to create your own Neural Network for Deep Learning in a few steps
Cover: Example of a neural network’s. The image shows an example of a neuronal unit in a neural network, illustrating its main components. In the example, these are the inputs, the weights, the bias, the summation of inputs and bias, the activation function, and the output. The activation function chosen for this example is a ReLU (Rectified Linear Unit). By BrunelloN August 12, 2021. Source: Wikimedia Commons.
Maurício Pinheiro
Deep neural networks (DNN) have been at the forefront of the artificial intelligence (AI) revolution, transforming the way we approach a wide range of applications, from computer vision and natural language processing to game playing and robotics. With their ability to learn from data and make complex decisions, deep neural networks have unlocked new levels of accuracy and efficiency in many areas of AI technology. From self-driving cars to personalized recommendations and speech recognition, the impact of deep neural networks has been far-reaching and profound, ushering in a new era of intelligent machines that can analyze and understand complex patterns in data with human-like accuracy.
Creating your own neural network (NN) for deep learning is a powerful way to gain knowledge and understanding. The process of building a neural network allows you to actively engage with the concepts and techniques involved, leading to a deeper understanding of the topic. By following our steps and doing it yourself, you will be able to gain hands-on experience and truly learn the ins and outs of neural networks. We will tell you how to train it in a future post.
Step 1) Know what you are doing
1.1 What is a neural network?
A neural network is a highly advanced machine learning algorithm that has been designed to mimic the structure and function of the human brain. It is a complex network of interconnected nodes or neurons that work together to perform complex calculations on input data. These neurons are organized into layers, with each layer responsible for processing a specific aspect of the input data.
The basic idea behind a neural network is to model complex relationships between the input data (converted in real numbers) and the desired output (also real numbers), by adjusting the weights and biases of the network over time, to minimize the difference between the predicted output and the actual output. This process is called training, and it involves feeding large amounts of data into the network and adjusting the weights and biases, using a technique called backpropagation.

Neural networks are highly flexible and can be used to solve a wide variety of problems. They are capable of learning from different types of data, such as images, text, and time-series data, and can produce highly accurate predictions and classifications. They are used in a wide range of applications, such as computer vision, natural language processing, speech recognition, robotics, and even music composition.
Neural networks can be thought of as a highly complex non-linear function of many variables, with changable parameters. They take in input data and produce a numeric output, which can be converted into a label, phrase, decision, or any other form of output, depending on the nature of the problem being solved. The output of a neural network is determined by a combination of the input data and the weights and biases of the network, which are adjusted over time during the training process to produce more accurate results.

1.2 How it works?
At the core of a neural network lies the artificial neuron or perceptron, which serves as the basic building block for the entire network. The perceptron receives input from one or more sources and performs a series of computations to produce an output.
To be more specific, the perceptron takes the weighted sum w1y1+w2y2+…+wnyn of its inputs yi (i=1,…, n=number of inputs) adds a bias term bi to the result, and then passes the output z through a non-linear activation function f(z) (logistic or sigmoid, ReLU: rectifier linear unit, etc. ). The weights and biases of the perceptron are the parameters that the network learns during training, and they determine the importance of each input in producing the output.

Building a network of neurons involves connecting multiple neurons in a layered structure, with each layer of neurons receiving input from the previous layer and passing its output to the next layer. The input layer of the network receives the raw input data, such as an image or text, and passes it through the network’s hidden layers, which perform increasingly complex computations on the data. The output layer produces the final output of the network, such as a classification label or a predicted value (a number to be converted).
The process of training a neural network involves adjusting the weights and biases of the perceptrons to minimize the difference between the predicted output and the actual output. This is typically done using an optimization algorithm that iteratively adjusts the parameters of the network to minimize a loss function.
Neural networks can have multiple hidden layers, and the depth of the network refers to the number of hidden layers it has. Deep neural networks, which are neural networks with many hidden layers, have been shown to be highly effective at solving complex problems in a variety of fields, including computer vision, natural language processing, and speech recognition.
1.3 Know the basic concepts: input, output, weight, biases, activation function.
Input: The input to a neural network is the data that the network is trying to learn from. It is converted in real numbers before feeding the NN. In supervised learning, the input is labeled data, while in unsupervised learning, the input is unlabeled data.

Input data can be:
- Structured data refers to data that can be easily organized, stored and processed in relational databases. These data have a pre-defined schema or structure that allows them to be stored in tables with rows and columns. Examples of structured data include numbers, text, date and time, and true/false values. For instance, a customer’s name, address, phone number, and purchase history can be stored in a structured format in a database.
- Unstructured data, on the other hand, do not have a fixed structure or format, making them difficult to organize and store in traditional databases. These data types include free-form text, images, audio, and video. Social media posts, emails, and customer reviews are some examples of unstructured data that require natural language processing techniques to extract useful information from them.
- Time series data is a type of data that is collected over time and is typically indexed by a time stamp or date. Examples of time series data include stock prices, weather data, and website traffic statistics. In a time series dataset, each record represents a measurement or observation taken at a specific point in time. Analyzing time series data involves identifying patterns and trends in the data and making predictions about future values.
- Graph-formatted data refers to data that are organized in a graph structure, where nodes represent entities, and edges represent relationships between them. Examples of graph-formatted data include social networks, knowledge graphs, and chemical compounds. Analyzing graph-formatted data involves identifying patterns and relationships between nodes and edges and making predictions about the properties of the entities and their relationships.
Weight: The weight associated with each connection between neurons in a neural network plays a critical role in determining the overall behavior of the network. In a neural network, each connection represents a specific relationship between two neurons, and the weight associated with that connection determines the strength of that relationship. The weight can be thought of as a scaling factor that determines how much influence the input from one neuron has on the output of the next neuron in the network.

The weights in a neural network are typically adjusted during training, in order to optimize the performance of the network. During training, the network is presented with a set of inputs and corresponding target outputs, and the weights are adjusted in such a way as to minimize the difference between the predicted outputs of the network and the target outputs. This process is often referred to as “learning” or “optimization” and is typically done using a technique called backpropagation.
The weights in a neural network can take on both positive and negative values, with positive weights representing an excitatory influence and negative weights representing an inhibitory influence. The magnitude of the weight determines the strength of the influence, with larger weights having a stronger influence on the output of the neuron.
Overall, the weights in a neural network are a fundamental aspect of the network’s architecture and play a critical role in determining the network’s behavior. Effective management of the weights is essential for achieving good performance in a neural network.
Bias:
In neural networks, bias is a constant term added to the weighted sum of inputs before applying the activation function. The bias allows the network to shift the activation function to the left or right, which can be helpful in achieving the desired output.

In simpler terms, the bias acts as a sort of baseline for the neuron’s output. Without the bias, the activation function would pass through the origin and the output would be restricted to the region where the input values are greater than or equal to zero. By adding a bias term, the activation function is shifted to the left or right, allowing for a wider range of possible outputs.
The bias can be considered as a learnable parameter in the neural network, meaning it can be adjusted during training to achieve better results. For example, if the network is not accurately predicting the desired output, the bias can be adjusted to shift the activation function and improve the predictions.
Overall, the bias is an important component in the design and optimization of neural networks, allowing for greater flexibility and accuracy in predicting desired outcomes.
Weigthed sum:
In a neural network, the weighted sum (or averaged sum) is a fundamental operation that takes place in each artificial neuron or perceptron. It involves multiplying the inputs to the neuron by their respective weights, and summing the results along with a bias term. The weighted sum is then passed through an activation function to produce the output of the neuron.

The weights and bias are learned during the training phase of the neural network, and they play a crucial role in determining the output of the network. The weights can be thought of as the “importance” or “relevance” of each input to the output of the neuron, while the bias determines the “threshold” or “bias” of the neuron’s response.
For example, consider a neural network that is trained to classify images of cats and dogs. Each pixel in the input image can be thought of as an input to the network, and each neuron in the network represents a feature or aspect of the image that is relevant to the classification task. The weights of each neuron determine how much it contributes to the overall classification decision, while the bias term determines the overall threshold for classifying an image as a cat or a dog.
The weighted sum operation is a key building block of a neural network, and it allows the network to learn complex patterns and relationships in the input data. By adjusting the weights and biases of the neurons during training, the network can adapt to different types of input data and improve its performance on the task at hand.
Activation function: The activation function plays a crucial role in determining the output of a neuron in a neural network. It is a mathematical function that takes in the weighted sum of the inputs to the neuron and applies a non-linear transformation to it, producing the output of the neuron.

The choice of activation function can greatly impact the performance of a neural network. A good activation function should be non-linear, as linear functions cannot capture complex patterns in the data. It should also be differentiable, as backpropagation, the algorithm used to train neural networks, requires the computation of derivatives.
There are many different types of activation functions that can be used in neural networks. Some of the most common ones include the sigmoid function, the ReLU function, the hyperbolic tangent function, and the softmax function. Each of these functions has its own strengths and weaknesses, and the choice of activation function depends on the specific problem being solved and the characteristics of the data.

import matplotlib.pyplot as plt
import numpy as np
def logistic_function(x):
return 1 / (1 + np.exp(-x))
x = np.linspace(-10, 10, 100) # generate 100 points between -10 and 10
y = logistic_function(x) # calculate the y values
plt.plot(x, y)
plt.title(‘Logistic Function’)
plt.xlabel(‘x’)
plt.ylabel(‘f(x)’)
plt.show()
The sigmoid function is a popular choice of activation function in neural networks. It is a non-linear function that maps any real-valued number to a value between 0 and 1, which makes it particularly useful for binary classification problems where the output must be either 0 or 1.
One of the main advantages of the sigmoid function is that it is differentiable, which means that it can be used with gradient-based optimization algorithms such as backpropagation to train the network. The derivative of the sigmoid function is also easy to calculate, which makes it computationally efficient.
Another advantage of the sigmoid function is that it is monotonically increasing, which means that the output of the function increases as the input increases. This property is important for learning, as it allows the network to learn and make adjustments to its weights and biases in a consistent and predictable way.
However, the sigmoid function does have some limitations. One of the main limitations is that it can saturate, which means that for very large or very small values of the input, the derivative of the function becomes very small. This can lead to slow learning or even vanishing gradients, where the gradients become so small that the network is unable to make meaningful updates to its weights and biases.
Despite these limitations, the sigmoid function remains a popular choice of activation function in many neural networks, particularly for binary classification problems where the output must be a value between 0 and 1.
Here are some other examples of activation functions that are commonly used in neural networks:
- Hyperbolic tangent (tanh): The tanh function is similar to the sigmoid function, but its output ranges from -1 to 1 instead of 0 to 1. It is also symmetric around the origin, which can be useful in some applications.
- Rectified Linear Unit (ReLU): The ReLU function is defined as f(x) = max(0, x), which means that it outputs 0 for all negative inputs and the input value for all positive inputs. ReLU is currently one of the most popular activation functions because of its simplicity and effectiveness in deep learning models.
- Leaky ReLU: The Leaky ReLU function is a modified version of the ReLU function that avoids the “dying ReLU” problem by allowing a small negative output for negative inputs. The leaky ReLU function is defined as f(x) = max(ax, x), where a is a small positive number.
- Exponential Linear Unit (ELU): The ELU function is similar to the ReLU function, but it outputs negative values for negative inputs, which can help improve the performance of deep learning models. The ELU function is defined as f(x) = x if x > 0, and f(x) = alpha*(exp(x) – 1) if x <= 0, where alpha is a small positive constant.
- Softmax: The softmax function is often used as the activation function for the output layer of a neural network that is used for classification tasks. The softmax function maps the input values to a probability distribution over the output classes, ensuring that the sum of the probabilities is equal to 1.
Output: The output of a neural network is the result produced by the network after processing the input.

Here are some examples or output data (after converting the numeric result of f(z)):
- AI systems have a wide range of applications, from predicting future events to performing tasks and making decisions based on data. Some examples of AI applications include:
- Predictions: AI systems can analyze large datasets and use machine learning algorithms to make predictions about future events or outcomes. For instance, an AI system can predict the likelihood of a customer churning based on their usage patterns and behavior.
- Recommendations: AI systems can provide personalized recommendations to users based on their past behavior, preferences, and other factors. For example, an e-commerce platform can recommend products to a user based on their purchase history and browsing behavior.
- Classifications: AI systems can classify data into different categories based on certain features or characteristics. For instance, a machine learning algorithm can classify emails as spam or not spam based on the content of the email and the user’s past behavior.
- Decisions: AI systems can make decisions based on data and pre-defined rules. For example, an AI system can be used to decide whether to approve or reject a credit card application based on the applicant’s credit history and other factors.
- Actions: AI systems can perform tasks or take actions based on data and pre-defined rules. For example, an AI system can be used to control a manufacturing plant and optimize production based on real-time sensor data and other inputs.
1.4 The tools: linear algebra and Python
Linear algebra and programming languages are fundamental tools for working with neural networks. Linear algebra provides a way to represent and manipulate the weights, biases, and input/output data of the network. In particular, matrix multiplication is a key operation in neural networks, as it allows for efficient computation of the weighted sum and activation function. Python is a widely-used programming language for implementing neural networks due to its simplicity, flexibility, and rich ecosystem of libraries.
Python provides a wide range of libraries and frameworks that facilitate the implementation of neural networks. For example, NumPy is a Python library for scientific computing that provides support for array operations, linear algebra, and other numerical calculations. It is particularly useful for working with the matrices and vectors that are commonly used in neural networks. TensorFlow is another popular Python library for building and training neural networks, which provides a high-level interface for defining and executing computational graphs.
To create a simple neural network using Python, we will proceed in a step-by-step manner. We will begin by defining the input and output layers of the network, followed by the weights and biases of the hidden layer. We will then define the activation function and compute the weighted sum of the inputs. Finally, we will apply the activation function to the weighted sum to obtain the output of the network. By following these steps, we will be able to build a simple neural network using Python and linear algebra.
Here’s a step-by-step procedure to install Python on a Windows machine:
- Go to the official Python website at https://www.python.org/downloads/
- Click on the “Download Python” button that corresponds to the version of Python you want to install.
- Choose the appropriate installer for your operating system (Windows in this case).
- Once the installer is downloaded, double-click on it to begin the installation process.
- Check the “Add Python to PATH” checkbox during installation. This will ensure that the Python interpreter can be accessed from any directory in the command prompt or PowerShell.
- Follow the prompts in the installer until the installation is complete.
- To verify that Python is installed correctly, open the command prompt or PowerShell and type “python” (without quotes) and press Enter. You should see the Python interpreter start up and display the version number.
That’s it! You now have Python installed on your Windows machine and can start writing and running Python code.
Step 2) One single neuron: the non-linear percepron
We will begin by implementing a non-linear perceptron, which is an artificial neuron with two inputs and one output. This has been previously discussed in our paper on a binary classifier neuron. However, in this case, we will be using a sigmoidal activation function to implement the non-linear neuron.

We will be using a code generated by ChatGPT for this task. I will not post the prompts since it took several iterations and bug corrections to reach the optimal code. Additionally, I made some edits to the code based on external references.
The code (explained by ChatGPT)
from numpy import * #import numpy library
#initialization
y1_inp=random.uniform(low=-1,high=+1,size=1) #random first input
y2_inp=random.uniform(low=-1,high=+1,size=1) #random second input
w1=random.uniform(low=-1,high=+1) #random weigth of the first input
w2=random.uniform(low=-1,high=+1) #random weigth of the second input
b=random.uniform(low=-1,high=+1,size=1) #random bias
z=w1*y1_inp+w2*y2_inp+b #weigthed sum
fz=(1/(1+exp(-z))) #output = activation function
print("number of inputs =",2)
print("number of outputs =",1)
print("number of layers =",2,"(one input layer + one output layer)")
print("random-generated weights w1=",w1,"w2=",w2)
print("random-generated bias b=",b)
print("weighted sum z=",z)
print("activation_function = output =",fz)
This code initializes a non-linear perceptron, which is an artificial neuron with two inputs and one output.
- The
numpylibrary is imported with the first line of the code. - The next few lines of code initialize the inputs, weights, and bias with random values. The
random.uniformfunction fromnumpyis used to generate random values between -1 and +1 for the input and bias, as well as for the weights of the two inputs. - The weighted sum
zis then calculated by multiplying the inputs with their corresponding weights and adding the bias. - Next, the activation function is applied to
zto produce the output of the perceptron. In this case, the sigmoid activation function is used, which is defined asf(z) = 1 / (1 + exp(-z)). - The code then prints out some information about the perceptron, including the number of inputs and outputs, the number of layers (which is 2 for this simple perceptron), the randomly generated weights and bias, the weighted sum
z, and the output of the perceptron after applying the activation function.
This is our first step. Our artificial neuron is working! The results of a single run in the python IDE is:
number of inputs = 2 number of outputs = 1 number of layers = 1 (one neuron) random-generated weights w1= -0.1516818220375491 w2= 0.4826939463493012 random-generated bias b= [-0.41045979] weighted sum z= [-0.7173281] activation_function = output = [0.32798162]
Once we have a single neuron working now its time to build the neural network.
Step 3) Connecting Neurons: Building a two-layer input/output Neural Network
Now we are ready to connect neurons and build a two-layered neural network. The first layer of the network consists two neurons, with one input each and the output layer consists of just another neuron and the output.

The input neuron in the first layer receives two variables as input.
from numpy import *
import matplotlib.pyplot as plt
def apply_layer(y_in,w,b):
z=dot(w,y_in)+b
return(1/(1+exp(-z)))
N0=2
N1=1
w1=random.uniform(low=-10,high=+10,size=(N1,N0))
b1=random.uniform(low=-1,high=+1,size=N1)
def apply_net(y_in):
global w1,b1
y1=apply_layer(y_in,w1,b1)
return y1
M=50
y_out=zeros((M,M))
for j1 in range(M):
for j2 in range(M):
value0=float(j1)/M-0.5
value1=float(j2)/M-0.5
y_out[j1,j2]=apply_net([value0,value1])[0]
plt.imshow(y_out,origin=’lower’,extent=(-0.5,0.5,-0.5,0.5))
plt.colorbar()
plt.show()
This code defines a simple neural network with a two single-input neurons and an output neuron and no hidden layer. The network takes in two input variables, and applies a weight matrix and bias vector to calculate the output value using the sigmoid activation function.
The weight matrix and bias vector are initialized with random values, and the function apply_layer is used to apply them to the input variables and compute the output value using the sigmoid activation function.
The apply_net function calls apply_layer to calculate the output value for a given input, and returns the output as a scalar value.
Finally, a 2D color map is generated by computing the output of the neural network for a grid of input values, and plotting the resulting values using the imshow function from Matplotlib. The x and y axis of the plot represent the input variables, and the color of each pixel in the plot represents the output value of the neural network (the activation function value) for the corresponding input values. The color map is shown using the plt.show() function.

The code implements a neural network with two layers. The first layer has two neurons with one input variable each, while the second layer has one neuron with a single input variable. The weights and biases for the network are randomly generated within a given range. Instead of printing the values of biases, weights, and activation functions, the code displays the output activation function value of the second layer as a colormap, which varies as a function of the two input variables in the first neuron of the first layer.
Step 3) Scaling up
Here, we can see an example of scaling up a neural network by using pre-defined functions. This code builds a neural network with seven layers, consisting of five hidden layers, each with 50 neurons. The first layer has one neuron with two inputs, while each of the hidden layers has 50 neurons. Finally, the output layer has one neuron with 50 inputs.
This increase in the number of layers and neurons allows the neural network to capture more complex patterns in the data and make more accurate predictions. However, it also requires more computational resources and data to train the network effectively.
Overall, the ability to scale up a neural network is a powerful feature that allows us to tackle more complex problems and achieve better results.
The code is:
from numpy import *
import matplotlib.pyplot as plt
def apply_layer(y_in,w,b):
z=dot(w,y_in)+b
return(1/(1+exp(-z)))
N0=2
N1=50
N2=50
N3=50
N4=50
N5=50
N6=1
w1=random.uniform(low=-10,high=+10,size=(N1,N0))
b1=random.uniform(low=-1,high=+1,size=N1)
w2=random.uniform(low=-10,high=+10,size=(N2,N1))
b2=random.uniform(low=-1,high=+1,size=N2)
w3=random.uniform(low=-10,high=+10,size=(N3,N2))
b3=random.uniform(low=-1,high=+1,size=N3)
w4=random.uniform(low=-10,high=+10,size=(N4,N3))
b4=random.uniform(low=-1,high=+1,size=N4)
w5=random.uniform(low=-10,high=+10,size=(N5,N4))
b5=random.uniform(low=-1,high=+1,size=N5)
w6=random.uniform(low=-10,high=+10,size=(N6,N5))
b6=random.uniform(low=-1,high=+1,size=N6)
def apply_net(y_in):
global w1,b1,w2,b2,w3,b3,w4,b4,w5,b5,w6,b6
y1=apply_layer(y_in,w1,b1)
y2=apply_layer(y1,w2,b2)
y3=apply_layer(y2,w3,b3)
y4=apply_layer(y3,w4,b4)
y5=apply_layer(y4,w5,b5)
y6=apply_layer(y5,w6,b6)
return y6
M=50
y_out=zeros((M,M))
for j1 in range(M):
for j2 in range(M):
value0=float(j1)/M-0.5
value1=float(j2)/M-0.5
y_out[j1,j2]=apply_net([value0,value1])[0]
plt.imshow(y_out,origin=’lower’,extent=(-0.5,0.5,-0.5,0.5))
plt.colorbar()
plt.show()
Since we have random parameters, the NN delivers a different result for each run. One example of output is:

By modifying the code (you can improve it a lot since I am a python newbie), we can now easily scale up the neural network to accommodate any number of inputs and layers with different numbers of neurons. However, before we can effectively use the neural network for our desired task, we need to train it to learn the appropriate weights and biases for the given inputs and desired outputs. This process typically involves using a large set of training data to iteratively adjust the weights and biases of the network in order to minimize the error between the predicted outputs and the actual outputs.
Training a neural network can be a complex and time-consuming process, and requires careful consideration of factors such as the choice of optimization algorithm, the size of the training dataset, and the architecture of the network itself. However, with the right approach and tools, neural networks can be a powerful tool for a wide range of applications, including image recognition, natural language processing, and even game-playing algorithms. We will write about training a NN like those above in the future.
#AI #ArtificialIntelligence #DIY #DeepNeuralNetworks #AIRevolution #NeuralNetworkTutorial #HandsOnExperience #AItechnology #MachineLearning #DeepLearning #TrainYourOwnNeuralNetwork #ArtificialIntelligence #DataScience #PatternRecognition #Python #Tutorial
Reference:
https://www.fau.tv/course/id/778

Copyright 2026 AI-Talks.org
