Basic Sequential Model in Keras

Deep learning has become one of the most transformative technologies in the modern world, enabling advancements in computer vision, natural language processing, speech recognition, healthcare analytics, financial modeling, and countless other fields. At the core of deep learning lies the concept of neural networks—mathematical models inspired by the structure and function of the human brain. While neural networks can become extremely complex, every beginner must start with the basics. And one of the simplest yet most powerful ways to build a neural network using Python is through the Sequential Model in Keras.

Keras, a high-level deep-learning API running on top of TensorFlow, is known for its simplicity and user-friendly syntax. It allows developers to build robust neural networks with only a few lines of code. The Sequential Model is the foundation of Keras and serves as the easiest gateway for anyone learning deep learning for the first time.

This article will explore the simple Keras Sequential Model shown below and expand it into a full deep dive so that you can understand exactly what the code does, how it works, why it is structured this way, when it is used, and how it relates to broader deep-learning concepts.

Here is the simple model we will explore:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
Dense(32, activation='relu'),
Dense(1, activation='sigmoid')])

Even though this model consists of only a few lines, it contains many important components and ideas essential to understanding neural networks. Let’s break everything down step by step.

1. Understanding the Sequential Model in Keras

The Sequential Model is one of the simplest ways to build a neural network. It allows you to create layers sequentially, meaning one layer follows another in a straight line. There are no branches, no complex connections, and no parallel pathways. You simply stack layers in the order you want them to execute.

This makes the Sequential Model ideal for:

Beginners learning neural networks
Simple classification or regression tasks
Building quick prototypes
Standard feed-forward neural networks
Straightforward CNN or RNN models with a single pathway

The strength of the Sequential Model is its simplicity. When the problem does not require multiple inputs, multiple outputs, skip connections, residual paths, or complex architectures, Sequential is often the best choice.

2. Breakdown of the Code: Importing the Required Components

The first two lines of the code import essential classes from Keras. These are critical for building a neural network.

2.1 Importing Sequential

from tensorflow.keras.models import Sequential

This line imports the Sequential class, which acts as the container for the neural network. Think of it as an empty shell that you can fill with layers. Once the layers are added, the Sequential object defines the architecture of your network.

The Sequential class is responsible for:

Holding the layers
Maintaining their order
Managing forward and backward propagation
Compiling the model
Handling training and evaluation

The Sequential model is extremely efficient for simple neural networks because it assumes a strictly linear stacking of layers.

2.2 Importing Dense Layers

from tensorflow.keras.layers import Dense

The Dense layer is one of the most fundamental building blocks of neural networks. It represents a fully connected layer, meaning every neuron in the Dense layer is connected to every neuron in the previous layer.

Dense layers are commonly used in:

Classification problems
Regression problems
Fully connected final layers of CNN and RNN models
Multi-layer perceptrons (MLPs)
Tabular data predictions

The Dense layer is what gives neural networks their computational power and flexibility by allowing them to learn complex patterns through weighted connections.

3. Creating the Sequential Model

This line initializes the model:

model = Sequential([
Dense(32, activation='relu'),
Dense(1, activation='sigmoid')])

The Sequential constructor receives a list of layers. Each layer is added in the order they appear. Let’s now understand each layer in detail.

4. First Dense Layer: The Hidden Layer with 32 Neurons

Dense(32, activation='relu')

This line creates a fully connected (Dense) layer with 32 neurons and the ReLU activation function. This is typically a hidden layer, meaning it sits between the input and output layers and is responsible for learning patterns in the data.

4.1 What Are Neurons?

In neural networks, a neuron is a computational unit that takes inputs, applies weights, adds a bias, and passes the result through an activation function. Neurons help the network learn:

correlations
relationships
nonlinear patterns
complex decision boundaries

The number of neurons determines the learning capacity of the layer. More neurons mean more representational power, but also more computational cost and a higher risk of overfitting.

4.2 Why 32 Neurons?

Choosing 32 neurons is a common starting point. It is neither too small nor too large and provides enough capacity for simple tasks. In practice, the number of neurons should be selected based on experimentation, dataset size, and complexity.

4.3 Understanding the ReLU Activation Function

The ReLU (Rectified Linear Unit) activation function is the most widely used activation function in deep learning. It is defined as:

f(x) = max(0, x)

This means:

If x > 0, keep it.
If x <= 0, output 0.

ReLU helps solve the vanishing gradient problem and allows networks to converge faster during training. It introduces non-linearity, which is necessary because without activation functions, neural networks become simple linear transformations and cannot learn complex patterns.

5. Second Dense Layer: The Output Layer with Sigmoid Activation

Dense(1, activation='sigmoid')

This layer is the output layer of the model. It contains only one neuron, which means the model is designed to output a single value.

5.1 Why Only One Neuron?

A single output neuron is typically used for binary classification, where the model predicts either:

0 or 1
True or False
Yes or No
Positive or Negative

The output neuron generates a number between 0 and 1, representing a probability.

5.2 Why Use Sigmoid Activation?

The sigmoid activation function is defined as:

σ(x) = 1 / (1 + e^-x)

It outputs values between 0 and 1, making it ideal for:

binary classification
probability estimation
logistic regression

If the output is near:

0 → the model predicts class 0
1 → the model predicts class 1

The sigmoid function is specifically used for binary classification tasks because it provides a smooth probability curve.

6. How This Model Works Internally

Even though the model looks very simple, a lot happens behind the scenes.

6.1 Forward Propagation

Input data enters the first Dense layer.
The layer multiplies inputs by weights and adds biases.
ReLU activation is applied.
The transformed data flows to the next layer.
The final Dense layer produces an output.
Sigmoid activation converts it into a probability.

6.2 Backpropagation

During training:

The model calculates error using a loss function.
It propagates the error backward through layers.
Weights and biases are updated to reduce error.

The model gradually “learns” by adjusting its internal parameters.

7. What Kind of Problems Can This Model Solve?

This model is ideal for binary classification tasks such as:

Spam vs non-spam email
Positive vs negative sentiment
Fraudulent vs legitimate transaction
Healthy vs diseased patient
Churn vs non-churn user

It can also be adapted for:

Simple regression tasks
Feature extraction
Prototype testing
Educational purposes

8. Strengths of This Simple Keras Sequential Model

Despite being small, this model has many advantages:

8.1 Extremely Easy to Build

Only a few lines of code are required.

8.2 Beginner-Friendly

Great for students and beginners learning deep learning basics.

8.3 Highly Efficient for Small Tasks

Excellent for datasets that are not too complex.

8.4 Interpretable Architecture

No complex graph, no branching, no confusion.

8.5 Strong Foundation for Scaling Up

Once mastering this, learners can move to more complex architectures.

9. Limitations of This Basic Model

Because it is simple, it has limitations:

9.1 Not Suitable for Multi-Class Classification

Sigmoid cannot handle multiple classes.

9.2 Cannot Handle Complex Patterns

Only two layers might not be enough.

9.3 Not Ideal for Images or Text

These require CNNs or RNNs.

9.4 Limited Depth

Deep architectures need more layers.

9.5 No Flexibility for Advanced Designs

Cannot build architectures like:

ResNet
U-Net
Siamese networks
Transformers

10. Expanding the Model for Better Performance

To make the model better, you can add:

10.1 More Hidden Layers

Increase depth for more learning capacity.

10.2 Dropout Layers

To reduce overfitting.

10.3 Batch Normalization

To stabilize training.

10.4 Different Activation Functions

Experiment with tanh, leaky ReLU, or swish.

11. Why Keras Makes Deep Learning Easy

Keras is designed to be:

Human-friendly
Readable
Minimalistic
Modular