This article accompanies Jeremy Howard's fast.ai Practical Deep Learning for Coders 2022 · Lesson 3, specifically the section from 1:04:30 to 1:20:47. It is part of the curriculum materials produced by Dr Neal for one-to-one AI Learners sessions.
▶ Watch Lesson 3 on YouTube — jump to 1:04:30
A Word Before You Start: This Is Chapter 4's Prelude
Chapter 4 of Deep Learning for Coders with fastai and PyTorch is where the hairy stuff really begins. It covers MNIST from scratch — building a complete image classifier using nothing but raw PyTorch tensors, gradient descent, and the same mathematical machinery that runs every serious deep learning model in the world. It is not light reading.
The exercise in this post is Jeremy Howard's recommended prelude to that chapter. If you can build a working neural network in a spreadsheet — and actually understand every cell — you will find Chapter 4 considerably less intimidating. The algebra is identical. The only difference is that the spreadsheet makes every intermediate value visible and touchable, whereas Python compresses it into a few lines of tensor arithmetic. If you have not yet worked through the full Lesson 3 material — gradient descent, ReLU, and how neural networks approximate any function — read that companion post first before attempting the steps below.
1. Excel (Windows/Mac) — follow the steps below exactly as Jeremy does in the video
2. OpenOffice / LibreOffice Calc (free, open source, any OS) — identical logic, equivalent commands
3. Python / NumPy (ambitious) — if you want to translate every step to code before tackling Chapter 4
What OpenOffice Is (and Where to Get It)
OpenOffice and its more actively maintained fork LibreOffice are free, open-source office productivity suites that include a full-featured spreadsheet application called Calc. Calc reads and writes .xlsx files, supports all the functions used in this exercise (SUMPRODUCT, MAX, MMULT, TRANSPOSE, and a built-in Solver), and runs on Windows, macOS, and Linux.
→ LibreOffice (recommended — most actively maintained)
→ Apache OpenOffice
Both are completely free, contain no ads, and require no account or licence key. LibreOffice is the version most developers recommend today. For the purposes of this exercise, either works identically — all the commands below apply to both.
The Big Picture: What You Are Building
You are building a neural network that predicts Titanic survival. The architecture has two stages:
Stage 1 — Linear Regression Model: Multiply each passenger feature by a learnable coefficient, sum the results, and produce a single prediction number. This is a single neuron.
Stage 2 — Neural Network: Take two of those linear neurons, apply a ReLU (replace negatives with zero) to each, and add them together. This is a one-hidden-layer neural network — the same fundamental structure as modern deep learning, just tiny.
Both stages are trained by gradient descent: a numerical optimisation that adjusts the coefficients to minimise the error between your predictions and the known outcomes. In the spreadsheet you will run this using the built-in Solver tool. In Python you would use PyTorch's autograd. The mathematics is the same.
Part 1: Setting Up Your Data
Step 1 — Download the Titanic dataset
The fast.ai course uses the Titanic dataset from Kaggle. You can download train.csv from kaggle.com/c/titanic/data (free account required) or use Jeremy's course spreadsheet directly from the course repo:
▶ Download titanic-backprop.xlsx from GitHub
If using your own CSV, open it in Excel or LibreOffice Calc via File → Open.
Step 2 — Prepare the feature columns
Jeremy uses a small, cleaned subset. Set up your spreadsheet so that Row 1 is a header row and data begins in Row 2. You need the following columns (add or compute them as needed):
| Column | Label | Notes |
|---|---|---|
| A | survived |
0 = died, 1 = survived (this is your target — what you're predicting) |
| B | pclass |
Ticket class: 1, 2, or 3 |
| C | sex |
Convert to numeric: male = 0, female = 1 |
| D | age |
Numeric. Fill missing values with the column median |
| E | sibsp |
Number of siblings/spouses aboard |
| F | parch |
Number of parents/children aboard |
| G | fare |
Ticket fare (numeric) |
| H | embarked |
Convert to numeric: S = 0, C = 1, Q = 2 |
Converting Sex to numeric in Excel:
OpenOffice/LibreOffice Calc — identical formula:
Copy this formula down the entire column. Do the same for Embarked using nested IFs or SWITCH.
Step 3 — Create a parameters row
Above your data (say, in Row 1 if your header is in Row 2, or use a clearly labelled section at the top of the sheet), create one cell per feature (7 cells for 7 features). These are your model's learnable coefficients — what gradient descent will adjust.
Initialise them with small random values centred around zero:
Excel:
OpenOffice/LibreOffice Calc:
Name your parameter cells. In Excel: select them, then type a name in the Name Box (top-left, where the cell address shows). Call this range params. This makes your formulas readable.
Part 2: Building the Linear Regression Model
Step 4 — Compute predictions for each passenger
For each data row, you compute a prediction by taking the dot product of the parameters and that row's features. This is the heart of a linear model.
Excel — in the Prediction column (e.g. column I), enter for row 2:
Where $B$1:$H$1 is your parameters row (use absolute references so it doesn't move when you copy the formula down).
OpenOffice/LibreOffice Calc — identical:
Copy this formula down for every passenger row. Column I now contains your raw model output for each passenger.
Step 5 — Compute the loss
You need a single number that measures how wrong your predictions are. Jeremy uses Mean Absolute Error (MAE): the average absolute difference between each prediction and the true survival value.
Excel — in a dedicated cell (e.g. K1):
Important: This is an array formula. In Excel, enter it with Ctrl+Shift+Enter (not just Enter), and you will see curly braces { } appear around it automatically. In Excel 365 / Excel 2019+, you can press Enter normally — array formulas are automatic.
OpenOffice/LibreOffice Calc:
Also entered with Ctrl+Shift+Enter in LibreOffice to confirm as an array formula.
Label this cell clearly — e.g. put Loss (MAE) in the cell to the left. This is the number the Solver will minimise.
Step 6 — Run Solver to train the model
This is gradient descent — the same optimisation that trains GPT, image classifiers, and every other neural network. The Solver finds parameter values that minimise your loss.
In Microsoft Excel:
- Go to Data tab → Solver (if you don't see Solver: File → Options → Add-ins → Manage Excel Add-ins → tick Solver Add-in → OK)
- Set Objective: click the cell containing your MAE loss (e.g.
$K$1) - To: select Min
- By Changing Variable Cells: select your parameters range (e.g.
$B$1:$H$1) - Under Select a Solving Method: choose GRG Nonlinear
- Click Solve
- When Solver finishes, select Keep Solver Solution and click OK
In LibreOffice Calc:
- Go to Tools → Solver
- Target Cell: your MAE loss cell
- Optimize result to: select Minimum
- By changing cells: your parameters range
- Click OK to run
Watch the loss cell decrease as Solver iterates. When it converges, your parameters have been trained. The model is now making better predictions than it did with random weights.
Part 3: Building the Neural Network
A single linear model has a fundamental limitation — it can only draw a straight line (or flat hyperplane) through the data. Adding a non-linearity (ReLU) and a second linear unit gives the network the ability to learn curved, complex boundaries.
Step 7 — Add a second set of parameters
Copy your parameters row to a new row directly below it (e.g. Row 2 if your first parameters were in Row 1). Again initialise with =RAND()-0.5, then paste as values. You now have two sets of coefficients.
Label them: params_1 and params_2.
Step 8 — Compute two predictions and apply ReLU
For each passenger, compute two raw predictions — one from each parameter row — then apply a Rectified Linear Unit (ReLU): replace any negative value with zero. This is the non-linearity that gives neural networks their power.
Excel — Raw prediction from params_1 (column J):
Excel — ReLU of first prediction (column K):
Excel — Raw prediction from params_2 (column L):
Excel — ReLU of second prediction (column M):
Excel — Final neural net output = sum of the two ReLU outputs (column N):
OpenOffice/LibreOffice Calc — all identical:
Copy all five formulas down for every passenger row.
Step 9 — Update the loss cell
Change your MAE formula to reference the neural net output column (N) instead of the old single-model column (I):
Excel:
OpenOffice/LibreOffice Calc:
(Adjust row numbers to match your data range.)
Step 10 — Train the neural network with Solver
Run Solver again, this time with both parameter rows as the variable cells.
Excel:
1. Data → Solver
2. Set Objective: loss cell → Min
3. By Changing Variable Cells: select both parameter rows together, e.g. $B$1:$H$2
4. Method: GRG Nonlinear → Solve
LibreOffice Calc: 1. Tools → Solver 2. Target cell: loss → Minimum 3. By changing cells: both parameter rows
The loss should now be lower than before — the two-neuron network is more expressive than the single linear model.
Part 4: Matrix Multiplication (Bonus)
Jeremy shows that once you have more than a handful of features or neurons, writing out SUMPRODUCT formulas for each row individually becomes tedious. Matrix multiplication (MMULT) computes all the dot products simultaneously.
Step 11 — Compute all predictions at once with MMULT
Excel:
This produces a matrix where each row is a passenger and each column is one neuron's raw output — the equivalent of all your SUMPRODUCT formulas at once.
OpenOffice/LibreOffice Calc:
In both applications, enter this as an array formula (Ctrl+Shift+Enter) and select a destination range the right size (number of passengers × number of neurons) before entering.
@ operator or torch.matmul() does — and it is the reason GPUs are so effective at deep learning. They can perform enormous matrix multiplications (thousands of rows, thousands of columns) in parallel.
What You've Just Built
| Component | Spreadsheet version | PyTorch equivalent |
|---|---|---|
| Parameters | Cells in a row | torch.tensor(..., requires_grad=True) |
| Prediction | SUMPRODUCT(params, features) |
params @ features |
| Loss | AVERAGE(ABS(preds - actuals)) |
(preds - actuals).abs().mean() |
| ReLU | MAX(x, 0) |
F.relu(x) |
| Training | Excel/LibreOffice Solver | loss.backward(); params -= params.grad * lr |
| All predictions at once | MMULT(features, TRANSPOSE(params)) |
features @ params.T |
Every concept in Chapter 4 of the book — stochastic gradient descent, the forward pass, the activation function, the loss — maps directly onto what you have done in this spreadsheet. The book just compresses it into clean Python, vectorised across thousands of images simultaneously.
Doing It in Python (For the Ambitious)
If you want to go further before Chapter 4, translate every step above into NumPy:
import numpy as np
import pandas as pd
# Load and encode data
df = pd.read_csv('titanic/train.csv')
df['sex'] = (df['Sex'] == 'female').astype(float)
df['embarked'] = df['Embarked'].map({'S': 0, 'C': 1, 'Q': 2}).fillna(0)
df['age'] = df['Age'].fillna(df['Age'].median())
features = df[['Pclass', 'sex', 'age', 'SibSp', 'Parch', 'Fare', 'embarked']].values
targets = df['Survived'].values.reshape(-1, 1)
# Initialise random parameters (equivalent to RAND()-0.5)
params = np.random.rand(7, 2) - 0.5 # 7 features, 2 neurons
# Forward pass: MMULT equivalent
raw = features @ params # shape: (passengers, 2)
relus = np.maximum(raw, 0) # ReLU: MAX(x, 0)
preds = relus.sum(axis=1, keepdims=True) # sum the two neurons
# Loss: MAE
loss = np.abs(preds - targets).mean()
print(f"Initial loss: {loss:.4f}")
From here you would implement gradient descent manually — a fine exercise that will make Chapter 4's PyTorch autograd feel like a revelation rather than a mystery.
Why This Exercise Is Worth Your Time
The fast.ai philosophy is top-down: you use the tools before you understand the internals. But at the Chapter 4 inflection point, the book inverts — it asks you to understand the internals before you can properly use the tools. The spreadsheet exercise bridges that inversion. It makes the internals concrete and tactile at the cost of only a couple of hours.
Jeremy has said this exercise is one of his strongest recommendations to students who find Chapter 4 difficult. Having built a neural network with your own hands — even a tiny one, even in a spreadsheet — permanently changes the way you read the Python code that follows.
Do it in Excel. Do it in LibreOffice. Or do it in Python. But do it before you open Chapter 4. And if you haven't yet worked through the Lesson 3 foundations, start with the companion guide →
Why Study with Dr. Neal Aggarwal?
Forty years of teaching information technology and artificial intelligence across academic, corporate, and individual-mentorship contexts confer a particular kind of understanding that no amount of self-study can replicate: the ability to recognise where a given student is stuck, why they're stuck there, and what angle of re-entry will unstick them.
The fast.ai curriculum is excellent. But it was designed for a specific learner archetype, and most learners are not that archetype. Working through the course with a guide who has led hundreds of learners through this material means those mismatches get resolved in real time, not after three weeks of stalling on a concept that could have been reframed in five minutes.
Students who study with Dr. Neal complete the fast.ai curriculum in half the median time, with substantially deeper practical understanding — and they leave with a project, not just a certificate.
→ Contact Dr. Neal Aggarwal for 1-to-1 sessions, group workshops, and curriculum design