Year 1 · Week 07

Chapter 7: Finding the Best Line, One Step at a Time

Last week we saw that the best-fit line for our lemonade data is cups = 0.87 × temperature − 43. But how did we find those numbers? This week we learn a powerful trick: start with a random guess and improve it step by step. This is the same idea behind how real AI systems learn.

Back to Year 1 plan

Part 1: Our Data in Python

Remember the lemonade stand from last week? Let's put that data into Python as two lists. Open a new notebook in Google Colab and type:

# Temperature in °F
x = [60, 65, 70, 75, 80, 85, 90, 95]

# Cups sold
y = [10, 14, 20, 22, 28, 34, 38, 40]

Each number in x matches the number at the same position in y. When the temperature was 60, we sold 10 cups. When it was 65, we sold 14 cups, and so on.

Part 2: Plot the Data

Before we do any math, let's see our data. We will use a library called matplotlib to draw a scatter plot. Add this to your notebook:

import matplotlib.pyplot as plt

plt.scatter(x, y, color="blue")
plt.xlabel("Temperature (°F)")
plt.ylabel("Cups sold")
plt.title("Lemonade Stand Data")
plt.show()

You should see eight blue dots going upward from left to right. The pattern looks like it could be a straight line—and that's exactly what we want to find.

Part 3: Start with a Random Guess

We want to find a line that fits the data. A line has the equation:

Line equation: y = k \times x + b

Here k is the slope (how steep the line is) and b is the y-intercept (where the line crosses the y-axis). Last week we said the best answer is about k = 0.87 and b = -43. But pretend we don't know that yet.

Let's just guess. Pick any numbers:

# Our random starting guess
k = 0.5
b = -5

Is this a good guess? Probably not! But that's okay—we will fix it step by step.

Part 7: Walking Downhill—The Learning Rate

Instead of trying every possible k and b, we can be smart about it. The idea is simple:

Start at our current guess (k and b).
Try nudging k a tiny bit up and a tiny bit down. See which direction makes the error smaller.
Move k in the direction that reduces the error.
Do the same thing for b.
Repeat!

This is like being blindfolded on a hill and feeling the ground with your feet to figure out which way is downhill. Let's write this in Python:

# Start with our random guess
k = 0.5
b = -5

# The learning rate controls how big each step is
learning_rate = 0.0001

# A tiny nudge to test the slope
nudge = 0.01

Part 8: Putting It All Together

One step isn't enough. We need to repeat this many times. Let's use a for loop to do 1000 steps:

# Start fresh
k = 0.5
b = -5
learning_rate = 0.0001
nudge = 0.01

# Run 1000 steps
for step in range(1000):
    # Calculate slopes
    error_now = total_error(k, b, x, y)
    slope_k = (total_error(k + nudge, b, x, y) - error_now) / nudge
    slope_b = (total_error(k, b + nudge, x, y) - error_now) / nudge

    # Update k and b
    k = k - learning_rate * slope_k
    b = b - learning_rate * slope_b

    # Print every 200 steps so we can watch
    if step % 200 == 0:
        print("Step", step, " k =", round(k, 3),
              " b =", round(b, 3),
              " error =", round(error_now, 1))

Watch the output—you'll see k and b slowly moving toward the best values, and the error getting smaller.

Looking Ahead

Today you learned one of the most important ideas in all of AI: gradient descent—start with a guess, measure the error, and improve step by step. This is exactly how ChatGPT, image recognizers, and self-driving cars learn. The only difference is they have millions of parameters instead of just two (k and b), and they use much more data.

In the coming weeks, we will explore what happens when the relationship is not a straight line, how to deal with more than one input, and how this same idea powers neural networks.