Teaching My Daughter with an LLM We’re Building ...

I’ve always believed the best way to learn something is to build it.

So I decided to teach my eldest daughter, Khawlah, Python not with textbooks or tutorials, but by building something real together: a personalized large language model.

It’s not a toy. We’re building a fully functional private LLM system with FastAPI, Supabase for storage, vector search, and an embedding pipeline. She’s typing alongside me, watching what happens when you pipe data through a tokenizer, when a query hits a vector store, or when the memory index shifts.

But somewhere around week three, we hit a proper wall. And honestly, it was a wall I didn’t expect.

The Problem We Faced

Our model was losing its sharpness over time.

Not in the obvious sense no wild hallucinations or errors but something more subtle. It started giving bland responses. It was technically correct, but felt like it forgot why it cared.

It wasn’t just a coding bug. It was architectural.

The system was loading entire memory logs into the context window on each query — every previous interaction, every correction, every lesson. It was working fine at first, but the longer we used it, the more noise it introduced.

The model was forgetting how to prioritize what mattered.

That’s when I sat down and explained it to Khawlah using a metaphor:

“Imagine if every time you opened your schoolbag, everything from the past month fell out — even that banana peel you forgot to throw away. It’s all there, but you can’t find your math book.”

She laughed, but she got it immediately. And we decided to fix it.

The Technical Insight: Context Weighting for Memory Recall

We introduced context-weighted memory embeddings.

Instead of treating all past interactions equally, we assigned weights to each memory vector at insertion time, based on:

Interaction Type: Was it a correction? A question? A passive fact?
Recency: When did it happen?
Repetition: Has this question been asked before?
Confidence: How certain was the LLM in its last answer?

We used cosine similarity thresholds combined with a decay function that reduced the weight of stale, one-off context fragments over time.

Here’s the rough pseudocode we played with (Khawlah helped structure this):

def assign_weight(memory_entry):
    base_score = 1.0
    
    if memory_entry['type'] == 'correction':
        base_score *= 2
    elif memory_entry['type'] == 'passive_observation':
        base_score *= 0.5

    days_old = (datetime.now() - memory_entry['timestamp']).days
    recency_penalty = 1 / (1 + days_old)

    weight = base_score * recency_penalty
    return weight

These weights were stored alongside the embedding in Supabase, and during vector retrieval, we filtered or boosted vectors based on both semantic similarity and their assigned weights.

Suddenly the model started acting like a real tutor again one that remembered what mattered, ignored the fluff, and knew when to reinforce vs. move forward.

The Unexpected Bonus: Real Learning for Both of Us

Watching Khawlah learn this alongside me changed everything.

She wasn’t just learning Python syntax she was seeing how data structure design impacts cognition, how memory management affects intelligence, and how algorithms influence behavior. These are concepts most kids (and frankly, most devs) only encounter much later.

But here she was, debugging a vector priority bug with me, asking questions like:

“Shouldn’t we make corrections last longer in memory, even if they’re old?”

Yes, we should.

Final Reflection

This project started as a way for her to learn Python in context.

But it turned into a father-daughter engineering lab where we both discovered something important:

Memory is not about recall. It’s about relevance.

If you’re building LLM tools for your kids or with them don’t just dump everything into memory. Teach the system what to forget. Or better yet, teach it how to forget.

And let your kids teach you something along the way too.

CTA:

Building something similar? I’m happy to share our stack, prompt design strategy, or memory indexing structure.

You can also reach out if you’re trying to teach your kids Python through real-world LLM projects I’ve got notes.

Teaching My Daughter with an LLM We’re Building Together (And the High-Level Problem We Solved)

The Problem We Faced

The Technical Insight: Context Weighting for Memory Recall

The Unexpected Bonus: Real Learning for Both of Us

Final Reflection

Leave a Reply Cancel reply

Before you continue reading...

The Problem We Faced

The Technical Insight: Context Weighting for Memory Recall

The Unexpected Bonus: Real Learning for Both of Us

Final Reflection

Enjoyed this article?

Leave a Reply Cancel reply