Skip to main content

Command Palette

Search for a command to run...

Building a Thinking Model from a Non-Thinking Model Using Chain-of-Thought (COT) Prompting

Updated
3 min read
Building a Thinking Model from a Non-Thinking Model Using Chain-of-Thought (COT) Prompting

When we think about an AI chatbot — like the one Zomato uses — it only gives solutions based on what it’s designed for.

If you ask it to write code, it won’t.
Why? Because it’s not trained for that — it’s following patterns, not reasoning.

Most language models work like this:

  • Input → Predict most likely text → Output.
    No planning. No deep thought. Just autocomplete on steroids.

But as developers, we can actually programmatically force the model to reason step-by-step.
That’s where Chain-of-Thought (CoT) comes in — not as a “prompt trick,” but as part of your system design.

The Problem: LLMs Don’t Think By Default

LLMs are trained to complete text, not to break problems into logical substeps.
When given:

20 + 32 × 67 + 93 - 267 ÷ 45

They might jump straight to an answer.
If they make a small mistake early, the final answer is wrong — and you won’t even know why.

The Solution: Force Thinking with START → THINK → EVALUATE → OUTPUT

Instead of just asking for the solution to a complex problem
We define a protocol that the AI must follow:

  1. START — Understand the problem.

  2. THINK — Break it into smaller steps.

  3. EVALUATE — Wait for a human or another AI to check the step.

  4. OUTPUT — Only after all checks, give the final answer.

By forcing the AI to follow this sequence — one step at a time — we can catch mistakes before the final output.

The Code

import 'dotenv/config';
import { OpenAI } from 'openai';

const client = new OpenAI();

async function main() {
  const SYSTEM_PROMPT = `
    You are an AI assistant who works on START, THINK, EVALUATE, OUTPUT format.
    Always break down problems, evaluate correctness, and only give the final output after all thinking steps are done.

    Output JSON format:
    { "step": "START | THINK | EVALUATE | OUTPUT", "content": "string" }
  `;

  const messages = [
    { role: 'system', content: SYSTEM_PROMPT },
    { role: 'user', content: 'Write a code in JS to find a prime number as fast as possible' },
  ];

  while (true) {
    const response = await client.chat.completions.create({
      model: 'gpt-4.1-mini',
      messages,
    });

    const rawContent = response.choices[0].message.content;
    const parsed = JSON.parse(rawContent);

    messages.push({ role: 'assistant', content: JSON.stringify(parsed) });

    if (parsed.step === 'START') {
      console.log(`🔥`, parsed.content);
      continue;
    }

    if (parsed.step === 'THINK') {
      console.log(`\t🧠`, parsed.content);

      messages.push({
        role: 'developer',
        content: JSON.stringify({
          step: 'EVALUATE',
          content: 'Nice, you are going on the correct path',
        }),
      });

      continue;
    }

    if (parsed.step === 'OUTPUT') {
      console.log(`🤖`, parsed.content);
      break;
    }
  }

  console.log('Done...');
}

main();

source: https://github.com/piyushgarg-dev/genai-js-1.0

Why This Works

  • No blind guessing — AI must show its reasoning.

  • Error catching — Mistakes are caught in EVALUATE before they reach the user.

  • Composable — You can swap the evaluator with another AI (LLM-as-a-judge) or a human reviewer.

  • Transparent — Every decision step is visible for debugging.

Beyond Math

This technique isn’t just for calculations.
You can apply it to:

  • Debugging code

  • Medical diagnosis

  • Legal reasoning

  • Complex business workflows

Any time reasoning matters more than speed, this approach turns your LLM into a thinking partner instead of a pattern-matcher.

More from this blog

E

Explaining The Tech

10 posts