Building a Thinking Model from a Non-Thinking Model Using Chain-of-Thought (COT) Prompting

When we think about an AI chatbot — like the one Zomato uses — it only gives solutions based on what it’s designed for.

If you ask it to write code, it won’t.
Why? Because it’s not trained for that — it’s following patterns, not reasoning.

Most language models work like this:

Input → Predict most likely text → Output.
No planning. No deep thought. Just autocomplete on steroids.

But as developers, we can actually programmatically force the model to reason step-by-step.
That’s where Chain-of-Thought (CoT) comes in — not as a “prompt trick,” but as part of your system design.

The Problem: LLMs Don’t Think By Default

LLMs are trained to complete text, not to break problems into logical substeps.
When given:

20 + 32 × 67 + 93 - 267 ÷ 45

They might jump straight to an answer.
If they make a small mistake early, the final answer is wrong — and you won’t even know why.

The Solution: Force Thinking with START → THINK → EVALUATE → OUTPUT

Instead of just asking for the solution to a complex problem
We define a protocol that the AI must follow:

START — Understand the problem.
THINK — Break it into smaller steps.
EVALUATE — Wait for a human or another AI to check the step.
OUTPUT — Only after all checks, give the final answer.

By forcing the AI to follow this sequence — one step at a time — we can catch mistakes before the final output.

The Code

import 'dotenv/config';
import { OpenAI } from 'openai';

const client = new OpenAI();

async function main() {
  const SYSTEM_PROMPT = `
    You are an AI assistant who works on START, THINK, EVALUATE, OUTPUT format.
    Always break down problems, evaluate correctness, and only give the final output after all thinking steps are done.

    Output JSON format:
    { "step": "START | THINK | EVALUATE | OUTPUT", "content": "string" }
  `;

  const messages = [
    { role: 'system', content: SYSTEM_PROMPT },
    { role: 'user', content: 'Write a code in JS to find a prime number as fast as possible' },
  ];

  while (true) {
    const response = await client.chat.completions.create({
      model: 'gpt-4.1-mini',
      messages,
    });

    const rawContent = response.choices[0].message.content;
    const parsed = JSON.parse(rawContent);

    messages.push({ role: 'assistant', content: JSON.stringify(parsed) });

    if (parsed.step === 'START') {
      console.log(`🔥`, parsed.content);
      continue;
    }

    if (parsed.step === 'THINK') {
      console.log(`\t🧠`, parsed.content);

      messages.push({
        role: 'developer',
        content: JSON.stringify({
          step: 'EVALUATE',
          content: 'Nice, you are going on the correct path',
        }),
      });

      continue;
    }

    if (parsed.step === 'OUTPUT') {
      console.log(`🤖`, parsed.content);
      break;
    }
  }

  console.log('Done...');
}

main();

source: https://github.com/piyushgarg-dev/genai-js-1.0

Why This Works

No blind guessing — AI must show its reasoning.
Error catching — Mistakes are caught in EVALUATE before they reach the user.
Composable — You can swap the evaluator with another AI (LLM-as-a-judge) or a human reviewer.
Transparent — Every decision step is visible for debugging.

Beyond Math

This technique isn’t just for calculations.
You can apply it to:

Debugging code
Medical diagnosis
Legal reasoning
Complex business workflows

Any time reasoning matters more than speed, this approach turns your LLM into a thinking partner instead of a pattern-matcher.

Building a Thinking Model from a Non-Thinking Model Using Chain-of-Thought (COT) Prompting

The Problem: LLMs Don’t Think By Default

The Solution: Force Thinking with START → THINK → EVALUATE → OUTPUT

The Code

Why This Works

Beyond Math

Comments

More from this blog

Getting Started with GIT | Basics and Essential Commands

Making RAG Smarter: Improving Accuracy

Common Failure Cases in RAG Systems And How to Fix Them Fast

Retrieval-Augmented Generation (RAG)

Agentic AI: How AI Becomes a Doer, Not Just a Thinker

Command Palette

The Problem: LLMs Don’t Think By Default

The Solution: Force Thinking with START → THINK → EVALUATE → OUTPUT

The Code

Why This Works

Beyond Math

Comments

More from this blog