Building a Thinking Model from a Non-Thinking Model Using Chain-of-Thought (COT) Prompting

When we think about an AI chatbot — like the one Zomato uses — it only gives solutions based on what it’s designed for.
If you ask it to write code, it won’t.
Why? Because it’s not trained for that — it’s following patterns, not reasoning.
Most language models work like this:
- Input → Predict most likely text → Output.
No planning. No deep thought. Just autocomplete on steroids.
But as developers, we can actually programmatically force the model to reason step-by-step.
That’s where Chain-of-Thought (CoT) comes in — not as a “prompt trick,” but as part of your system design.
The Problem: LLMs Don’t Think By Default
LLMs are trained to complete text, not to break problems into logical substeps.
When given:
20 + 32 × 67 + 93 - 267 ÷ 45
They might jump straight to an answer.
If they make a small mistake early, the final answer is wrong — and you won’t even know why.
The Solution: Force Thinking with START → THINK → EVALUATE → OUTPUT
Instead of just asking for the solution to a complex problem
We define a protocol that the AI must follow:
START — Understand the problem.
THINK — Break it into smaller steps.
EVALUATE — Wait for a human or another AI to check the step.
OUTPUT — Only after all checks, give the final answer.
By forcing the AI to follow this sequence — one step at a time — we can catch mistakes before the final output.
The Code
import 'dotenv/config';
import { OpenAI } from 'openai';
const client = new OpenAI();
async function main() {
const SYSTEM_PROMPT = `
You are an AI assistant who works on START, THINK, EVALUATE, OUTPUT format.
Always break down problems, evaluate correctness, and only give the final output after all thinking steps are done.
Output JSON format:
{ "step": "START | THINK | EVALUATE | OUTPUT", "content": "string" }
`;
const messages = [
{ role: 'system', content: SYSTEM_PROMPT },
{ role: 'user', content: 'Write a code in JS to find a prime number as fast as possible' },
];
while (true) {
const response = await client.chat.completions.create({
model: 'gpt-4.1-mini',
messages,
});
const rawContent = response.choices[0].message.content;
const parsed = JSON.parse(rawContent);
messages.push({ role: 'assistant', content: JSON.stringify(parsed) });
if (parsed.step === 'START') {
console.log(`🔥`, parsed.content);
continue;
}
if (parsed.step === 'THINK') {
console.log(`\t🧠`, parsed.content);
messages.push({
role: 'developer',
content: JSON.stringify({
step: 'EVALUATE',
content: 'Nice, you are going on the correct path',
}),
});
continue;
}
if (parsed.step === 'OUTPUT') {
console.log(`🤖`, parsed.content);
break;
}
}
console.log('Done...');
}
main();
source: https://github.com/piyushgarg-dev/genai-js-1.0
Why This Works
No blind guessing — AI must show its reasoning.
Error catching — Mistakes are caught in
EVALUATEbefore they reach the user.Composable — You can swap the evaluator with another AI (LLM-as-a-judge) or a human reviewer.
Transparent — Every decision step is visible for debugging.
Beyond Math
This technique isn’t just for calculations.
You can apply it to:
Debugging code
Medical diagnosis
Legal reasoning
Complex business workflows
Any time reasoning matters more than speed, this approach turns your LLM into a thinking partner instead of a pattern-matcher.






