<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Explaining The Tech]]></title><description><![CDATA[Explaining The Tech]]></description><link>https://blog.veerrajpoot.com</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1755093781097/902d270f-2176-4a28-9694-b4bc3da09ac2.png</url><title>Explaining The Tech</title><link>https://blog.veerrajpoot.com</link></image><generator>RSS for Node</generator><lastBuildDate>Mon, 18 May 2026 00:29:18 GMT</lastBuildDate><atom:link href="https://blog.veerrajpoot.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Getting Started with GIT | Basics and Essential Commands]]></title><description><![CDATA[Heyy there, today in this blog we are gonna discuss about everyone’s favoutite verson control system, git. Now matter you’re a complete begginer, or an expert software developer or just a geek. If you write code, then git is a life saver & time saver...]]></description><link>https://blog.veerrajpoot.com/getting-started-with-git-basics-and-essential-commands</link><guid isPermaLink="true">https://blog.veerrajpoot.com/getting-started-with-git-basics-and-essential-commands</guid><category><![CDATA[ChaiCode]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Tue, 27 Jan 2026 08:12:52 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1769501482492/042dbc61-c8fe-4bac-9095-3c1ff1f3202e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Heyy there, today in this blog we are gonna discuss about everyone’s favoutite verson control system, git. Now matter you’re a complete begginer, or an expert software developer or just a geek. If you write code, then git is a life saver &amp; time saver tool for you as a developer.</p>
<h2 id="heading-what-is-git">What is Git?</h2>
<p>Git is a distributed version control system (DVCS) specifically design to manage and track changes in a project's codebase over time. It allows multiple developers working on the same project to maintain their own complete history of changes, providing a comprehensive record of who made what changes and when. This system is like a "time machine" for your code, enable developers to easily revert to previous versions if needed, compare different history of the project, and collaborate efficiently without overwriting each other's work. Git's distributed nature means that every developer has a full copy of the entire project history on their local machine, ensuring that the project is prevented against data loss and allow seamless collaboration across diverse teams.</p>
<h2 id="heading-why-git-is-used">Why Git is Used?</h2>
<p>Git solves many real problems developers face while building software in a team (or even solo).</p>
<ol>
<li><p><strong>Track Changes:</strong> See what changes are made in which files, when, and by whom.</p>
</li>
<li><p><strong>Undo Mistakes:</strong> Safely undo any mistaken changes made to any file.</p>
</li>
<li><p><strong>Work in Parallel:</strong> Multiple developers can create branches to work on different features simultaneously without waiting for each others.</p>
</li>
<li><p><strong>Merge Code:</strong> Combine code from different developers.</p>
</li>
<li><p><strong>Detect Conflicts:</strong> While merging, Git helps track conflicts in the code and resolve them.</p>
</li>
<li><p><strong>Maintain a Remote Repository:</strong> Keep a remote repository for the codebase on a local or cloud server for backup and deployments.</p>
</li>
</ol>
<h2 id="heading-git-basics-and-core-terminologies">Git Basics and Core Terminologies</h2>
<ul>
<li><p><strong>Repository</strong></p>
<p>  A repository (or repo) is a virtual storage space for managing and storing digital assets like a codebase, data, or project files. At its core, a Git repository is the hidden <code>.git</code> directory located in the root of your project folder. Read <a class="post-section-overview" href="#">this blog</a> to learn more about the <code>.git folder</code>.</p>
</li>
<li><p><strong>Commit</strong></p>
<p>  A commit in Git is a snapshot of your project at a specific moment. Essentially, it saves the project's history at a particular time, including the staged changes and metadata.</p>
</li>
<li><p><strong>Staged changes</strong></p>
<p>  These are modifications in your project's files that have been marked in their current version to be included in the next commit. This is done using the <code>git add</code> command.</p>
</li>
<li><p><strong>Branch</strong></p>
<p>  A branch is like a separate workspace created from a specific commit in a repository where you can make changes. It can be merged with another branch as a commit or kept as a separate branch.</p>
</li>
<li><p><strong>HEAD</strong></p>
<p>  HEAD points to the current commit we are working on in a specific branch. When you make a commit, it is added on top of HEAD. When you checkout a branch, HEAD moves to that branch.</p>
</li>
<li><p><strong>Checkout</strong></p>
<p>  Checkout in Git is a command used to move HEAD to a different branch or commit and update your working files accordingly.</p>
</li>
</ul>
<h2 id="heading-common-git-commands">Common Git Commands</h2>
<ul>
<li><p>Initialize a new Git repository:<br />  <code>git init</code></p>
</li>
<li><p>Copy an existing remote repository to your machine:<br />  <code>git clone &lt;url&gt;</code></p>
</li>
<li><p>See the current state of files in the repository:<br />  <code>git status</code></p>
</li>
<li><p>Stage file changes for the next commit:<br />  <code>git add &lt;filename&gt;</code> → This stages specific file for the next commit<br />  <code>git add .</code> → This stages all file changes in the repository for the next commit.</p>
</li>
<li><p>Save/commit staged changes to the repository history:<br />  <code>git commit -m “message here”</code></p>
</li>
<li><p>Display the commit history:<br />  <code>git log</code></p>
</li>
<li><p>Combines another branch into the current branch.</p>
<p>  <code>git merge &lt;branch&gt;</code></p>
</li>
<li><p>Show differences between file versions:<br />  <code>git diff</code> → This show changes we’ve made in the files that haven't yet added to the staging area with <code>git add .</code><br />  <code>git diff &lt;branch1&gt; &lt;branch2&gt;</code> → This displays all changes that are in branch2 but not in branch1.</p>
<p>  <code>git diff &lt;commit-id1&gt; &lt;commit-id2&gt;</code> → We can use commit hashes to see the differences between any two points in project's history.</p>
</li>
<li><p>Lists all branches in the repository:<br />  <code>git branch</code></p>
</li>
<li><p>Create a new branch:</p>
<p>  <code>git branch &lt;branch-name&gt;</code></p>
</li>
<li><p>Switches to another branch or commit:</p>
<p>  <code>git checkout &lt;branch/commit hash&gt;</code></p>
</li>
<li><p>Create and switch to a new branch:</p>
<p>  <code>git checkout -b &lt;branch&gt;</code></p>
</li>
<li><p>To add a remote repository in git repo:</p>
<p>  <code>git remote add &lt;remote-name&gt; &lt;remote-url&gt;</code></p>
<p>  replace <code>&lt;remote-name&gt;</code> with a name for the remote (e.g., origin) and <code>&lt;remote-url&gt;</code> with the URL of remote location (eg. GitHub or GitLab).</p>
</li>
<li><p>Upload local commits to the remote repository:</p>
<p>  <code>git push</code></p>
</li>
<li><p>Fetch and merge changes from the remote repository:</p>
<p>  <code>git pull</code></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Making RAG Smarter: Improving Accuracy]]></title><description><![CDATA[In my previous blog on Retrieval-Augmented Generation (RAG), I broke down what RAG is, why it matters, and how it supercharges LLMs with external knowledge.Then, in my follow-up post, I shared the common failure points in RAG systems and how to fix t...]]></description><link>https://blog.veerrajpoot.com/making-rag-smarter-improving-accuracy</link><guid isPermaLink="true">https://blog.veerrajpoot.com/making-rag-smarter-improving-accuracy</guid><category><![CDATA[ChaiCode]]></category><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[Retrieval-Augmented Generation]]></category><category><![CDATA[generative ai]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Fri, 22 Aug 2025 15:23:27 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/fd4xmQUMJPg/upload/94967ad0d40b020de356260928d82a3c.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In my <a target="_blank" href="https://blog.veerrajpoot.com/retrieval-augmented-generation-rag">previous blog on Retrieval-Augmented Generation (RAG), I broke down what RAG is, why it matters, and how i</a>t supercharges LLMs with external knowledge.<br />Then, in my <a target="_blank" href="https://blog.veerrajpoot.com/common-failure-cases-in-rag-systems-and-how-to-fix-them-fast">follow-up post, I shared the</a> <strong>common failure points</strong> in RAG systems and how to fix them quickly.</p>
<p>I recently started digging deeper into <strong>RAG (Retrieval-Augmented Generation)</strong> and realized that while the <strong>basic RAG architecture</strong> is powerful, it’s also far from perfect. So, in this article, let me explain:</p>
<ul>
<li><p>How <strong>basic RAG</strong> works</p>
</li>
<li><p>Why <strong>RAG struggles</strong> sometimes</p>
</li>
<li><p>Different <strong>optimization techniques</strong> to improve accuracy</p>
</li>
<li><p>When <strong>not</strong> to overengineer things</p>
</li>
</ul>
<h2 id="heading-how-basic-rag-works"><strong>How Basic RAG Works</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755695334249/ccca3b93-7f8e-4d67-8d14-440a5a17a818.png?auto=compress,format&amp;format=webp" alt /></p>
<p>At its core, a RAG system does something simple:</p>
<ol>
<li><p><strong>Take user input</strong> → a query or question.</p>
</li>
<li><p><strong>Convert it into vector embeddings</strong> → numerical representations of meaning.</p>
</li>
<li><p><strong>Search the vector database</strong> → e.g., <strong>Qdrant</strong>, <strong>Pinecone</strong>, or <strong>FAISS</strong>.</p>
</li>
<li><p><strong>Retrieve relevant chunks</strong> of information.</p>
</li>
<li><p><strong>Send the retrieved chunks + user query</strong> to an LLM.</p>
</li>
<li><p><strong>LLM generates an answer</strong> using both its knowledge + provided context.</p>
</li>
</ol>
<p>Sounds neat, right? But here’s the problem…</p>
<h2 id="heading-the-garbage-in-garbage-out-gigo-problem"><strong>The Garbage In, Garbage Out (GIGO) Problem</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755874470384/15076115-bc09-4665-b29a-1506714676b2.png" alt class="image--center mx-auto" /></p>
<p>RAG is <strong>only as good</strong> as the input you give it.<br />If the <strong>user’s query is vague, incomplete, or inconsistent</strong>, the <strong>retrieved context may not match well</strong>, leading to poor answers.</p>
<p>For example:</p>
<ul>
<li><p>Your vector DB has chunks about <strong>“machine learning model deployment”</strong></p>
</li>
<li><p>The user asks: “How to put my AI online?”</p>
</li>
<li><p>The retriever might miss relevant chunks because the <strong>wording doesn’t match</strong>, even though the intent is related.</p>
</li>
</ul>
<p>So, we need <strong>smarter techniques</strong> to <strong>bridge this gap</strong> and make RAG more accurate.</p>
<h2 id="heading-ways-to-make-rag-smarter"><strong>Ways to Make RAG Smarter</strong></h2>
<h3 id="heading-1-query-rewriting-simplest-fix"><strong>1. Query Rewriting (Simplest Fix)</strong></h3>
<p><strong>Idea:</strong><br />Before hitting the vector DB, <strong>rewrite the user’s query</strong> to make it more <strong>clear, structured, and context-friendly</strong>.</p>
<p><strong>Flow:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755875851942/d4e68fc4-2202-44de-afd0-e29c417f8fa4.png" alt class="image--center mx-auto" /></p>
<p><strong>How it helps:</strong></p>
<ul>
<li><p>Better embeddings → better chunk retrieval</p>
</li>
<li><p>More consistent matches with your knowledge base</p>
</li>
</ul>
<p><strong>When to use it:</strong></p>
<ul>
<li><p>Works great for small optimizations</p>
</li>
<li><p>Minimal performance impact</p>
</li>
</ul>
<h3 id="heading-2-multi-query-retrieval-more-accurate-slightly-slower"><strong>2. Multi-Query Retrieval (More Accurate, Slightly Slower)</strong></h3>
<p><strong>Idea:</strong><br />Instead of <strong>one improved query</strong>, generate <strong>multiple related queries</strong> to cover <strong>all possible angles</strong> of the user’s intent.</p>
<p><strong>Flow:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755876320781/d7c9983c-7517-4566-bd41-4ad251b0ec27.png" alt class="image--center mx-auto" /></p>
<p><strong>Why it works:</strong></p>
<ul>
<li><p>Covers <strong>semantic variations</strong> the original query might miss</p>
</li>
<li><p>Retrieves <strong>more complete and accurate context</strong></p>
</li>
<li><p>Significantly improves overall precision</p>
</li>
</ul>
<p><strong>Trade-off:</strong></p>
<ul>
<li><p>Increases retrieval time slightly</p>
</li>
<li><p>Best for <strong>complex or ambiguous queries</strong></p>
</li>
</ul>
<h3 id="heading-3-hyde-approach-hypothetical-document-embeddings"><strong>3. HyDe Approach (Hypothetical Document Embeddings)</strong></h3>
<p>This one’s clever. Instead of <strong>directly searching</strong> the vector DB with the user’s query, we:</p>
<ol>
<li><p><strong>Generate a “hypothetical answer”</strong> using an LLM.</p>
</li>
<li><p>Convert this generated answer <strong>into vector embeddings</strong>.</p>
</li>
<li><p>Use those embeddings to <strong>search the vector DB</strong>.</p>
</li>
<li><p>Retrieve <strong>highly relevant chunks</strong>.</p>
</li>
<li><p>Finally, send the <strong>best chunks + user query</strong> to the LLM for final output.</p>
</li>
</ol>
<p><strong>Flow:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755876981267/81582eed-ecd1-4fbb-826e-a2250e63514a.png" alt class="image--center mx-auto" /></p>
<p><strong>Why it works:</strong></p>
<ul>
<li><p>The LLM “imagines” the right answer first</p>
</li>
<li><p>This makes the <strong>retrieval process much more accurate</strong></p>
</li>
<li><p>Especially useful when user queries are <strong>vague or incomplete</strong></p>
</li>
</ul>
<h2 id="heading-bonus-combine-multi-query-hyde-ultra-accuracy"><strong>Bonus: Combine Multi-Query + HyDe = Ultra Accuracy</strong></h2>
<p>For <strong>critical tasks</strong> where accuracy matters more than speed, you can <strong>combine techniques 2 and 3</strong>:</p>
<ul>
<li><p>Use HyDe to generate a better search base</p>
</li>
<li><p>Then perform multi-query retrieval</p>
</li>
<li><p>Finally, pick the <strong>highest-frequency chunks</strong> for the final answer</p>
</li>
</ul>
<p>This gives you <strong>near-perfect retrieval accuracy</strong>, but it’s slower — so use it wisely.</p>
<h2 id="heading-final-thoughts"><strong>Final Thoughts</strong></h2>
<p>The key takeaway here is:</p>
<blockquote>
<p><strong>RAG isn’t broken — it just needs help understanding what you really mean.</strong></p>
</blockquote>
<ul>
<li><p>Use <strong>query rewriting</strong> for quick wins</p>
</li>
<li><p>Use <strong>multi-query retrieval</strong> when precision matters</p>
</li>
<li><p>Use <strong>HyDe</strong> for vague queries or weak context</p>
</li>
<li><p>Combine techniques <strong>only when necessary</strong></p>
</li>
</ul>
<p>And most importantly:</p>
<blockquote>
<p><strong>Don’t overengineer your RAG pipeline to kill a cockroach</strong><br />Keep it simple unless your use case <strong>truly demands ultra accuracy</strong>.</p>
</blockquote>
]]></content:encoded></item><item><title><![CDATA[Common Failure Cases in RAG Systems And How to Fix Them Fast]]></title><description><![CDATA[Have you ever used ChatGPT, Gemini, or any other GenAI model and thought,“Wait… that answer doesn’t look right.”?
Maybe it made up a fake reference…Maybe it skipped something important…Or maybe it confidently told you something completely wrong.
Well...]]></description><link>https://blog.veerrajpoot.com/common-failure-cases-in-rag-systems-and-how-to-fix-them-fast</link><guid isPermaLink="true">https://blog.veerrajpoot.com/common-failure-cases-in-rag-systems-and-how-to-fix-them-fast</guid><category><![CDATA[ChaiCode]]></category><category><![CDATA[Retrieval-Augmented Generation]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Artificial Intelligence]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Wed, 20 Aug 2025 13:47:38 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1755697608908/0e232292-65cb-4a36-beda-26be569a1761.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever used <strong>ChatGPT</strong>, <strong>Gemini</strong>, or any other GenAI model and thought,<br />“Wait… that answer doesn’t look right.”?</p>
<p>Maybe it made up a fake reference…<br />Maybe it skipped something important…<br />Or maybe it confidently told you something completely wrong.</p>
<p>Well, if you’re working with <strong>Retrieval-Augmented Generation (RAG)</strong> systems, these problems are even more common. RAG sounds powerful — combine an <strong>LLM</strong> with an <strong>external knowledge base</strong> — but in reality, most RAG pipelines <strong>break in subtle ways</strong>.</p>
<p>Don’t worry, though. In this article, I’ll explain:</p>
<ul>
<li><p>Why RAG systems fail</p>
</li>
<li><p>The <strong>5 most common failure cases</strong></p>
</li>
<li><p>How to <strong>fix them quickly</strong></p>
</li>
<li><p>Best practices to make your RAG pipelines more <strong>accurate and reliable</strong></p>
</li>
</ul>
<p>Let’s dive in.</p>
<h2 id="heading-poor-recall-missing-the-right-content"><strong>Poor Recall → Missing the Right Content</strong></h2>
<p>Imagine you ask your RAG-powered chatbot:<br /><em>"What are the eligibility criteria for the new AWS Activate program?"</em></p>
<p>And it replies:<br /><em>"Sorry, I couldn’t find anything relevant."</em></p>
<p>That’s <strong>poor recall</strong> — your retriever didn’t fetch the right context.</p>
<h3 id="heading-why-it-happens"><strong>Why it happens</strong></h3>
<ul>
<li><p>Your knowledge base isn’t updated.</p>
</li>
<li><p>Indexing missed some documents.</p>
</li>
<li><p>Query expansion is weak.</p>
</li>
</ul>
<h3 id="heading-quick-fixes"><strong>Quick Fixes</strong></h3>
<ul>
<li><p><strong>Enrich &amp; update your knowledge base</strong> → Keep your database fresh.</p>
</li>
<li><p><strong>Human-in-the-loop reviews</strong> → Get experts to validate coverage gaps.</p>
</li>
<li><p><strong>Query expansion</strong> → Add synonyms and related terms for better hits.</p>
</li>
</ul>
<h2 id="heading-bad-chunking-broken-context"><strong>Bad Chunking → Broken Context</strong></h2>
<p>Chunking is how you split your documents before indexing.<br />Do it wrong, and your RAG system either:</p>
<ul>
<li><p>Misses important context, OR</p>
</li>
<li><p>Fetches too much irrelevant data, confusing the model.</p>
</li>
</ul>
<h3 id="heading-why-it-happens-1"><strong>Why it happens</strong></h3>
<ul>
<li><p>Splitting blindly by token count.</p>
</li>
<li><p>Ignoring semantic boundaries like paragraphs or sections.</p>
</li>
</ul>
<h3 id="heading-quick-fixes-1"><strong>Quick Fixes</strong></h3>
<ul>
<li><p><strong>Semantic chunking</strong> → Break at logical boundaries.</p>
</li>
<li><p><strong>Dynamic chunk sizing</strong> → Adjust based on document structure.</p>
</li>
<li><p><strong>Hybrid retrieval</strong> → Use both <strong>dense embeddings</strong> (concept-based) + <strong>sparse retrieval</strong> (keyword-based).</p>
</li>
</ul>
<blockquote>
<p><em>Tip:</em> Don’t just feed RAG random pieces of text. Make sure your chunks <strong>carry meaning</strong>.</p>
</blockquote>
<h2 id="heading-query-drift-the-model-loses-the-plot"><strong>Query Drift → The Model Loses the Plot</strong></h2>
<p>Sometimes your retriever rewrites queries to improve results…<br />But in doing so, it <strong>changes the meaning</strong> of your question.</p>
<p>For example:<br /><strong>User query:</strong> “Show me the top 5 fastest-growing AI startups in India.”<br /><strong>Retriever reformulation:</strong> “AI startups India revenue report.”</p>
<p>Suddenly, you’re getting financial reports instead of growth data.</p>
<h3 id="heading-quick-fixes-2"><strong>Quick Fixes</strong></h3>
<ul>
<li><p><strong>Controlled query rewriting</strong> → Expand queries but keep intent intact.</p>
</li>
<li><p><strong>Context adherence checks</strong> → Track how much reformulated queries deviate.</p>
</li>
<li><p><strong>Prompt engineering</strong> → Use clearer, tighter instructions for the retriever.</p>
</li>
</ul>
<h2 id="heading-outdated-indexes-stale-knowledge"><strong>Outdated Indexes → Stale Knowledge</strong></h2>
<p>RAG systems fail badly in <strong>recent events</strong>.<br />Ask it about <strong>OpenAI’s latest model release</strong>, and it might give you data from <strong>2022</strong>.</p>
<h3 id="heading-why-it-happens-2"><strong>Why it happens</strong></h3>
<ul>
<li><p>Indexes aren’t updated frequently.</p>
</li>
<li><p>No metadata on document freshness.</p>
</li>
</ul>
<h3 id="heading-quick-fixes-3"><strong>Quick Fixes</strong></h3>
<ul>
<li><p><strong>Automate index updates</strong> → Schedule frequent rebuilds.</p>
</li>
<li><p><strong>Add versioning &amp; timestamps</strong> → Track when data was last updated.</p>
</li>
<li><p><strong>Automated fact-checking</strong> → Flag outdated or inconsistent answers.</p>
</li>
</ul>
<h2 id="heading-hallucinations-the-llm-makes-stuff-up"><strong>Hallucinations → The LLM Makes Stuff Up</strong></h2>
<p>Even with RAG, models sometimes <strong>invent facts</strong> that don’t exist anywhere.<br />Why? Weak or irrelevant context.</p>
<p>Example:<br /><em>"Who founded SpaceX?"</em><br />RAG retrieves <strong>nothing useful</strong> → LLM hallucinates:<br /><em>"It was founded by Steve Jobs in 2010."</em></p>
<h3 id="heading-quick-fixes-4"><strong>Quick Fixes</strong></h3>
<ul>
<li><p><strong>Better retrieval + reranking</strong> → Ensure high-quality, relevant chunks.</p>
</li>
<li><p><strong>Structured output formats</strong> → Force models to stick to facts.</p>
</li>
<li><p><strong>Continuous context optimization</strong> → Improve query expansion + filtering.</p>
</li>
</ul>
<h2 id="heading-quick-summary"><strong>Quick Summary</strong></h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Failure Case</strong></td><td><strong>Quick Fixes</strong></td></tr>
</thead>
<tbody>
<tr>
<td>Poor Recall</td><td>Update DB, query expansion, expert review</td></tr>
<tr>
<td>Bad Chunking</td><td>Semantic chunking, dynamic sizing, hybrid retrieval</td></tr>
<tr>
<td>Query Drift</td><td>Controlled rewriting, context checks, better prompts</td></tr>
<tr>
<td>Outdated Indexes</td><td>Auto-updates, versioning, fact-checking</td></tr>
<tr>
<td>Hallucinations</td><td>Fine-tuned retrieval, structured outputs, and reranking</td></tr>
</tbody>
</table>
</div><h2 id="heading-final-thoughts"><strong>Final Thoughts</strong></h2>
<p>RAG is <strong>powerful</strong> — but fragile.<br />Most failures happen <strong>before generation</strong> — at the retrieval and chunking stages.</p>
<p>If you:</p>
<ul>
<li><p>Keep your indexes fresh</p>
</li>
<li><p>Use smart chunking</p>
</li>
<li><p>Control query rewriting</p>
</li>
<li><p>Tune retrieval + reranking</p>
</li>
</ul>
<p>…your RAG system instantly becomes <strong>10× more reliable</strong> and <strong>much harder to break</strong>.</p>
<p>In short:</p>
<blockquote>
<p><strong>Good RAG ≠ Good LLM.</strong><br /><strong>Good RAG = Good Retrieval + Good Generation + Good Context.</strong></p>
</blockquote>
]]></content:encoded></item><item><title><![CDATA[Retrieval-Augmented Generation (RAG)]]></title><description><![CDATA[Have you ever asked ChatGPT something like:

“Who won the IPL 2024 finals?”

…and it confidently gave you the wrong answer?
That happens because most AI models, including GPT, don’t actually know everything. They’re trained on huge amounts of data, b...]]></description><link>https://blog.veerrajpoot.com/retrieval-augmented-generation-rag</link><guid isPermaLink="true">https://blog.veerrajpoot.com/retrieval-augmented-generation-rag</guid><category><![CDATA[ChaiCode]]></category><category><![CDATA[Retrieval-Augmented Generation]]></category><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[#ai-tools]]></category><category><![CDATA[generative ai]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Wed, 20 Aug 2025 13:37:10 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1755696974586/c6bfee68-60ff-4f2c-9b75-8066b992c815.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever asked <strong>ChatGPT</strong> something like:</p>
<blockquote>
<p>“Who won the IPL 2024 finals?”</p>
</blockquote>
<p>…and it confidently gave you the <strong>wrong answer</strong>?</p>
<p>That happens because most AI models, including GPT, <strong>don’t actually know everything</strong>. They’re trained on <strong>huge amounts of data</strong>, but their knowledge is <strong>frozen</strong> at the time of training. If you ask about <strong>recent events</strong> or <strong>company-specific data</strong>, they might <strong>hallucinate</strong> — meaning they <strong>make things up</strong>.</p>
<p>Now imagine this instead:</p>
<ul>
<li><p>You have your <strong>own knowledge base</strong> (a large source of information)</p>
</li>
<li><p>AI first <strong>searches</strong> in your database</p>
</li>
<li><p>Then it <strong>understands</strong> the context</p>
</li>
<li><p>Finally, it <strong>generates</strong> a smart, relevant answer</p>
</li>
</ul>
<p>That’s exactly what <strong>Retrieval-Augmented Generation (RAG)</strong> does.<br />It <strong>bridges the gap</strong> between an AI model’s <strong>training data</strong> and your <strong>real-world, up-to-date information</strong>.</p>
<h2 id="heading-why-do-we-need-rag"><strong>Why Do We Need RAG?</strong></h2>
<p>Think of a <strong>library</strong>.</p>
<ul>
<li><p>GPT is like a librarian who has read <strong>millions of books</strong>.</p>
</li>
<li><p>But the librarian <strong>can’t remember everything perfectly</strong>.</p>
</li>
<li><p>Sometimes, you want <strong>fresh information</strong> or <strong>specific documents</strong> that aren’t in their memory.</p>
</li>
</ul>
<p><strong>RAG</strong> acts like giving the librarian a <strong>catalog system</strong>:</p>
<ul>
<li><p>First, they <strong>search the right shelf</strong> (retrieval)</p>
</li>
<li><p>Then, they <strong>summarize and explain</strong> (generation)</p>
</li>
</ul>
<p>This makes AI:<br />More <strong>accurate</strong><br />More <strong>reliable</strong><br />More <strong>context-aware</strong><br />Perfect for <strong>real-time knowledge</strong></p>
<h2 id="heading-how-rag-works-retriever-generator"><strong>How RAG Works (Retriever + Generator)</strong></h2>
<p>Let’s break it into two main components:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755695334249/ccca3b93-7f8e-4d67-8d14-440a5a17a818.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-step-1-retriever"><strong>Step 1 — Retriever 🔍</strong></h3>
<ul>
<li><p>Think of it like <strong>Google Search</strong> for your knowledge base.</p>
</li>
<li><p>It <strong>finds the most relevant documents</strong> based on your query from the Data Source.</p>
</li>
<li><p>Uses <a target="_blank" href="https://blog.veerrajpoot.com/explaining-vector-embeddings-to-mom"><strong>vector embeddings</strong></a> to compare meaning, not just keywords.</p>
</li>
</ul>
<p>For example:</p>
<blockquote>
<p>You ask: “How to install Ubuntu on Raspberry Pi?”</p>
</blockquote>
<ul>
<li><p>Retriever looks into your docs/wiki</p>
</li>
<li><p>Finds the most relevant guides</p>
</li>
<li><p>Sends them to the generator</p>
</li>
</ul>
<h3 id="heading-step-2-generator"><strong>Step 2 — Generator ✍️</strong></h3>
<ul>
<li><p>This is your <strong>LLM</strong> (e.g., GPT, Claude, Gemma).</p>
</li>
<li><p>It <strong>reads the retrieved documents</strong> and uses them to <strong>create an accurate, human-like answer</strong>.</p>
</li>
</ul>
<p>Example answer:</p>
<blockquote>
<p>“To install Ubuntu on a Raspberry Pi, download the Ubuntu Server image, flash it using Raspberry Pi Imager, insert the SD card, and boot your Pi. Make sure to enable SSH if needed.”</p>
</blockquote>
<h3 id="heading-quick-example-flow"><strong>Quick Example Flow</strong></h3>
<p><strong>You ask:</strong> “Who is the CEO of OpenAI?”</p>
<ul>
<li><p><strong>Retriever:</strong> Searches your knowledge base → finds a doc saying “Sam Altman is the CEO.”</p>
</li>
<li><p><strong>Generator:</strong> Reads it → gives you a natural reply:</p>
</li>
</ul>
<blockquote>
<p>“The current CEO of OpenAI is Sam Altman.”</p>
</blockquote>
<h2 id="heading-what-is-indexing"><strong>What is Indexing?</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755696831245/48601894-e52d-4178-8e28-d084b5a5e68b.png" alt class="image--center mx-auto" /></p>
<p>Before AI can <strong>retrieve</strong> anything, we need a <strong>searchable structure</strong>. That’s where <strong>indexing</strong> comes in.</p>
<p>Think of indexing like a <strong>table of contents</strong> in a book:</p>
<ul>
<li><p>It breaks your documents into <strong>chunks</strong></p>
</li>
<li><p>Converts them into <strong>vectors</strong> (we’ll get there in a sec)</p>
</li>
<li><p>Stores them in a <strong>vector database</strong> like <strong>Pinecone, Weaviate, Milvus, or FAISS</strong></p>
</li>
<li><p>When you search, AI <strong>compares your query vector</strong> to these stored vectors and fetches the closest matches.</p>
</li>
</ul>
<h2 id="heading-why-we-perform-vectorization"><strong>Why We Perform Vectorization</strong>?</h2>
<p>Normal keyword search sucks for AI. Why?</p>
<ul>
<li><p>If you search <strong>“AI laws”</strong>, a normal search engine might skip documents that say <strong>“legal regulations for artificial intelligence.”</strong></p>
</li>
<li><p>But AI needs <strong>meaning</strong>, not exact words.</p>
</li>
</ul>
<p>That’s why we use <strong>vector embeddings</strong>:</p>
<ul>
<li><p>We convert <strong>text → numerical vectors</strong> in a <strong>high-dimensional space</strong>.</p>
</li>
<li><p>Sentences with <strong>similar meaning</strong> end up <strong>closer together</strong>.</p>
</li>
<li><p>This makes retrieval <strong>semantic</strong> instead of <strong>keyword-based</strong>.</p>
</li>
</ul>
<p>Example:</p>
<ul>
<li><p>“Install Ubuntu on Pi” → Vector A</p>
</li>
<li><p>“Setup Raspberry Pi with Ubuntu” → Vector B</p>
</li>
<li><p>A &amp; B are <strong>close</strong> in vector space → retriever understands both are related</p>
</li>
</ul>
<h2 id="heading-why-do-rags-exist"><strong>Why Do RAGs Exist?</strong></h2>
<p>We created RAG because <strong>LLMs alone aren’t enough</strong>:</p>
<ul>
<li><p>They <strong>forget</strong> private, domain-specific knowledge</p>
</li>
<li><p>They <strong>hallucinate</strong> when uncertain</p>
</li>
<li><p>They <strong>can’t access real-time data</strong></p>
</li>
<li><p>They <strong>don’t know your internal documents</strong></p>
</li>
</ul>
<p>RAG lets you <strong>connect AI to your data</strong> safely, without retraining the whole model.<br />That’s why <strong>companies, chatbots, SaaS platforms, and knowledge assistants</strong> rely on RAG.</p>
<h2 id="heading-why-we-perform-chunking"><strong>Why We Perform Chunking</strong></h2>
<p>Imagine dumping a <strong>500-page PDF</strong> into ChatGPT.<br />It would <strong>struggle</strong> to find the relevant parts efficiently.</p>
<p>That’s why we <strong>split documents into smaller pieces</strong> → called <strong>chunks</strong>.</p>
<ul>
<li><p>Typical chunk size = <strong>300 to 800 tokens</strong></p>
</li>
<li><p>Each chunk is indexed separately</p>
</li>
<li><p>This makes searching <strong>faster</strong> and <strong>more accurate</strong></p>
</li>
</ul>
<h2 id="heading-why-overlapping-is-used-in-chunking"><strong>Why Overlapping is Used in Chunking</strong></h2>
<p>Sometimes, the <strong>important context</strong> lies <strong>between two chunks</strong>.</p>
<p>Example:</p>
<ul>
<li><p>Chunk 1 ends with: “The API key should be stored securely.”</p>
</li>
<li><p>Chunk 2 starts with: “Never commit secrets to GitHub.”</p>
</li>
</ul>
<p>If we don’t overlap, AI might miss the <strong>connection</strong> between them.</p>
<p>That’s why we use <strong>sliding windows</strong>:</p>
<ul>
<li><p>Each chunk <strong>shares some sentences</strong> with the previous one</p>
</li>
<li><p>Ensures AI <strong>always has full context</strong></p>
</li>
</ul>
<h2 id="heading-final-thoughts"><strong>Final Thoughts</strong></h2>
<p><strong>Retrieval-Augmented Generation (RAG)</strong> is like giving your AI <strong>Google + Brain Power</strong>:</p>
<ul>
<li><p>Retriever → finds the <strong>right knowledge</strong></p>
</li>
<li><p>Generator → writes <strong>smart answers</strong></p>
</li>
<li><p>Indexing + Vectorization → make search <strong>semantic</strong></p>
</li>
<li><p>Chunking + Overlap → make results <strong>accurate</strong></p>
</li>
</ul>
<p>If you’re building:</p>
<ul>
<li><p>AI-powered <strong>chatbots</strong> 🤖</p>
</li>
<li><p><strong>Document assistants</strong></p>
</li>
<li><p><strong>Knowledge search systems</strong></p>
</li>
<li><p><strong>Customer support bots</strong></p>
</li>
</ul>
<p>…you’ll <strong>definitely</strong> need RAG.</p>
<h2 id="heading-quick-summary"><strong>Quick Summary</strong></h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Concept</strong></td><td><strong>Why It Matters</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>RAG</strong></td><td>Combines retrieval + generation for accurate answers</td></tr>
<tr>
<td><strong>Retriever</strong></td><td>Finds the most relevant documents</td></tr>
<tr>
<td><strong>Generator</strong></td><td>Uses docs + LLM to create responses</td></tr>
<tr>
<td><strong>Indexing</strong></td><td>Stores documents in a searchable vector format</td></tr>
<tr>
<td><strong>Vectorization</strong></td><td>Finds meaning, not just keywords</td></tr>
<tr>
<td><strong>Chunking</strong></td><td>Splits large docs for faster, better search</td></tr>
<tr>
<td><strong>Overlap</strong></td><td>Preserves context between chunks</td></tr>
</tbody>
</table>
</div>]]></content:encoded></item><item><title><![CDATA[Agentic AI: How AI Becomes a Doer, Not Just a Thinker]]></title><description><![CDATA[When we think about AI chatbots, most of us picture something like Zomato’s assistant – it can tell you about restaurants, help with orders, and maybe suggest food. But if you ask it to solve a math equation or write a Python script, it won’t. Why? B...]]></description><link>https://blog.veerrajpoot.com/agentic-ai-how-ai-becomes-a-doer-not-just-a-thinker</link><guid isPermaLink="true">https://blog.veerrajpoot.com/agentic-ai-how-ai-becomes-a-doer-not-just-a-thinker</guid><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[AI]]></category><category><![CDATA[Machine Learning]]></category><category><![CDATA[ChaiCode]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Mon, 18 Aug 2025 15:10:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1755529984185/87098722-1464-4337-8758-14f5bdc17476.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When we think about AI chatbots, most of us picture something like Zomato’s assistant – it can tell you about restaurants, help with orders, and maybe suggest food. But if you ask it to solve a math equation or write a Python script, it won’t. Why? Because it’s designed for one job.</p>
<p>Now imagine we could give AI a “toolbox” – like a set of apps or functions – and let it pick the right one depending on the task. That’s where <strong>Agentic AI</strong> comes in.</p>
<h2 id="heading-what-are-ai-agents">What are AI Agents?</h2>
<p>Think of an <strong>AI agent</strong> as not just a chatbot, but like a person who can think, plan, and use tools to get things done.</p>
<ul>
<li><p>A <strong>normal AI LLM model</strong> just predicts text based on what you give it.</p>
</li>
<li><p>An <strong>agentic AI model</strong> takes it further: it reasons step by step, decides what to do, uses tools, and then gives you the final answer.</p>
</li>
</ul>
<p>It’s like the difference between a student who memorizes formulas vs. one who knows how to apply formulas, use a calculator, and solve real problems.</p>
<h2 id="heading-how-agents-work">How Agents Work</h2>
<p>Here’s the flow:</p>
<ol>
<li><p><strong>You ask a question</strong> → “What’s the weather in Delhi tomorrow?”</p>
</li>
<li><p><strong>AI checks its toolbox</strong> → “I don’t know live weather, but I see a weather API tool available.”</p>
</li>
<li><p><strong>AI decides the step</strong> → “Use the weather API with location=Delhi.”</p>
</li>
<li><p><strong>Tool runs and returns data</strong> → “Sunny, 34°C.”</p>
</li>
<li><p><strong>AI explains back to you</strong> → “It’ll be sunny in Delhi tomorrow with a high of 34°C.”</p>
</li>
</ol>
<p>So the AI doesn’t magically know the weather. It just knows how to pick the right tool and use it.</p>
<h2 id="heading-the-role-of-tools">The Role of Tools</h2>
<p>Tools are <strong>functions or APIs</strong> we expose to the AI.</p>
<p>Example:</p>
<pre><code class="lang-plaintext">{
  "tools": {
    "calculator": (expression) =&gt; eval(expression),
    "weather": (city) =&gt; getWeather(city),
    "dbSearch": (query) =&gt; queryDB(query)
  }
}
</code></pre>
<p>Now when AI sees “27 × (32 + 67) – 93 ÷ 45” it knows:</p>
<ul>
<li><p>Use the <strong>calculator</strong> tool.</p>
</li>
<li><p>Parse the expression.</p>
</li>
<li><p>Return the answer.</p>
</li>
</ul>
<p>If you ask about sales data, it can call <strong>dbSearch</strong>. If you ask about the weather, it calls <strong>weather</strong>.</p>
<p>The AI itself doesn’t do the math or fetch live info – it <strong>delegates the task</strong>.</p>
<h2 id="heading-why-agentic-ai-is-powerful">Why Agentic AI is Powerful</h2>
<ul>
<li><p><strong>Flexibility</strong> → The Same model can do many tasks if given the right tools.</p>
</li>
<li><p><strong>Scalability</strong> → Add/remove tools without retraining the model.</p>
</li>
<li><p><strong>Reliability</strong> → Tools return exact results, AI just interprets.</p>
</li>
<li><p><strong>Human-like reasoning</strong> → The AI acts like an assistant that knows when to Google, when to calculate, and when to just answer directly.</p>
</li>
</ul>
<h2 id="heading-real-world-examples">Real-World Examples</h2>
<ul>
<li><p><strong>ChatGPT with Browsing</strong> → When you ask about current events, it calls a search tool.</p>
</li>
<li><p><strong>LangChain Agents</strong> → Define multiple tools (search, calculator, database) and let the model pick.</p>
</li>
<li><p><strong>Copilot for Devs</strong> → Calls code search, compiler, or documentation functions.</p>
</li>
</ul>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>Agentic AI is not just about “chatting.” It’s about <strong>thinking + acting + using tools</strong>.<br />Just like we don’t solve everything with memory, AI shouldn’t either. We check Google, we use calculators, we read docs. Agents do the same – they just need to know what tools are in their kit.</p>
<p>So, the future of AI is not just bigger models – it’s <strong>smarter agents with the right tools</strong>.</p>
<blockquote>
<p>Next time you use an AI, think: <em>Is this just a chatbot, or is it an agent using tools behind the scenes?</em></p>
</blockquote>
]]></content:encoded></item><item><title><![CDATA[Building a Thinking Model from a Non-Thinking Model Using Chain-of-Thought (COT) Prompting]]></title><description><![CDATA[When we think about an AI chatbot — like the one Zomato uses — it only gives solutions based on what it’s designed for.  
If you ask it to write code, it won’t.Why? Because it’s not trained for that — it’s following patterns, not reasoning.
Most lang...]]></description><link>https://blog.veerrajpoot.com/building-a-thinking-model-from-a-non-thinking-model-using-chain-of-thought-cot-prompting</link><guid isPermaLink="true">https://blog.veerrajpoot.com/building-a-thinking-model-from-a-non-thinking-model-using-chain-of-thought-cot-prompting</guid><category><![CDATA[ChaiCode]]></category><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Programming Blogs]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Fri, 15 Aug 2025 15:25:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/LO7JT0Cb9Z8/upload/e932223f2f7a9a21c72b8ad244c7db31.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When we think about an AI chatbot — like the one Zomato uses — it only gives solutions based on what it’s designed for.  </p>
<p>If you ask it to <em>write code</em>, it won’t.<br />Why? Because it’s not trained for that — it’s following patterns, not reasoning.</p>
<p>Most language models work like this:</p>
<ul>
<li>Input → Predict most likely text → Output.<br />  No planning. No deep thought. Just autocomplete on steroids.</li>
</ul>
<p>But as developers, we can actually <strong>programmatically force</strong> the model to reason step-by-step.<br />That’s where <strong>Chain-of-Thought (CoT)</strong> comes in — not as a “prompt trick,” but as <strong>part of your system design</strong>.</p>
<h2 id="heading-the-problem-llms-dont-think-by-default"><strong>The Problem: LLMs Don’t Think By Default</strong></h2>
<p>LLMs are trained to complete text, not to break problems into logical substeps.<br />When given:</p>
<pre><code class="lang-plaintext">20 + 32 × 67 + 93 - 267 ÷ 45
</code></pre>
<p>They might jump straight to an answer.<br />If they make a small mistake early, the final answer is wrong — and you won’t even know why.</p>
<h2 id="heading-the-solution-force-thinking-with-start-think-evaluate-output"><strong>The Solution: Force Thinking with START → THINK → EVALUATE → OUTPUT</strong></h2>
<p>Instead of just asking for the solution to a complex problem<br />We define a <strong>protocol</strong> that the AI must follow:</p>
<ol>
<li><p><strong>START</strong> — Understand the problem.</p>
</li>
<li><p><strong>THINK</strong> — Break it into smaller steps.</p>
</li>
<li><p><strong>EVALUATE</strong> — Wait for a human or another AI to check the step.</p>
</li>
<li><p><strong>OUTPUT</strong> — Only after all checks, give the final answer.</p>
</li>
</ol>
<p>By forcing the AI to follow this sequence — one step at a time — we can catch mistakes before the final output.</p>
<h2 id="heading-the-code"><strong>The Code</strong></h2>
<pre><code class="lang-plaintext">import 'dotenv/config';
import { OpenAI } from 'openai';

const client = new OpenAI();

async function main() {
  const SYSTEM_PROMPT = `
    You are an AI assistant who works on START, THINK, EVALUATE, OUTPUT format.
    Always break down problems, evaluate correctness, and only give the final output after all thinking steps are done.

    Output JSON format:
    { "step": "START | THINK | EVALUATE | OUTPUT", "content": "string" }
  `;

  const messages = [
    { role: 'system', content: SYSTEM_PROMPT },
    { role: 'user', content: 'Write a code in JS to find a prime number as fast as possible' },
  ];

  while (true) {
    const response = await client.chat.completions.create({
      model: 'gpt-4.1-mini',
      messages,
    });

    const rawContent = response.choices[0].message.content;
    const parsed = JSON.parse(rawContent);

    messages.push({ role: 'assistant', content: JSON.stringify(parsed) });

    if (parsed.step === 'START') {
      console.log(`🔥`, parsed.content);
      continue;
    }

    if (parsed.step === 'THINK') {
      console.log(`\t🧠`, parsed.content);

      messages.push({
        role: 'developer',
        content: JSON.stringify({
          step: 'EVALUATE',
          content: 'Nice, you are going on the correct path',
        }),
      });

      continue;
    }

    if (parsed.step === 'OUTPUT') {
      console.log(`🤖`, parsed.content);
      break;
    }
  }

  console.log('Done...');
}

main();
</code></pre>
<p>source: <a target="_blank" href="https://github.com/piyushgarg-dev/genai-js-1.0">https://github.com/piyushgarg-dev/genai-js-1.0</a></p>
<h2 id="heading-why-this-works"><strong>Why This Works</strong></h2>
<ul>
<li><p><strong>No blind guessing</strong> — AI must show its reasoning.</p>
</li>
<li><p><strong>Error catching</strong> — Mistakes are caught in <code>EVALUATE</code> before they reach the user.</p>
</li>
<li><p><strong>Composable</strong> — You can swap the evaluator with another AI (LLM-as-a-judge) or a human reviewer.</p>
</li>
<li><p><strong>Transparent</strong> — Every decision step is visible for debugging.</p>
</li>
</ul>
<h2 id="heading-beyond-math"><strong>Beyond Math</strong></h2>
<p>This technique isn’t just for calculations.<br />You can apply it to:</p>
<ul>
<li><p><strong>Debugging code</strong></p>
</li>
<li><p><strong>Medical diagnosis</strong></p>
</li>
<li><p><strong>Legal reasoning</strong></p>
</li>
<li><p><strong>Complex business workflows</strong></p>
</li>
</ul>
<p>Any time <strong>reasoning matters more than speed</strong>, this approach turns your LLM into a <strong>thinking partner</strong> instead of a pattern-matcher.</p>
]]></content:encoded></item><item><title><![CDATA[Importance of System Prompts & Types of Prompting in AI]]></title><description><![CDATA[When we think about an AI chatbot, like the one Zomato uses, it’s designed to do one thing well: help you with food ordering, restaurant info, or delivery updates.
If you ask Zomato’s chatbot to write a Python script, it’s not going to start coding f...]]></description><link>https://blog.veerrajpoot.com/importance-of-system-prompts-and-types-of-prompting-in-ai</link><guid isPermaLink="true">https://blog.veerrajpoot.com/importance-of-system-prompts-and-types-of-prompting-in-ai</guid><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[AWS]]></category><category><![CDATA[chatgpt]]></category><category><![CDATA[General Programming]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[ChaiCode]]></category><category><![CDATA[Chaiaurcode]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Fri, 15 Aug 2025 15:10:12 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/nGoCBxiaRO0/upload/c3ba395cb1630d4eb8deabba4637ce7f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When we think about an AI chatbot, like the one Zomato uses, it’s designed to do one thing well: help you with food ordering, restaurant info, or delivery updates.</p>
<p>If you ask Zomato’s chatbot to <em>write a Python script</em>, it’s not going to start coding for you. Why?<br />Because it’s <strong>only working within the scope it has been assigned</strong>.</p>
<p>That “scope” is set through something called a <strong>system prompt</strong> - the AI’s hidden set of instructions that define its purpose, tone, and boundaries.</p>
<h2 id="heading-what-is-a-system-prompt"><strong>What is a System Prompt?</strong></h2>
<p>A <strong>system prompt</strong> is like the <em>AI’s job description</em>. It tells the AI:</p>
<ul>
<li><p>Who it should be (you can define a name, tone, personality, style)</p>
</li>
<li><p>What it should and shouldn’t do</p>
</li>
<li><p>How it should answer (structure of output)</p>
</li>
</ul>
<p>If AI is a chef, the system prompt is the recipe card you hand them before they start cooking. Everything after that follows those instructions.</p>
<p>Example:</p>
<ul>
<li><p>Without a system prompt: <em>AI gives a neutral, general answer.</em></p>
</li>
<li><p>With a system prompt: <em>AI answers exactly as instructed, e.g., “Explain in the style of a service manager.”</em></p>
</li>
</ul>
<h2 id="heading-why-system-prompts-matter"><strong>Why System Prompts Matter</strong></h2>
<ol>
<li><p><strong>Scope Control</strong> – Keeps the AI focused on its purpose (like Zomato bot sticking to food queries).</p>
</li>
<li><p><strong>Consistency</strong> – Ensures the same tone and style across responses.</p>
</li>
<li><p><strong>Role Setting</strong> – Makes AI behave like a teacher, coder, marketer, or even a poet.</p>
</li>
<li><p><strong>User Experience</strong> – Gives a unique personality to the interaction.</p>
</li>
</ol>
<p>Without a well-designed system prompt, the AI can feel generic or confused.</p>
<h2 id="heading-types-of-prompting"><strong>Types of Prompting</strong></h2>
<p>Now that you know what a system prompt is, let’s look at the <strong>types of prompting</strong> you can use when interacting with AI models like GPT.</p>
<h3 id="heading-1-zero-shot-prompting"><strong>1. Zero-Shot Prompting</strong></h3>
<p>The model is given a direct question or task without any prior example.</p>
<ul>
<li><strong>Example:</strong><br />  <em>"Translate 'I am learning AI' into French."</em></li>
</ul>
<p><strong>When to use:</strong> The task is simple and widely understood by AI (common cases)</p>
<h3 id="heading-2-few-shot-prompting"><strong>2. Few-Shot Prompting</strong></h3>
<p>You give <strong>a few</strong> examples before asking the main question. (Around 100-150 examples is a good range)</p>
<ul>
<li><p><strong>Example:</strong></p>
<p>  English: Hello → French: Bonjour<br />  English: Thank you → French: Merci<br />  English: How are yo....(more examples)</p>
</li>
</ul>
<p><strong>When to use:</strong> To get a specific style, tone, or format.</p>
<h3 id="heading-3-chain-of-thought-prompting"><strong>3. Chain-of-Thought Prompting</strong></h3>
<p>The model is encouraged to break down a problem into multiple small sub-problems, and evaluate each one by one, reasoning each step before giving the final output.</p>
<ul>
<li><strong>Example:</strong><br />  <em>"Explain your reasoning before solving: 27 × 14."</em></li>
</ul>
<p><strong>When to use:</strong> For reasoning-heavy tasks like math, logic, or planning.</p>
<h3 id="heading-4-self-consistency-prompting"><strong>4. Self-Consistency Prompting</strong></h3>
<p>Think of this like asking multiple friends the same question and then going with the answer most of them agree on.</p>
<p>In AI’s case, instead of generating just one chain of thought, it generates <strong>multiple reasoning paths</strong> and then picks the answer that comes up the most.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755270308762/c5588bab-74c4-41d0-97da-d1f0d0c2bcca.png" alt /></p>
<p><strong>AI Process:</strong></p>
<ul>
<li><p>Reasoning Path 1 → Answer: 405</p>
</li>
<li><p>Reasoning Path 2 → Answer: 405</p>
</li>
<li><p>Reasoning Path 3 → Answer: 402</p>
</li>
</ul>
<p><strong>Final Answer:</strong> 405 (picked because it appeared most often).</p>
<p><strong>When to use:</strong></p>
<ul>
<li><p>High-stakes reasoning tasks (math, planning, legal analysis).</p>
</li>
<li><p>When you need <strong>more reliability</strong> and less chance of a random mistake.</p>
</li>
</ul>
<p><strong>How it works:</strong><br />It’s like cross-checking your homework before submission, either by your own or by any other friend or person - different “thoughts” compete, and the most consistent one wins.</p>
<h3 id="heading-5-persona-prompting"><strong>5. Persona Prompting</strong></h3>
<p>This is when you tell the AI to <strong>pretend to be a specific person, profession, or character</strong> so it answers from that perspective.</p>
<p>It’s like asking your friend, <em>“Imagine you’re a chef, how would you make Maggi?”</em> — their answer will change based on the role they take.</p>
<p><strong>Example:</strong><br />Instruction: You are an experienced financial advisor. Explain the basics of budgeting to a college student.</p>
<p><strong>AI Output:</strong><br /><em>"Alright, first thing you need to do is track your expenses... Think of your income as a pizza and your budget as how you slice it."</em></p>
<p><strong>When to use:</strong></p>
<ul>
<li><p>Customer support (AI acts like a polite support rep).</p>
</li>
<li><p>Education (AI acts like a history teacher or coding tutor).</p>
</li>
<li><p>Creative writing (AI acts like Shakespeare or a movie director).</p>
</li>
</ul>
<p><strong>How it works:</strong><br />It sets the <strong>context and tone</strong> before the AI even sees your question, so responses feel more natural and aligned with that persona.</p>
<blockquote>
<p>Just like Zomato’s chatbot won’t write code for you, AI systems will only perform as well as the <strong>instructions</strong> they’re given.<br />The system prompt is the hidden boss that defines those instructions.</p>
<p>Pair it with the right prompting technique — zero-shot, few-shot, chain-of-thought, or role-based — and you can make AI work exactly the way you want.</p>
<p>The next time you talk to an AI, remember: the magic starts <em>before</em> you type.</p>
</blockquote>
]]></content:encoded></item><item><title><![CDATA[Explaining GPT To Babies]]></title><description><![CDATA[We Indians generally love parrots.Even if you don’t have one in your house, you probably know someone who does.
Now imagine you have a special parrot.Not just any parrot - this one doesn’t only repeat words you say.This parrot has read millions of bo...]]></description><link>https://blog.veerrajpoot.com/explaining-gpt-to-babies</link><guid isPermaLink="true">https://blog.veerrajpoot.com/explaining-gpt-to-babies</guid><category><![CDATA[explainingtechtobabies]]></category><category><![CDATA[Babies]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[ChaiCode]]></category><category><![CDATA[chatgpt]]></category><category><![CDATA[GPT 3]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Wed, 13 Aug 2025 13:48:35 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1755092889270/e1147551-a0cc-4aba-85e0-5c91e0f6517b.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We Indians generally love parrots.<br />Even if you don’t have one in your house, you probably know someone who does.</p>
<p>Now imagine you have a <em>special parrot</em>.<br />Not just any parrot - this one doesn’t only repeat words you say.<br />This parrot has read <strong>millions of books</strong>, <strong>heard endless stories</strong>, and <strong>seen countless conversations</strong>.<br />So when you ask it something, instead of just repeating, it <em>thinks for a moment</em> and gives you a brand-new, meaningful answer.</p>
<h2 id="heading-why-is-gpt-called-gpt"><strong>Why is GPT called GPT?</strong></h2>
<p>GPT stands for <strong>Generative Pretrained Transformer</strong>:</p>
<ul>
<li><p><strong>Generative</strong> → It can <em>generate</em> new sentences, stories, or answers.</p>
</li>
<li><p><strong>Pretrained</strong> → Before you even talk to it, it has already <em>learned from a huge amount of text</em> from books, articles, and the internet.</p>
</li>
<li><p><strong>Transformer</strong> → A special type of computer model that understands patterns in text and figures out what should come next.</p>
</li>
</ul>
<h2 id="heading-how-it-works-parrot-version"><strong>How it Works (Parrot Version)</strong></h2>
<p>Think of GPT like that intelligent parrot:</p>
<ol>
<li><p><strong>It listens to your question</strong> → “What’s the capital of India?”</p>
</li>
<li><p><strong>Remembers all the reading it has done</strong> → "Oh! I’ve read this many times in books and articles."</p>
</li>
<li><p><strong>Speaks back in its own words</strong> → “The capital of India is New Delhi.”</p>
</li>
</ol>
<h2 id="heading-why-it-feels-magical"><strong>Why It Feels Magical</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755092498488/a7a11d86-d8da-4ba9-9b9d-1f4e608da277.webp" alt /></p>
<p>The magic is that GPT doesn’t just repeat facts.<br />You can ask it to tell a bedtime story, solve a riddle, write a poem, or explain maths - and it will do it <em>instantly</em> like a Disney movie parrot who never forgets anything it learned.</p>
<blockquote>
<p>So, GPT is like that clever parrot in our neighborhood who not only repeats what it hears but also learns so much that it can talk about new things you never taught it directly. The only difference? GPT doesn’t need food, water, or a cage - it just needs data and some good training. Next time you chat with GPT, think of it as a super-parrot that’s read the whole world’s books, newspapers, and websites… and is now ready to chat with you about anything from “why the sky is blue” to “how to make a paper rocket.”</p>
</blockquote>
]]></content:encoded></item><item><title><![CDATA[Understand Tokenization As A Fresher]]></title><description><![CDATA[What is Tokenization?
When a computer works with text, it can’t directly understand sentences the way we understand.It needs to break the text into smaller pieces so it can process them step-by-step.
Those smaller pieces are called tokens.Basically, ...]]></description><link>https://blog.veerrajpoot.com/understand-tokenization-as-a-fresher</link><guid isPermaLink="true">https://blog.veerrajpoot.com/understand-tokenization-as-a-fresher</guid><category><![CDATA[Tokenization]]></category><category><![CDATA[token]]></category><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[Machine Learning]]></category><category><![CDATA[ChaiCode]]></category><category><![CDATA[chatgpt]]></category><category><![CDATA[gemini]]></category><category><![CDATA[generative ai]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Wed, 13 Aug 2025 13:27:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/tikhtH3QRSQ/upload/9b30acfd977ada357744569ecbb13369.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-what-is-tokenization"><strong>What is Tokenization?</strong></h2>
<p>When a computer works with text, it can’t directly understand sentences the way we understand.<br />It needs to break the text into smaller pieces so it can process them step-by-step.</p>
<p>Those smaller pieces are called tokens.<br />Basically, tokenization is the process of splitting text into tokens.</p>
<h2 id="heading-what-do-i-mean"><strong>What Do I Mean?</strong></h2>
<p><strong>Example in Plain English</strong><br />Think of a sentence:</p>
<pre><code class="lang-plaintext">I love samosas
</code></pre>
<p>When we do Tokenization, we could break it into:</p>
<p>Let’s say we break it based on <strong>Word-level tokens</strong>:</p>
<pre><code class="lang-plaintext">["I", "love", "samosas"]
</code></pre>
<p>Now, <strong>Character-level tokens:</strong></p>
<pre><code class="lang-plaintext">["I", " ", "l", "o", "v", "e", " ", "s", "a", "m", "o", "s", "a", "s"]
</code></pre>
<p>Generally, in Machine Learning &amp; AI, the tokenizer converts an input into a unique number assigned to that exact word. And you know that computers are better with numbers, this also eliminates the confusion that can occur when someone misspells or miscases the input.<br />For example:</p>
<pre><code class="lang-plaintext">[
  46530,
  4,
  55530,
  4,
  82663
]
</code></pre>
<blockquote>
<p>Note: I used my recently made tool <strong>Tea Tokenizer</strong> here.<br />Link: <a target="_blank" href="https://teatokenizer.monc.space/">https://teatokenizer.monc.space</a></p>
</blockquote>
<h2 id="heading-why-is-it-important"><strong>Why is it Important?</strong></h2>
<p>Tokenization is like splitting a long message into smaller parts so the computer can read it one step at a time.</p>
<ul>
<li><p>Without tokenization, the computer sees the entire sentence as one giant block of text and can’t figure out where words or parts of words start and end.</p>
</li>
<li><p>With tokenization, the text becomes small chunks (tokens) that the computer can store, search, and process efficiently.</p>
</li>
</ul>
<p>In Short, Tokenization is cutting big text into small, meaningful chunks so a computer can handle it.</p>
]]></content:encoded></item><item><title><![CDATA[Explaining Vector Embeddings To Mom]]></title><description><![CDATA[When I first learned about vector embeddings, I thought it was fascinating. But whenever I discuss it up with others, they gave me that "oh no, another scary tech thing" look - as if it was rocket science and they’d need a PhD to understand it.
So, I...]]></description><link>https://blog.veerrajpoot.com/explaining-vector-embeddings-to-mom</link><guid isPermaLink="true">https://blog.veerrajpoot.com/explaining-vector-embeddings-to-mom</guid><category><![CDATA[generative ai]]></category><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[Machine Learning]]></category><category><![CDATA[vector embeddings]]></category><dc:creator><![CDATA[Rahul Singh (Veer)]]></dc:creator><pubDate>Wed, 13 Aug 2025 12:59:09 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/zzl7_zgdTm4/upload/70ce335e36e795c658367e93f171bffc.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When I first learned about <strong>vector embeddings</strong>, I thought it was fascinating. But whenever I discuss it up with others, they gave me that <em>"oh no, another scary tech thing"</em> look - as if it was rocket science and they’d need a PhD to understand it.</p>
<p>So, I decided to take on a challenge: <strong>Could I explain vector embeddings to my mom without using a single technical jargon?</strong></p>
<p>At first, I did some searches, some LLM help, they were guiding me to use the example of mangoes and fruits, but recently, when my sister was getting ready for school, and was yelling, <em>“Where’s my uniform?”</em>  </p>
<p>My mom replied:</p>
<blockquote>
<p>“Kitni baar boli hoon ki uniform upar wale shelf me hai!”<br />(<em>How many times have I told you, the uniform is on the top shelf!</em>)</p>
</blockquote>
<p>Then I thought I could explain this concept with this trending topic (like we express ourselves using viral Instagram Memes)</p>
<h2 id="heading-explanation-the-wardrobe-example"><strong>Explanation: The Wardrobe Example</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755089086246/1ae3d6c4-418d-41e1-82eb-b157a02f60f4.jpeg" alt class="image--center mx-auto" /></p>
<p>In our Indian home, clothes are never just thrown in. Mom has a system: My &amp; my sibling’s wardrobe is in my sibling's room, but behind the separate doors. Mom’s wardrobe is in her room.</p>
<p>Inside each wardrobe, there is a left door and a Right door. Behind each door, there are 3 sections: Top shelf uniforms/professional outfits, Middle shelf for T-shirts/Topwears, Bottom shelf pants/jeans/bottomwears, and Innerwear hung on hooks inside the doors.</p>
<p>It’s neat, predictable, and easy to find things.</p>
<h2 id="heading-mapping-turning-clothes-into-coordinates"><strong>Mapping: Turning Clothes into Coordinates</strong></h2>
<p>Let’s say I want my blue T-shirt, and I tell my mother in the kitchen; Mom, where’s my blue T-shirt. After that, the usual dialogue “Saare kaam main hi karoon… +200 more lines”, she will tell me the exact place of my t-shirt, without even going to the room.</p>
<p>Now, for you (tech people), we can describe it as:<br /><code>[Room: My room, Door: Left, Shelf: Middle, Type: T-shirt]</code></p>
<p>If we turn that into numbers:</p>
<ul>
<li><p>My room -&gt;   0</p>
</li>
<li><p>Left door -&gt; 0</p>
</li>
<li><p>Middle shelf -&gt; 1</p>
</li>
</ul>
<p>Now the position of my T-shirt is: <code>[0, 0, 1, T-shirt]</code></p>
<p>Similarly:</p>
<ul>
<li><p>My sibling’s T-shirt: <code>[0, 1, 1, T-shirt]</code></p>
</li>
<li><p>Mom’s T-shirt: <code>[1, 0, 1, T-shirt]</code></p>
</li>
</ul>
<p>These numbers are like coordinates on a map, telling us exactly where something lives in our <em>“wardrobe space.”</em></p>
<h2 id="heading-wardrobe-to-vectors"><strong>Wardrobe To Vectors</strong></h2>
<p>Now Imagine This…</p>
<p>Instead of clothes, what if we’re arranging words, sentences, images, or sounds, basically data/information?</p>
<p>In AI, vector embeddings store words, sentences, images, or sounds in a multi-dimensional space (different rooms or wardrobes) where:</p>
<p>Similar meanings are stored close together (like my T-shirt and my sibling’s T-shirt, and corresponding to similar coordinate (same shelf position)  distance in different dimensions (wardrobe).</p>
<p>Different meanings are far apart (like my T-shirt and a cooking pan)</p>
<p><strong>Example:</strong> All my books, notebooks, and stationery are placed in the nearby places in my room, but my bike key is hanging in the living room.</p>
<h2 id="heading-why-vector-embeddings-matter">Why Vector Embeddings Matter</h2>
<p>By storing meanings as coordinates, AI can:</p>
<ul>
<li><p>Find similar things (search “T-shirt” and get all T-shirts)</p>
</li>
<li><p>Group related items (keep all uniforms together)</p>
</li>
<li><p>Understand relationships (knowing my and my sibling’s T-shirts are similar kinds of items)</p>
</li>
</ul>
<p>This is why embeddings are used not only in AI, but way before, already being used in search engines, chatbots, recommendation systems &amp; more.</p>
<h2 id="heading-how-you-can-explain-it-too">How You Can Explain It Too</h2>
<ol>
<li><p>Pick a familiar system - wardrobes, library bookshelves, kitchen spice rack, recommend picking the recent topic in your house, your mother, or whoever you are going to explain had just discussed or encountered. Also, break things into sections (dimensions).</p>
</li>
<li><p>Although I didn’t explain this part to my mother but, you can show how each item’s location can be described as numbers, as explained in the section “Mapping: Turning Clothes into Coordinates.”</p>
</li>
<li><p>Connect the example to how AI stores meanings &amp; highlight how “closeness” in this space means similarity.</p>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Just like Mom knows exactly where my jeans are without opening every shelf, AI knows where “mango” is and which other words are sitting right next to it.  </p>
<p><strong>Message from my Mom</strong></p>
<blockquote>
<p>“Thanks for reading this article, and I know you’re surely gonna forget tomorrow where your favourite jeans are, but don’t forget to like, and share your thoughts or anything I have missed. Follow me to get more articles like this.”</p>
</blockquote>
]]></content:encoded></item></channel></rss>