Sanity Bytes: May 2026

Wednesday, May 6, 2026

Vibe coding & Shipping to Production

Vibe Coding Is Not the Problem. Shipping to Production Is. We Fixed That

Last week marked a milestone for me. I gave a lecture to 10,000 AI enthusiasts, hoping they learned something. It sounds like an odd thing to count, but that was a target I set for myself last year-end. Finally made it.

Ok, enough boasting. Here's what I actually want to talk about.

I think we have passed the phase of "this is just hype, the bubble will burst." It won't. AI coding tools write thousands of lines instead of tens. Testing will compensate for the time we've cut in development.

It's not delivering 100% accuracy yet, but it's improving every day -- and the last year is proof.

Yes, we are talking about vibe coding. Spec-driven coding. AI-assisted development. Whatever you want to call it.

Now the debate has shifted. It's no longer "does it work?" -- it's "is it ready for production?"

And the honest answer? Not yet. Not by itself.

What AI gives you today is a quick PoC, not a production-ready solution. But here's the thing -- that's the same problem we've always had, even with human developers writing code from just an idea. No requirements, no architecture standards, no design specs, no test cases. The output is fast, but it's fragile.

𝗧𝗵𝗶𝘀 𝗶𝘀 𝗲𝘅𝗮𝗰𝘁𝗹𝘆 𝘁𝗵𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝘄𝗲 𝗮𝗿𝗲 𝘀𝗼𝗹𝘃𝗶𝗻𝗴. Spec-driven development gave us hope, but it didn't solve the problem completely. Something was still missing.

We faced this challenge internally. So we built a platform that integrates vibe coding with a full SDLC lifecycle -- Jira, Confluence, Git, DevOps tools -- with all the controls: stage gates, mandatory standards, approval workflows.

Here's how it works:

A development team creates a project on the platform. They configure the SDLC steps, approval gates, coding standards, and integrations. That configuration is then pushed directly into the AI coding tools -- Cursor, Claude Code, Codex, Antigravity -- through the Model Context Protocol (MCP).

𝗧𝗵𝗲 𝗔𝗜 𝗶𝘀 𝗻𝗼𝘄 𝗴𝗼𝘃𝗲𝗿𝗻𝗲𝗱.

It cannot skip requirements. It cannot write code before the design is approved. It cannot deploy without passing tests. Every phase is enforced at the tool level, not just by asking the AI to behave.

When a user says "build me a School Management System," the AI doesn't jump to code. It starts with requirements. It asks what you need. It structures your answers into formal requirements. It waits for approval. Only then does it move to design. Then development. Then testing. Each transition requires human sign-off.

𝗔𝗹𝗹 𝘁𝗵𝗲 𝗳𝗹𝗲𝘅𝗶𝗯𝗶𝗹𝗶𝘁𝘆 𝗮𝗻𝗱 𝘀𝗽𝗲𝗲𝗱 𝗼𝗳 𝘃𝗶𝗯𝗲 𝗰𝗼𝗱𝗶𝗻𝗴 𝗯𝘂𝘁 𝗰𝗼𝗻𝘁𝗿𝗼𝗹𝗹𝗲𝗱 𝗯𝘆 𝗦𝗗𝗟𝗖 𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗲𝘀 𝘁𝗵𝗮𝘁 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗼𝗿𝗴𝗮𝗻𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝘀 𝗵𝗮𝘃𝗲 𝗿𝗲𝗳𝗶𝗻𝗲𝗱 𝗼𝘃𝗲𝗿 𝗱𝗲𝗰𝗮𝗱𝗲𝘀.

𝗧𝗵𝗲 𝗿𝗲𝘀𝘂𝗹𝘁: 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝘀, 𝗻𝗼𝘁 𝘁𝗵𝗿𝗼𝘄𝗮𝘄𝗮𝘆 𝗽𝗿𝗼𝘁𝗼𝘁𝘆𝗽𝗲𝘀.

The irony? The platform doesn't slow anyone down. It eliminates the rework cycle. A governed project that takes 5 days is faster than an ungoverned one that takes 2 days to code and 3 weeks to fix, document, review, and re-deploy.

The fastest path to production is the one that doesn't require rewriting everything after the first compliance review.

Now with all the development agents packed inside vibe-coding tools -- augmenting the BA, the developer, the tester, the DevOps engineer -- this is enabling something bigger than any single tool.

𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁 𝗮𝘀 𝗮 𝗦𝗲𝗿𝘃𝗶𝗰𝗲: 𝗗𝗮𝗮𝗦.

A governed, AI-powered development capability that any team can plug into and start shipping production-ready software from day one.

Diabetics: Best Daal to consume

Every Indian household has daal. But no one told diabetics which one to actually eat. Dals have always been a core part of the Indian diet and they should remain so. They offer protein, fiber, minerals, and of course, taste. But not all dals behave the same when it comes to blood sugar.

I’ve broken it down from #6 to #1 so you can make smarter choices without confusion. Here’s the simple sequence

#6 → Toor (Arhar) daal

Good protein (~22g), but lower fiber → higher glycaemic response. Especially risky when paired with rice.

#5 → Urad daal
High protein + fiber. On paper, excellent. But often eaten with rice (idli/dosa), which can spike sugar.

#4 → Masoor daal
Rich in protein and iron. Moderate impact on glucose good, but not the best.

#3 → Moong daal
Easy to digest, gut-friendly, low GI (30–40). A strong option.

#2 → Chana daal
High satiety, low GI/GL, keeps you full longer. One of the best for diabetes.

#1 → Whole daals (with skin)
Highest fiber (15–20g), slowest glucose absorption, richest in minerals → least sugar spike

Simple rule to follow → The more natural the daal, the better it is for your sugar control. But, Daals are ~60% carbohydrates. So quantity matters just as much as quality. I often see people making this mistake - “Healthy khana hai” → eating lobiya curry+ daal together → sugar spikes.

Instead Fix your plate to
→ 50% vegetables + salad
→ 25% dal
→ 25% grains

Eat daal like a part of the plate not the entire meal. Done right, daal acts like medicine. Done wrong, it can quietly raise your sugars.

#DiabetesDiet #Nutrition #DiabetesManagement #IndianDiet #BloodSugarControl

How Delivery/Project Leaders Learned : Training & Tuning

Somewhere along the way, “training” and “tuning” became the new “blockchain” and “microservices.” - widely used, poorly understood, and dangerously overconfident.

Suddenly, every second delivery leader is saying things like: “We should train our own model.” “Let’s tune it for better accuracy.” "We trained the model for automation" ..etc

And I sit there thinking… Train what? Tune what? You have not even tuned your sprint backlog properly.

The Problem Is not the Words, It’s the Blind Confidence. Look, I am not against people learning new things. In fact, I encourage it.

But there is a difference between:

Understanding something vs
Dropping buzzwords in review meetings, exec presentations or client calls like seasoning on biryani, except here, it is all masala and no rice.

Right now, we have a wave of few “AI-aware” leaders who believe:

Uploading data = Training a model
Changing a prompt = Tuning a model

If that were true, we would all be AI researchers by now.

So Let’s Fix This in Plain English: Forget jargon. Let me explain this, the way it actually works.

What is Training a Model?

Training is where the real heavy lifting happens. Think of it like teaching a child from scratch:

You show examples
You correct mistakes
You repeat… a lot

A model during training:

Looks at a huge amount of data
Learns pattern from that data
Adjusts its internal parameters (millions or billions of them)

It is not just “feeding data.” It is mathematical optimization at scale.

Simple analogy: Training a model is like: Teaching someone an entire language from zero, grammar, vocabulary, context, everything.

In reality, training involves:

Massive datasets
GPUs burning money like Diwali crackers
Algorithms adjusting weights through backpropagation

And yes… it is expensive, slow, and complex.

So next time someone says:

“Let’s just train our own model.” "We trained the model to..."

You might want to ask:

“With what data, what infrastructure, and whose budget?”

What is Tuning a Model?

Now comes tuning, the part most people think they are doing. Tuning is not building from scratch. It is refining something that already exists. There are different levels of tuning:

1. Fine-Tuning

You take a pre-trained model and:

Train it further on specific data
Make it better at a particular task

Example: A general AI model → fine-tuned for legal contracts

2. Prompt Tuning (what most people actually do)

This is:

Changing how you ask questions
Structuring inputs better

Let’s be honest, this is what 80% of teams call “AI tuning.” And there is nothing wrong with it. Just don’t call it “model tuning” in a strategy meeting.

3. Parameter Tuning

Adjusting things like:

Learning rate
Batch size
Model behavior settings

This is closer to real ML work.

Here is what actually happening in most delivery teams:

And again, nothing is wrong with using APIs. Just don’t over sell it like you have reinvented AI. This confusion is not just funny, it is dangerous.

Because:

Clients / Business get unrealistic expectations
Teams get vague directions
Budgets get allocated blindly

And eventually… Someone has to explain why “training the model” did not magically solve the problem.

When Should You Actually Train a Model?

Almost never (for most enterprises). You should consider training only if:

You have massive proprietary data
Off-the-shelf models don’t work
You can afford infrastructure and expertise

Otherwise? Use existing models. Adapt them. Build solutions around them.

When Should You Tune?

All the time, but intelligently.

Use prompt engineering for quick wins
Fine-tune when domain specificity matters
Optimize based on real feedback, not assumptions

The Real Skill Is not Training, It’s Thinking

Here’s the uncomfortable truth: AI success is not about:

Training models
Tuning parameters

It is about:

Defining the right problem
Using the right approach
Knowing what not to build

In conclusion, the next time someone confidently says: “Let’s train and tune our own model or We have trained the model to... ”

Pause. Smile. And gently ask: “Are we building intelligence… or just building slides?”

Because in today’s world, AI does not fail because of technology. It fails because of vocabulary-driven Delivery Leadership.

AI Usage: Token cost

“Wait… What Exactly Am I Paying For?”

Let me be honest, When i tell someone that, “You are billed per token,” they will nod like they understood… and then immediately Google it later.

If you are building with AI models today, whether it is chatbots, copilots, agents, or full-blown platforms, you cannot afford to misunderstand tokens. This is not just a technical concept. It directly hits your cloud bill, product pricing, and scalability decisions.

So let me explain you through this in plain English

First Things First: What is a Token? Think of a token as a chunk of text. Not exactly a word. Not exactly a character. Somewhere in between.

“Hello” → 1 token
“Artificial Intelligence” → ~2–3 tokens
A long paragraph → dozens or hundreds of tokens

Rough rule of thumb: 1 token ≈ ¾ of a word (in English)

So if you send a 100-word input, you are roughly dealing with 130–150 tokens.

So Where Does Billing Come In? Here is the key idea: You are billed for every token processed by the model. And that includes:

Input tokens (what you send)
Output tokens (what the model generates)

Let’s say:

Your prompt = 200 tokens
Model response = 300 tokens

Total billed tokens = 500 tokens

That is it. No magic. No hidden tricks. Not All Tokens Cost the Same, Here is where it gets interesting, and where many teams get it wrong. Different models have different pricing. And more importantly: Input tokens and output tokens are often priced differently

Why?

Because generating text (output) is computationally more expensive than reading input.

Think of It Like This

Input tokens → “Reading effort”
Output tokens → “Thinking + Writing effort”

And thinking is always more expensive...right?

Model Choice = Cost Strategy

Let me say something that might sound obvious, but is often ignored:

Choosing the wrong model can blow up your costs faster than bad code.

You don’t need the most powerful model for everything.

Typical pattern:

Lightweight models → cheap, fast → good for:
Heavy models → expensive, powerful → good for:

I have seen teams use a high-end model for:

simple FAQ bots
basic text rewriting

That is like using a supercomputer to calculate 2 + 2.

Here’s something many people miss. Every time you send a request in a chat-based system: The entire conversation history (or part of it) is sent again.

Which means:

First message → cheap
Fifth message → more expensive
Tenth message → significantly more expensive

Because tokens are accumulating.

Example

Conversation:

User: “Explain AI” → 50 tokens
Assistant: response → 200 tokens
User: follow-up → 30 tokens

Now step 3 request might include: 50 + 200 + 30 = 280 tokens (input!)

Do you see the problem?

Token Explosion is Real, If you don’t manage context:

Costs grow silently
Performance slows down
Latency increases

This is why context management is a core design problem, not a minor detail.

Some of the smart Cost Optimization Techniques that I can think of are below. Let me share what actually works in real systems.

1. Trim the Context

Don’t send everything every time.

Keep only relevant messages
Use summaries instead of full history

2. Use Summarization Loops

Instead of: “Keep entire conversation”

Do: “Summarize conversation so far → send summary”

3. Route to the Right Model

Not every request needs the same intelligence.

Simple → small model
Complex → powerful model

This alone can reduce costs by 50–80%

4. Control Output Length

Don’t let the model ramble.

Use prompts like:

“Answer in 3 sentences”
“Give concise output”

Less output tokens = lower cost

5. Cache Responses

If users ask similar questions: Don’t recompute. Reuse.

Tokens vs Traditional Billing: Let’s compare this to what we were used to.

Tokens vs traditional billing

Just look at the the Big Mindset Shift: This is the part most organizations struggle with. In AI systems, your prompt design is your cost architecture

Not just: infrastructure, scaling, deployment

But: how you ask, how much you send, how much you generate

Just to put together a a Simple Mental Model, Whenever you design a feature, ask:

How many tokens am I sending?
How many tokens will I receive?
How often will this run?
Which model am I using?

Multiply that - That’s your real cost.

In Conclusion, Let me leave you with this: “AI is not expensive because models are costly. It becomes expensive when we use intelligence where it is not needed.” If you understand tokens, you don’t just control cost.

You control: performance, scalability and architecture decisions

And honestly, your CFO will like you a lot more. “In the world of AI, you are not just writing prompts, you are writing your bill.”

Diabetics and Mangoes consumption

Can diabetics eat mangoes without fear? I hear this every summer. Many people with diabetes either avoid mangoes completely or eat them with fear assuming every bite will spike their sugar.

Let’s clear this confusion.

Mango is not the villain it’s made out to be. In fact, is a nutrient-rich fruit, loaded with vitamins A, C, and E supporting skin health and immunity. It also contains antioxidants that help reduce oxidative stress and inflammation, both of which play a role in long-term metabolic health.

Now, the important part → its glycemic index. Mango falls in the moderate GI range (around 51–60). This means

→ Overeating can raise blood sugar

→ But eating it mindfully can work just fine

So what does “mindful” look like?

↳ Start simple. Try one small mango, or half of a large one. Observe how your body responds.

Even better don’t eat it in isolation. Pair it with

→ A handful of nuts

→ Or include it in a vegetable salad

This combination slows down sugar absorption, reduces spikes, and helps you actually benefit from the nutrients. The goal is not restriction. The goal is understanding + balance.

Eat smart, not scared. Enjoy the mango season without guilt, and without compromising your health.