
What Actually Matters After Using AI for Production Productivity
Over the past year I have continued to find new peaks in productivity as a DataTools Pro. In this article, I break down some of the biggest unlocks I have experienced watching leaders build hyper growth businesses on the backs of well designed AI experiences. The common thread where I find the greatest productivity and rapid adoption is a well designed AI co-pilot… Work that requires human accountability, typically requires a human in the loop. Here are the tools I am using every day giving me move 2-5x faster than 2023.
- Snowflake dev environments (Cortex Code) –
- Repo-driven IDE workflows (Cursor)
- Micro-Apps and prototyping (Lovable)
- Product and Web Analytics (PostHog AI)
- GPT / Claude chat interfaces
- Video editing (Descript)
Breaking Down Features for Peak AI Co-Pilot Productivity
After dozens of experiments across tools I have attempted to apply lessons learned into our DataTools Pro, where we manage strategy, business semantics and metrics. Here is a framework that actually matters as I evaluate my own startups and what I adopt.
1. Multi-Turn Conversation
What it is
The ability to maintain context across iterative back-and-forth reasoning inside a session. It simulates short term cognitive continuity.
Without multi-turn, every request is stateless. With it, the AI remembers prior questions, assumptions, and constraints.
Why it matters
Real engineering work is iterative. You ask a broad question, narrow scope, introduce tradeoffs, refine logic. Multi-turn prevents constant context resets.
Example in action
- GPT / Claude: You brainstorm architecture, refine it over 10–15 exchanges.
- Cortex Code: You explore warehouse credit usage, then drill down into specific roles without re-briefing the account context.
- Cursor: You modify a function, then adjust related files in follow-ups.
- Lovable: You scaffold an app, then iteratively adjust schema and UI.
- PostHog AI: You analyze funnel drop-offs, then pivot into retention metrics.
Multi-turn is table stakes now. But it only gives session continuity. It does not create long-term intelligence.
2. Context-Aware Reasoning
What it is
The model reasons against your environment, grounded on what you are specifically working on instead of abstract patterns based solely on the interaction itself.
- Repository / code awareness
- Metadata awareness
- Change and usage logs
- Visual awareness (screen grabs and computer vision).
- App state (what you are doing in present moment or history).
Why it matters
This is the difference between “plausible” and “correct.”
Examples
- Cortex Code: You ask, “Which warehouses consumed the most credits?” It generates SQL grounded in your actual Snowflake metadata.
- Cursor: It refactors across your actual repo instead of hallucinating file names.
- Lovable: It understands the state of the generated app and adjusts components coherently.
- PostHog AI: It queries real event data to answer product questions.
- GPT / Claude (standalone): Context awareness is limited to what you paste in manually.
Grounded context dramatically increases reliability and reduces hallucination.
3. Self-Reflection & Iterative Reasoning
What it is
The system critiques or refines its own output instead of stopping at first completion. This is effectively a quality control layer.
Why it matters
Speed without reflection creates brittle systems. Reflection increases decision quality.
Where we’ve seen this
- PostHog AI: Agent loops evaluate output and adjust before finalizing analysis.
- Cursor (partial): When prompted explicitly, it can compare approaches and refactor more carefully.
- GPT / Claude: Capable, but requires manual prompting (“critique this”).
- Cortex Code: Typically direct generation, not built-in critique loops.
- Lovable: Focused on generation speed over architectural reflection.
Reflection is not default behavior in most tools. It has to be engineered or prompted
5. Agent Workflows & Task Loops
What it is
The ability to break down an objective function and execute step-by-step with intermediate evaluation is how most people problem solve and execute. Agents that summarize work before execution creates a much better experiences in my opinion.
Why it matters
This shifts AI from “answering questions” to “completing tasks”; one day completing goals
Strong examples
- Cursor: Multi-file planning and stepwise refactors.
- Lovable: Full-stack app scaffolding from high-level instructions.
- PostHog AI: Analytics agents running multi-step investigations.
- Cortex Code: Less agentic, more query-focused based on questions.
- GPT / Claude: Capable but requires manual orchestration.
This is where copilots begin to feel like collaborators instead of search engines, is when they demonstrate understanding. Breaking down problems into its smallest parts and recommending next steps is where you truly feel like you have a “co-pilot.
Exciting Innovations I’m Looking for in an AI Copilot
After running these systems in real workflows, I look for 3 capabilities will make co-pilots even more useful!
Controlled and Secured Autonomy with Safe Reversion
As AI edits files, runs queries, or executes workflows, autonomy increases. AI accesses data it shouldn’t have. How do you recover? That is the “trust layer” that needs to be engineered at every layer of your technology stack.
A mature system must provide:
- Suggest-only mode
- Controlled edits
- Test execution
- Refactor execution
- Deterministic rollback
Trust is built through reversibility.
Cursor approaches this through diff visibility. Most others still lack robust autonomy controls.
Persistent Structured Memory
Long term cognitive continuity.. (Long-Term Cognitive Continuity). For now, I am collecting a mountain of “know how” in the form of MD files and knowledge bases across multiple domain specific tools. ChatGPT is still my favorite to recall fragments of work and reasoning.
A fun experiment is open ChatGPT and ask
What is it like to work with me? What are my top 3 strengths and what are my top 3 weaknesses.
What We’ve Learned from Lab Experiments
Embedding AI copilots into production workflows shifts the evaluation criteria. AI feels magical until you know what the output should be. That is why I look to best of breed co-pilot experiences as the guiding light for what I should working toward.
Multi-turn was the first wave. Agent workflows were the second. The next frontier is institutional intelligence where AI not only reasons in the moment, but compounds over time. That is why our investments in DataTools Pro from day 1 has been cultivating business semantics from existing systems of record (Salesforce) and systems of understanding (Snowflake, Tableau).












