Engineering partner proposal · Tallwave × BetterBrain
For Hill's Pet Nutrition / Colgate-Palmolive · prepared by BetterBrain · Jun 2026
01 — Approach
Spec Refinery already works. Product managers across Hill's and Col-Pal domains have used it to turn a guided conversation into a full, readiness-scored spec across 17+ sections. The job for v1.0 is not to reinvent that — it's to put enterprise foundations underneath it without losing the product, or the speed, that made it land.
We treat this as a disciplined re-platforming, not a rewrite. The validated product surface stays alive throughout. The POC scaffolding is replaced with foundations a Colgate-Palmolive security review will pass. And the whole build is sequenced around that review and GIT's readiness — planned for from day one, never bolted on at the end.
The 17-section discovery, Definition-of-Ready scoring, the stakeholder outreach loop and Knowledge Mode are the validated product — they carry forward intact. The fast PM feedback loop is a requirement, not a nicety: hardening happens underneath the workflow, never on top of it.
Swap POC scaffolding — the password gate, GitHub-as-database, Resend — for enterprise foundations: Okta identity, managed persistence, an audit layer, GCP-aligned hosting. PMs keep the surface they already know; the change is structural, not cosmetic.
Identity, audit and data-access controls are built first, not last. The Col-Pal security review becomes a confirmation of how the system already works — not a late scramble. GIT readiness is treated as a dependency we sequence around, not a formality.
Why the speed survives
We're not inventing this on your build. The identity-scoped retrieval, audit trails and optimized retrieval systems v1.0 needs are patterns BetterBrain has shipped many times over — including SOC 2 Type II–compliant, permission-scoped RAG for dozens of clients. Enterprise hardening is well-trodden ground for us, which is exactly why it doesn't cost you the iteration speed that made the prototype work.
The architecture
Here is how the target v1.0 is put together. Connected sources feed a Knowledge Layer of many components. A model-portable orchestrator and a set of tools draw on those components — many-to-many. The orchestrator chains tools into workflows; workflows compose into the surfaces PMs and SMEs actually use. Colour shows what is validated in the POC and carried forward versus what is net-new for the enterprise — but the product logic that already works stays exactly that.
Flow ↑ Sources (bottom) → Knowledge Layer → Orchestrator & Tools → Workflows → Surfaces (top)
The re-platforming map
Every piece of POC scaffolding has a defined enterprise target. The product logic that PMs validated is left untouched. Nothing here is a guess about scope — it is the migration the brief describes, made concrete.
Identity
Persistence
Hosting
AI stack
Outbound
Audit
Knowledge Mode
Knowledge sources
The validated product — unchanged
This is the asset. We harden everything around it, not it.
How the hard parts work
Model portability
Spec generation and interview turns call an interface, not a vendor SDK. Claude (Sonnet for generation, Haiku for interview turns) stays the default today. If Hill's confirms the Google direction, Vertex AI / Gemini becomes a configuration-and-evaluation exercise behind that same boundary.
RAG across sources
The agents draw on domain knowledge spread across Confluence, Jira, Google Docs, Snowflake and whatever else scoping surfaces. Retrieval is built to be access-aware end to end: a user only ever retrieves what their Okta role allows.
Evaluation
A base evaluation harness, co-owned with the Tallwave lead. It is how we keep refactor speed without regressing the product, and how we prove a model swap is safe before it ships.
Speed & the PM loop
The feedback loop that made the prototype work is treated as a first-class constraint, not a casualty of hardening. Enterprise rigour goes in underneath the product, never across the PM's path.
Sequencing
Indicative shape, aligned to the brief's timeline. The security-review spine — identity, audit, data-access — goes in first, so the review confirms a system that already behaves correctly. Full sequence and estimate are set together at scoping.
Phase 0 — Joint scoping
The first paid activity. With workshop transcripts and the current codebase in hand, we fix scope, sequencing and the source mix jointly with Tallwave — no work priced against a frozen spec.
Phase 1 — Identity & audit foundation
Okta SSO and role-based access replace the password gate. The audit and access log and managed persistence go in early, so access control and traceability are load-bearing from the start of the build.
Phase 2 — Re-platform & portability
Move to GCP-aligned hosting and the provider-abstraction layer, with the validated product surface preserved throughout — PMs keep working as the foundation changes underneath them.
Phase 3 — RAG & integrations
Permission-aware retrieval across Confluence, Jira, Google Docs and Snowflake; read/write integration with Jira, Confluence and GitHub. Coordinated with GIT, who host and bridge — we build to that bridge.
Phase 4 — Eval & clearance
The evaluation harness lands, and we support the Colgate-Palmolive security review through to clearance. GIT readiness is sequenced as a dependency, not assumed.
What this de-risks
Portability as an abstraction rather than a Gemini commitment, and a build sequenced around the review — the reasoning the architect-lead will walk you through.
Comparable enterprise AI builds taken through security review — covered in the Prior Work section of the full pack.
Under Tallwave direction, in a daily loop with Hill's PMs, at speed — the operating posture this whole approach is shaped around.
Joint scoping as the first paid step, a model that shares risk rather than padding a fixed bid — detailed in the Commercial section.
What follows in the full pack
This is the Approach. The named architect-lead and engineering team, comparable prior work taken through enterprise security review, the commercial model, and the assumptions and risks we'd need closed with Tallwave, Hill's and GIT — each follows as its own section, built on the same rails as this one.
Drop named people into the Team section when ready · per-domain rollout names confirmed at scoping.
02 — Team
A small, named pod selected for this build. Abhishek leads as architect — hands-on through delivery, co-owning evaluation with Tallwave. Dima and Darshan are the forward-deployed engineers who carry the product surface, retrieval and RAG patterns BetterBrain has already shipped. Ilona owns QA to the security-reviewed bar a Colgate-Palmolive system requires — with senior specialists brought in as each part of the project needs them.
Day-to-day delivery · engagement, build & quality
CMU computer science and computational finance; ex-YC, with a commodities-trading background. Owns the technical approach and co-owns evaluation with the Tallwave lead — a hands-on architect through delivery, and the main point of contact for roadmap and priorities.
Owns the product surface and UX — the mini-app interfaces and the fast editing workspace that keep the PM loop quick. Fluent in the React / Vite / Tailwind / shadcn stack Spec Refinery already runs on, so iteration speed survives the re-platforming.
Builds the knowledge layer, retrieval and app backends directly in your environment. Carries BetterBrain's permission-scoped RAG and knowledge-graph patterns from prior builds — paired with Abhishek on retrieval.
Integration and regression testing, evals and UX quality. Builds the evaluation harness — golden specs, Definition-of-Ready regression, model-swap regression — so the product and portability are provable to the security bar.
Relevant specialists · matched to the work
Ex-Goldman investment banking; runs finance-related project initiatives like cost-containment tooling. Flexes in on commercial framing and the business case.
Enterprise IT, security and compliance. Flexes in for the Colgate-Palmolive security review, IT governance and GIT coordination as the review ramps.
Retrieval & knowledge layer
Security & compliance
One engagement, many disciplines — depth wherever the build needs it.
Abhishek, Dima and Darshan are the ~3 FTE build core the brief calls for — a named architect-lead and two engineers with production LLM, RAG and integration experience. Ilona owns the quality and evaluation bar; Alex and Michael flex in as the security review, commercial and governance phases surge.
Edit any bio, chip, or pairing to taste — the structure holds.
03 — Depth behind the bench
Research-grade AI depth meets operators who've shipped in production — across finance, robotics, enterprise IT and energy. We're backed by leading funds, and by individual investors and advisors from the very companies building the models and data platforms Hill's will run on.
Academia & research
Enterprise & finance
Industry & operations
Core capabilities
04 — Prior work
The brief asks for comparable enterprise AI — LLM and RAG applications taken through enterprise security review. Below is a selection of BetterBrain engagements chosen for how directly they map to Spec Refinery v1.0: permission-scoped retrieval, provenance and audit, format-faithful generation, and guided discovery — built for regulated buyers, including a global bank's AI-governance program.
Selected builds · matched to what v1.0 needs
A chatbot that indexed data separately for each of a client's 100+ customers, with per-customer access control and a dual-pane UI unifying internal and web results — Vespa semantic + lexical indexing across Slack, ClickUp and Google Drive.
Access-controlled retrieval for hundreds of clients, with data isolated per customer.
→ v1.0: Okta-scoped, per-domain retrieval
Connected every data source for a global bank and traced each AI answer back to the specific document it came from — accurate and traceable rather than hallucinated, built for the bank's AI-governance program.
Audit-grade governance for a regulated buyer — the bar compliance teams actually require.
→ v1.0: Citations, provenance & the audit trail
Best-in-class enterprise search over CRM, Airtable, Slack and Google Drive — custom ranking, re-rankers, query expansion and contextual embeddings — to support VC due diligence across heterogeneous sources.
Surfaced long-forgotten information that changed go/no-go decisions.
→ v1.0: Confluence / Jira / Docs / Snowflake retrieval
A system that learns the format and tone of prior reports and drafts new ones from fresh data — holding a consistent structure across every output, with a human in the loop.
On track to save hundreds of finance-team hours a month.
→ v1.0: Spec generation in the Definition-of-Ready format
An agent that writes SQL and Python, plans, reflects and asks clarifying questions — learning from previous queries, human-in-the-loop throughout — over a structured warehouse.
80% less analyst time on ad-hoc requests; 75% faster for non-technical users.
→ v1.0: Snowflake retrieval + the Knowledge Mode loop
A structured discovery agent that interviews practitioners, surfaces the workflows worth automating, and produces an opportunity map — discovery that would take weeks of consulting done in days.
Production-shape discovery at scale — weeks of interviews compressed to days.
→ v1.0: The guided discovery interview itself
Growth-led ROI · what these builds do to a P&L
80% automation of QA cycles — ~$3.6M from freed capacity, plus ~$5M as ~24 engineers redirect to product.
3× recall lift on churn (7% → 21%) — about 56K subscribers retained a year.
Coaching patterns surfaced at scale — ~$2.4M time freed plus ~$1M from a 3% conversion uplift.
Cost attribution & automated quoting — margin projected to more than double (4% → 10%).
Outcomes from specific BetterBrain engagements — evidence of impact, not a projection for this build.
Where we work · selected references
Strategic partnership · F500
World's largest datacenter operator. Shared AI tooling for their Global Solutions Architects team, scoping and closing enterprise deals.
Strategic partnership · $1B-funded
Leading NVIDIA challenger in AI. Uses BetterBrain's audit capabilities across engagements.
Enterprise · global bank
Source-attribution platform tracing every AI answer to its document across all connected data.
F500 delivery · ~$20B+ revenue
Automated manual QA cycles across the engineering org, freeing skilled engineers for product velocity.
Commercial · EU distributor
Plain-language queries over live SKU, inventory and customer pricing. In production daily.
Startup / consultancy · US
Natural-language SQL over billions of rows with a self-learning loop — $600K+ revenue in six months.
Why this is the right track record
Every pattern Spec Refinery v1.0 depends on — permission-scoped retrieval, provenance and audit, retrieval across many enterprise sources, format-faithful generation, and guided discovery — is something BetterBrain has already shipped, much of it for regulated buyers and a global bank's AI-governance program, by a SOC 2 Type 2–certified team. The hard parts of this build are delivery for us, not discovery.
Drawn from BetterBrain's case studies and reference materials. Swap in named clients or add logos where NDAs allow.
05 — Commercial model
The brief is explicit: propose a model, with joint scoping before any fixed number, and share risk rather than pad a bid. That's how we'd price this — a transparent monthly pod rate, sized as a share of the overall program, with the firm number set together at scoping.
How it works · scope → build → expand
Step 1 — Joint scoping
A short, paid scoping sprint with Tallwave — workshop transcripts and the current codebase in hand — that fixes scope, sequence and estimate jointly. No build number is locked before this.
Step 2 — Build pod
A time-boxed monthly rate for the pod actually deployed — Abhishek leading, Dima and Darshan building, Ilona on QA. Transparent and adjustable as scope firms up, not a frozen fixed bid against a spec we were told not to freeze.
Step 3 — Sized to the engagement
Pricing is calibrated as a share — about 30–40% — of the overall program economics, given Tallwave's team and the build's shape, so our incentive is the success of the whole engagement rather than a padded line item. This is the risk-share the brief asks for.
Step 4 — Expansion
As the build delivers, we help Tallwave identify and scope areas for expansion — additional product domains, new agents, deeper integrations — priced the same transparent way.
Step 5 — Longer-term partnership
Once v1.0 is delivered and owned by Hill's, there's room to discuss an ongoing partnership — scale-out across domains, support, new agents. Worth exploring later; this proposal stands on the build alone.
What you can count on
No number is locked before scope is set together — the brief's first paid activity is ours too.
You pay for the pod you can see, billed monthly — not a padded fixed bid against a frozen spec.
Priced as a share of program value, so we win when the whole engagement does — the commercial fit Colgate is looking for.
Hill's owns v1.0 outright — any cloud, any model provider, nothing held hostage.
In short
A ~$30–40k/month pod for the build phase, sized as roughly a third of the overall program and structured to share risk — with the firm number set jointly at scoping, expansion scoped as we go, and a longer-term partnership open to discuss once Hill's owns v1.0.
Final figures and billing cadence confirmed at the joint scoping session.
06 — Assumptions & risks
The brief asks what we'd need from each party and where the risks sit. Here's what we're assuming, what we'd need from Tallwave, Hill's and GIT, and the main risks — each paired with how the approach already handles it.
Assumptions we're building on
What we'd need · from each party
From Tallwave
From Hill's
From GIT
Main risks · ranked, with mitigations
The Colgate-Palmolive review governs what ships and when; requirements surfaced late mean rework.
MitigationGet the checklist early, build identity, audit and data-access first, and treat the review as a dependency rather than a formality.
If hosting or the integration bridge slips, deployment and integration stall.
MitigationSequence around it, secure a committed timeline, and build to interfaces with stubs and staging so the critical path isn't blocked while we wait.
A mid-build mandate to move to Vertex / Gemini, or a lingering decision. Worth naming: model parity on spec-generation and interview quality isn't guaranteed, so a swap may need prompt and tuning work.
MitigationA provider-abstraction layer from day one plus a model-swap eval suite — config and eval, not a rebuild — so the eval catches any regression before it ships.
Access, permissions and data quality across Confluence, Jira, Docs and Snowflake — plus the brief's open-ended “any other source” — can expand RAG scope.
MitigationFix the source mix and priority at scoping, use permission-aware retrieval, and phase connectors.
Hardening and review can quietly throttle the PM loop the brief calls a requirement.
MitigationKeep a fast iteration surface, ship behind flags, and don't route PMs through enterprise process.
If Hill's PMs aren't reliably in the loop, validation and velocity slip.
MitigationNamed PMs and a committed cadence agreed at scoping.
Unclear ownership or handoff friction across Tallwave, Hill's and GIT.
MitigationJoint scoping fixes scope, a clear RACI, and the explicit GIT boundary — host and bridge, not builder.
Colgate-Palmolive data handling and PII across connected sources.
MitigationA PII and sensitivity guard, RBAC-scoped retrieval, full audit, and no client data used to train external models.
In short
None of these is a surprise. The two biggest — the security review and GIT readiness — are exactly why the approach builds identity and audit first and sequences around GIT. Two lighter items we'd raise live rather than belabor here: decision latency across three parties, mitigated with a single decision contact per party; and that the brief's indicative timeline stays contingent on scoping and GIT readiness.