Two companies working on agentic artificial intelligence are tapping the brakes on expectations, saying the tools still fall short for broad use. Their message cuts through months of marketing noise and sets a more cautious tone for near‑term deployments.
The companies are building systems that plan tasks, take actions, and adapt to feedback. Yet their leaders say performance and reliability need work before these tools can handle complex jobs for most users.
“Even as they build out agentic tools themselves, leaders from the two companies say the capabilities aren’t quite there yet.”
What Agentic Tools Aim To Do
Agentic tools are designed to act on a user’s goal, not just answer a single prompt. They can call software, chain steps, and adjust when things change. This approach could automate support tickets, draft marketing campaigns, or manage software builds without constant oversight.
Backers argue that such tools raise productivity and cut routine work. Early pilots show promise on narrow tasks. But the shift from demos to daily operations is proving harder than expected.
A Reality Check From Inside The Build
Leaders at the two companies describe a gap between hype and results. They report success on short, guided workflows. But longer tasks reveal weak points. Tools may lose context, pick the wrong plan, or miss edge cases that humans catch.
The executives cite three concerns. First, reliability, especially across long chains of steps. Second, evaluation, since traditional tests do not reflect live, messy work. Third, cost control, because long runs can consume more compute than planned.
- Long tasks drift, causing errors late in the process.
- Metrics lag real needs, hiding issues until deployment.
- Costs spike when retries and tool calls pile up.
Given these risks, both companies are keeping humans in the loop. They favor review gates on key steps and audit trails for every action. This slows the work, but it keeps errors from reaching customers.
Why The Caution Matters Now
Interest in agents surged this year as companies looked past simple chatbots. Teams want systems that can book travel, reconcile invoices, or triage code bugs on their own. Expectations rose fast, pushed by demos and early case studies.
The sober view from builders is timely. It suggests a slower rollout with clear limits. That could prevent failed launches and loss of trust. It may also steer investment into core fixes rather than splashy features.
Key Technical and Operational Hurdles
The companies point to several roadblocks that must be addressed before large‑scale use:
- Planning and memory: Agents need stable ways to track goals and facts over time.
- Tool use: Integrations must be reliable and secure across many apps.
- Safety: Guardrails are needed to stop harmful or wasteful actions.
- Measurement: Better benchmarks for multistep tasks are still lacking.
- Governance: Teams need policies for oversight, logs, and incident response.
Leaders add that user trust depends on clear handoffs. People want to know when an agent acts, why it chose a path, and how to reverse it. Simple explanations and easy undo flows help adoption.
Balanced Outlook From Practitioners
Neither company is stepping back from agents. Both are investing in smaller, safer use cases. They report gains in areas like data entry, first drafts, and routine checks. These limited wins build confidence and training data for future upgrades.
They also stress open problem solving with customers. Shared dashboards, test runs, and postmortems help teams spot failure modes. Over time, this could create clearer standards for accuracy and cost.
What To Watch Next
The near future will likely bring steadier, not flashier, progress. Expect tighter loops between planning, execution, and review. Look for new metrics that score entire workflows, not just single answers. And watch for pricing models that cap spend on long runs.
For now, the headline is restraint. Builders closest to the work are urging patience and proof. The tools are improving, but broad autonomy will arrive step by step, not all at once.
The immediate takeaway is clear. Pick narrow jobs. Keep humans in control. Track cost and quality from day one. As the core issues are solved, wider use will follow—and with fewer surprises.
A seasoned technology executive with a proven record of developing and executing innovative strategies to scale high-growth SaaS platforms and enterprise solutions. As a hands-on CTO and systems architect, he combines technical excellence with visionary leadership to drive organizational success.
























