Boston / Remote
Full time
Remote
Other
Who This Is For
Building an agent is easy. Getting it to produce the same right answer three times in a row — on real enterprise data, in a regulated environment, without a human checking every output — that’s the actual problem.
You’ve shipped AI agent systems to production on real, unclean data. Not a demo dataset. You know the accuracy cliff. You know why prompting cannot fix semantic problems. You’ve built systems that don’t rely on the model getting it right every time, and you’ve built the judgment to know when to ship anyway.
You’re not looking for a well-defined architecture to implement. You’re looking for the unsolved problem — and the mandate to build the solution that becomes the standard.
About edisyl
edisyl builds AI solutions that turn messy institutional data into decisions, workflows, and outcomes. We came out of blockchain data infrastructure — 8 years, 20+ chains, 700M+ resolved wallets — and now deploy that capability to enterprises navigating the same challenge: how to make their data work for them at scale, without armies of analysts.
We have active deployments with Fidelity and Interlochen, a proven architecture, and inbound from firms that need what we’ve built. The technology works. What we’re building now is the enterprise motion around it.
The Role
You own the architecture that makes our agent fleets reliable: the harness, the tooling, the orchestration patterns, the semantic layers that keep outputs grounded in organizational context. You work on Forge (our agent framework), Lattice (fleet orchestration), and Stratum (semantic intelligence) — building and extending the systems that production deployments run on.
You care obsessively about output quality — not because someone told you to, but because you’ve seen what happens when agents drift. You solve for quasi-determinism: agents that use validated tools instead of guessing at raw data, producing consistent and auditable results at scale. This is a Staff-level role — you define the architecture, set the standards, and make principled decisions without waiting for a framework to be handed to you.
What You’ll Actually Do
Design and build the architecture for AI agent workflows — planning loops, tool use, memory, retrieval, and human-in-the-loop checkpoints
Evaluate, integrate, and fine-tune foundation models and LLM APIs for specific enterprise use cases and data types
Define standards for agent reliability, observability, and failure modes in production deployments
Collaborate with Forward-Deployed Engineers to translate what’s working in client environments into reusable platform components
Build internal tooling and eval harnesses to assess agent quality, hallucination rates, and task completion
Make principled, documented architectural decisions — and stay current enough with the ecosystem to know what to adopt and what to ignore
What Success Looks Like in Year One
You’ve shipped meaningful improvements to Forge, Lattice, or Stratum that are running in production. You’ve established the eval framework the team uses to assess agent quality. The Forward-Deployed Engineers trust the platform enough to focus on client problems instead of working around infrastructure limitations. At least one architectural decision you made is something we’re still building on two years from now.
The measure isn’t how sophisticated the architecture is. It’s whether the agents produce the right outputs reliably enough that customers act on them without checking every result.
Compensation
Competitive base salary and meaningful early-stage equity. This is a foundational technical role and we price it that way. We’ll be transparent about the full picture in our first conversation.
Who We’re Looking For
Experience
6–10 years building production AI or data systems — not prototypes; systems that run reliably at scale under real conditions
Deep hands-on experience with multi-agent architectures: context windows, memory management, dependency graphs, and where things break in practice
Strong Python and familiarity with agent frameworks — LangChain, LlamaIndex, AutoGen, or equivalent — or a clear, documented opinion on why you built your own
Practical experience with RAG architectures, vector databases, and context window management in production settings
Experience deploying LLM-powered systems in enterprise contexts — data security, access controls, audit logging
The Stuff That’s Harder to Teach
LLM failure mode literacy. You know the accuracy cliff. You know why prompting cannot fix semantic problems. You build systems that don’t rely on the model getting it right every time.
Production instincts. You don’t consider something done until it’s been wrong three times and you’ve fixed it twice — and you’ve built the judgment to know when to ship anyway.
Strong opinions on agent design. You have a clear answer to why most agent architectures fail at enterprise scale — and you’ve built something that doesn’t.
Systems thinking. You design for failure modes first. Happy paths are not the interesting problem.
Bonus (Genuinely Not Required)
ML research exposure — fine-tuning, RLHF, model evaluation methodology
Production AI deployments in regulated industries — financial services, insurance, healthcare
Familiarity with blockchain data infrastructure or institutional crypto
Why This, Why Now
edisyl is at the moment where the technology is proven and the enterprise problem is clear. The person who takes this role will define the architecture that production deployments run on — not inherit it. The platform is real, the customers are real, and the hard problems are still open. That’s a rare place to work and a real chance to build something that matters.
To Apply
Complete the online application and include responses to: 1) why this role fits where you are in your career right now, and why you are the right person for it; and 2) one example of an agent system or AI infrastructure decision you made in production — what the constraints were, what broke, and what you built to fix it.
No template. Just tell us the story.
Other similar jobs that might interest you