SDE III – AI Software Engineer- RAG- Vector Database
- India
Job Details
Full Time
Full Job Description
What You’ll Do
- Architect, build, and scale agentic RAG and text-to-SQL copilots supporting 50K+ daily queries, delivering 99.9% uptime, low latency, and high semantic accuracy.
- Design, operate, and continuously optimize a production-grade LLMOps platform, leveraging LangGraph, LangSmith, MLflow, Kubernetes, async inference, and leading cloud LLM providers such as AWS Bedrock, Google Vertex AI, Azure OpenAI, and Anthropic.
- Develop and own MCP server integrations, ensuring reliable, efficient, and secure runtime execution across multi-agent workflows and toolchains.
- Implement evaluation and guardrail frameworks (AI-as-a-Judge, grounding checks, safety filters, regression tests) to minimize hallucinations, control model drift, and reduce token usage and inference costs by 30%+.
- Own end-to-end system observability and performance, including latency, throughput, reliability, cost optimization, caching strategies, and retrieval quality.
- Optimize inference, retrieval, and orchestration pipelines to support high-traffic, enterprise-scale workloads.
- Partner closely with product, infrastructure, and leadership teams to define SLAs, unblock customer requirements, and deliver robust, enterprise-ready AI capabilities.
- Leverage AI-assisted development tools (GitHub Copilot, MCP-enabled IDEs, Claude, GPT, etc.) to improve development velocity, code quality, and system reliability.
What We’re Looking For
- 5+ years of experience in software engineering or ML engineering, with hands-on ownership of production-grade LLM, RAG, or agent-based systems.
- Strong Python engineering expertise, with deep experience building RAG pipelines, agent architectures, tool-calling workflows, and text-to-SQL copilots.
- Proven experience working with MCP servers, vector databases, and retrieval-augmented system architectures.
- Strong understanding of agent development, LLM integration patterns, prompt engineering, and runtime orchestration frameworks.
- Hands-on experience with cloud-native infrastructure, including Kubernetes, async workers, queueing systems, and observability/monitoring stacks.
- Demonstrated ability to build LLM evaluation pipelines, guardrails, monitoring, experiment tracking, and regression testing for AI systems.
- Experience with multiple agent SDKs, such as:
- Anthropic SDK
- ClaudeAgent SDK
- Google ADK (Agent Developer Kit)
- Bonus: LangChain, LlamaIndex, AutoGen, or custom agent runtimes
- Strong ownership mindset, with a track record of taking AI prototypes from concept to scalable, reliable, high-traffic production systems.
High Impact Jobs: CareerXperts Jobs
Follow CareerXperts on LinkedIn: CareerXperts Consulting