Prompt Engineer

Vectra AI’s Office of the CTO is seeking to expand the Emerging Technologies team to include a Prompt Engineer contributing to the company’s next generation of AI-based products and services. This role will focus on ongoing research initiatives and prototypes centrally leveraging large language models (LLMs).

The ideal candidate will both collaborate through the difficult challenges of early-stage research and development while also working independently and autonomously in their specialized area of focus. Additionally, this role must be comfortable with the ambiguity of open-ended research while maintaining a strong bias to converge that research into a concrete and supportable specification and implementation.

Team Responsibilities:

Envision, develop, and test prototypes to demonstrate feasibility and potential impact to new or existing lines of business.
Transition prototypes from early-stage design into long-term supportable state in partnership and collaboration with peers across organizational boundaries.
Research emerging or newly viable technologies to identify innovation opportunities.
Promote continuous learning, experimentation, and collaboration throughout CTO organization and across organizational boundaries at Vectra AI.

Individual Responsibilities:

Architect, build, and maintain infrastructure to support multi-agent LLM-based AI Assistants.
Collaborate with security researchers, sales engineers, and UX teams to understand product use-cases and workflows, prototyping AI assistants and automation to achieve desired outcomes.
Develop, test, and deploy machine learning models and AI algorithms, with a focus on large language models (LLMs).
Optimize models for performance, scalability, and reliability.
Design and implement data pipelines and workflows to support AI and ML projects.
Educate the engineering organization on AI/LLM infrastructure and facilitate knowledge transfer to broader teams.
Leverage AI/ML expertise to collaborate with cross-functional teams chartered to identify and enable opportunities for research acceleration through platform capabilities.
Capacity to flex into adjacent technical domains to maintain research velocity.

Qualifications:

Bachelor’s degree in Computer Science or a related field (Master’s or Ph.D. preferred).
Extensive experience with Python and ML frameworks such as PyTorch, TensorFlow, and JAX.
Expertise in working with and deploying large language models (LLMs):
Proficiency in prompt-engineering techniques, such as chain-of-thought and self-reflection.
Experience with Retrieval-Augmented Generation (RAG) and VectorDB technologies.
Skills in fine-tuning models and techniques to quantize and accelerate LLMs.
Experience with Reinforcement Learning from Human Feedback (RLHF) is a plus.
Familiarity with frameworks such as Langchain and MS Guidance.
Understanding of LLM evaluation metric (perplexity, accuracy, etc).
Experience developing and deploying for cloud platforms like AWS and Azure.
Ability to understand novel ML methods from academic literature and prototype/deploy these models.
Real-world experience in evaluating LLM performance in production scenarios is highly desirable.
Experience working in R&D organization.
Familiarity with machine learning and data processing frameworks.