AI Scientist III – Multi-Modal LLMs

Benefits:

Paid Vacation Time and Paid Sick Time and Paid Holidays
401k 6% match with immediate vesting
Nationwide Medical Insurance plans and coverage (Medical, Dental/Orthodontia, Vision)
TeleDoc
HSA company match
3 Medical plan options including a Low Deductible PPO Medical Plan Offering
Employee Assistance Program
Engaged Employee Resource Groups
Outstanding Learning and Career Development Opportunities

Pay Range: Actual pay may vary up or down depending on job-related factors which may include knowledge, skills, experience, and location. In addition, this position may be eligible for incentive compensation.

Essential Duties And Responsibilities:

Research, develop, and productize state-of-the-art multi-modal LLMs using techniques such as transformers, contrastive learning, auto encoders, and other emerging technologies.
Work with diverse data types, including text, images, audio, and video, to develop and optimize multi-modal AI models for applications such as video captioning, visual question answering, and cross-modal retrieval.
Design and conduct experiments to train and validate multi-modal LLMs for tasks such as text-to-video generation, video-to-text generation, and multi-modal reasoning.
Collaborate with software engineers to deploy scalable, real-time multi-modal AI systems in the cloud, ensuring optimal performance and efficiency.
Create and manage efficient data preprocessing pipelines for multi-modal data, proactively identifying and integrating new data sources.
Consult with Sorenson management on new AI directions and opportunities and to recommend new products that leverage multi-modal LLMs to enable human communication.
Maintain a deep understanding of current research, technologies, and emerging trends in the field to inform and guide AI development.

Supervisory Responsibility:

This position has no direct supervisory responsibilities but does serve as a coach and mentor for other positions in the department.

Travel Requirements:

Travel Requirements: Less than 25%

Education:

Minimum 4 Year / Bachelor’s Degree in a field related to science, engineering, software, or mathematics.

Preferred : Graduate Degree PhD degree in a field related to science, engineering, software, or mathematics.

Experience:

5+ years of experience in an AI field (e.g., deep learning, ML, CV, NLP, ASR).
Publications in top conferences such as CVPR, ACL, NeurIPS, ICML, or ICLR are a plus.

Knowledge, Skills, And Abilities:

Strong foundation in natural language processing (NLP), computer vision, and multi-modal learning, with expertise in areas such as transformers, cross-attention, modality fusion, and vision-language systems.
Proven track record solving complex problems with innovative solutions, demonstrating the ability to develop novel algorithms and adapt existing methods to new challenges.
Extensive experience with large language models (LLMs) and architectures commonly used in multi-modal learning, such as BERT, T5, GPT, ViT, VAEs, MAEs, and CLIP.
Proficiency in programming languages like Python and C++, and deep learning frameworks such as PyTorch and TensorFlow.
Familiarity with libraries and tools such as Hugging Face, LLaMA-Adapter, OpenCV, and NLTK.
Experience with model optimization techniques for efficient inference in real-world applications, such as model compression and knowledge distillation.
Familiarity with cloud computing services like AWS for training and deploying multi-modal LLMs at scale.
Strong problem-solving skills and the ability to think creatively to develop innovative multi-modal AI solutions.
Excellent communication and collaboration skills, with the ability to work effectively in a diverse team of AI experts.
Passion for continuous learning and staying at the forefront of multi-modal learning research and development.

Company Summary:

Our Mission…Harnessing the power of language, we connect diverse people and enrich the human experience.

Our Vision…To provide global language services that expand opportunities, nurture belonging, and empower the world to connect beyond words.

Apply Now