Junior AI Researcher

DataGPT is a the world’s first Conversational AI Analyst. Chat directly with your data and receive analyst-grade answers in seconds.

As a Junior AI Researcher, specializing in Data Preparation and Support, you will be instrumental in driving the progress of our AI research initiatives. Your role involves meticulous cleaning of datasets and facilitating the design and rollout of innovative machine learning algorithms. Dive into the forefront of technology by engaging in groundbreaking algorithmic experiments and enhancing the implementation of sophisticated data analysis and GenaAI techniques. Work closely with the Data Science Director and fellow team members in a fast-paced, Agile environment, contributing to the cutting edge of data exploration and model creation.

You will:

  • Dataset Optimization: Lead the charge in cleaning and preparing datasets for our GenAI models, ensuring they are diverse and of high quality to establish a strong foundation for sophisticated analysis and modeling.
  • Data Profiling Assistance: Support the enhancement of our data profiling efforts through active participation in data collection, preliminary analysis, and the detailed documentation of data structures and statistics, all under the mentorship of our experienced team members.
  • Model Development Support: Get involved in the initial stages of embedding model development by conducting literature reviews, experimenting with various embedding techniques, and preparing datasets, contributing to both the project’s success and your own professional development.
  • Multidimensional Analysis Support: Assist in the early stages of multidimensional analysis development by helping with the documentation of requirements, identification of data sources, and conducting exploratory data analysis, making foundational contributions to the project.

You are:

  • Analytical: Demonstrate exceptional analytical and problem-solving abilities, grounded in a solid understanding of data science and statistical principles, with a meticulous attention to detail in data cleaning and analysis.
  • Technical: Proficient in Python, particularly for data manipulation and analysis. Basic knowledge of LLMs, especially transformers, and an understanding of LLM training and fine-tuning principles, metric spaces, and vectorization techniques. Solid SQL skills and a willingness to learn and apply new technologies or methods related to data profiling, embedding models, and exploratory data analysis.
  • Collaborative: Show a strong commitment to research, literature review, and experimentation with guidance. Take initiative in exploring innovative solutions and adapting to new challenges, effectively contributing to team dynamics and sharing knowledge.
  • Communicative: Effectively communicate findings and progress in support roles, and document work and findings in a manner that is accessible to team members of varying expertise.
  • Proactive: Exhibit a strong desire to learn and adapt, proactively seeking opportunities to enhance technical skills and understanding of data science methodologies.
  • Quick Learner: Eager to quickly absorb and utilize new information.

This position is crafted for individuals who have a background in LLM training and fine-tuning, are familiar with distributed LLM training, and have engaged with open-source software and development practices.

We provide:

  • We’re completely remote.
  • Competitive base salary and stock option plan.
  • We have full health insurance for both Canadian and US residents.
  • We offer unlimited vacation.
  • Flexible hours – create your most productive work schedule.

DataGPT is an Equal Opportunity Employer and makes all decisions without regard to age, national origin, race, ethnicity, religion, creed, gender, sexual orientation, disability status or any other characteristics protected by law.

Similar AI Jobs