About the Role
 
We are building a multilingual Large Language Model tailored for Bahasa Indonesia and regional languages. We are looking for a passionate Senior Data Scientist to help shape the future of open and inclusive AI for Indonesia, as well as playing a pivotal role in identifying impactful AI use cases. As a Senior Data Scientist working on LLMs, you will design and build high-quality datasets, advanced model pre-training, fine tuning and alignment techniques, and collaborate closely with product and engineering teams to ship safe, reliable LLM-powered features to millions of users. This role offers the opportunity to drive innovation, solve critical business challenges, and shape the future of AI-driven solutions at GoTo Group.
 

What You Will Do

  • Perform data annotation and labeling based on provided guidelines
  • Validate language accuracy, grammar, and contextual relevance
  • Review annotated datasets to identify and correct errors
  • Ensure consistency and quality across large volumes of data
  • Collaborate with internal teams to refine annotation processes
  • Provide feedback to improve annotation guidelines and workflows

What You Will Need

  • 4+ years of experience in LLM, Deep Learning, NLP, Computer Vision, or Voice.
  • Proficient in data preprocessing, model training, evaluation, and optimisation.
  • Practical experience in applying deep learning to solve real business problems, with models successfully deployed and used in production environments.
  • Proficient with Python and deep learning frameworks such as PyTorch or Tensorflow.
  • Experience with cloud platforms like Alicloud or Tencent.
  • Strong communication skills to understand business needs and effectively convey analytical solutions.
  • Ability to write clear and concise technical documentation.
  • A Master’s or PhD in Computer Science, Data Science, AI, or a related field.
  • Understanding Bahasa Indonesia will be an advantage.

About the Team
 
The LLM team is on a mission to build the most capable and culturally-aligned multilingual LLMs for Indonesia. At GoTo Group, the team is at the forefront of developing state-of-the-art language models. We are building foundational AI models that understand and generate Bahasa Indonesia and regional languages – empowering more inclusive technology. We work on everything from continual pretraining large-scale LLMs to alignment and safety fine-tuning, using both structured and unstructured data from diverse sources. Our projects span core model development, dataset curation, safety systems, and real-world deployment in consumer and enterprise applications. Our team brings together members with diverse technical and cultural backgrounds, bringing expertise in machine learning and local languages.