NLP Engineer

Insilico Medicine



About Insilico

Insilico Medicine is an end-to-end, artificial intelligence (AI) -driven pharma- biotechnology company with a mission to accelerate drug discovery and development by leveraging our rapidly evolving, proprietary platform across biology, chemistry, and clinical development.


For more info, visit our website


About Role

Insilico Medicine is looking for a Machine Learning Engineer specializing in Natural Language Processing (NLP) tasks within the biomedical and materials science domains. The role focuses on areas such as text classification, information extraction from abstracts, patents, and clinical trials, multi-task learning, knowledge graph construction, and fine-tuning large language models (LLMs) for chemical and biomedical applications.



Place of work

René-Lévesque West Boulevard, Montreal,QC-H3B 4W8,Canad



Reports to

NLP Team Lead (AI R&D)


Responsibilities


  • Fine-tune and optimize Large Language Models on domain-specific or custom datasets.
  • Analyze errors, identify system limitations, and propose enhancements.
  • Search and review state-of-the-art solutions and new datasets for NLP tasks.
  • Design scalable and maintainable engineering solutions inspired by the latest research and innovation.
  • Translate academic innovation into scalable, maintainable engineering solutions.
  • Build and curate datasets using annotation tools, distant supervision, and expert annotations.
  • Collaborate closely with clients and internal stakeholders to align research-driven initiatives with business needs.

General Requirements:



I. Education

Master’s degree or PhD in Computer Science, Machine Learning, or a related field.


II. Experience and Skills


  • 5+ years of hands-on experience in NLP, Machine Learning, and Deep Learning.
  • Strong understanding of Machine Learning, Deep Learning and AI.
  • Strong proficiency in Python programming.
  • Motivation to learn new things and apply creative solutions.
  • Hands-on experience in scaling and optimizing large language model (LLM) training and fine-tuning, including multi-GPU/multi-node setups.
  • Familiarity with frameworks like DeepSpeed, FSDP, Megatron-LM, or equivalent.
  • Ability to diagnose and resolve performance bottlenecks in distributed training.
  • Experience fine-tuning LLMs (e.g. GPT, LLaMA, Mistral) on custom or domain-specific datasets.

Desirable skills:

  • Knowledge of chemistry and biology, particularly for domain-specific NLP applications in life sciences.
  • Familiarity with Reinforcement Learning concepts and frameworks.

Personal Attributes:

  • Motivation to explore and apply creative, cutting-edge solutions.
  • Strong communication and collaboration skills.
  • Ability to work independently in a dynamic, fast-paced environment.

To apply, please visit the following URL: