Job Description
We are Nexus Horizon, a pioneer in next-generation cognitive computing. We are looking for a world-class Senior AI Engineer to spearhead the development of our flagship Large Language Model (LLM) products. If you are passionate about pushing the boundaries of artificial intelligence and want to build systems that understand and generate human-like text at scale, we want to meet you.
In this role, you will bridge the gap between theoretical research and production-grade software engineering. You will work closely with our research scientists and product managers to deploy robust, scalable AI solutions that power the future of enterprise communication.
Responsibilities
- Model Development: Design, train, and fine-tune large-scale transformer models (e.g., GPT, LLaMA architectures) using PyTorch and TensorFlow.
- Infrastructure Optimization: Architect and optimize inference pipelines to ensure low-latency, high-throughput performance in cloud environments (AWS/Azure).
- Research Implementation: Translate cutting-edge academic research papers into production-ready code and prototype novel architectures.
- Collaboration: Partner with cross-functional teams (Data Science, Product, Engineering) to define AI product requirements and technical roadmaps.
- Evaluation: Establish rigorous metrics for model performance, including accuracy, hallucination rates, and safety alignment.
- Mentorship: Guide junior engineers and interns in best practices for MLOps and model deployment.
Qualifications
- Education: Masterβs or PhD in Computer Science, Machine Learning, or a related field; 5+ years of industry experience in AI/ML.
- Programming: Expert-level proficiency in Python and experience with deep learning frameworks (PyTorch preferred).
- Experience: Proven track record of deploying LLMs or NLP models at scale; experience with Hugging Face Transformers and RAG (Retrieval-Augmented Generation).
- Tools: Strong understanding of MLOps tools (MLflow, Kubeflow, Docker, Kubernetes) and cloud infrastructure.
- Problem Solving: Ability to troubleshoot complex distributed system issues and optimize model efficiency.
- Communication: Excellent verbal and written communication skills for technical documentation and stakeholder presentations.