Machine Learning / LLM Engineer
About Oh:
Oh is pioneering hyper-realistic, uncensored AI-driven content, building a full-spectrum ecosystem of multimodal AI products. Our platform powers lifelike digital twins and AI characters across text, voice, and images.
With a mission to become the OpenAI of the spicy content industry, we iterate fast, push boundaries, and deploy cutting-edge, real-time conversational AI experiences at scale.
The Role:
Our platform integrates a variety of multimodal GenAI models. You will own the technical roadmap and full lifecycle of our large language models, most notably our flagship Llama 3.1 70B and other open-source models.
Your responsibilities will include:
- Fine-tuning with custom and synthetic datasets
- Deploying on GPU platforms to ensure low-latency, cost-efficient, and safe real-time interactions
- Driving multimodal expansionβintegrating text, voice, and image capabilities
- Embedding robust safety and compliance measures
- Keeping on top of recent development in the field and auditing new models for a wide range of purposes (e.g. conversational AI, intent classification, AI agents life planner)
Key Responsibilities:
πΉ LLM Fine-Tuning & Optimization
- Fine-tune and optimize models (Llama 3.1 70B, GPT-based, Mistral, etc.) using domain-specific and synthetic datasets
- Enhance accuracy, reduce hallucinations, and improve alignment with user intent
πΉ Deployment & Infrastructure Management
- Deploy scalable, memory-efficient models on GPU-based platforms (Runpod, AWS, Kubernetes clusters)
- Optimize GPU inference with Torch, CUDA, TensorRT, vLLM, and DeepSpeed
πΉ Multimodal & Cross-Model Integration
- Integrate additional open-source models to enable image prompt generation, voice synthesis, and dynamic character personalization
- Expand multimodal AI capabilities (e.g. improve LLava-based vision models)
πΉ Data Pipeline & Evaluation
- Design robust data pipelines for curation, cleaning, synthetic data generation, and versioning (DVC)
- Implement evaluation metrics and continuous monitoring to ensure model quality
πΉ Real-Time Performance & System Optimization
- Ensure low-latency, real-time performance using mixed-precision training, quantization, pruning, and distillation techniques
πΉ Safety, Moderation & Compliance
- Embed robust safety, content moderation, and ethical AI frameworks to comply with GDPR and industry standards
- Develop custom token filters and controlled response mechanisms
πΉ Monitoring, Diagnostics & Cost Management
- Set up and maintain monitoring tools (Prometheus, Grafana, TensorBoard, Weights & Biases, Sentry) for performance tracking and cost optimization
Technical Skills & Requirements:
πΉ Experience:
- 5+ years in machine learning engineering, NLP, or AI research with deep expertise in Transformer-based LLMs
πΉ Programming & Frameworks:
- Strong proficiency in Python and Bash scripting
- Hands-on experience with PyTorch, HuggingFace libraries (Transformers, Diffusers, PEFT, Accelerate), and the common ML toolkit (e.g. SKLearn, Pandas, Numpy)
- Familiarity with JAX/TensorFlow is a plus
πΉ LLM Specialization:
- Proven expertise in fine-tuning LLMs using techniques like LoRA, QLoRA, PEFT, RLHF, and prompt engineering
πΉ GPU & Inference Optimization:
- Experience with common inference speed optimisation and model quantization techniques.
πΉ Deployment & Orchestration:
- Skilled in containerization (Docker) and orchestration (Kubernetes) for scalable ML deployments
- Experience with major MLOps frameworks (MLFlow / KubeFlow) preferred
πΉ Data Handling:
- Proficient in data wrangling and preprocessing (Pandas, Dask)
- Experience managing large-scale datasets using AWS (S3, RedShift, EC2)
- Knowledge of data QC and monitoring tools (DVC, Great Expectations)
πΉ Additional Knowledge:
- Understanding of retrieval-augmented generation (RAG) techniques
- Familiarity with vector databases (FAISS, Pinecone, Weaviate)
Preferred Qualifications:
β Experience integrating and optimizing multimodal models (text, voice, image, video)
β Background in AI-driven gaming, digital experiences, or adult content
β Familiarity with CI/CD pipelines (GitLab CI, Jenkins) for ML workflows
β Interest or experience in crypto, Web3, or NFT-based AI models
β Prior exposure to AI governance, safety, or ethical AI frameworks
What We Offer:
π° Competitive Compensation:
- Attractive salary, benefits, and equity participation
π Remote & Flexible:
- Remote-first work environment with flexible hours
π Growth & Leadership:
- Rapid career advancement and the opportunity to shape our AI strategy
π₯ Innovative Culture:
- Join a fast-paced team at the forefront of advanced, uncensored AI applications
If youβre passionate about pushing the boundaries of AI-driven experiences and have a track record in developing, deploying, and optimizing cutting-edge LLMs, we want to hear from you! π
Contact Detail:
OhChat Recruiting Team