Job Description
A person with proficiency in Python, PyTorch, and FastAPI, along with expertise in threading,
synchronisation, distributed training, and inference of language models, is sought for a Systems
Engineering Internship. This individual will be tasked with designing and implementing a scalable
system for clients to efficiently utilise machine learning (ML) and deep learning (DL) models with
minimal inference time using distributed architecture.
Responsibilities
- Design and develop a scalable system for clients to utilise various machine learning (ML) and
deep learning (DL) models with minimal inference time, leveraging distributed architecture.
- Collaborate with the team to identify client requirements and design solutions.
- Implement efficient threading and synchronisation techniques to optimise inference time.
- Train ML/DL models across multiple nodes for improved performance.
Requirements
- Pursuing a Bachelors in Technology.
- Proficiency in Python programming language.
- Familiarity with deep learning frameworks such as PyTorch.
- Knowledge of FastAPI for building web APIs.
- Understanding of DevOps practices and experience with system design for complex systems.
- Strong understanding of threading and synchronisation concepts.
- Experience with distributed training and inference of deep learning models.
- Ability to work with Distributed Data Parallel (DDP) for model training.
About shodh.ai and the Job
We are India's only Deep Learning product building lab. We have developed our own pipelline
and framework for pre-training our own LLMs. Our LLMs are built from scratch. We have our
own H100 clusters and are currently building tools for various unicorns, companies and well as
DRDO. The founder team includes ex-Microsoft Research scientists and ex-DRDO scientists.
Place: Remote (Company locations in Jaipur/Bangalore/Hyderabad)
PPO: Based on performance
DOJ: At first availability