At DatologyAI, we are on a mission to transform how companies train their large models on proprietary data. Traditional methods rely on random data sampling, which can be inefficient and detrimental to model quality. Our groundbreaking research has demonstrated that smarter data selection significantly improves model training efficiency and quality.
Founded to turn this research into practical tools, DatologyAI helps enterprises identify the optimal data for training, leading to superior models at a lower cost. Our team comprises pioneers in deep learning data research, seasoned startup founders, and creators of enterprise ML tools.
Following our impressive $11.65M Seed round last September, we’ve recently secured a $46M Series A led by Felicis Ventures, with additional backing from Radical Ventures, Amplify Partners, Microsoft, Amazon, and notable angels including Jeff Dean, Geoff Hinton, Yann LeCun, and Elad Gil. With over $57.5M in total funding, we are scaling rapidly to revolutionize data curation across modalities.
About the Role: Dive into Cutting-Edge Research
As a Research Intern at DatologyAI, you will have the unique opportunity to delve into how interventions on training data can enhance the quality and influence the behavior of deep learning models. Here’s what you can expect from this role:
Transform Messy Literature into Practical Solutions: Navigate the vast and evolving research literature to source, vet, implement, and refine promising ideas. Your scientific skills will be crucial in translating theoretical research into practical improvements.
Engage in High-Risk, High-Reward Research: Focus on transformative problems that could revolutionize how data is ingested into ML models. Rather than making incremental changes, you’ll tackle novel projects with the potential for significant impact.
Conduct Science Driven by Real-World Needs: At DatologyAI, we prioritize research that addresses concrete customer needs and product improvements over purely academic benchmarks. Your work will directly influence our product development and customer satisfaction.
Collaborate and Innovate: Work closely with engineers, engage with customers, and contribute to shaping the product vision. Science at DatologyAI extends beyond experiments; it’s about making meaningful contributions to our product and industry.
About You: Ideal Candidate Profile
We are looking for candidates with:
Strong Coding Skills: Proficiency in programming is essential, with experience in any of the following areas:
Data research
Data pruning/curation
Curriculum learning
Synthetic data generation
Dataset distillation
Effects of training data on model behavior
Embedding models
Semantic search
Efficient ML
Relevant Experience or Publications: Practical experience and/or publications in training large vision (especially video), language, or multimodal models are highly desirable.
Passion for Innovation: A drive to explore and teach us something new that could enhance data curation is encouraged. If you have a unique perspective or innovative idea, we’d love to hear about it.
Additional Information
Internship Availability: This role is currently closed for summer 2024, but we are accepting applications for fall and winter internships.
Location: Based in Redwood City, CA. We work in person 4 days a week and offer relocation assistance for new hires. Visa sponsorship is available for selected candidates.
Ready to Make an Impact?
If you’re excited about the opportunity to drive innovation in data curation and contribute to the future of AI, we’d love to hear from you. Apply now to join DatologyAI as a Research Intern and be part of a team that is pushing the boundaries of what’s possible in AI!
Comments