Job description
Who we are
We are a collaborative, product-minded engineering and data organization that ships value iteratively. We combine rigorous analytics with pragmatic engineering to solve real customer problems at scale. Our teams partner closely with product, design, and customer-facing functions, using modern data tooling and cloud infrastructure to deliver reliable insights and intelligent features. We value ownership, curiosity, and a bias for action—balanced with healthy code and data quality practices.
What you will do
We welcome applications from students who are excited to learn and contribute in a supportive, mentored environment.
We are a collaborative, product-minded engineering and data organization that ships value iteratively. We combine rigorous analytics with pragmatic engineering to solve real customer problems at scale. Our teams partner closely with product, design, and customer-facing functions, using modern data tooling and cloud infrastructure to deliver reliable insights and intelligent features. We value ownership, curiosity, and a bias for action—balanced with healthy code and data quality practices.
What you will do
- Learn our data domain, pipelines, and tooling through guided onboarding and hands-on tasks.
- Support data preparation: collect, clean, transform, and document datasets for analytics and modeling.
- Assist with exploratory data analysis (EDA): summarize datasets, visualize patterns, and surface data quality issues to your mentor.
- Contribute to supervised model prototyping: implement baseline models, iterate on features, and compare approaches using sound evaluation methods.
- Run structured experiments: track hypotheses, metrics, and outcomes; communicate results clearly to the team.
- Document work thoroughly: notebooks, experiment logs, code comments, READMEs, and lightweight reports for stakeholders.
- Pair with engineers and data scientists on small, well-scoped tasks that ladder into a meaningful intern project by the end of the term.
- Follow software craftsmanship practices: version control, code reviews, testing, and simple, maintainable code.
- Mentorship-first: you’ll have a primary mentor, regular check-ins, and scoped milestones to build confidence and impact.
- Pragmatic rigor: we value simple baselines, clear comparisons, and incremental improvements over overly complex solutions.
- Collaboration: pair programming, shared reviews, and open discussions to learn faster and raise the quality bar.
- By the end of your internship, you’ve delivered a scoped project or feature under supervision with measurable impact and clear documentation.
- You demonstrate growth in data wrangling, evaluation, and communicating findings, and you can explain trade-offs behind your modeling choices.
- Your code, notebooks, and reports are reproducible, reviewed, and integrated into our workflows.
- Please include links to a portfolio, GitHub, or class projects that demonstrate your data skills and curiosity.
- Highlight any coursework in statistics, ML, NLP, data engineering, or cloud.
We welcome applications from students who are excited to learn and contribute in a supportive, mentored environment.
- Currently enrolled in a Bachelor’s or Master’s program in Computer Science, Data Science, Mathematics, Statistics, Physics, or a related field - or a similar training program.
- Foundational Python skills (pandas, NumPy); comfort working in notebooks and scripts for data analysis.
- Basic understanding of machine learning concepts (train/validation/test splits, overfitting, evaluation metrics). Experience from coursework or personal projects is great.
- Familiarity or strong interest in NLP, classical ML, or simple deep learning workflows; curiosity to explore model behavior and error analysis.
- Experience using Git and code reviews; ability to break down tasks and ask clear questions when blocked.
- Clear written and verbal communication; ability to summarize findings for technical and non-technical audiences.
- Enthusiasm and interest in learning more about data science and data engineering.
- Exposure to TypeScript or JavaScript for simple data tools or dashboards.
- Introductory experience with cloud platforms (AWS preferred) and services like S3, Lambda, or SageMaker/Bedrock.
- Experience with experiment tracking tools (e.g., MLflow, Weights & Biases) or evaluation frameworks.
- Other worldly experience that helped to form your critical and analytical mind.
Skills & technologies
pandasAWSSageMakerGitWeights & BiasesPythoncloud infrastructurestatisticsNLPcommunicationmachine learningMLflowdata pipelinesexploratory data analysiscode reviewsNumPydata wranglingLambdaS3model evaluation