Principal Data Scientist
What if your job description were simply “Make tomorrow better?” That’s the essence of roles within our team.
Can you bring an insatiable curiosity to the table, challenge yourself and re-imagine what is into what can be done with data? Do you want to manage and use data to design data-driven controlled experiments? We are looking for experienced engineers who know how to solve complex big data problems, work with algorithms, analyze big data and can run controlled experiments.
We are the Experimentation Team at Microsoft and part of the Application and Services Group’s (ASG). Daily, new data sources and signals come in and we need intelligent people who can dive into that data, make sense of it, and use it to solve large-scale problems all around ASG and Microsoft.
What you will do:
- Maintain and work with our data pipeline that transfers and processes several terabytes of data using Spark, Scala, Python, Apache Kafka, Pig/Hive & Impala.
- Work directly with application teams/partners (internal clients such as Xbox, Skype for Business, Microsoft Office 365) to understand their offerings/domain and get them successful with data so they can run controlled experiments (ab testing).
- Design, build and support pipelines of data transformation, conversion, validation
- Build data manipulation, processing, and data visualization tools and share these tools across the team, ASG, and Microsoft.
- Leverage your statistical and computational knowledge to build algorithms for calculating variances.
- Apply data analysis, data mining and data engineering to present data clearly and develop experiments (A/B testing)
- Ensure high-quality data and understand how data is generated out experimental design and how these experiments can produce actionable, trustworthy conclusions.
- Assist senior management in making key business decisions.
- Work with development teams to build tools for data logging and repeatable data tasks that will accelerate and automate data scientist duties.
- 5+ years of experience working with large data sets or do large scale quantitative analysis
- Bachelor’s or Master’s degree in Computer Science, Math, Physics, Engineering, Statistics or other technical field. PhD preferred.
- Expert SQL scripting required.
- Development experience in one of the following: Scala, Java, Python, Perl, PHP, C++ or C#.
- Experience working with Hadoop, Pig/Hive, Spark, MapReduce
- Ability to drive projects
- Basic understanding of statistics – hypothesis testing, p-values, confidence intervals, regression, classification, and optimization are core lingo.
- Strong algorithmic problem-solving skills.
- Experience manipulating large data sets through statistical software (ex. R, SAS) or other methods
- Superior verbal, visual and written communication skills to educate and work with cross functional teams on controlled experiments.
- A willingness to learn, share, and improve.
- Experimentation design or A/B testing experience is preferred.