What you will do
The Data Scientist will play a pivotal role in planning, executing and delivering machine learning-based projects. The bulk of the work will be in machine learning (ML) modelling, management and problem analysis, data exploration and preparation, data collection and integration, operationalization.
The Data Scientist will be a key interface between the analytics team and the business(s) and various other departments (such as IT). Candidates need to be very much self-driven, curious and creative.
How you will do it
- Problem Analysis and Project Management
- Guide and inspire the organization about the business potential and strategy of artificial intelligence
- Identify data-driven/ML business opportunities
- Collaborate across the business to understand IT and business constraints
- Prioritize, scope, and manage data science projects and the corresponding key performance indicators (KPIs) for success
- Define and communicate governance principles
- Data Collection and Integration
- Understand new data sources and process pipelines and catalog/document them
- Acquire access to various databases and other sources systems such as SQL or graph databases
- Create data pipelines for more efficient and repeatable data science projects
- Data Exploration and Preparation
- Apply statistical analysis and visualization techniques to various data, such as hierarchical clustering, T-distributed Stochastic Neighbor Embedding (t-SNE), principal components analysis (PCA) Machine Learning
- Generate hypotheses about the underlying mechanics of the business process
- Test hypotheses using various quantitative methods
- Display drive and curiosity to understand the business process to its core
- Network with domain experts to better understand the business mechanics that generated the data
- Machine Learning
- Apply various ML and advanced analytics techniques to perform classification or prediction tasks
- Integrate domain knowledge into the ML solution; for example, from an understanding of financial risk, customer journey, quality prediction, sales, marketing
- Testing of ML models, such as cross-validation, A/B testing, bias and fairness
- Collaborate with ML operations (MLOps), data engineers, and IT to evaluate and implement ML deployment options
- integrate model performance management tools into the current business infrastructure
- Implement champion/challenger test (A/B tests) on production systems
- Continuously monitor execution and health of production ML models
- Establish best practices around ML production infrastructure
- Train other business and IT staff on basic data science principles and techniques
- Train peers on specialist data science topics
- Network with internal and external partners
- Upskill yourself (through conferences, publications, courses, local academia and meetups).
- Promote collaboration with other data science teams within the organization (if there is a decentralized data science practice). Encourage reuse of artifacts
What we look for
- Coding knowledge and experience in several languages: for example, R, Python/Jupyter, SAS, Java, Scala, C++, Excel, MATLAB, etc.
- Experience with popular database programming languages including SQL, PL/SQL, for relational databases and upcoming nonrelational databases such as NoSQL/Hadoop-oriented databases such as MongoDB, Cassandra, and others.
- Experience with distributed data/computing tools: MapReduce, Hadoop, Hive, Kafka, also MySQL, and so on
- Experience of working across multiple deployment environments including [cloud, on-premises and hybrid], multiple operating systems and through containerization techniques such as Docker, Kubernetes, AWS Elastic Container Service, and others.
Machine Learning and Data Science Knowledge/Skills
- Experience in one or more of the following commercial/open-source data discovery/analysis platforms: [RStudio, Spark, KNIME, RapidMiner, Alteryx, Dataiku, H2O, SAS Enterprise Miner (SAS EM) and/or SAS Visual Data Mining and Machine Learning, Microsoft AzureML, IBM Watson Studio or SPSS Modeler, Amazon SageMaker, Google Cloud ML, SAP Predictive Analytics.
- Expertise in solving [vision, text analytics, credit scoring, failure prediction, propensity to buy] problems is preferable.
- Knowledge and experience in statistical and data mining techniques: generalized linear model (GLM)/regression, random forest, boosting, trees, text mining, hierarchical clustering, deep learning, convolutional neural network (CNN), recurrent neural network (RNN), T-distributed Stochastic Neighbor Embedding (t-SNE), graph analysis, etc.
Interpersonal Skills and Characteristics
- All candidates must be self-driven, curious and creative.
- They must demonstrate the ability to work in diverse, cross-functional teams in a dynamic business environment.
- Candidates should be confident, energetic self-starters, with strong moderation and communication skills.
- Candidates should exhibit superior presentation skills, including storytelling and other techniques to guide and inspire.
- Candidates should have a minimum of eight (8) years of relevant project experience in successfully launching, planning, and executing data science projects.
- A specialization in text analytics, image recognition, graph analysis or other specialized ML techniques such as deep learning, etc., is preferred.
- Ideally, the candidate is adept in agile methodologies and well-versed in applying DevOps/MLOps methods to the construction of ML and data science pipelines.
- Candidates should ideally exhibit project experience in applying ML and data science to business functions.
- Candidates need to demonstrate that they were instrumental in launching significant data science projects.
- Candidates should have demonstrated the ability to manage large data science projects and diverse teams.
- A bachelor’s or master’s degree in computer science, data science, operations research, statistics, applied mathematics, or a related quantitative field [or equivalent work experience such as, economics, engineering and physics] is preferred. Alternate experience and education in equivalent areas such as economics, engineering or physics, is acceptable. Experience in more than one area is strongly preferred.
- Candidates will ideally have a specialization in ML, AI, cognitive science or data science.