San Francisco - California
Date Posted: Jan. 18, 2018
Requisition ID: MAC08062Apply
Macy's is looking for a passionate, talented, and innovative Machine Learning Engineer with a strong machine learning background to help build industry-leading AI/ML enabled applications for retail/commerce platform. As Lead ML Engineer you will be part of the team shaping the future of retail/commerce, revolutionize how millions of customer shop online, store and other channels, you'll partner with technology and business teams to build new AI/ML enabled smart services that surprise and delight our customers.
The Lead Machine Learning (ML) Engineer will work closely with data scientists, application development leaders and platform architects to take ML models from R&D to production, scale and maintain them. Help mature and continuously update Macys ML platform by absorbing best-of-breed toolsets.
You will also play a key role in developing tools and abstractions that our other developers will build on top of.
Perform other duties as assigned.
• Lead and work with ML engineers and data scientists to build smart commerce services & ML applications
• Assess how a given problem can be addressed using advanced data analytics and/or various ML techniques
• Identify what kind of data in which format should be collected in order to solve the problem
• Identify which advanced data analytics or machine learning technique, such as reinforced learning, deep learning or traditional ML, is more suitable for the problem
• Clean & structure the data and build training & validate data sets by applying advance feature-engineering techniques
• Training models and showing that they can perform better than the baseline
• Implement from early exploration to production
• Set up development and test environments for advanced analytics and machine learning projects
• Consistently demonstrates regular, dependable attendance & punctuality
• BA/BS Degree required (MS or PhD preferred).
• 5+ years of industry experience in Machine Learning or related field.
• Extensive programming experience in Python and Java.
• Strong preference for programming experience in Spark framework and PySpark.
• Strong preference for hands-on experience with TensorFlow, Scikit-learn, PredictionIO, Spark MLlib, MXNet, Caffe, H2O or other ML Libraries.
• Basic knowledge on RESTful APIs.
• Familiar with various machine learning techniques and libraries.
• A plus to have working knowledge of Big Data (Hadoop, Cassandra, Spark, Kafka, YARN), real-time and batch processing.
• At least three years of experience working with Machine Learning Algorithms and Libraries. How to effectively choose suitable model (decision tree, nearest neighbor, neural net, support vector machine, ensemble of multiple models, etc.) while working with some of the widely available open source standard implementations of Machine Learning algorithms through libraries/packages/APIs (e.g. scikit-learn, Theano, Spark MLlib, H2O, TensorFlow etc.). How to devise a learning procedure to fit the data (linear regression, gradient descent, genetic algorithms, bagging, boosting, and other model-specific methods), as well as understanding how hyperparameters affect learning. You also need to be aware of the relative advantages and disadvantages of different approaches, and the numerous gotchas that can trip you (bias and variance, overfitting and underfitting, missing data, data leakage, etc.).
• Strong knowledge of Computer science fundamentals including data structures (stacks, queues, multi-dimensional arrays, trees, graphs, etc.), algorithms (searching, sorting, optimization, dynamic programming, etc.), computability and complexity (P vs. NP, NP-complete problems, big-O notation, approximate algorithms, etc.), and computer architecture (memory, cache, bandwidth, deadlocks, distributed processing, etc.).
• Good understanding of Probability and Statistics: (conditional probability, Bayes rule, likelihood, independence, etc.) and techniques derived from it (Bayes Nets, Markov Decision Processes, Hidden Markov Models, etc.) which are at the heart Machine Learning algorithms; and statistical measures (mean, median, variance, etc.), distributions (uniform, normal, binomial, Poisson, etc.) and analysis methods (ANOVA, hypothesis testing, etc.) that are necessary for building and validating models from observed data.
• At least 6 years of working software programming experience and strong understanding of Software Engineering and System Design. You need to understand how machine learning components work with other software components, communicate with them (using library calls, REST APIs, database queries, etc.) and build appropriate interfaces for your component that others will depend on. Should be able to design and implement ML components to avoid bottlenecks and let algorithms scale well with increasing volumes of data. Solid knowledge of software engineering best practices (including requirements analysis, system design, modularity, version control, testing, documentation, etc.) necessary for productivity, collaboration, quality and maintainability
• 3+ years of experience in Big data frameworks like Hadoop, Spark, Scala, Kafka, Hive. Excellent coding ability in Java is required. 2+ years of feature engineering and data science with Python, Jupyter, Scikitlearn, Numpy. Knowledge of design and architecture: service-oriented and message-driven architectures
• Proven Experience in deep learning: e.g., computer vision, natural language translation, or speech recognition using tools such as TensorFlow and Keras
• Deep Expertise in Statistical/predictive models: e.g., decision making, adaptive algorithms, or diagnostic problem spaces (many use cases in operations, omni-channel-fulfillment, logistics, etc.,)
• Expertise with building, training and productionizing ML models in Python/Java/Scala using Spark/Tensorflow/H2O... Technically adept and hands-on (focus more on fundamentals)
• Experience working with large datasets in a ‘Big Data’ environment, preferably in retail/e-commerce or other customer facing industry, for feature engineering, PCA, etc., using tools such as Tableau/PowerBI/... (partnering with data scientist)
• Experience with deploying ML solutions in cloud, both as PaaS and IaaS. (ensuring cloud-readiness and cloud-transition)
• Strong attention to detail when identifying data relationships, trends, and anomalies
• Ability to generate quick, iterative solutions to a wide range of business problems
• Ability to take initiative and deliver tangible results under deadlines
• Flexibility to work across all functions/levels as part of a dynamic team