Corporate Director and Principal Partner at Be Beyond
Views:669 Applications:74 Rec. Actions:Recruiter Actions:0
Python Data Science Engineer - PySpark (3-6 yrs)
Core skills :
1. Solid understanding of applying Python, SQL and Linux bash scripting for data preprocessing, data wrangling, data analysis and modelling.
2. Python, pandas, Numpy, Pyspark and MR python streaming.
3. Visualisation packages like plotly, matplotlib
4. Working on unstructured data, messy data
5. Understanding of the best practices for the above and what it takes to write production code.
6. Statistical and Mathematical models like Regression, Clustering, times Series, Classification
7. Should have at least promoted 1 data project to production
Good to have :
1. Machine learning experience in both Hadoop (SparkML) and other platforms (Sci-kit Learn, CNTK, TensorFlow) as well as contemporary cloud offerings (Azure ML, Cortana Intelligence, Cognitive Services, Google Cloud ML Engine).
2. Ability to execute and deploy code for data management. [ETL]
1. Work on Internal and External Engagement through code and in the areas of Data Mining, Machine Learning, Predictive Analytics and Integrating Science to Systems
2. Work through the Data Engineering pipeline to Ingest Data, Data Quality, Data Characterisation, Meta-Data Management, Preliminary model automation
3. Work in areas like image processing, text processing, business rules, RPA, IOT.
4. Experience and ability to handle client and engagement and the ability to manage span of control both on team and work will define Snr or Jnr Role.
1. Bachelors in STEM programs.
2. 4-5 years
3. Having worked in datacentric python programs using some of libraries mentioned above
4. Very good direct experience in analytical model development