Build data systems and pipelines for data collection. Combine raw information from different sources.
Processing, cleansing, and validating the integrity of data to be used for analysis. Explore ways to enhance data quality and reliability
Analyze huge amounts of data, both structured and unstructured raw data. Interpret trends and patterns.
Conduct complex data analysis and report on results and present data using various data visualization techniques and tools.
Prepare data for prescriptive and predictive modeling including machine learning models.
Build algorithms and prototypes
Identify opportunities for data acquisition
Develop analytical tools and programs
Continually improving coding skills
Xüsusi tələblər
Degree in Computer Science, Physics or Petroleum Engineer; a Master’s is a plus
Minimum 1 year experience as a data engineer/scientist or in a similar role
Technical expertise with data models, data mining, and segmentation techniques.
Ability to compose pipelines for data science models.
Knowledge of Machine Learning techniques, including decision tree learning, clustering, artificial neural networks, etc., and their pros and cons
Data Wrangling – proficiency in handling imperfections in data.
Programming Skills – good knowledge of statistical programming languages like R, Python, and hands on experience database query languages like SQL.
Proficiency in essential Python libraries - NumPy, pandas, scikit-learn, TensorFlow/PyTorch, seaborn, and BeautifulSoup/Scrapy for web scraping when necessary.
Statistics – Good applied statistical skills, including knowledge of statistical tests, distributions, regression, maximum likelihood estimators, etc.
Basic Math Skills (Linear Algebra) - understanding the fundamentals of Linear Algebra.
Knowledge with Timeseries data analysis and modelling.
Knowledge with regular expressions.
Knowledge with Data Visualization Tools like Power BI, Spotfire, Tableau, matplotlib, etc.
Great numerical and analytical skills
Excellent Communication Skills –efficiently communicating with both a technical and non-technical audience.