At the Hortonworks seminar on Hadoop for Modern Data Architectures, today, there was a lot of discussion around the skills required to get maximum value from the floods of data pouring in from sensors, websites, machine logs, phone data and the rest of the BigData deluge. At the Hortonworks Data Science workshop, the experiences needed in a Data Scientist were listed as
- Research Scientist
- Platform selector
- Database schema designer
Apparently this skill set is so rare that the Hortonworks team recommend their customers recruit data science teams, rather than hunting this mythical Unicorn. From their presentation … For years, I’ve struggled to know what label to put on my CV – apparently I should say Unicorn. I started doing research into sonar, modelling the ocean and atmospheric environments, the vessels and sensors and the processing behind them, then improving and automating the signal processing and presentation of data. ‘Sonar Researcher’ wouldn’t transfer well, so I opted for ‘Algorithmist’. After I’d moved on to putting together larger scale production systems concentrating on the integration aspects, that label didn’t seem to fit so well, so I tried ‘Software Integrator’. After some hardware control and visual demonstrators, I had a spell of database schema design, then platform selection and setup and fell back on the more general ‘Software Engineer’, and this lasted through a period of speech processing and web app development. The problem is there are a lot of Software Engineers out there, with varying levels and areas of skill and expertise. There are DBAs and Web designers, architects and game builders. I tried to find a succinct term summarizing my skill set and specialism, but no label seemed to show that I love data – organizing, combining, categorizing, automating, and then getting surprising and useful information out of a mass of bits. When my kids asked, I could only say that ‘I do stuff with data’. But now I know – from now on my CV will read ‘I am a Unicorn’.