Review by mburnamfink - Hiring Data Scientists and Machine Learning Engineers

mburnamfink 's review for:

Hiring Data Scientists and Machine Learning Engineers by Roy Keyes

4.0

Hiring Data Scientists is a short book on the difficulties of hiring. I've been on both sides of the table, with mixed results. I did get hired once, and I've sat through a bunch of interviews which didn't go anywhere. Hiring is hard, and this book doesn't have the answers, but it does have some good practical advice.

The most interesting parts of the book are interviews with other managers: Julie Hollet at Mozilla, Chris Albion at Wikimedia, Sean Taylor at Lyft, Angela Bassa at iRobot, and Ravi Moody at Spotify. It's an interesting world of data science, with many paths to success. Keyes has picked interesting people and asks good questions, which is a skill.

The more pragmatic thing to consider is that "data science" has expanded to include many diverse duties, and you won't be able to hire a candidate who's good at all of them and has relevant subject matter expertise. So you actually have to think about specific tasks you need done, in the broad areas of explaining patterns to the business, training new machine learning models, and scaling the production operations of data driven software. There are lots of great tables of tasks and skills, which got sadly mangled in my Kindle edition. So it goes.

The other pragmatic advice is that your hiring process is a funnel, and you have to ensure that it works properly. Good job postings are your first step, so be clear about about the skills you need, and the ones you expect to train people on. Most specific technologies can be learned. Asking for more years of experience in a thing than it's been around is an obvious misstep that'll lose you candidates. The next step is sifting candidates. Reading resumes is tiring, and can expose hidden biases. Worse, junior candidates look very similar on paper, so it's hard to distinguish them. Whiteboard coding is very artificial, and the typical 'reverse a linked list' problems not representative of actual work. Takehomes are better, but biased against working people with families, and require even more effort to grade. One useful tidbit, from the interviews is that the only decisions that matter are "hire" and "don't hire", so a ranking scheme should be as binary as possible, rather than trying to pick a candidate from the up wing of a bell curve.

We're still learning lessons as data science rapidly matures as a field. The idea that you can take someone with a PhD, point them at your business databases, and build a recurrent neural network that makes number go up, is thoroughly obsolete. A mature idea of what comes next is still being developed, and this book is a worthwhile contribution to the field.