Journalist, data scientist, or both?

When I decided to enroll in the Flatiron School and learn data science, I found myself in a bit of a career identity crisis. I’ve been living and breathing journalism since college. Was I leaving that behind?

My main goal as a Flatiron student is to use the technical skills that I’m learning in order to open my my career options to jobs that weren’t available to me before.

In truth, I don’t know if I’m leaving journalism behind or if I’ll be able to have a job where I’m both a data scientist and a journalist. The most important thing to me is that I’m able to use the aspects of both fields that I’m most interested.

It turns out there’s a significant overlap between the two. For both journalists and data scientists, it’s important to be able to communicate complicated or niche concepts to less exposed audiences. Journalists immerse themselves in worlds that are unfamiliar to their audiences in order to tell stories in familiar terms. Similarly, data scientists immerse themselves in data and code in order to draw insights, but they also have to break down their methods and insights to non-technical stakeholders.

I got to put this to practice in my first project for the Flatiron School. The (hypothetical) premise is that Microsoft wants to launch a movie studio and is consulting a data scientist on what kinds of movies are most successful.

Getting to this point wasn’t a walk in the park. In a matter of weeks, I went from nearly zero coding knowledge to being able to tackle this project with SQL, Python, and libraries like like NumPy, Pandas, Matplotlib, and Seaborn. I also learned how to scrape websites, make API calls, and maintain a Git repository.

After immersing myself in that world, and after using those skills to draw insights for an imaginary version of Microsoft, I had to also present my methods and findings to a non-technical stakeholder (rather, a data science instructor pretending to be a non-technical stakeholder). This where I was hoping my journalism background would be useful.

With that, below is the non-technical version of my analysis for (fake) Microsoft. This this the first of five projects I’ll do at Flatiron, so I still have a lot to learn. I may even look back at this project after I graduate and disagree with my own methods.

You can also check out the GitHub repository for my project here. It contains a Jupyter Notebook with all the Python code I used to do this analysis. It also has the slides to the non-technical presentation.




Microsoft’s Foray Into the Movie World

Flatiron School Data Science: Project 1

Author: Zaid Shoorbajee