When most people think of big data they think of numbers, but it turns out that a lot of big data -- a lot of the output of our work and activity as humans in fact -- is in the form of words. So what can we learn when we apply machine learning and natural language processing techniques to text?
The findings may surprise you. For example, did you know that you can predict whether a Kickstarter project will be funded or not based on textual elements alone ... before it's even published? Other findings are not so surprising; e.g., hopefully we all know by now that a word like "synergy" can sink a job description! But what words DO appeal in tech job descriptions when you're trying to draw the most qualified, diverse candidates? And speaking of diversity: What's up with those findings about differences in how men and women describe themselves on their resumes -- or are described by others in their performance reviews?
On this episode of the a16z Podcast, Textio co-founder and CEO Kieran Snyder (who has a PhD in linguistics and formerly led product and design in roles at Microsoft and Amazon) shares her findings, answers to some of these questions, and other insights based on several studies they've conducted on language, technology, and document bias.