Tufts by Numbers: What data knows

Digging through data this semester, I tried to expose stories that can usually only be found buried under covers of complex databases and meaningless numbers. By parsing data on topics like college tuition cost, class waiting lists and the illusive Tufts snow day, I hope that I provided an opportunity for readers to parse through a greater understanding of their own world. But, over the course of the semester, I have overlooked perhaps the most important data mine of them all: ourselves.

In our modern technology-driven environment, each command that we make on our phones and computers is tracked in databases around the globe. And, more importantly, this data doesn’t then just sit in the databases. Instead, it is used to fuel the content, interfaces and advertisements that we see on our screens and push computing capability even further.

Didn’t realize the huge role you play in data science? See how you feel after asking Google to tell you a little bit about yourself. Feeding google.com/ads/preferences into your search bar will return what Google believes to be your age, gender and interests. These qualifiers come from Google’s machine learning algorithm – a complicated data science term used to indicate a system that develops as it is fed more information, such as search queries – that tracks your Google searches and visits to sites that are part of Google’s ad network.

You can even find Google’s stash of all your past queries if you look hard enough. When signed into a Google account, accessing google.com/myactivity returns all of your past Google searches. Search too many cat memes at home? The public might never know, but Google sure does. Computers and data algorithms also learn how to mimic human behavior and skills from human-provided data. In a recent project by Quartz, a computer algorithm was fed thousands of pages of human-written love stories to see if it could write its own. After being fed the prompt, “I loved him…,” the computer experienced a few hiccups, but managed to spit out the semi-coherent sentence, “I loved him for the weekend as well, and I drank apple martini ingredients like hummingbird saliva or snake testicles.” Sure, it’s not a completely articulate love story, but it’s eerily closer to the love that we expect our computers to be able to understand.

The massive amount of data on each and every one of us that is collected and used daily shows how significant understanding data really is. When companies and algorithms are using our every key stroke to learn and grow, shouldn’t we able to easily use data for the same purposes as well? Each week in this column, I have tried to help fulfill exactly that human need – to learn, grow and develop from intricate sets of numbers.