Digital DNA: How Algorithms Take Grip Of Our True Selves
2018 saw one of the world’s largest data breach scandals. Facebook was accused of selling user data, including personal messages, to Cambridge Analytica that company, in turn, supposedly used the data to influence election campaigns all over the world. Summoned to testify by both the US and European authorities, Facebook CEO Mark Zuckerberg promised a privacy reform. Two years later, he still hasn’t fulfilled the promise. Meanwhile, Facebook users continue taking funny “What kind of pizza you are” quizzes. So how does social media decode our digital DNA, and why is it about time to stop trying to learn which Disney princess you are?
If we imagine our life without a smartphone, it might feel prehistoric. Today we’ve become inseparable from our gadgets — we communicate, google, check the weather and traffic, book train tickets and make doctor’s appointments. We work online. And we write. We write a lot. Leo Tolstoy’s War and Peace is 2.5 million characters. Twitter users type the equivalent of the novel every 1.5 seconds. Over three million emails are sent per second — no one has ever written as tirelessly as netizens.
With cafes and other public places shut, schools and universities closed, national quarantines are winding real social life down and pushing people to plunge into the online world to make up for the missing communication and entertainment.
We used to believe our “second”, online life is clandestine. You don’t have to show your real face on social media, and you can engage in all kinds of activities without others knowing about it — just clear your browser history. Thus, for many, online life is a means of escaping the harsh reality and trying to be someone else. Still, things are not that simple.
Uncovering our true selves
The problem is that every step we take online is recorded and remembered by the system. Every Google inquiry or every Facebook like makes up a part of our digital footprint that remains alive for an indefinite amount of time. Digital footprints all over the net can very accurately describe the person who left them: his or her personality, desires, interests and views. We seem to have acquired another set of genes, but digital ones.
Words and sentence structure you use while writing your posts reflect your emotional state and mindset. Likes can reveal your political preferences, religiousness, sexuality, drug and alcohol addiction and much more. Back in 2013, researchers from Cambridge learned to determine IQ based on a person’s Facebook posts, and their estimates were accurate. “If you like thunderstorms, The Colbert Report or curly fries on Facebook, you’re a genius. If you like Sephora, Harley-Davidson or the country-western band Lady Antebellum, you’re not.” But what high IQ has to do with curly fries? Although such a connection might seem pretty stupid, computers think differently.
They can detect very subtle statistical deviations that escape our perception but can be used to evaluate personality. As for curly fries, researchers tend to explain its correlation with high IQ by the homophilic theory. The idea says smart people prefer to have smart friends, in the same way as young people are more likely to groove with young ones. So, once upon a time, some wise guy liked curly fries, his smart friends also liked it, and then their smart friends took over the tradition — and away we go. Computers turned affection for curly fries into a signal of high IQ. In such a way, if a gay person likes a chocolate cupcake and his or her gay friends follow the example, chocolate cupcakes will become criteria of homosexuality.
Big Brother?
Have you ever seen any funny quizzes like “How well do you know your friend” or “Which Disney princess are you”? In essence, these are short psychological tests that may be used to assess your likes and dislikes. Millions of users play these games to share a funny result, and they don’t feel tricked when sharing their personal info. When you play such a game, all that algorithm has to do is compare your test answers to the posts you like and the websites you visit. And, as a result, it knows what political party you vote for and what cigarettes you smoke. When millions of users pass such tests, machines get a perfect training ground — they analyse more and more data and become more and more accurate in personal profile making.
In 2015 the Cambridge Analytica released the This Is Your Digital Life app. The company confessed in the app description: yes, we do want to study your digital footprint and build your psychological profile based on it, but just for the sake of scientific research. About 270 volunteers agreed to share their data and were paid a dollar each for their scientific contribution. But, for a dollar, they were eager to share not only their digital self but also those of their friends. “Do you mind if we look at your friends’ data?” the programme asked. “OK, please! I want to take this funny test!”
It was revealed later that the data Cambridge Analytica collected from Facebook users who installed the app included personal messages. “A small number of people who logged into This Is Your Digital Life also shared their own news feed, timeline, posts and messages,” the company said in a statement. The messages included those sent by people who didn’t have the app and weren’t eager to share any personal info. Cambridge Analytica supposedly used this data to boost Donald Trump’s presidential campaign.
Anti-data collection campaigners suggest analysing what users write and what they like, the company issued canvassing instructions: what should be told a person and how it should be said, so he or she believes and responds “correctly”. It turned out to be not that difficult: in many cases, to get a response from people of different types, all campaigners had to do is change the packaging of the same idea. Besides, the algorithm selected users who could vote for Trump and informed them of the aspects of his programme that might interest them.
Cambridge Analytica boasted it was able to “hack” 200 elections around the world with the help of Facebook including the 2016 US presidential election and the Brexit referendum. However, there is no reliable evidence to show their activities did have a decisive influence on political processes. Though there is no need to hit the panic button, the mere existence of such massive data collection and analysis capabilities demand thorough reflection.
No cheating
One of the problems is we cannot control those tiny things that reveal our true selves — from chocolate cupcakes to Lana del Ray songs. We have no idea what exactly our mindless mouse and keyboard clicks say. Thus, it’s virtually impossible to deceive a machine that counts our IQ based on our music preferences. Well, we could do that if social media sent us alerts like, “Of course, you can write (or like) this, but then it will be easier to determine your sexual orientation, whether you use drugs and how low your self-esteem is.” Sounds spooky, right?
It looks like the war over personal data might already be lost: after all, a radical revolt against the system will entail our transformation into a literal caveman.
Or, even better, replace the fancy smartphone with an old black and white push-button mobile. We will have to leave the house wearing a mask — there are surveillance cameras everywhere! The list of precautions is endless.
Still, we can and should play hide and seek with algorithms, come up with reliable encryption, introduce new laws about personal data, so that recent scandals like Cambridge Analytica do not blow in our faces.
The good brother
There is also good news. If big brother makes you nervous, wait a couple of years, and some thoughtful algorithm will send you a push-notification: “Judging by your tweets and Facebook likes, your level of anxiety is growing — it’s time to see your shrink!” After all, computers are already gradually learning to recognise signs of psychological problems in digital footprints. Scientists at Harvard and the University of Vermont have developed an algorithm that captures linguistic signs of depression and post-traumatic stress disorder in Twitter posts. In other words, the machine analysed the text a person writes and identifies words with depressive emotional connotation.
Consequently, almost half of severe depression cases elude early diagnosis. With computers becoming capable of recognising depression symptoms in advance, lives could be saved. It might be consoling in a time of large-scale data analysis — and it makes sense to cooperate with algorithms rather than try to hide from them. After all, they may learn to warn us about other dangers.