With his professed concern about fake accounts on Twitter, Elon Musk appears to be grasping at legal straws in an attempt to back out of his commitment to buy the social networking company for $54.20 a share, or at least to pay less for it. But his gambit has shined a light on a real scourge of online companies and their users.
Counting the autonomous accounts that mimic real people is just as slippery as valuing companies. A 2020 study by Adrian Rauchfleisch and Jonas Kaiser looking at thousands of Twitter accounts, including hundreds of verified politicians as well as “obvious” bots, found Botometer, the industry-standard learning algorithm trained to calculate the likelihood an account is a bot, yields imprecise scores leading to both false negatives and false positives.
Perhaps more disturbingly, findings of an investigation published in Science in 2018 that looked at about 126,000 stories—true and false—disseminated on Twitter from 2006 to 2017 suggest humans are more likely than robots to spread false news.
Darius Kazemi, a computer programmer who has spent a decade creating and studying bots, describes the trouble in identifying bots as twofold: First, bots are measured through internal metrics typically not available to the public; second, the machine-learning algorithms built to identify bots are informed through human judgment calls, which are often wrong.
That might at least be better than what Mr. Musk proposed: He said last week his team would try to ascertain a more accurate percentage of Twitter’s bots by doing “a random sample of 100 followers of @twitter.” He invited his followers to also engage in the exercise. Mr. Musk has recently estimated that fake users make up at least 20% of all Twitter accounts. In securities filings, Twitter has long estimated that false or spam accounts represent less than 5% of its total number of active users, but has also said that the actual number “could be higher than we have estimated.”
When Twitter Chief Executive Officer Parag Agrawal tried to demonstrate the challenges with both internal and external identification of bots on Twitter, Mr. Musk cut the discussion short with a poop emoji. It isn’t even clear that a bot-less platform wouldn’t suffer in some ways. Mr. Musk has likened Twitter’s bots to termites in a house, but they can also do good: A newspaper bot that automatically posts top headlines every day, for example, is a clear public service.
Humans misidentify bots for myriad reasons, according to Mr. Kazemi’s research. Active lurkers, who spend a lot of time on Twitter consuming news but who tweet infrequently are often identified as bots, as are accounts of non-English speakers who have mistranslated a few words. Especially avid fans who post frequently and in rapid succession can appear like bots. And there will always be burner accounts owned by real people so they can anonymously engage in objectionable behavior such as harassment. As a self-proclaimed free speech absolutist, Mr. Musk says he wants more freedoms on Twitter; in this context, he seems to be fighting for less.
Mr. Musk has questioned whether advertisers can know what they are getting for their money without understanding how many accounts are actually human. It is a legitimate concern, but certainly not one specific to Twitter. Mr. Kazemi described Twitter as one of the most open platforms with data, making it possible to be widely academically studied relative to other platforms.
“Twitter is in bad shape,” Mr. Kazemi said. “That puts it in the top 10% of all platforms.”
Twitter historically hasn’t been afraid to look bad to get better. Back in 2019, Twitter shares dove after the company said it would need further investments to clean up its user base. It also said it was transitioning from disseminating monthly users to monetizable daily active users, or those it believed could actually view ads.
Rather than trying to compete with the likes of Facebook and Instagram on sheer numbers, Twitter has, if anything, presented a lowball estimate of real users in an attempt to more accurately show advertisers its value. Last month, the company publicly revised several quarters of user data lower, acknowledging it had failed to link multiple separate accounts in some instances.
Meta Platforms said in its annual filing that duplicate accounts for its Facebook app may have represented about 11% of its global monthly active users in the fourth quarter and false accounts potentially represented approximately 5% of its global monthly user base. It called duplicate and false accounts “very difficult to measure” at its scale, noting “it is possible that the actual numbers…may vary significantly from our estimates, potentially beyond our estimated error margins.”
The takeaway there: No major platform’s user numbers are useful for anything other than as arbitrary benchmarks used to benchmark against themselves.
Last week on Twitter, Mr. Kazemi showed Mr. Musk’s own Twitter account getting flagged as a potential bot by Botometer, assigning it a bot likelihood of 3.5 out of a potential 5.
Mr. Kazemi said his work at online platforms trying to nail down accurate user numbers was so futile he quit the career. At least in his case, it was a legitimate excuse to walk.