We live in the age of scale. Everything has to be scalable. Everything has to accelerate. It seems that if your business, division or team isn’t achieving rapid growth, it’s not successful.
Scale, though, isn’t always creating more opportunities. It induces an effect called aggregation. The more prominent a business is, the more people flock to it. The more information it gets, the better it gets. The further it improves, the larger it gets. And so on.
For businesses to compete with aggregators, they need scale. Without scale, it’s hard to make enough money to sustain operations. But scale depends upon two things, automation and data. Automation requires both automated artificial intelligence systems and crowdsourced ones. Data is also a combination of automated and user-generated content.
The question is, who controls the generated content? What happens when the amount of information exceeds human oversight? Can we trust an algorithm to vet what content exists and what doesn’t?
Scale and content
Most companies are investing heavily in scalability. They’re increasing their capacity, their infrastructure and their quality of service. But hidden among this growing frenzy, content assessment and security are being abandoned.
“Over the course of this year, we have invested significant resources to increase trading capacity on our platform and maintain availability of our service. We have increased the size of our support team by 640% and launched phone support in September. We have also invested heavily in our infrastructure and have increased the number of transactions we are processing during peak hours by over 40x.”
But as these services grow, content quality and security assurance are becoming critical. Facebook is under fire due to unsupervised ad purchases and filter bubbles. YouTube is getting hell for their lack of content control, especially, around children’s content. Users are accusing Twitter of becoming the home of trolls, Nazis, and armies of soulless botnets. The FCC is being questioned on the truthfulness of their Net Neutrality probe comments.
All these aggregator companies struggle with content. They feed on it but their scale is so massive, it’s impossible for them to control the flow of it anymore. And the most problematic aspect is that they still don’t know how to fix it.
Twitter’s Verification system, originally designed to solve the problem of impersonation, has been under fire since its inception. The lack of clear guidelines has plunged the program into suspended animation.
“A filter bubble is a state of intellectual isolation that can result from personalized searches when a website algorithm selectively guesses what information a user would like to see based on information about the user, such as location, past click-behavior and search history. As a result, users become separated from information that disagrees with their viewpoints, effectively isolating them in their own cultural or ideological bubbles.”
Facebook is trying to ease the situation. The truth though, is that Facebook, by design, creates filter bubbles.
“A bridging weak tie in a web context is a link to a source of information that you might not normally look at, you may not agree with, and challenges your ideas. Facebook and Google algorithms do the opposite: They show things we will like and agree with, so they are basically erasing our weak, bridging ties, at least in our digital networks.”
YouTube is another example of out-of-control content. Their case is compelling because they’re mixing human moderation with Deep Learning aid. Human operators train the artificial network, and the Deep Learning system expands the reach to all the platform’s content. The results, while impressive, have also generated unintended consequences.
“The thing that sucks is YouTube doesn’t tell you why it was de-monetized,” said Sam Sheffer, a 27-year-old whose career as a YouTuber began just a few months ago. “They link you to some arbitrary set of rules, and you have no idea why you were de-monetized other than the fact that you are.”
Algorithmic moderation systems
The general tonic is always the same. Due to scale, content gets out of control. Automated content infests the networks. People cry out, and the operators harden the filters. Due to the immense volume, humans alone can’t manually operate these filters. Operators then design new algorithms that can aid them in filtering and controlling it.
These machine-augmented moderation systems do censor plenty of subversive content. Content that shouldn’t be there in the first place. But they also have unexpected effects. The diversity of content is suppressed, and only the most conservative views are allowed. Worst of all, these systems can’t explain why they did what they did. When questioned by the platform’s users, operators are unable to tell why the system censored their content.
Ethics, diversity, and open-mindedness aren’t a black or white equation. Your upbringing, your education, your culture and your personal experiences matter. All these biases will creep into AI assisted moderation systems. And we need to be vigilant about it.
Build content moderation from the start
Learning from past mistakes has always been critical. In the age of exponential scalability, this is even more crucial than ever. There isn’t much margin for error. A small slip, innocuous at a small scale, will sprout into a choking issue when the system grows.
There are valuable lessons that newcomers can learn from the current aggregators.
- Don’t subvert content quality pursuing rapid growth. Eradicating questionable content, once it’s part of the larger system, will be damaging.
- Establish a clear policy for content since day one. There has to be a clear set of rules people can follow. It’s impossible to be objective, but at least, be transparent about the guidelines.
- Be straightforward about how the organization enforces the policy. Users should know how the system assess if a piece of content has infringed the platforms policy.
- Be impartial. It can’t happen that some users, due to their status or name, can upend the rules of the platform. The recent banning of women on Facebook is a good example of what not to do.
- Setup moderators early on. Moderators should raise problematic issues that the current policy doesn’t addresses.
- Under no circumstance allow moderators to make decisions if it’s not objectively supported by the policy.
- Update the policy on an ongoing basis. It’s impossible to capture all the nuisances of social conventions, so keep the guidelines alive. It’s a growing organism, like a newborn learning the rules of engagement.
- Implement self-policing mechanisms in the platform from day one. You will need them. No matter how good your moderators are, you need to build a system to allow users to bring the attention to specific issues.
- Build abuse detectors. As you platform grows, rogue elements will try to abuse it. You need to have ways of detecting these behaviors from day one. It’s easy to delay this until you’ve grown, but by then, the damage might be too widespread. Twitter bots or the FCC Net Neutrality probe are a good cautionary tales.
- Review the output of your abuse detectors regularly. These systems are autonomous and will make mistakes. You can’t build them and forget about them.
- Make sure your automated systems execute on new changes to the policy. A delay between one and the other can be problematic.
There isn’t a perfect recipe for humans. We are complex systems, and it’s impossible to plan for everything. Nonetheless, most people forgo essential quality assurance for the riches of rapid growth.
The consequences of not doing it, are dire. Advertisers flee and revenues go down. Content creators flee, traffic plummets and the market share erodes. Revenues go down even more.
PS: As a side note, I wonder how feasible it is to create a system, like AlphaZero that uses reinforced learning to devise a real-time policy that changes and adapts.