Go back to menu

Content Moderation and Online Platforms: An impossible problem?

Regulators and legislators look to new laws

21 July 2020

Every day globally, around 5 billion YouTube videos are watched, 500 million tweets are posted, and 95 million posts are shared on Instagram. How is this content moderated?

Content moderation, the act of monitoring and withdrawing published information, is particularly challenging in online environments where the information is partially or wholly derived from a large, diverse, and diffused user base.

Traditionally, social media platforms have been subject to less direct regulation given their roles as "passive intermediaries", in comparison to typical content creators such as broadcasters or newspapers. Under laws such as Section 230 of the Communications Decency Act 1996, in America, or the EU eCommerce Directive in Europe, social media platforms have been exempt from liability for illegal content they host.

With the increasing proliferation of fake news and new technologies such as deep fake algorithms, there are increasing calls for new, specific regulations and in the case of Facebook, the issue has prompted a large-scale boycott from advertisers.

The coronavirus outbreak has amplified this issue; industry and government stakeholders are worried about medical disinformation spreading. This concern has accelerated government intervention in the form of emerging legislation and prompted an overhaul of the traditional content moderating toolkit.

The Big Picture

There are two fundamental tensions in this area.

The first is the debate between freedom of speech and quality of information. For example while hate speech, defamation or content that endangers children is regulated, there is a grey area of content which, while not directly infringing laws or regulations, can nevertheless be harmful. Misinformation campaigns can damage democratic processes and in the time of Covid-19, can lead people to unsafely rely on bogus medical advice.

The second tension is centred on liability and responsibility for policing. Should the platform be responsible and therefore absorb the cost and resource requirements of content moderation? Would this disincentivise platforms from hosting content?

How Does Content Moderation Currently Work in Practice?

There are currently three main approaches to content moderation. Very few platforms rely on just one of these methods. The most common approach before the coronavirus outbreak was to rely on automated review model algorithms as the first line of defence and use human reviewers (either from the user base or professionally hired) to correct the mistakes made by the automated review model.

Manual/Human Moderation

This model relies on a large team hired by the platform to manually review the content themselves. Platforms draft a content policy that users subscribe to by virtue of using the platform and moderators remove content which doesn't comply.

The system is not fail-safe as there are often grey areas, but policies have been refined over the last few years and moderators are trained to distinguish between permissible and impermissible content. Currently, this is considered to be the most accurate method of content moderation.

The moderation process typically occurs once the content has been posted to the platform ("post-moderation") although in some cases it is vetted beforehand ("pre-moderation"). Not surprisingly, pre-moderation is less popular as users expect instantaneous communications.  As users drive revenue, user experience is prioritised on platforms and post-moderation is used, increasing the risk of exposure of inappropriate content.

Automated Moderation

As artificial intelligence and machine learning has developed, automated moderation is increasingly being deployed. It is cheaper than pure human moderation and can process large volumes of information far faster than humans can.

There are ongoing challenges to accuracy, despite significant technical advances, including where nuanced ethical choices are involved as algorithms struggle to make the "right" choice.

User Moderation

A different form of human moderation, the user-based method, relies on members of the online community policing themselves (e.g. Wikipedia). At its simplest, there is a reporting function where content flagged by users is passed to the internal review team.

Another form of user-based moderation is where the community appoints their own moderators. Reddit is a classic example of this type of moderation where each content channel (a "subreddit") is monitored for spam by a volunteer from within that online community. This can be effective: it requires minimal investment from the platform and leverages the benefits of a motivated subject-matter expert, with an awareness of context, who can respond quickly and accurately.

Covid-19 and the Rise of Automated Moderation

Since the start of the pandemic, online companies have begun to drastically increase their use of automated tools as social distancing rules have limited the number of human reviewers who can work together.

YouTube announced that it "will temporarily start relying more on technology to help with some of the work normally done by reviewer" with Facebook and Twitter making similar announcements.

While all of these platforms previously partially relied on AI/machine learning tools to manage content, the coronavirus pandemic has accelerated the roll-out of these tools on an unprecedented scale and required companies to develop specific Covid-19 techniques.

This has presented a number of challenges. Without the layer of human moderators, the automated tools are mistakenly capturing far more information than they should. Twitter's algorithm, for instance, was flagging tweets containing the words "oxygen" and "frequency" as requiring a Covid-19 fact-check even when the subject matter was completely unrelated.

Facebook was criticized for stopping users from sharing articles from mainstream papers such as The Atlantic or Buzzfeed as their software had flagged the posts as spam. Despite increasing technical progress, the Covid-19 pandemic has emphasised the current limitations of pure automated moderation.

Emerging Global Legal Response

Although a handful of specific content moderation laws existed before the coronavirus pandemic began, they were not customary. Many countries have viewed the Covid-19 crisis as precipitating an immediate need for drafting new laws.

On May 28 2020, President Trump issued an executive order demanding the creation of a uniform approach to content regulation and requiring an overhaul of existing internet liability law. The order makes explicit reference to the use of online platforms to spread misinformation on the origins of the Covid-19 pandemic as a reason for introducing the order but as of yet no new policy has emerged.

Europe has also been active on this front, with new laws being introduced in several nations. In Europe, Hungary's parliament passed a law that allows for the imprisonment of people "considered" to have spread false information about coronavirus. Bosnia and Herzegovina, is also introducing fines at the regional level for publication of fake news or unproven allegations which may "cause panic and fear among citizens" on social media as well as in print press. However both of these examples target the users rather than the platform.

France has proposed a different approach. On 13 May 2020, it adopted a new law against online hate speech which specifically imposes requirements on platforms (see our previous article on the law). The law echoes the existing Netzwerkdurchsetzungsgesetz law in Germany (the "NetzDG law") which came into force in January 2018. The NetzDG law required online platforms with over 2 million users to remove "manifestly unlawful content" (including defamation and hate speech) within 24 hours and produce a transparency report twice a year recording the removal process and requests for takedowns.

Although there is no reporting obligation under the new French law, the removal obligations are very similar: online intermediaries must remove hateful content within 24 hours and terrorist propaganda within one hour. Failure to do so could result in fines of up to 1.25 million EUR.

At the time, the European Commission voiced its criticism and requested postponement of the legislation. Just over a month after the law was passed, the French constitutional court overturned it, on grounds of infringement on free speech (see our previous article on the court ruling). The court also considered that the narrow time frames do not give intermediaries sufficient time to either identify or remove material and seek to turn online platforms into arbiters of free speech rather than the judiciary fulfilling this function.

Meanwhile in the UK , the House of Lords has been pushing for an "online harms bill" although no draft legislation has yet been proposed (see our previous article on the Online Harms White Paper).

The introduction of new laws throughout the coronavirus outbreak has sharpened the debate on the level of responsibility online intermediaries should take for content on their platforms.

Emerging Commercial Response

Platforms are increasingly facing challenges from revenue and commercial perspectives, as well as from a legal perspective. Recently Facebook has come under fire from advertisers as over 800 major brands, including Unilever, Ford and Coca-Cola, announced that they were pulling their advertising spending from Facebook over its failure to tackle hate speech. Although these companies make up a small fraction of Facebook's advertising revenue base, the market has responded strongly to the boycott: the company's market value dropped by $75 billion (more than the combined market capitalisation of Twitter Inc. and Snap Inc.) Brands are seeking increased transparency and third-party verification from Facebook. The company is quickly seeking to respond and has announced that it will undergo an audit by the Media Ratings Council, a media accreditation firm.

How Companies Are Responding to Content Moderation Challenges

As regulators and legislators bring in new laws to specifically regulate platforms and the moderation of their content and advertisers increasingly are making their spending contingent on platforms doing so (see for example the Facebook boycott), this will further accelerate platforms need to invest in, develop and implement range of alternative forms of moderation. New measures have been introduced to mitigate the weaknesses of purely automated moderation.


Firstly there is a focus on labelling rather than on removing content. For example, Twitter has created tailored labels for tweets relating to the 5G-coronavirus conspiracy, leading users to an article fact-checking the theory. Facebook has taken a similar approach, labelling content with fact-checking stickers and subsequently giving users discretion as to whether to view the content.

Controlling Monetisation

A second tool is controlling monetisation: stopping content on certain topics or containing misinformation from being monetised by the content creators. This approach is only relevant for certain types of platforms (such as YouTube or Spotify) where content is shared on the platform by users for the express purpose of generating ad-revenue which is then shared between the platform and the creator.

YouTube historically offers monetization, as default, on all videos added and then selectively demonetises videos or channels if they breach the community policy. The platform has taken the opposite approach during the coronavirus pandemic: all coronavirus-related videos would not be monetized as default, but creators could then apply for ad-revenue certification.

This shows a marked policy change from YouTube and it is to be seen whether it applies this strategy more broadly in the platform and whether other platforms adopt similar approaches.  

Enhanced Human Moderation

Facebook is also revisiting the concept of human moderation – in November 2018, Mark Zuckerberg announced the creation of an Oversight Board to govern its content decisions. Comprised of experts from diverse fields, who are formally appointed and administrated by an independent trust, the Oversight Board will review appeals on content which users believe was wrongfully removed by Facebook's content moderators, once they have exhausted Facebook's internal appeals process. 

Facebook can also refer content decisions to the board which are particularly difficult and will have significant impact in terms of severity, scale and relevance to public discourse. Facebook will be required to implement the board's decision (unless it infringes local law), even if it disagrees. Facebook will also formally consider and publicly respond to content policy recommendations proposed by the board. As the Oversight Board is yet to review a case, whether its decisions will have a tangible influence or impact on content on Facebook remains to be seen.

However, while there is scope for future expansion of its powers, the board currently only has jurisdiction over content which has been removed by Facebook. The Oversight Board would consequently have no jurisdiction over misinformation spreading about the coronavirus, if Facebook allows such content to remain on its platform.


Digital content platforms, and particularly social media platforms, have a unique position in facilitating an increasingly global network of users, content producers and advertisers to publish, use and process unprecedented amounts of content and data. There is increasing regulatory and commercial scrutiny on the role of such platforms in managing and safeguarding its participants and its content; the pandemic is only accelerating this issue.

Deployment of emerging technology is increasingly seen as a means to address to the platforms' challenges, however in its current form, it is unlikely to be the whole solution.

Ioana Burtea, Trainee, TMT, contributed to the writing of this article.