Featured Article : Microsoft Launches New AI Content Safety Service

Microsoft has announced the launch of Azure AI Content Safety, a new content moderation service that uses AI to detect and filter out offensive, harmful, or inappropriate user and AI-generated text or image content.

What Kind of Harmful Content?

The type of content Microsoft’s developed Azure AI Content Safety to filter out includes anything that’s offensive, risky, or undesirable, e.g. “profanity, adult content, gore, violence, hate speech” and more. Azure is Microsoft’s cloud computing platform, where the new Content Safety moderation filter will be deployed (ChatGPT is available in the Azure OpenAI Service).

What’s The Problem? 

Microsoft says that the impact of harmful content on platforms goes beyond user dissatisfaction and can damage a brand’s image, erode user trust, undermine long-term financial stability, and even expose the platform to potential legal liabilities.  As well as the problem of user-generated content, the new feature uses AI to filter out the growing problem of AI-generated harmful content, which includes inaccurate content (misinformation – perhaps generated by AI ‘hallucinations’).

A Sophisticated AI Moderation Tool 

Although Microsoft’s AI Content Safety filtering feature sounds as though it’s primarily designed to protect private users, it’s actually primarily designed to protect companies and their brands from the risks and challenges of moderation and of the rub-off associations and legal problems of having harmful content and misinformation or disinformation published on their platforms (a moderation tool), with users being the secondary beneficiaries – if it’s filtered out, they won’t see it (a win-win).

With Microsoft being a major investor in AI (i.e. OpenAI) it also appears to have a wider purpose that utilises this and shows that AI can have a really positive purpose, countering the fear stories of AI running away with itself and wiping out humanity.

In a nutshell, Microsoft says its new Azure AI Content Safety Filtering feature ensures “accuracy, reliability, and absence of harmful or inappropriate materials in AI-generated outputs” and “protects users from misinformation and potential harm but also upholds ethical standards and builds trust in AI technologies” which Microsoft says will help “create a safer digital environment that promotes responsible use of AI and safeguards the well-being of individuals and society as a whole”. 

How Does It Work and What Can It Do? 

The types of detection and filtering possible and the capabilities of AI Content Safety includes:

– Offering moderation of visual and text content.

– A ‘Severity’ metric,’ which (on scale of 0 to 7) gives an indication of the severity of specific content (safe 0-1, low 2-3, medium 4-5, and high 6-7) which enables businesses to assess the level of threat posed by certain content, make informed decisions, and take proactive measures. A severity level of 7 (the highest), for example, covers content that “endorses, glorifies, or promotes extreme forms of harmful instruction and activity towards Identity Groups”.

– The multi-category filtering of harmful content across the domains of Hate, Violence, Self-Harm, and Sex.

– The use of AI algorithms to scan, analyse, and moderate visual content because Microsoft says digital communication also relies heavily on visuals.

– Moderation across multiple languages.


Businesses can choose to operate and use the new filtering system either via API/SDK integration (for automated content analysis) or by using the more hands-on ‘Content Safety Studio’ dashboard-style, web-based interface.


Amazon also has a similar content moderation service for its AWS called ‘Amazon Rekognition.’ It also uses a hierarchical taxonomy to label categories of inappropriate or offensive content and has “DetectModerationLabels” in operation to detect inappropriate or offensive content in images.

What Does This Mean For Your Business? 

As any social media platform or larger company will be able to testify, moderation of content posts is a major task and human moderators alone can’t really scale efficiently to meet these the demands quickly or well enough, so companies need a more intelligent, cost-effective, reliable, and scalable solution.

The costs of not tackling offensive and inappropriate content don’t just relate to poor user experiences but can lead to expensive legal issues, loss of brand reputation, and more. Whereas before generative AI arrived on the scene, it was bad enough trying to moderate just the human-generated content, with the addition of AI-generated content, moderation of offensive content has become exponentially harder. It makes sense, therefore, for Microsoft to leverage the power of its own considerable AI investment to offer an intelligent system to businesses that covers both images and texts, uses an ordered and understandable system of categorisation, and offers businesses the choice of an automated or more hands-on dashboard version.

AI offers a level of reliability, scalability, and affordability that wasn’t available before, thereby reducing risk and worry for businesses. The recent events of the conflict in Israel and Gaza (plus the posting of horrific images and videos which have prompted the deletion of social media apps for children) illustrates just how bad some content posts can be, although images of self-harm, violence, hate speech, and more have long been a source of concern for all web users.

Microsoft’s AI Content Safety system therefore gives businesses a way to ensure that their own platform is free of offensive and damaging content. Furthermore, in protecting themselves, it follows that customers and other web users and viewers are also spared and protected from the bad experience and effects that some content can cause.

Tech News : Protect Kids from War Content

It’s been reported that some schools, in the UK (as well as Israel and the US) have advised Jewish parents to delete social media apps from their children’s phones over fears that they may see distressing hostage videos or videos of civilians being killed in the Israel-Hamas-Gaza conflict.

In Israel 

In Israel, schools and parents are reported to have been asking children to delete their social media apps over fears that they may see images and videos, made and posted online by Hamas, showing Israeli citizens being shot (e.g. at the Tribe of Nova Festival near the Gaza-Israel border), children being abducted, and captives of Hamas pleading for their lives. The fear is that children could be subjected to psychological terror and long-lasting psychological damage by witnessing the videos and images, which it’s been reported have been shared on Instagram, ‘X’ (Twitter), and TikTok, and forwarded on WhatsApp.

In the US 

In the US, it’s been reported that a New Jersey school emailed parents, asking them to tell their children to delete their social media apps, and that another New York school advised parents to monitor their children’s social media usage, and to talk to them about what action to take if/when they encounter such images or videos.

In The UK

A similar approach is being taken in the UK with Jewish schools asking parents to ask their children to delete social media apps and/or talk to their children about the kind of content they are seeing.

Social Media 

Social media’s role generally over the Israel-Gaza conflict is now under the spotlight, particularly over how it has been used to spread misinformation (false or incorrect information shared without harmful intent), disinformation (false information shared with the specific intent to deceive), and confusion, and to fan hatred. For example:

– A misleading video was shared across platforms, wrongly connecting a 2015 Guatemala event to Hamas (a video of a girl being set on fire by a mob).

– A Hamas leader recently reacted to a fake news story from an Israeli TV channel.

– False claims that Qatar had threatened to cut off gas exports.

– Allegations that Hamas “beheaded babies” which was even published on tabloid front pages, and was referenced by President Joe Biden in a speech.

With factors like mistrust of mainstream media allowing falsehoods to be spread instantly by social media, a surge in the amount of falsehoods being spread, challenges in verifying and fact checking, a lack of moderation guardrails on some platforms, intense emotions about the conflict, and third-party agendas, social media is playing a part not just in shaping opinion, but also perhaps affecting the thinking, attitudes, and decisions of key players in the war.

Facing Criticism and Investigations 

Examples of how the social media platforms and secure apps are facing scrutiny in relation to the conflict include:

– X, Telegram, and TikTok being criticised by regulators for not doing enough to stop the deluge of misleading information being spread via their platforms.

– The EU launching an investigation into ‘X’ (Twitter) over the spread of disinformation and violent content relating to the Israel-Hamas conflict.

– The Atlantic Council’s Digital Forensic Research Lab reporting that Telegram is the primary means of communication for disseminating statements by Hamas to its supporters.

– The UK’s technology secretary (Michelle Donelan) holding a virtual meeting with bosses at Google, Meta, X, TikTok, and Snapchat and asking the platforms to clearly set out what action they were taking to remove illegal material that breaches their terms and conditions.

What Are The Social Media Platforms Doing To Help? 

Examples of what some of the main things social media platforms are doing, e.g. to tackle distressing videos and images from the conflict, misinformation, and disinformation being posted on their platforms include:

– X (Twitter) has emphasised its commitment to tackling misinformation and has implemented stricter rules about misleading information. X says it’s using a combination of technology and human review to flag and, if necessary, remove false or misleading content about the Israel-Gaza conflict, and they’re adding warning labels to potentially distressing or graphic content and offer users the choice to view or skip such posts.

– It’s been reported that Meta has established a special operations centre (with experts, including fluent Hebrew and Arabic) dedicated to the Israel-Gaza situation, focusing on detecting and removing harmful content more rapidly, and leveraging third-party fact-checkers to assess the accuracy of potentially misleading posts. Meta has also enhanced its measures to reduce the spread of graphic videos and images of the conflict and has introduced “sensitivity screens” which blur out potentially distressing content until a user chooses to view it.

– TikTok has reinforced its community guidelines that prohibit content promoting hate or misinformation and is reported to be working with experts and fact-checkers to identify and combat false narratives about the conflict. Although the platform has (since Musk took ownership) very much touted its ‘free speech’ approach, it’s now reported to have implemented a stronger content moderation system to quickly detect and restrict the spread of graphic videos related to the conflict. X is also reported to be using warning labels and restricting the reach of videos that may not violate their policies but could be distressing to some users.

– Although Snapchat focuses on content from trusted news outlets through its ‘Discover’ feature, it’s reported to have enhanced its moderation guidelines for user-generated content regarding the conflict, especially content that is graphic in nature. Snapchat uses both automated systems and human reviewers to monitor and, when necessary, remove certain such content and labels have been introduced for stories or snaps that may contain distressing imagery.

What Does This Mean For Your Business? 

With Hamas reportedly using Telegram as their main means of communication with supporters and with anyone on any side able to upload and share videos and images on social media platforms, plus use encrypted apps like WhatsApp to share content, this conflict is a moderation nightmare for social media companies and a source of real concern for parents and schools.

Even though social media platforms are facing investigations and questions and have introduced some measures to help, as the advice from schools shows, perhaps the only sure and trusted way to protect children is to delete social medias apps together.

This story highlights how in conflicts such as Russia’s war on Ukraine and now the conflict in Gaza, social media channels are not just sources of information but can be used as a tool in information warfare and as a tool to deliberately terrorise and horrify people. Being vulnerable and inquisitive, alongside not having the capacity to cope with the many images of war, children are particularly at risk of distress and psychological damage.

It’s not surprising therefore, that schools and parents are seeking to take time to talk to children about what’s happening and their feelings and questions about it, plus reason with them about parental monitoring of what children are looking at and of the advantages of deleting their much-valued social media apps.

This story also highlights why many feel that social media platforms still have such a long way to go in protecting people (particularly their youngest users) from online threats and perhaps provides some vindication to governments and critics who have called for (and supported the introduction of) protective laws, e.g. the Online Safety Bill, and how these may force social media companies to be more socially responsible.

For the social media companies, issues that arrive in conflicts are a reminder of the difficulties of maintaining a balance between free speech and preventing harm and influence from bad actors. With a ground invasion by Israel apparently imminent, the situation for those directly affected in the Middle East only looks like getting worse, as do the worries for parents and the challenges for social media companies.