On 23 March 2016, Microsoft released Tay to the world, a chatbot built through a collaboration between the company’s research and Bing divisions. Unlike other chatbots, this was designed as a proof of concept to show that artificial intelligence could interact with users effectively across very human communication media – in this case, Twitter.
Unfortunately, Microsoft hadn’t quite understood the dangers of releasing a tool that learns from user interaction into the Twittersphere, a place not exactly known for constructive criticism and polite political discourse. Exploiting its “repeat after me” function – it’s still unclear whether this was built in or a learned function – users were able to expose the @TayandYou account to inflammatory statements about some of the internet’s most popular talking points of the time, whether that was Gamergate, feminism, or simply entries from the Urban Dictionary.
Within days, this self-described “AI with zero chill” had been transformed into an accomplished purveyor of hate. The chatbot was supposed to demonstrate how AI can learn from environmental cues and communicate nuanced ideas in a language familiar to those on Twitter. What we got was a bot that echoed some of the more hateful sides of the internet, either through the parroting of genuine messages or users attempting to derail the project.
Microsoft said at the time that it was «deeply sorry for the unintended offensive and hurtful tweets from Tay», and that it would «look to bring Tay back only when we are confident we can better anticipate malicious intent that conflicts with our principles and values».
Tay was meant to show just how far Microsoft’s research teams had outpaced those of its rivals in the AI arms race. Instead, it was a catastrophic failure, proving that technology capable of learning and making its own decisions based on its environment needed to have guardrails, not just in development but also in its application.
It’s clear the experiment served as a wake-up call for a company that, until that point, had little in the way of internal processes to govern how such technology was developed. Some three months after it pulled the plug on Tay, CEO Satya Nadella would go on to pen an article in Slate, that tackled the issue head on, arguing that AI and humans can exist in harmony if strict development safeguards exist – something that is now regarded as the genesis of Microsoft’s current internal policies.
Heading into the Aether
By 2017, Microsoft had started work on turning Nadella’s ideals into something a little more tangible, deciding that the answer was to involve as many voices as possible in development. To this end, Microsoft created the Aether Committee, an internal ethics board made up of representatives from all the various departments involved in the development and delivery of AI products. Today it includes people from its engineering teams, sales teams, and research departments, as well as a representative from Microsoft’s Office of the Chief Economist.
“One way of accounting for the broad use of the technology across a range of different sectors is to bring a range of perspectives to the actual work,” says Natasha Crampton, head of Microsoft’s Office of Responsible AI. “As much as possible, you want to try and have a diverse group of people actually building the technology. That then helps you try and avoid the situation where your training data is a miss-match for the real world conditions for which the technology is used.”
Aether, which stands for ‘AI, Ethics and Effects in Engineering and Research’, was the first step in creating greater accountability and communicating what can and can’t be done at a development level among the many teams operating across the globe. The committee was tasked with ensuring that any work on AI being done at the company followed a set of new principles now being championed by Microsoft – fairness, reliability in its purpose, safety, privacy and security, and inclusivity of wider society, all of which are underpinned by the ideas of transparency and accountability.
While Aether proved useful for communicating these new ideas, the committee functioned solely as a non-binding advisory body, promoting fairly vague principles without the powers to create or enforce policy. Its limitations in affecting change soon became clear and in 2018, Microsoft created the Office for Responsible AI, a separate department responsible for enforcing new development policies and maintaining oversight of all AI development, with the help of the committee.
“You can think of the Office for Responsible AI as the governance complement to the advisory function of the Aether Committee work,” explains Cramption, who has headed up the unit since July 2019. “So together, we have this company wide approach where we use Aether as this thought leadership body to unravel some of these really thorny questions, and my office sort of defines the company wide approach and roles.”
Crampton and her team have been tasked with coming up with a set of internal policies to help govern the development of AI at Microsoft – in other words, safeguards to prevent a repeat of Tay. A large part of that is case management, which ensures that new and existing projects are able to effectively translate academic AI papers into something fit for the real world, as well as improving communication between development, sales and public-facing teams – something that was clearly ineffective during the rollout of Tay in 2016.
One way it does this is by assessing projects for ‘sensitive use cases’– anything that’s likely to infringe on the fundamental human rights of the end user or lead to the denial of services (such as a loan application), anything involving the public sector, or any service that could harm the end user, either physically, psychologically, or materially. Through this, the Aether committee and OFRA can triage use cases as Microsoft works with customers, and pass recommendations up to the senior leadership team.
At least, that’s the idea. Most of Microsoft’s AI policies are still in a pilot phase.
“We’ve been piloting the standard for coming up to five months now and we’re starting to get some interesting learnings,” explains Crampton.
“We’re imminently about to bring on more groups as part of our phase 2 – the rollout is accompanied by research and a learning process as we go, because the whole point of the pilot is to make sure that we are refining the standard and improving it as we learn more. Based on what we see now, my expectation is that by the end of the next financial year, so in 2021, we will have it rolled out across the company.”
Owning up to mistakes
Microsoft’s work in this area is as much about righting the wrongs of the past as it is about ensuring AI is built for everyone. Tay was certainly a blemish on Microsoft’s development history, but the incident also shattered any trust the public had placed in AI-based software and served as validation for those who feel the technology is overhyped, or perhaps even dangerous.
“We’ve tried to approach this in a humble way, recognising that we don’t have all the answers yet … these are really tough challenges that we’re confronting,” says Crampton.
“The near term role of it is to really be a champion for developing new practices … to uphold our principles. I really want to be forward leaning on that and really try and do what we did for privacy or security and embed responsible AI considerations into our thinking when we’re developing technologies.”
Crampton believes her soon-to-be company-wide AI policies are a reflection of a maturing Microsoft, a company trying to be more responsible with how it develops software, and how it speaks to its customer base.
“When customers come to us and ask us to help them with solutions, we think about how we can help the customer with a product in a way that upholds our principles,” says Crampton. “And … sometimes we are conscious of how a system may be expanded in the future.
“A number of our AI technologies are building blocks, especially when we think about some of our cognitive services. By design, we make available the building block and the customer adds various components. If we use facial recognition as an example, we provide the [facial] matching technology, but the customer adds their cameras, their database, their people and their processes. Recognising that there are all of these different actors, different people contributing different things to create an overall system and interaction between components is really important, and so is the accountability between different players.”
“We really try hard to communicate the limitations of our technology,” adds Crampton. “In the past, I think Microsoft has really just focused heavily on the capability. We think it’s important to talk about the limitations as well. And so we are trying to communicate those to our customers so that they are educated about the downstream issues that could arise.”