The term “exponential technology” was made popular by Google’s director of engineering, Ray Kurzweil, in his 2005 best seller The Singularity is Near. Exponential referred to the speed and nature of technological change that is fast, usually a doubling in capability within a short period of time. These digital technologies, fueled by advances in computer programming and processing power of the hardware, have been “disruptive” in that they significantly alter the economy and society.
These technologies mostly originated in the public sector (especially in military and space research). Yet while they significantly changed the private sector, they have had less impact on public institutions, at least so far. What impact might they have on an international public institution like the United Nations?
This question is of personal interest, having spent almost all my working life as a UN staff, immersed in its inner culture with its strong preference for precedence and tradition. It is not exactly an ‘exponential technology friendly’ culture. But the change is coming. This article looks at a daily and extensive UN activity – intergovernmental meetings – checking its components against possible disruption by these technologies. Given the military origins of many of these technologies, it would also be interesting to look at UN activities on the peacekeeping side of the house. But that will be the focus of a Part II to the current piece.
As an organization of 193 member states, meetings are a main activity so the members can discuss matters they consider important. These discussions cover a wide range of issues from peace and security to economic development, human rights, education, health, social protection, environment, and the use of outer space to mention just a few. The opportunity to discuss is central to sifting through positions and build consensus, which is the base for global agreements, solutions, and standards. For a novice, a UN meeting may look nothing more than an endless talk fest, but diplomacy needs a lot of talking and the meetings make it possible.
The extent of meetings that regularly happen in the UN as a whole is not easy to pin down. Because in addition to the permanent inter-governmental bodies and their subsidiaries, there are many temporary or purpose-specific inter-governmental processes such as ad hoc working groups, task forces, and expert groups that affect the total number. These meetings originate from the following types of UN entities:
- The four permanent UN organs of General Assembly, Security Council, Human Rights Council (HRC) and the Economic and Social Council (ECOSOC), and their subsidiaries total 84 inter-governmental groups. They meet regularly for different periods of time from a few days, to a few weeks, to several months. They mostly meet in New York and Geneva.
- The UN system organizations that include various funds and programmes, specialized agencies, regional economic commissions, as well as the World Bank and IMF (full list here). Each has an intergovernmental governance body plus a host of permanent and temporary subsidiaries. Most are based in Geneva or New York and hold their meetings there. Some are based in other “UN cities” such as Nairobi, Paris, Rome, Santiago, and Vienna (see full list here).
- The intergovernmental bodies composed of signatories of international legal instruments such as conventions, protocols, and treaties. Since its creation 75 years ago, the UN member states have agreed on a large number of such multilateral agreements. In the field of environment alone there are 1300 such legal instruments. They are governed by their Conference of the Parties (the signatory countries of the agreement) and they have multiple subsidiaries. Their meetings review compliance and progress with the legal instrument.
A lot of meetings happen daily at the UN. On any given day at least a dozen meetings, may be more, can take place in the New York UN Headquarters. Meetings take place in the other cities that host UN organizations.
- Simultaneous interpretation
- Documents and reports
- Conference servicing
- Verbatim and Substantive note taking
- Administrative support and security arrangements
The UN rules require that all formal meetings are conducted in the six official languages. Simultaneous interpretation, that is speech-to-speech translation in real time, makes this happen. Interpretation booths are linked to the sound system and government and other participants of the meeting put on the ear-phones available in the room, select their preferred language, and follow the meeting without missing a beat.
The UN interpreters usually speak 2 or 3 of the six official languages fluently but equally important is that they are also fluent in “UN-ese,” that is, knowledge of the phrases, concepts, conceptual nuances, and the many acronyms that fill the UN discourse.
Simultaneous interpretation is a skill that needs special training. Hearing speech in one language and reproducing it in another language without losing the nuances and context, while also paying attention to the speaker’s subsequent sentence is taxing on the human brain. Therefore the rules limit working hours of simultaneous interpreters to no more than 24 per week. The rules also require two interpreters per language per 3-hour session to allow for breaks without which interpretation accuracy declines.
Simultaneous interpretation is one of the high cost components of UN meetings: a team of 12 for each session, and often multiple teams engaged at the same time in different meeting rooms. Interpretation cost is higher when interpreters are deployed to off-site meetings, such as UN conferences and summits. As a savings measure, UN uses remote interpretation if the event venue has reliable and fast internet connections.
Because of the context a human brings to it, simultaneous interpretation is difficult to automate but it is especially challenging in the UN given the peculiar language known as UN-ese. Yet there are technological developments that are readying to disrupt the interpretation world.
Microsoft Translator software has been in development for more than two decades limited to a handful of languages. In 2012, at a meeting in China, the former head of Microsoft Research demonstrated the software’s ability to translate and play speech back in Chinese and in his own voice. Today the software covers 60 languages, and the list includes all six UN languages. It can translate both text and speech. The output is not simultaneous, but the gap between the input and the translated output is closing fast. The speaker needs to articulate well and pause between sentences. Companies and education institutions use the software for their online presentations, webinars, and lectures.
Another technology is Google translate which we know more as an online text-to-text translator in over 100 languages. It can now translate speech as well. If you have a smart device and the necessary app installed, you can choose the input and output languages, speak into the device’s microphone and get your words translated usually in written form. But in some languages the app gives a spoken result. Native speakers find that both text and speech translation of Google still has problems with idioms, slang, and errors of transliteration. But it works better than it did only a few years ago and with crowdsourcing the software is learning from native speakers and improving its accuracy. Google announced in 2019 a new tool, Translatotron, for speech to speech translation without the intermediate step of converting speech to text first.
An important development in this area was announced in October 2018 by Baidu Research: that it has achieved “the first simultaneous machine translation system with anticipation capabilities and controllable latency.” Their software is speech to text translation but with an unusual twist: it can anticipate content from context, similar to what simultaneous interpreters do. In fact, Baidu researchers studied human interpreters in course of their software development process.
These technologies are are infinitely customizable. So they can “learn” UN-ese by processing tens of thousands of UN interpretation and translation records and adding the anticipation skill Baidu is developing. Processing that amount of material may take a human many years but mere minutes if not seconds for an algorithm. Given the rapid evolution of machine learning, neural networks, native language processes, and sound recognition, one can legitimately expect fast improvements in the automated translation and interpretation technology.
Does this technology pose a challenge to the simultaneous interpreters at the UN? In the near-term “No” but “Yes” in the long-term. The available software is still far from the sophistication of human interpreters and given the peculiarities of spoken language, reaching that level will need time. So while the current simultaneous interpreters at the UN need not worry, those who starting a simultaneous interpreter career say 10 years from now may need to be prepared for an early end to what is normally a long and steady career.
In the meantime hybrid situations are likely to become the norm. For example, there may be tiers of interpretation service: human provided for extremely important, sensitive and complex international negotiations, and algorithm provided for the rest. As I write this I recall UN meetings in which we could predict what the delegates would say before they spoke because they often repeated themselves. Such repetitive content is not be much of a challenge to an algorithm.
Documents and reports
The UN produces tens of thousands of pages of documentation annually. The total is higher when there is a special event such as a summit or conference. One way to see and experience the extent of documents produced is to subscribe to the daily document digest. Luckily the practice of printed documents is largely discontinued, replaced by digitally available versions. When I was a new UN staff in the early 1990s, attending a meeting involved carrying around hundreds of pages of documents, so e-documentation was a relief.
UN rules require that official UN documents are produced in all 6 languages. There is also demand for supplementary documents that are not official but useful, such background papers, to be available at least in English, French and Spanish. Document translation, in New York, is handled by the Documentation Division of the Department for General Assembly and Conference Services (DGACS). The Division has dedicated translation teams for each official language. Unlike interpreters, translators do not have restrictions to their working hours. But like the interpreters, the UN translators must be well familiar with the peculiar UN expressions, and the historical context of an issue which makes them more than basic translators.
Software is more advanced for translation. Those old enough will remember the early days of automatic text translation by Google, and what an exercise in frustration it was not only because it could convert a limited number of languages but, worse, it produced mostly transliterated gibberish. Today Google translate handles more than 100 languages and is a great deal more sophisticated, though still far from perfect.
A 2013 trends report by the International Federation of Library Associations (IFLA) noted the possible problems with automated translation in diplomatically sensitive circumstances but predicted that in “…5-10 years there will be automated translation techniques which will adequately support 90% of most communication need.”
The machine translation software is advancing and learning to handle more complex and technical text although none is advanced enough and Google itself advises that “no automated translation is perfect nor is it intended to replace human translators.” With respect to the language particularities of UN documents, there is an in-house effort, a collaboration between the World Intellectual Property Organization and the UN Secretariat, to develop a machine translation prototype specific to UN documents.
Should these developments worry the UN translators? Not in the near term but probably a little more than the simultaneous interpreters because translation software combined with machine learning and use of neural networks is improving fast. There is also the return on investment matter: priming the translation software for the UN is not high priority when customizing for the private sector has larger returns on the investment.
What is more likely is that UN translators move fast into hybrid working situations in which humans work together with the translation software. One already expected is human translators apply their UN specific skills as editors of machine translated material, and provide quality control of the automated output. Similarly, using machine translation can help UN translators have more time for the contextual research necessary for translation accuracy of politically sensitive documents.
But in the long-term, given the always present calls for reduced cost of services in international organizations and the exponential developments in language software, most UN translators, if not all, are likely to become redundant.
Servicing a meeting involves a minimum two staff, one professional and one general service staff (latter is a UN term for assistants, or secretaries a few decades). In a mid-size room 2-4 staff would be assigned while 4-8 in the large meeting rooms and the General Assembly Hall. The professional staff person functions as the Secretary to the inter-governmental group that is in session. Their primary responsibility is to support the Chair (or the Co-chairs) of the meeting. The general service staff support the committee secretary, and handle the in-room logistics such as helping country delegates find their assigned seats, or set their name plates etc.
In the past both these staff categories had a more critical role. For example, the committee secretaries had to keep track of the delegates’ requests for the floor, handwritten and reflecting the order in which the requests came. They needed to be familiar with the delegates especially their rank in the hierarchy to advise the meeting Chairperson(s) accordingly. They knew the applicable parliamentary procedures to handle a procedural question or challenge. The assigned general service staff did a lot of walking to manually place the country name plates, distribute a variety of printed meeting documents to the delegates, and collect the printed copies of the statements made for records.
These functions became redundant after the 6-year multi-billion dollar renovation of the UN building and meeting rooms 2008 to 2014. The physical nameplates are no longer necessary because they are replaced by a small digital boards on each desk. Requests to speak used to be made by placing your country name plate vertically in its stand, but now the request is done by pressing a button connected to a digital system that records the requests automatically, in the order in which they are made. The requests are compiled in real time and shown on a screen in front of the Chairperson. No need to wave a nameplate to get attention and no need for a hand-written speakers’ list. There is hardly any printed document distribution because documents are available electronically on a dedicated web site. With these and other improvements in the meeting room infrastructure digital technologies have already disrupted conference servicing at the UN.
Given the significant changes in the way a UN meeting is serviced, one could assume there would be fewer conference servicing staff but that does not seem to be the case. While I like seeing people be employed instead of made redundant by technology, this should not lead underutilizing well educated and skilled people like the UN staff. With most of their traditional functions no longer necessary, many conference servicing staff at the podium or in the room have little to do, often checking their emails of social media feeds. A more human-centered approach would be to retrain and redeploy them to more engaging and meaningful jobs. This would also downsize the conference servicing Department to a small division, making it possible to eliminate a number of senior level people and save on cost. Last I checked this Department had one Under Secretary-General, one Assistant Secretary-General, and several Directors, salaries totaling well over US$ 1 million a year. But ideas like this run into politics. The norm with organizational reforms at the UN has been to reduce staff at the lower levels not the higher, even though reducing the number of chefs in the kitchen and keeping more of the cooks would be more logical.
Verbatim and Substantive note-taking
The verbatim note-taking is done by teams for each of the six UN languages. The teams are staff of the DGACM, the Department that provides conference services. The tradition of verbatim reporting was for record keeping purposes. That was before webcasting and video recording of meetings became possible and regular. Today, UN meetings are webcast and a recording of it is usually available later on-demand. For how long is another matter because I recall there was always a server capacity issue so older information would need to be moved, archived or deleted.
The live webcast includes the audio from the interpretation booths for languages, but the video records are not in the languages. If a language version of the video is not always available there will need to be verbatim records for later, especially for those meetings on contentious issues where going back to the records may be necessary. Yet with all the technological development, it is hard to justify why video in the languages cannot be available. It is likely due to solvable problems such as server capacity, or lack of technology that can combine the sound from interpreters with the video of the meetings for language versions. Having all meetings recorded and available in all the languages is a matter of time considering webcasting was rarely available if at all a few decades ago. Given the trajectory of rapid technological change, verbatim note taking is likely to become redundant and the staff doing this work will need to be retrained and redeployed.
The substantive note takers come from the department or the UN body that works on the topic of the meeting. So in a meeting on sustainable development substantive note takers will come from the Department of Economic and Social Affairs because the issue is a mandate of the department. The note takers are familiar with the discussion topic, including knowledge of its history and context. Their notes are the basis for an analytical meeting report which many people seek for learning and research.
There are many paid and free software for speech recognition and transcription. A recent review of a list of ten, includes software with 99% accuracy before any training for specific voices, and another that works in 30 languages. Some are apps for a smartphone which, with the right cable to plug the phone to the interpretation output, might get the transcription of the meeting in one of the six languages for anyone in the room.
Are these apps and other software good enough for transcribing politically complex discussions? Some argue that speech recognition software cannot replace human transcribers because they are limited by their need for extensive voice recognition training, and usually miss context, homonyms, names, and specialized language. Such limitations are important drawbacks for UN meetings where speakers come in a large variety of voices and accents, with challenging names, and speaking UN-ese. It seems current speech recognition and transcription software, even with extensive training for a variety of voices and accents, need more time to mature and for the near term can at most be a complementary tool, not a replacement of the human transcribers.
In the meantime, the UN needs to explore the new technologies relevant to servicing of meetings. A step in this direction is a request for information released by the UN Office in Geneva in late 2019. The request is open to interested companies about the capabilities of their automatic speech recognition as well as human-based transcription products. The purpose is to gather insight on “available solutions for implementing speech recognition for live captioning of meetings and for generating searchable verbatim transcripts in the six official UN languages to make them available shortly after a meeting for a number of uses.” But these steps need to keep in mind resolutions by UN member states that ” …verbatim and summary records remain the only official records of the meetings of United Nations bodies” (A/74/32 paragraph 83) somewhat limiting experimentation.
As for substantive reporting, it needs expertise, as well as an educated and analytical mind, knowledge of the issues, and familiarity with the inter-governmental arena. Surely an algorithm cannot do the job. In fact, for the longest time the assumption about automation has been that it works well for repetitive-routine-manual work, but not with non-routine, non-manual, and cognitive work. The emerging technologies are fast challenging this long-standing assumption.
Jobs involving senior management, composing music, writing poetry or prose can and have been at least partially automated: the iCEO software successfully handles 80-90% of top management tasks; AI poetry writes poems that are hard to distinguish from those written by human poets (there is a test you can take on this); and EMI, which stands for Experiments in Musical Intelligence, composes original music.
In fact there are many available option for automated writing which are in use by news outlets. Wordsmith, by Automated Insights, is a content writing software that produced over 300 million pieces of content in 2013 alone. The Associated Press, Yahoo Sports, Allstate, and Comcast are among the companies that use the software regularly. A similar software is Quill, by Narrative Science. This software not only identifies and analyzes data on a topic but also writes narratives using the data. Topics can be on corporate earnings, sports stories, or the analysis of your twitter engagement. In an interview by Wired Magazine in 2012, one of the Narrative Science founders predicted that in 20 years “there will be no area in which Narrative Science doesn’t write stories”.
A newer entry into this field is Kensho and its eponymous software that writes financial analyses and asset performance reports among other financial reporting. In a 2016 New York Times interview, the founder and CEO of Kensho, Daniel Nadler, predicted that “…within a decade, between a third and a half of the current employees in finance will lose their jobs to Kensho and other automation software.” The human analysis speed cannot match that of Kensho which can conduct big data financial analysis in a matter of minutes which would take an expensive financial analyst up to a week or longer.
The writing by software is good enough that readers find it difficult to distinguish them from those written by a human. New York Times has developed a simple test with 8 writing samples and you guess if they are written by a human or a machine. You can try the test here. I took this test and guessed only half of the 8 correctly which I thought was dismal until the software told me I did 24% better than those who took the test before me.
So if machines can write stories, analyses, music, poetry and more, can they also write UN reports? During my years at the UN, I have written my share of progress reports, analytical reports, survey reports, summary reports and other types of reports mostly on behalf of the Secretary-General because in the UN all reports are produced by the SG and the rest of the staff are the living bots who do the work.
UN reports are not in the creative writing category. In fact many are just routine reporting of what happened and who said what in an inter-governmental process. Political situation reports appear more analytical but in effect they are summaries of news articles which it seems is not difficult for a bot to summarize and do it in a matter of seconds. Many of the UN reports covering economic and social issues are based on review and analysis of data which takes a long time for humans. In fact, analysis of a social or economic survey may take months for an entire team. Many UN reports also have repetitive content and style, such as reports on the parliamentary process, recounting the election of the Bureau and Chair, adoption of the agenda, and the list of speakers on the different agenda items. Only a handful of the thousands of reports produced every year by the UN can genuinely be labelled creative and cognitively inspiring. All this makes a lot of UN reports good candidates for bot writing.
Should the young PhDs just starting a career in the UN and looking forward to many years of writing brilliant political and economic analyses be worried? Again, not immediately, but given the trends, in about 10 years they would be writing together with algorithms and in another 10 years half may be replaced by software just like it is happening today in the finance sector. So working for and retiring from the UN after 25-30 years may not be in the works for people joining today as writers and analysts.
Administrative support and security arrangements
UN meetings need administrative support functions such as support for travel of LDC government delegates, management of the meeting schedules and rooms, providing sound and audio-visual technology for meetings, keeping track of and reporting on these services and related budgets for oversight by the General Assembly etc.
In the early 2010s the UN launched Emeets – an electronic meetings management system for requesting meeting space and services. This system is available in several UN cities (New York, Geneva, Nairobi and Vienna) for use of UN staff and governments. A one-stop-shop interface named Gmeets, through which “Clients …..select the services they require from a menu that enables direct access to: conference room allocation; interpretation; nameplates, podium signs and room set-up; publishing material in the Journal; audio-visual services; loaning of technological equipment; webcast services (UN Web TV); access and security related services.” The user response to these systems was mixed but overall most everyone was glad they no longer had to make many phone calls and, send many emails and fill out forms to get a room with a sound engineer.
A major step to digitize and automate administrative tasks was taken in 2008, when the UN changed its administrative management platform to a new Enterprise Resource Planning (ERP) system named Umoja, a Swahili word that means “unity.” It was launched in 2015 although it is still a “work in progress.”
There is a lot to say about Umoja including its progressively increasing cost from the initial $248 million to over half a billion by 2019, its ever extending completion deadline from its original of 2012 to the present, and its cause of frustration to many including the then head of UN’s Department of Political Affairs who called it an “unmitigated dismal experience” in a 2016 Foreign Policy article.
Umoja is a self-service system for administrative tasks ranging from travel requests to requesting a new ID. This ambitious system at least initially created a more labor-intensive admin process than one that was automated and lean. In fact, I recall how during the first six months after the launch of the initial system, parts of it had to be shut down during payroll processing periods, to prevent a total shut down: the expensive system could not multi-task.
Given these and similar efforts inside the Organization to digitize and automate administrative functions, including those for better meetings management, it would be superfluous to list here automated meeting management software that exists in the market. If the on-going system changes reach their goals, the UN can look at reductions in support and administrative staff. Any self-service system eliminates those individuals who previously used to provide these services. With hardly any support staff left the class system of professional versus general service staff definition will erode and that would be a good thing. But self-servicing by professional staff means some of their time will be spent on making travel requests and such which is not a good use of their time and skills.
As for security arrangements: the UN has its own security officers because they are necessary for keeping the UN grounds, which is international territory, safe.
When you come to the UN, usually the first point of contact will be a UN security officer at the entrance. Their functions have changed already with automation. For example they no longer need to carefully “scan” the IDs of people entering the campus because the UN IDs have chips scanned automatically at the entrance turnstiles. The chips are programmable so the access level of a staff will be entered in their ID. If their ID does not have the code to get into say the Security Council room, they will not get through even as a UN staff. While getting in and out of the UN grounds, and moving inside the campus is mostly automatically controlled, there are still a lot of security officers monitoring the entrances and the inside of the buildings especially around the meeting rooms.
As a young new staff member, I used to get annoyed by the heavy handed security officers. Then I had a chance to work closely with the UN Security because my assignment then involved non-governmental actors and some would start a demonstration and get “arrested” because demonstrations are not allowed on UN grounds. It was my job to get the NGO representatives out of the mess, preferably without having their access to the UN revoked. In the course of this work I interacted with a lot of UN Security officers and I learned, among other things, how many bomb threats the UN receives daily. It was a lot and I came to appreciate the security people a great deal more since.
The UN attracts not only the idealistic people who want to “make the world a better place” but also very the evil minded and the crazy. Automating security functions in this place and eliminating the officers would make no sense even though robotic security technology exists (like the robot soldier Atlas or Cobalt security robot). A human security officer can detect hundreds of small emotional, and cognitive details about the thousands that come through the UN doors. Such cognitive skills and reading of emotional detail are still difficult for machines who also are not flexible in their movements like a human would be say when they run up the stairs – impossible for robotic security.
In summary, a lot of UN jobs can be made redundant by technology in the future. But before their jobs are eliminated by technology, UN staff may as well learn to work with the emerging technologies, sort of job-sharing with the bots. And the UN leadership should invest in such learning to be an example of human-centeredness. Luckily the young people coming into the Organization are increasingly those who grew up with the Internet, smartphones, and other digital gadgets and already have a mind-set amenable to working with bots. Those UN staff who joined 15, or 20 years ago and are in their late 40s or 50s now will find the adjustment more difficult. The staff unions in New York, Geneva, or Nairobi will fight these changes and fight to keep people in their jobs but a better union strategy would be demanding training, and retraining for that hybrid work situation.
Unfortunately, as it happens with all disruptive changes, those harmed most and first will be those at the bottom of the hierarchical food chain even though it would make a lot more sense to keep them and reduce the numbers at the top given they constitute the highest labor cost in the UN. The top layer is also the least ready to master, handle or understand the emerging technologies considering many are in their 60s. If there must be cost savings measures, one can also look at other places such as travel, especially by the top leadership. Their travel is almost always business class, their travel allowances are much bigger. Most are on the road half the year if not more. Why not limit actual travel and use holographs or similar ideas so they can still participate in meetings but not add to overall cost or to the carbon footprint?
In this article I focused on technologies that are likely to disrupt one of the key activity of the UN (meetings). There are other organizational implications of exponential technologies for the UN in areas such as humanitarian and peace keeping which will be Part II. A third part of the series will explore how and if the United Nations and its work force can remain relevant in a world that is not the one for which it was created.
May 2016 – Updated in January 2020