Artificial Intelligence in Journalism
How do we enable the benefits and manage the harms of artificial intelligence in journalism?LAST UPDATED: December 01, 2023
Developments in AI carry new legal and ethical challenges for how news organizations use AI in production and distribution as well as how AI systems use news content to learn. For newsrooms, the use of generative AI tools offers benefits for productivity and innovation. At the same time, it risks inaccuracies, ethical issues and undermining public trust. It also presents opportunities for abuse of copyright for journalists’ original work. To address these challenges, legislation will need to offer clear definitions of AI categories and specific disclosures for each. It must also grapple with the repercussions of AI-generated content for (1) copyright or terms-of-service violations and (2) people’s civil liberties that, in practice, will likely be hard to identify and enforce through policy. Publishers and technology companies will also be responsible for establishing transparent, ethical guidelines for and education on these practices. Forward-thinking collaboration among policymakers, publishers, technology developers and academics is critical.
Early forms of artificial intelligence (prior to the development of generative AI) have been used for years to both create and distribute online news and information. Larger newsrooms have leveraged automation for years to streamline production and routine tasks, from generating earnings reports and sports recaps to producing tags and transcriptions. While these practices have been far less common in local and smaller newsrooms, they are being adopted more. Technology companies also increasingly use AI to automate critical tasks related to news and information, such as recommending and moderating content as well as generating search results and summaries.
Until now, public debates around the rise of artificial intelligence have largely focused on its potential to disrupt manual labor and operational work such as food service or manufacturing, with the assumption that creative work would be far less affected. However, a recent wave of accessible – and far more sophisticated – “generative AI” systems such as DALL-E, Lensa AI, Stable Diffusion, ChatGPT, Poe and Bard have raised concerns about their potential for destabilizing white-collar jobs and media work, abusing copyright (both against and by newsrooms), giving the public inaccurate information and eroding trust. At the same time, these technologies also create new pathways for sustainability and innovation in news production, ranging from generating summaries or newsletters and covering local events (with mixed results) to pitching stories and moderating comments sections.
As newsrooms experiment with new uses of generative AI, some of their practices have been criticized for errors and a lack of transparency. News publishers themselves are claiming copyright and terms-of-service violations by those using news content to build and train new AI tools (and, in some cases, striking deals with tech companies or blocking web crawler access to their content), while also grappling with the potential of generative AI tools to further shift search engine traffic away from news content.
These developments introduce novel legal and ethical challenges for journalists, creators, policymakers and social media platforms. This includes how publishers use AI in news production and distribution, how AI systems draw from news content and how AI policy around the world will shape both. CNTI will address each of these, with a particular focus on areas that require legislation as well as those that should be a part of ethical journalism practice and policy. Copyright challenges emerging from artificial intelligence are addressed here, though CNTI also offers a separate issue primer focusing on issues of copyright more broadly.
Source: OpenAI’s ChatGPT
What Makes It Complex
In considering legislation, it is unclear how to determine which AI news practices would fall within legal parameters, how to categorize those that do and how news practices differ from other AI uses.
What specific practices and types of content would be subject to legislative policy? How would it apply to other types of AI? What comprises “artificial intelligence” is often contested and difficult to define depending on how broad the scope is (e.g., if it includes or excludes classical algorithms) and whether one uses technical or “human-based” language (e.g., “machine learning” vs. “AI”). New questions have emerged around the umbrella term of “generative AI” systems, most prominently Large Language Models (LLMs), further complicating these distinctions. These questions will need to be addressed in legislation in ways that protect citizens’ and creators’ fundamental rights and safety. At the same time, policymakers must consider the future risks of each system, as these technologies will continue to evolve at a faster pace than policy can reasonably keep up with.
Among these concerns is what data is collected and how it is used by AI tools, in terms of both the input (the online data scraped to train these tools) and the output (the automated content itself). For example, Stable Diffusion was originally trained on 2.3 billion captioned images, including copyrighted works as well as images from artists, Pinterest and stock image sites. Additionally, news publishers have questioned whether their articles are being used to train AI tools without authorization, potentially violating terms-of-service agreements. In response, some technology companies have begun discussing agreements to pay publishers for use of their content to train generative AI models. This issue raises a central question about what in our digital world should qualify as a derivative piece of content, thereby tying it to copyright rules. Further, does new AI-generated content deserve its own copyright? If so, who gets the copyright: the developers who built the algorithm or the entity that published the content?
The copyright challenges of AI (addressed in more detail in our copyright issue primer) also present ethical dilemmas involved in profiting off of the output of AI models trained on copyrighted creative work without attribution or compensation.
Establishing transparency and disclosure standards for AI practices requires a coordinated approach between legal and organizational policies.
While some areas of transparency may make sense to be addressed through legal requirements (like current advertising disclosures), others will be more appropriate for technology companies and publishers to take on themselves. This means establishing their own principles, guidelines and policies for navigating the use of AI within their organizations – ranging from appropriate application to labeling to image manipulation. But these will need to both fit alongside any legal requirements and also be similar enough across organizations for public understanding. Newsroom education will also be critical, as journalists themselves are often unsure of how, or to what extent, their organizations rely on AI. For technology companies specifically, there is ongoing debate over requirements of algorithmic transparency (addressed in more detail in our algorithmic accountability issue primer) and the degree to which legal demand for this transparency could enable bad actors to hack or otherwise take advantage of the system in harmful ways.
The use of generative AI tools to create news stories presents a series of challenges around providing fact-based information to the public; the question of how that factors into legal or organizational policies remains uncertain.
Generative AI tools have initially been shown to produce content riddled with factual errors. Not only do they include false or entirely made-up information, but they are also “confidently wrong,” creating convincingly high-quality content and offering authoritative arguments for inaccuracies. Distinguishing between legitimate and illegitimate content (or even satire) will, therefore, become increasingly difficult – particularly as counter-AI tools have so far been ineffective. Further, it is easy to produce AI-generated images or content and use it to manipulate search engine optimization results. This can be exploited by spammers who churn out AI-generated “news” content or antidemocratic actors who create scalable and potentially persuasive propaganda and misinformation. For instance, while news publishers such as Semafor have generated powerful AI animations of Ukraine war eyewitness accounts in the absence of original footage, the same technology was weaponized by hackers to create convincing “deepfakes” of Ukrainian President Volodymyr Zelenskyy telling citizens to lay down their arms. While it is clear automation offers many opportunities to improve news efficiency and innovation, it also risks further commoditizing and undermining public trust in news.
There are inherent biases in generative AI tools that content generators and policymakers need to be aware of and guard against.
Because these technologies are usually trained on massive swaths of data scraped from the internet, they tend to replicate existing social biases and inequities. For instance, Lensa AI – a photo-editing app that launched a viral AI-powered avatar feature – has been alleged to produce hypersexualized and racialized images. Experts have expressed similar concerns about DALL-E and Stable Diffusion, which employ neural networks to transform text into imagery, which could be used to amplify stereotypes and produce fodder for sexual harassment or misinformation. The highly lauded AI text generator ChatGPT has been shown to generate violent, racist and sexist content (e.g. that only white men would make good scientists). Further, both the application of AI systems to sociocultural contexts they weren’t developed for and the human work in places like Kenya to make generative AI output less toxic present ethical issues. Finally, while natural language processing (NLP) is rapidly improving, AI tools’ training in dominant languages worsens longstanding access barriers for those who speak marginalized languages around the world. Developers have announced efforts to reduce some of these biases, but the longstanding, embedded nature of these biases and global use of the tools will make mitigation challenging.
OpenAI’s ChatGPT notes some of these biases and limitations for users.
State of Research
Artificial intelligence is no longer a fringe technology. Research finds a majority of companies, particularly those based in emerging economies, report AI adoption as of 2021. Experts have begun to document the increasingly critical role of AI for news publishers and technology companies, both separately and in relation to each other. And there is mounting evidence that AI technologies are routinely used both in social platforms’ algorithms and in everyday news work, though the latter is often concentrated among larger and upmarket publishers who have the resources to invest in these practices.
There are limitations to what journalists and the public understand when it comes to AI. Research shows there are gaps between the pervasiveness of AI uses in news and journalists’ understandings of and attitudes toward these practices. Further, audience-focused research on AI in journalism has found that news users often cannot discern between AI-generated and human-generated content. They also perceive there to be less media bias and higher credibility for certain types of AI-generated news, despite ample evidence that AI tools can perpetuate social biases and enable the development of disinformation.
Much of the existing research on AI in journalism has been theoretical. Even when the work is evidence-based, it is often more qualitative than quantitative, which allows us to answer some important questions, but makes a representative assessment of the situation difficult. Theoretical work has focused on the changing role of AI in journalism practice, the central role of platform companies in shaping AI and the conditions of news work, and the implications for AI dependence on journalism’s value and its ability to fulfill its democratic aims. Work in the media policy space has largely concentrated around European Union policy debates and the role of transparency around AI news practices in enhancing trust.
Future work should prioritize evidence-based research on how AI reshapes the news people get to see – both directly from publishers and indirectly through platforms. AI research focused outside of the U.S. and outside economically developed countries would offer a fuller understanding of how technological changes affect news practices globally. On the policy side, comparative analyses of use cases would aid in developing transnational best practices in news transparency and disclosure around AI.
University of Oxford (2023)
Summary: The authors of this preprint article examine 52 news organizations’ guidelines on the use of AI in the newsroom. While many of the groups implement similar policies (e.g., human supervision of automated content), variations exist at both the national and organizational level.
CNTI’s Takeaway: This type of comparative research is particularly insightful because it outlines common practices and sheds light on gaps that remain. As the use of AI technologies continues to grow in newsrooms, it is crucial to understand and promote the guidelines put in place that ensure high-quality journalism persists.
Journalism and Mass Communication Quarterly (2023)
Summary: This article examines a nationally representative sample in the U.S. and finds that greater trust in key actors like scientists and academics is associated with an increased support for AI technologies, though political ideology influences these beliefs.
CNTI’s Takeaway: Public trust is important when communicating and developing new AI technologies. Understanding how the public views and interacts with these novel and powerful technologies is also beneficial to newsrooms and the companies that create these technologies.
Policy Design and Practice (2023)
Summary: This paper analyzes 31 governments’ AI strategies since 2017 and offers a typology of and themes for national policy responses to AI through legislation, standards and guidelines.
CNTI’s Takeaway: The three key themes identified in this paper around governing AI – development, control and promotion – are useful categories for assessing policy proposals and practices.
Digital Journalism (2022)
Summary: This paper provides an overview of AI’s rise in the news, considers how these changes risk shifting more control to and dependence on platform companies and proposes a research agenda for better understanding the role of platform companies in AI in the news.
CNTI’s Takeaway: This proposed research agenda depicts how we need a more systematic understanding of the actors involved in AI tools and services used in the news as well as their intended uses so we can assess what kind of regulatory or policy interventions would protect the news industry from overdependence on platform companies.
GRUR International (2022)
Summary: These legal scholars focus on issues surrounding copyright protections for AI-generated outputs, with a focus on “robojournalism” in Europe.
CNTI’s Takeaway: This paper argues that the extent to which journalism in Europe has relied on generative AI to produce content may not justify changes to the current copyright system. There are some useful considerations, but CNTI believes this argument may already be outdated and short-sighted from a policy perspective, considering the speed of technological innovation in this area. This speaks to the importance of ongoing research in this space.
Digital Journalism (2022)
Summary: These experts outline three key components of AI literacy and argue for the importance of closing the AI knowledge deficit within the news industry.
CNTI’s Takeaway: This offers a clear and useful overview of the various roles of AI in journalism and the forms of AI literacy that journalists should develop.
African Journalism Studies (2022)
Summary: This paper summarizes the state of AI use in newsrooms in African countries and proposes ideas for future research in this area as well as recommendations for addressing methodological challenges.
CNTI’s Takeaway: Further research should examine two key areas: 1) the actual application (or lack thereof) of AI in African newsrooms, including how AI technologies and algorithms developed in the Global North have been integrated into Global South newsrooms and how this changes the news production process and 2) the role government agencies will have in the oversight of automated journalism.
Digital Journalism (2022)
Summary: This case study evaluates the impact of China’s copyright law on artificial intelligence innovation in Chinese newsrooms.
CNTI’s Takeaway: More work is needed on copyright law and AI policymaking in journalism amid the rise of automated news, algorithmic distribution and digital content ownership.
Internet Policy Review (2020)
Summary: This work explores what “transparency” means in the context of AI, including its role in AI regulatory development, organizational policies and ethical guidelines.
CNTI’s Takeaway: “AI transparency” is a more useful concept than “algorithmic transparency” because it focuses on the system rather than on specific algorithms or components.
London School of Economics (2019)
Summary: This survey research on AI technologies in 71 news publishers across 32 countries shows that by 2019 AI was a significant – but unevenly distributed – part of the journalism process and introduces new editorial and ethical responsibilities.
CNTI’s Takeaway: This work provides a critical baseline understanding of AI use in newsrooms around the world and proposes key elements of AI strategy, ethics, editorial policy and journalist education for newsrooms and policymakers to draw from.
State of Legislation
The latest wave of AI innovation has, in most countries, far outpaced governmental oversight or regulation. Regulatory responses to emerging technologies like AI have ranged from direct regulation to soft law (e.g. guidelines) to industry self-regulation, and they vary by country. Some governments, such as Russia and China, directly or indirectly facilitate – and thus often control – the development of AI in their countries. Others attempt to facilitate innovation by involving various stakeholders. Some actively seek to regulate AI technology and protect citizens against its risks. For example, when it comes to privacy the EU’s legislation has placed heavy emphasis on robust protections of citizens’ data from commercial and state entities, while countries like China assume the state’s right to collect and use citizens’ data.
These differences reflect a lack of agreement over what values should underpin AI legislation or ethics frameworks and make global consensus over its regulation challenging. That said, legislation in one country can have important effects elsewhere. It is important that those proposing policy and other solutions recognize global differences and consider the full range of their potential impacts without compromising democratic values of an independent press, an open internet and free expression.
Legislative policies specifically intended to regulate AI can easily be weakened by a lack of clarity around what qualifies as AI, making violations incredibly hard to identify and enforce. Given the complexity of these systems and the speed of innovation in this field, experts have called for individualized and adaptive provisions rather than one-size-fits-all responses. Recommendations for broader stakeholder involvement in building AI legislation also include engaging groups (such as marginalized or vulnerable communities) that are often most impacted by its outcomes.
Finally, as the role of news content in the training of AI systems becomes an increasingly central part of regulatory and policy debates, responses to AI developments will likely need to account for the protection of an independent, competitive news media. Currently, this applies to policy debates about modernizing copyright and fair use provisions for digital content as well as collective bargaining codes and other forms of economic support between publishers and the companies that develop and commodify these technologies.
In September 2021, Brazil’s Chamber of Deputies approved the Marco Legal da Inteligência Artificial to regulate the development of AI technologies and promote research on AI ethics and accountability. In May 2023, a bill was proposed to regulate AI use based on recommendations made by a working group created in 2022. The stated aim of this legislation is to protect citizens’ fundamental rights and, like the EU’s AI Act, introduces a risk-based regulatory model for AI systems.
Read more here.
Canada’s proposed Artificial Intelligence & Data Act (AIDA), Bill C-27, would place guardrails on AI uses and enforce penalties for noncompliance. These requirements focus on addressing AI bias, transparency, risk mitigation and record-keeping. A lack of definitional clarity in the legislation makes it unclear exactly which systems it proposes to regulate. Because the AIDA focuses on deliberately harmful AI use, there is also a lack of clarity on the “gray areas” of AI harms.
Read more here.
In 2021, China’s Ministry of Science and Technology published new ethical guidelines for the use of AI in China. These guidelines build off of the 2017 “A Next Generation Artificial Intelligence Development Plan.” The guidelines focus on promoting and actively monitoring AI development by technology companies to achieve national priorities. They also mandate the use of the People’s Republic of China’s national standards for AI, as they apply to big data, cloud computing and industrial software.
The EU’s AI Act, adopted in June 2023, assigns AI applications to three risk categories, banning uses of systems with “unacceptable” risk levels (violating fundamental rights or safety) and increasing regulation of “high-risk” systems. It also requires that all copyrightable data used in AI model training be published. Concerns include a lack of clarity of what constitutes AI, flexibility in the regulation process and unintended legal implications for marginalized communities.
Read more here.
In recent years in the U.S., a range of legislative proposals have been introduced that would, to some extent, regulate AI. The AI Bill of Rights, unveiled in October 2022, may set the stage for future legislation but lacks mechanisms for regulation. In March 2023, the U.S. Copyright Office launched an initiative to examine AI issues. In October 2023, the Biden administration released an executive order to develop safety guidelines around generative AI. Organizations have tracked AI proposals at the federal and state levels.
Read more here.
Resources & Events
Notable Articles & Statements
RSF and 16 partners unveil Paris Charter on AI and Journalism
Reporters Without Borders (November 2023)
These look like prizewinning photos. They’re AI fakes.
The Washington Post (November 2023)
How AI reduces the world to stereotypes
Rest of World (October 2023)
Standards around generative AI
Associated Press (August 2023)
The New York Times wants to go its own way on AI licensing
Nieman Lab (August 2023)
Automating democracy: Generative AI, journalism, and the future of democracy
Oxford Internet Institute (August 2023)
Outcry against AI companies grows over who controls internet’s content
Wall Street Journal (July 2023)
OpenAI will give local news millions to experiment with AI
Nieman Lab (July 2023)
Generative AI and journalism: A catalyst or a roadblock for African newsrooms?
Internews (May 2023)
Lost in translation: Large language models in non-English content analysis
Center for Democracy & Technology (May 2023)
AI will not revolutionise journalism, but it is far from a fad
Oxford Internet Institute (March 2023)
Section 230 won’t protect ChatGPT
Lawfare (February 2023)
Generative AI copyright concerns you must know in 2023
AI Multiple (January 2023)
ChatGPT can’t be credited as an author, says world’s largest academic publisher
The Verge (January 2023)
Guidelines for responsible content creation with generative AI
Contently (January 2023)
Governing artificial intelligence in the public interest
Stanford Cyber Policy Center (July 2022)
Initial white paper on the social, economic and political impact of media AI technologies
AI4Media (February 2021)
Toward an ethics of artificial intelligence
United Nations (2018)
Algorithmic bias detection and mitigation: Best practices and policies to reduce consumer harms
Brookings Institute (May 2019)
Key Institutions & Resources
AIAAIC: Independent, public interest initiative that examines AI, algorithmic and automation transparency and openness.
AI Now Institute: Policy research institute studying the social implications of artificial intelligence and policy research.
Digital Policy Alert Activity Tracker: Tracks developments in legislatures, judiciaries and the executive branches of G20, EU member states and Switzerland.
Global Partnership on Artificial Intelligence (GPAI): International initiative aiming to advance the responsible development of AI.
Electronic Frontier Foundation (EFF): Non-profit organization aiming to protect digital privacy, free speech and innovation, including for AI.
Institute for the Future of Work (IFOW): Independent research institute tracking international legislation relevant to AI in the workplace.
Local News AI Initiative: Knight Foundation/Associated Press initiative advancing AI in local newsrooms.
MIT Media Lab: Interdisciplinary AI research lab.
Nesta AI Governance Database: Inventory of global governance activities related to AI (up to 2020).
OECD.AI Policy Observatory: Repository of over 800 AI policy initiatives from 69 countries, territories and the EU.
Organized Crime & Corruption Reporting Project: Investigative reporting platform for a worldwide network of independent media centers and journalists.
Partnership on Artificial Intelligence: Non-profit organization offering resources and convenings to address ethical AI issues.
Stanford University AI Index Report: Independent initiative tracking data related to artificial intelligence.
Term Tabs: A digital tool for searching and comparing definitions of (U.S./English language) technology-related terms in social media legislation.
Tortoise Global AI Index: Ranks countries based on capacity for artificial intelligence by measuring levels of investment, innovation and implementation.
Rachel Adams, Principal Researcher, Research ICT Africa
Pekka Ala-Pietilä, Chair, European Commission High-Level Expert Group on Artificial Intelligence
Norberto Andrade, Director of AI Policy and Governance, Meta
Chinmayi Arun, Executive Director, Information Society Project
Charlie Beckett, Director, JournalismAI Project
Meredith Broussard, Research Director, NYU Alliance for Public Interest Technology
Pedro Burgos, Knight Fellow, International Center for Journalists
Jack Clark, Policy Director, OpenAI
Kate Crawford, Research Professor, USC Annenberg
Renée Cummings, Assistant Professor, University of Virginia
Claes de Vreese, Research Leader, AI, Media, and Democracy Lab
Timnit Gebru, Founder and Executive Director, The Distributed AI Research Institute (DAIR)
Natali Helberger, Research Leader, AI, Media, and Democracy Lab
Aurelie Jean, Founder, In Silico Veritas
Francesco Marconi, Co-founder, AppliedXL
Surya Mattu, Lead, Digital Witness Lab
Madhumita Murgia, AI Editor, Financial Times
Felix Simon, Fellow, Tow Center for Digital Journalism
Edson Tandoc Jr., Associate Professor, Nanyang Technological University
Scott Timcke, Senior Research Associate, Research ICT Africa
Recent & Upcoming Events
Abraji International Congress of Investigative Journalism
June 29–July 2, 2023 – São Paulo, Brazil
Association for the Advancement of Artificial Intelligence 2023 Conference
February 7–14, 2023 – Washington, DC
ACM CHI Conference on Human Factors in Computing Systems
April 23–28, 2023 – Hamburg, Germany
International Conference on Learning Representations
May 1–5, 2023 – Kigali, Rwanda
2023 IPI World Congress: New Frontiers in the Age of AI
May 25–26, 2023 – Vienna, Austria
June 5–8, 2023 – San José, Costa Rica
RegHorizon AI Policy Summit 2023
November 3–4, 2023 – Zurich, Switzerland
Issue primers have been reviewed at multiple stages by more than 20 global research and industry expert partners, including CNTI advisory committee members, representing five regions. We invite you to send us research, legislation and other resources. Read more about CNTI’s issue primer and other research quality standards.Download PDF