Recent revelations and documentation from the latest Big Tech whistleblower, Frances Haugen, have confirmed long-standing assumptions about how Facebook and other platforms with ad-based business models can contribute to the real-world harms associated with certain forms of user-created content. After years of debate about how to regulate the content itself, policymakers and critics of dominant internet platforms are now rightly focusing their attention on the ways that platforms sometimes amplify and spread some of the worst content on the internet. One of the ways this happens — though not the only way — is through “algorithms.”
Algorithms are at the core of every digital platform’s business model. They rank and organize content, and allow advertisers to connect with their target audience. They automate compliance and moderate user content at scale. In the context we use here, algorithms are the mechanisms that organize and recommend content to users of social media sites, or user-generated content sites like YouTube or TikTok. For platforms based on an advertising business model, the goal of the algorithms is to maximize engagement, that is, time spent using the service, and therefore seeing ads. They might do this by showing a user content that is similar to content the user has engaged with before, or that other, similar users have enjoyed, basing this on the personal data on each user that a platform may have collected. Or it might be content that users have not “enjoyed” at all, but stuff that makes them angry, or inflamed, or provoked — which research shows still keeps them using the site (at least for a while). It is this dynamic, in which platforms distribute or amplify content based primarily on a profit motive, and regardless of the risk of harm, that has put them in policymakers’ and tech critics’ sights.
Throughout 2022 Public Knowledge will explore the question of whether, or how, public policy can be used to create platform accountability for algorithms on behalf of the public interest. We will examine topics like what we know about the harms associated with algorithmic distribution of content, the most important considerations in creating policy in regard to speech, an inventory of existing policy proposals, and concepts like algorithmic transparency and justice.
Are Algorithms Evil?
You might think so, from the language in legislative proposals we have seen over the past year or so: bills have been designed to address malicious algorithms, dangerous algorithms, and harmful algorithms, to name a few.
But in fact, an algorithm is simply a series of steps. A recipe is an algorithm. A simple computer program that sorts items alphabetically or chronologically is also an algorithm. But for the most part, policymakers aren’t concerned with simple, transparent algorithms like “most recent first.” Most of the criticisms are levelled towards complex algorithms derived from machine learning, that may be so complex that no one really understands exactly how they work. For example, GPT-3, a deep-learning model for generating natural language, has 175 billion parameters. It would be virtually impossible for any user to know exactly why a specific word was chosen over another.
Even algorithms that evolve from complex machine learning processes are really just pieces of code, created by humans and designed to perform specific tasks. They can enhance the user experience by making it easier to find content, or by surfacing voices that would otherwise go unheard. In some contexts, algorithms have been proven to provide better outcomes than even the most highly trained humans. But the way algorithms are coded and the decisions they make, like what content to prioritize or remove, can have profound effects on users. They may be coded with the same bias as their human developers, using data sets that may reproduce past inequities, and they may manipulate, discriminate, radicalize, and misinform. The same algorithm might produce harmful results on one platform and not another, because they are working on different sets of content and using different sets of personal data. And the best developers with the best data and the best intentions may still get over-ruled by policies and company leaders to ship products optimized for their company’s bottom line. Any policy solution that doesn’t address how complex algorithms are designed and implemented will be unlikely to fully address the negative externalities of social media platforms’ business models.
So, what are the appropriate policy solutions that would allow users to enjoy the benefits of algorithms while avoiding their harms?
There is a great deal of research into platforms, the information environment, and how business choices platforms make might contribute to vaccine hesitancy, political extremism, or just simple harassment. However, research into harms that may be caused or exacerbated by platforms does not always translate easily into legislation. Words and concepts that are easily discussed and understood in a more academic context may be difficult to translate to a legislative one, and to apply by lawyers, judges, and juries. This is not just because the people who actually implement a new law are not likely to know as much as the researchers whose work informed it, but because it is the job of legal advocates to stretch and pull at any statutory language, discovering ambiguities and loopholes on behalf of their client.
Principles for “Regulating the Algorithms”
As a result of that complexity, we approach the exploration of policy solutions for algorithms with some fundamental principles in mind. Policy solutions should 1) help create a more competitive digital marketplace; 2) make data privacy and protection the norm; 3) address harms directly; 4) ensure we are regulating business practices, not expressive content; and 5) ensure the provision of ongoing oversight.
First, the ideal digital and social media landscape that Public Knowledge is working for is more competitive. It would be better if the content moderation decisions of a few, dominant social media platforms were less matters of public concern, and if people unhappy with one platform could easily move to others. This would be enabled by interoperability requirements that ensure that people can still communicate with each other between platforms (as they do between phone or email providers), and to prevent “network effects” from turning any one platform into the dominant one once more. In this world, there may still be problems to address, but any one platform making a mistake (or a deliberate decision motivated purely by profit, with no consideration of user safety) would be less consequential.
Second, we believe that companies must be required to protect user privacy. A strong privacy law would require companies to only collect data that is necessary, limit the uses of personal data, and delete personal data when it is no longer relevant. The creation of a Privacy Bureau at the Federal Trade Commission, as called for in the Build Back Better Act just approved by the House, would also be an important step towards protecting consumer data and preventing unfair and deceptive practices. Creating an internet environment where privacy, rather than rampant data collection, is the norm means that platforms will have fewer opportunities to target users. In fact, whether the primary concern is data collection and exploitation or the harms associated with algorithms, a law focused on the collection and use of data in general is likely to produce more benefit than a law focused on one specific use.
Third, we favor policy solutions that directly address the harms that digital platforms may create, rather than the means or methods by which those harms are created. Many of the proposals we have seen are intended to address certain harms, but only when the harms were created through use of some sort of algorithm. This seems arbitrarily narrow, compounds the challenges for any plaintiff, and creates problems that can be avoided. If a user, or group, or community is harmed through the platform’s choice to amplify disinformation, why should it matter if this was done by an algorithm, some other category of software, or by a person (for example, an executive who over-rules the natural outcome of an algorithm)? Regulating the underlying harm, rather than a particular method by which it may have been created, would allow anyone who suffered that harm the opportunity to pursue justice.
Fourth, we shouldn’t pursue policy solutions for which the distinction between “content regulation” and “algorithm regulation” is illusory. That is, solutions that create platform liability for content that is algorithmically amplified (for example, by removing Section 230 protections), even if it is framed as liability for the amplification, not the underlying content, is nonetheless practically equivalent to creating liability for the content. This introduces constitutional challenges in regard to both the user’s and the platform’s speech. (It’s not our position that any attempt to regulate how content is presented to users necessarily poses First Amendment concerns. They are hard, but not impossible, to avoid. And many things that platforms do are driven by business interests and are not expressive. But it is very hard to draw the line.)
Lastly, as we’ve noted, any policy solution that doesn’t address the complexity of how algorithms are designed and implemented is unlikely to address their negative externalities — and all of these are moving targets, not conducive to one-time legislation. We believe that trying to legislate for the harm du jour making the news (or the extremely narrow range of harms on which Republicans and Democrats can agree) is not the proper way to fundamentally address the issue. Developing a broader framework for consideration of harms and empowering an agency to implement that framework through public input will likely produce much better results and be more capable of evolving with the market. It will take time, but that time will be well-spent if it benefits users in ways that are both more durable and more flexible. Public policy solutions need to be durable to effectively address the impact of algorithms on our well-being, our society, and our democracy, while being flexible enough to support the pace of innovation in technology.