Auditing Critical Dependencies Between Online Media Platforms
Lead PI
Abstract
This research will audit the dependencies between major online media platforms, with a focus on major search engines, and investigate the extent to which they rely on content from social media platforms. In recent years, the web has centralized around a small number of mega-platforms that attract the bulk of people’s attention and the majority of content. Social media and search engines in particular dominate people’s online time, and serve as de-facto “homepages” for many users. As these media platforms have grown to prominence, so too have concerns about their power to create and shape online spaces. All of the large online platforms use socio-technical algorithms to rank, filter, recommend, and moderate content, thus privileging some information at the expense of other information. Although mass media has always functioned this way, the novel concern is that the algorithms that implement these processes are opaque, making them difficult to understand, disintermediate, or contest. Algorithm auditing has emerged as a powerful approach to increase transparency and accountability around “black-box” systems, but the vast majority of existing audits fail to grapple with the dependencies between major online media platforms.
This project aims to answer several high-level questions, including: What fraction of search results link to social media? What social media platforms and authors appear in search results? How do links to social media vary by query? Are results for social media personalized? Do links to social media increase content diversity, or are they a vehicle for misinformation? With respect to simultaneous audits of search engines versus YouTube and Twitter, more specific questions will be investigated about how the algorithms on the social media sites (such as “like” recommendations and trending topics or videos) influence search results. Further, because Google Search and Bing both integrate specialized search components from Twitter and YouTube, simultaneous audits will jointly investigate how the algorithms on these pairs of platforms interact. To conduct these algorithm audits, hybrid techniques will combine carefully controlled experiments from the vantage point of real users with simulated online identities created by the auditor. This allows answering questions about the impact of algorithmic curation, as well as revealing some of its underlying causes. One major challenge when auditing online platforms is achieving ecological validity. Here, this means selecting queries that are representative of those executed by real people. To address this challenge, the research will leverage a unique dataset of Google Search queries from a panel of over 350 participants. This set of queries will be expanded with additional terms curated from other sources, and queries will be diversified using autocomplete suggestions. The results of this new algorithm audit methodology will be useful to the general public, strengthening their media literacy, as well as to the designers of the platforms themselves.