We then subjected them to a common sample of videos so the recommendations could be compared like-for-like. History was disabled so that the viewing of the common sample would not impact personalization. Video's younger than 7 days old were chosen at random (proportional to views) from our dataset of news, politics and culture-war channels. We also included an anonymous viewer as a baseline to compare the influence of personalization.

Here is a sample of raw video recommendations, which is helpful to get get a sense of the recomendation system before analyzing the overall results.

These are the recommendations shown to the Partisan Left, Partisan Right and an Anonymous persona's when watching the 1st Presidential Debate 2020. In this venn diagram, you can see how much overlap of recommendations there are between the personas. Both their history and the video being watched influences the recommendations.

If you are interested in exploring, use the filter controls bellow to filter recommendations shown to any combination of channels and persona's. Click RANDOM to see a random channels recommendations. Otherwise, just keep scrolling down.

Recommendations seen by personas Allon All

Overall Influence of Personalization

Home page

On the home page, the personas were presented with 34% of videos from channels they had watched within their bubble - much more tailored than the up-next recommendations. This might be less than you see in your own experience because our persona's didn't subscribe to any channels, which would be featured more frequently.

19% of video's shown to the anonymous user were towards one of the channels in our dataset (i.e. they are mainly focused on news, politics or the culture war), 11% of those were Mainstream news. Our persona's were shown much more political content than an anonymous user — 47% — mostly to channels they had previously watched.

Up-next Videos

When our persona's were presented up-next video recommendations, on average:

  • 16% were for channels within their bubble, 11% of those channels they had already watched

  • 16% were back to the same channel they were watching

  • 73% were to channels they hadn't seen before

There was plenty of variety in up-next recommendations, even for the same video. When our anonymous user watched the same video in the same week, only 16% of recommendations were repeated. When our persona's re-watched a video there were 10 percentage points more repeated recommendations - a mildly more consistent influence. When they watched different videos on the same day, there were 20 points less repeated recommendations vs watching the same video - showing that the video is having a larger impact on recommendations than who is watching it.

Ideological Influence of Personalization

Now we'll be looking at the personalization and influence of recommendations by breaking them down by the persona watching and the the category of video they lead to.

  • % of persona recommendations: the percent of all recommendations show to this persona. Note that these add up to more than 100% because recommended video categories overlap with each other (e.g. a video's channel can be bother MSM and Partisan Left).

  • vs video views: [% of recommendations] - [% of total video views]. This is for comparison vs a simple/neutral algorithm which would recommend proportional to views.

  • vs anonymous: [% of recommendations] - [% equivalent anonymous recommendations]. A comparison to a user which is not logged in.

Here is a comparison of which political categories of the recommendations, starting with what was shown to an Anonymous user.

Here is the same data for all persona's (left) towards videos (top). The table takes a little effort to understand, but is easier to spot patterns.

percentage point difference between persona's recommendations and an anonymous viewer

Is YouTube a recommendation bubble?

Because our study created fake personas that aren't representative of real users, we can't make any strong conclusions about the overall influence of YouTube. But we can talk about the mechanics.

It's safe to say the home page is "bubbly" for users who are already watching within a content niche, but it's not a radicalization pipeline. Our persona's saw on average 34% recommendations toward their own category, but were rarely introduce to new channels within their bubble.

The video recommendations are a mixed bag. To find a meaningful way to try and think about it, I have taken a graphic from the NYT piece Do You Live in a Political Bubble? and overlaid it with the equivalent data from our video recommendations.

The chart shows a representative 100 democrats and 100 republicans, placed by the percent of their neighbors (within 5 miles) that vote for the opposite party. For each persona, I have overlaid the percentage of the recommendations towards channels opposite to them in a Left/Right dichotomy. When classified like this, the left leaning persona's are in much tighter content bubbles than the right. It's a pretty loose comparison, but I would say the video recommendations for our persona's are similar to living in America - an increasingly party-segregated place.

Although I have focused on the bubbles here, I believe it has been overstated how important this is. I think there are more powerful forces culturally and psychologically that is the biggest factor in determining what people will choose to watch. The recommendation algorithm deserves scrutiny, and needs more transparency, but it is a gentle breeze in a storm.

channel bubble visualization

How does this compare to other studies?

A recent NYU study on YouTube recommendations had real americans open a variety of video's and follow recommendation according to a script. They only found minor differences in the ideological bias of recommendations. This shows the self-identification on users (left axis) against the ideology of video recommendations (bottom axis).

NYU study chart

We found much stronger differences, and this is likely due to our persona's having the strongest signal possible for their ideology. Its important to keep this in mind with our results - this represents an upper bound to ideological personalization and isn't representative of normal use.

A 2021 study by Hosseinmardi and others use data of real world watching patterns from a representative sample of americans. They found that there were in-practice ideological bubbles of news consumption on YouTube. The chart bellow shows the risk ratio (higher is more likely) that users (left) will watch a category of news (bottom). The labels are (fL = far left, L = left, C center, AW = anti-woke)

channel bubble visualization

This contrasts to our finding that YouTube recommendation's is more ideologically bubbled for left users, in practice the right are much more bubbled in their news watching habits. This isn't a contradiction, the recommendation system is only part of many factors that influence what people watch.

Another really great part of this study was the data on how users arrive at videos

channel bubble visualization

They show that as much video watching comes from external links (e.g. a link from a facebook group, twitter or a blog) as from video recommendations. Also the extremes have more coming from external links, maybe because YouTube is less likely to recommend what they want to watch.

I hope I have been helpful for people to understand YouTubes recommendation system. I am moving on from YouTube research, if you would like to see the ongoing monitoring of youtube recommendations on transparency.tube and recfluence.net and can spare $300USD/month then please contact me at mark@ledwich.com.au.

For details about the process, and the data and code see our GitHub page.

Bough to you by:

  • Anna Zaitsev, University of California, Berkeley - Advisor and author of study

  • Anton Laukemper, Rijksuniversiteit Groningen - Original idea and code for personalized data collection

  • Mark Ledwich, Unaffiliated - Code and analysis, data viz, author of this article

A paper will be available soon and this will be updated to link to that when it is published.