Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YouTube] New Channel renderer fails extracting various metadata like channel_follower_count #9893

Open
10 of 11 tasks
bbilly1 opened this issue May 9, 2024 · 4 comments
Open
10 of 11 tasks
Labels
site-bug Issue with a specific website

Comments

@bbilly1
Copy link
Contributor

bbilly1 commented May 9, 2024

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

Region

Europe

Provide a description that is worded well enough to be understood

Since a few days, I'm not able to reliably extract channel_follower_count when extracting info from the main channel page.

I've observed, that it fails around 50% of the times, but it's not consistent, making me assume there is some A/B testing going on.

Bellow verbose output doesn't show anything, but it is produced like this from latest commit on master:

import yt_dlp

url = "https://www.youtube.com/@Computerphile"

yt_obs = {
    "skip_download": True,
    "playlist_items": "0,0",
    "verbose": True,
}

response = yt_dlp.YoutubeDL(yt_obs).extract_info(url)
response.get("channel_follower_count")

Where channel_follower_count in response is None for me around 50% of times.

I can only reproduce this on channel pages, not on video pages.

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2024.04.09 from yt-dlp/yt-dlp [ff0779267] API
[debug] params: {'skip_download': True, 'playlist_items': '0,0', 'verbose': True, 'compat_opts': set(), 'http_headers': {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.18 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language': 'en-us,en;q=0.5', 'Sec-Fetch-Mode': 'navigate'}}
[debug] Lazy loading extractors is disabled
[debug] Python 3.12.3 (CPython x86_64 64bit) - Linux-6.8.9-arch1-2-x86_64-with-glibc2.39 (OpenSSL 3.3.0 9 Apr 2024, glibc 2.39)
[debug] exe versions: ffmpeg 6.1.1 (setts), ffprobe 6.1.1
[debug] Optional libraries: certifi-2024.02.02, requests-2.31.0, sqlite3-3.45.3, urllib3-1.26.18
[debug] Proxy map: {}
[debug] Request Handlers: urllib, requests
[debug] Loaded 1810 extractors
[youtube:tab] Extracting URL: https://www.youtube.com/@Computerphile
[youtube:tab] @Computerphile: Downloading webpage
[debug] [youtube:tab] Selected tab: 'videos' (videos), Requested tab: ''
[youtube:tab] Downloading all uploads of the channel. To download only the videos in a specific tab, pass the tab's URL
[download] Downloading playlist: Computerphile - Videos
[youtube:tab] Playlist Computerphile - Videos: Downloading 0 items
[debug] The information of all playlist entries will be held in memory
[download] Finished downloading playlist: Computerphile - Videos
@bbilly1 bbilly1 added site-bug Issue with a specific website triage Untriaged issue labels May 9, 2024
@bbilly1
Copy link
Contributor Author

bbilly1 commented May 9, 2024

Some further investigating shows, that I'm getting wildly different input between tries in _extract_metadata_from_tabs. Inspecting the content of the data argument going into that function there.

On successful requests I'm seeing all relevant content in c4TabbedHeaderRenderer in header. On failing requests, that is not present and I'm seeing a pageHeaderRenderer instead. That looks quite different, but the subscriber count is nested in contentMetadataViewModel.

@bashonly
Copy link
Member

bashonly commented May 9, 2024

Seems like A/B testing similar to what YT is currently doing with comments.

@oifj34f34f
Copy link

Might be related to iv-org/invidious#4681

@bbilly1
Copy link
Contributor Author

bbilly1 commented May 11, 2024

Looking further into this, it's not just the channel_follower_count extraction that fails, channel thumbnails like banner fails too, plus probably a few other things, supporting the thinking of @oifj34f34f with the linked issue.

@bbilly1 bbilly1 changed the title [YouTube] Extracting channel_follower_count regularly failes on channel pages [YouTube] New Channel renderer fails extracting various metadata like channel_follower_count May 11, 2024
@pukkandan pukkandan removed the triage Untriaged issue label May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
Status: Youtube metadata
Status: Nice to have
Development

No branches or pull requests

4 participants