New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Pornhub] Give better error for geo-restriction than Unable to extract title
#9889
Comments
I've been doing further debugging using Aylo (formerly MindGeek) is blocking access to their sites, including PornHub, in the US States of Utah, Virginia, Texas, Montana, Mississippi, Arkansas and North Carolina in response to recently enacted age-verification laws in those states. In my case, my desktop and my server are in different locations, so the site was only blocked on my server. yt-dlp supported sites affected:
|
Unable to extract title
if we intend on fixing this someone with an IP address in one of the impacted US states will need to share a page dump of the blocked page |
Sure. I've attached a basic "Save Page as..." HTML file for the geo-blocked Pornhub page in my initial issue here: Pornhub.zip (accessed from Texas). I can provide more info if needed (Full curl trace, debug headers, HTML from each of the 7 geo blocked states, and/or from all of the 5 supported Aylo websites that have geo-block restrictions, etc.). I'm happy to help debug or provide whatever info is helpful. The page contents are quite clear regarding geo-blocking. |
@ClearBlueOcean could you add |
Or if easier: my repository |
diff --git a/yt_dlp/extractor/pornhub.py b/yt_dlp/extractor/pornhub.py
index d94f28ceb..97e2260d9 100644
--- a/yt_dlp/extractor/pornhub.py
+++ b/yt_dlp/extractor/pornhub.py
@@ -294,6 +294,11 @@ def dl_webpage(platform):
'PornHub said: %s' % error_msg,
expected=True, video_id=video_id)
+ if age_verify_msg := self._search_regex(
+ r'(your elected officials in \w+ are requiring us to verify your age before allowing you access to our website)',
+ webpage, 'age verification message', default=None):
+ self.raise_geo_restricted(f'PornHub said: {age_verify_msg}')
+
if any(re.search(p, webpage) for p in (
r'class=["\']geoBlocked["\']',
r'>\s*This content is unavailable in your country')) |
DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
Checklist
Region
Canada
Provide a description that is worded well enough to be understood
Parsing the title fails on each run. The issue occurs for all videos on Pornhub, using both stable and nightly builds. Same result with or without quotes around the URL. The issue has been occurring for several days, and I've had no problems downloading from other sites.
The common locations for the title look fine in the DOM:
<title>Wild College Orgy: three Hot Babes get Naughty with Students at Dorm Party - Pornhub.com</title>
and
<meta name="twitter:title" content="Wild College Orgy: Three Hot Babes Get Naughty with Students at Dorm Party">
This issue seems similar to #7527 (Closed in Jul 2023)
Provide verbose output that clearly demonstrates the problem
yt-dlp -vU <your command line>
)'verbose': True
toYoutubeDL
params instead[debug] Command-line config
) and insert it belowComplete Verbose Output
The text was updated successfully, but these errors were encountered: