We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
https://xiaohua.zol.com.cn/baoxiaonannv/1.html
运行代码
# 导入模块 import logging # 匹配内容 import re # 网页请求 import requests # 忽略警告 logging.captureWarnings(True) # 控制时间 import time # 写入请求网址与请求头 url = "https://xiaohua.zol.com.cn/baoxiaonannv/%d.html" header = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36", } # 正则表达式 pattern = re.compile(r'<div class="summary-text">(.*?)</div>') duanzi = url % (1) print(duanzi) requests.packages.urllib3.disable_warnings() # 获取代码内容,cerify=False不认证 response = requests.get(url=duanzi, headers=header, verify=False, timeout=10).text # 正则匹配 item = pattern.findall(response, re.S) time.sleep(2) response # print(item)
通过正则表达式<div class="summary-text">(.*?)</div>照理来说应该这20个都匹配到了,但是为什么这3个没有匹配到?re.S似乎能含\n但是没有制表符\t。是这个问题吗?那正则表达式该怎么改使得\t也能被匹配
<div class="summary-text">(.*?)</div>
re.S
\n
\t
The text was updated successfully, but these errors were encountered:
有没有看是哪个没有匹配上呢?然后对比下正则表达式
Sorry, something went wrong.
No branches or pull requests
https://xiaohua.zol.com.cn/baoxiaonannv/1.html
运行代码
通过正则表达式
<div class="summary-text">(.*?)</div>
照理来说应该这20个都匹配到了,但是为什么这3个没有匹配到?re.S
似乎能含\n
但是没有制表符\t
。是这个问题吗?那正则表达式该怎么改使得\t
也能被匹配The text was updated successfully, but these errors were encountered: