Oxylabs’ News Scraper is a data gathering solution allowing you to extract real-time information from any news website effortlessly. This brief guide explains how a News Scraper works and provides code examples to understand better how you can use it hassle-free.
You can get news results by providing your own URLs to our service. We can return the HTML for any news page you like.
The example below illustrates how you can get HTML of a nbcnews.com page.
import requests
from pprint import pprint
# Structure payload.
payload = {
'source': 'universal',
'url': 'https://www.nbcnews.com/world'
}
# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('user', 'pass1'),
json=payload,
)
# Instead of response with job status and results url, this will return the
# JSON response with the result.
pprint(response.json())
Find code examples for other programming languages here
{
"results": [
{
"content": "<!DOCTYPE html><html lang=\"en\"><head><link href=\"https://nodeassets.nbcnews.com/_next/static/css/525 ... </html>",
"created_at": "2023-12-18 11:37:17",
"updated_at": "2023-12-18 11:37:24",
"page": 1,
"url": "https://www.nbcnews.com/world",
"job_id": "7142477922073389057",
"status_code": 200
}
]
}
With our News Scraper, you can seamlessly extract public data from any News web page. Gather crucial information such as breaking news, opinion articles, or editorial pieces, to understand the current trends and stay informed. If you have any questions, reach out to our support team via live chat or email us at hello@oxylabs.io.