Skip to content

5.1. Google News

Javinator9889 edited this page Aug 13, 2018 · 4 revisions
  • withSiteSearch(self, site: str): defines a website for delimiting the results. You must include all the domain name, for example: google.com. At Google News, it must be a news website domain name.
  • withRelatedPagesToADocument(self, document_url: str): search news related to a document URL.

List output format

Here is the format of the results, when job finishes:

[    {
          'title': 'result title',
          'link': 'result link',
          'thumbnail': 'result thumbnail',
          'publisher': 'result publisher',
          'date': 'result date',
          'extra': 'result extra information',
          'description': 'result description',
          'related_articles': [   {
                                      'title': 'related article title',
                                      'link': 'related article link',
                                      'publisher': 'related article publisher',
                                      'date': 'related article date',
                                      'extra': 'related article extra'  # maybe it is not available in all cases
                                  },
                                  {... MORE RELATED ARTICLES ...}
                              ],
     },
     {
        'how_many_results': 'number of results',
        'google_stats': 'google provided stats',
        'stats': {
            'overall_time': 'searching time',
            'google_search_time': 'google searching time',
            'parsing_page_time': 'reading and extracting info time'
         },
         'url': 'used URL'
    }
]

<<< Previous page - Next page >>>