7. Exceptions
While scrapping the web, we can achieve different situations:
If everything had gone well, you must have got a List in where you had a Future object.
When you are doing a search and there is no result for it, two things can happen with pyGle:
- It throws an
IndexError
, so that means there is no result. - It returns only one element inside the list, which is the stats. No exception is thrown.
For the first case, we can wrap the code with the try/except
block. Also we can save the future object for recovering the exception in the future:
# pyGle has been initialized before, with the query and params as "pSearch"
try:
ft = pSearch.doSearch()
result = ft.result()
except IndexError:
result = ft.exception()
While creating the pyGle
object, maybe you declared that you are going to search at Google Images and you provide Google Patents params.
At this case, a exception is thrown warning you that the combination you did is not valid. That exceptions came from InvalidCombinationException
, so you can wrap both expcetions with only one:
from pyGle.errors import InvalidCombinationException, TimeCombinationNonValid, MixedSearchException
try:
# Create pSearch object with invalid combinations
pSearch.doSearch() # Once it is created
except InvalidCombinationException:
# do stuff
## ANOTHER POSSIBILITY ##
try:
# Create pSearch object with invalid combinations
pSearch.doSearch() # Once it is created
except (TimeCombinationNonValid, MixedSearchException):
# do stuff
Also, when you declare no query, a NullQueryError
will be raised:
from pyGle.errors import NullQueryError
try:
pSearch = PyGle()
pSearch.doSearch()
except NullQueryError:
print("No query declared")
After doing a lot of searches, Google can block pyGle as it detects it can be a bot. If this situation happens, pyGle will raise an exception, so you must handle it:
from pyGle.errors import GoogleOverloadedException, GoogleBlockingConnectionsError
try:
pSearch = PyGle(use_session_cookies=True)
pSearch.withQuery("query")
for i in range 10000:
print(pSearch.doSearch().result())
except (GoogleOverloadedException, GoogleBlockingConnectionsError):
pass
One possible solution is to torify your search, so it is more difficult to Google to block your searches:
from pyGle.errors import GoogleOverloadedException, GoogleBlockingConnectionsError
try:
pSearch = PyGle(use_session_cookies=True)
pSearch.withQuery("query)
for i in range 1000:
print(pSearch.doSearch().result())
except (GoogleOverloadedException, GoogleBlockingConnectionsError):
for i in range 1000:
print(pSearch.doSearch(torify=True).result())