Skip to content

A Python package to check input for validity and / or plausibility. Also contained: some methods to convert input into standardized formats.

License

Notifications You must be signed in to change notification settings

RuedigerVoigt/userprovided

Repository files navigation

userprovided

Supported Python Versions pypi version Last commit Downloads Coverage]

"Never trust user input!" is also true outside the security context: You cannot be sure users always provide you with valid and well formatted data.

The Python package userprovided checks input for validity and / or plausibility. Besides that it contains some methods to convert input into standardized formats.

The code has type hints (PEP 484) and aims to provide useful log and error messages.

Userprovided has functionality for the following inputs:

  • parameters:
    • Check a dictionary for valid, needed, and unknown keys.
    • Convert lists, strings and tuples into a set
    • Check if an integer or string is in a specific range.
    • ...
  • url:
  • hash:
    • Is the hash method available?
    • Calculate a file hash and (optionally) compare it to an expected value.
  • date:
    • Does a given date exist?
    • Convert English and German long format dates to ISO strings.
  • mail:
    • Check if a string is a valid email address.

Installation

Install exoskeleton using pip or pip3. For example:

sudo pip3 install userprovided

You may consider using a virtualenv.

To upgrade to the latest version accordingly:

sudo pip install userprovided --upgrade

Handle Parameters

Check a Parameter Dictionary

If your application accepts parameters in the form of a dictionary, you have to test if all needed parameters are provided and if there are any unknown keys (maybe due to typos). There is a method for that:

userprovided.parameters.validate_dict_keys(
    dict_to_check = {'a': 1, 'b': 2, 'c': 3},
    allowed_keys = {'a', 'b', 'c', 'd'},
    necessary_keys = {'b', 'c'})

Returns True if the dictionary dict_to_check contains only allowed keys and all necessary keys are present.

Avoid Keys without Value in a Dictionary

Check if all keys in a dictionary have a value. Return False if the value for any key is empty. Works for strings (including whitespace only), dictionary, list, tuple, and set.

# returns True:
parameters.keys_neither_none_nor_empty({'a': 123, 'b': 'example'})

# returns False:
parameters.keys_neither_none_nor_empty({'a': '   ', 'b': 'example'})
parameters.keys_neither_none_nor_empty({'a': None, 'b': 'example'})
parameters.keys_neither_none_nor_empty({'a': list(), 'b': 'example'})

Convert into a set

Convert a string, a tuple, or a list into a set (i.e. no duplicates, unordered):

userprovided.parameters.convert_to_set(list)

Check Range of Numbers and Strings

def numeric_in_range(parameter_name,
                     given_value,
                     minimum_value,
                     maximum_value,
                     fallback_value) -> Union[int, float]



def string_in_range(string_to_check,
                    minimum_length,
                    maximum_lenght,
                    strip_string: bool = True) -> bool

userprovided.parameters.is_port(int)
# Checks if the port is integer and within the
# valid range from 0 to 65535.

Handle URLs

Normalize URLs

Normalize an URL means:

  • remove whitespace around it,
  • convert scheme and hostname to lowercase,
  • remove ports if they are the standard port for the scheme,
  • remove duplicate slashes from the path,
  • remove fragments (like #foo),
  • remove empty elements of the query part,
  • order the elements in the query part by alphabet

The optional parameter drop_keys allows you to remove specific keys, like session ids or trackers, from the query part of the URL.

url = ' https://www.Example.com:443//index.py?c=3&a=1&b=2&d='
userprovided.url.normalize_url(url)
# returns: https://www.example.com/index.py?a=1&b=2&c=3
userprovided.url.normalize_url(url, drop_keys=['c'])
# returns: https://www.example.com/index.py?a=1&b=2

Check URLs

To check whether a string is a valid URL - including a scheme (like https) - use userprovided.url.is_url.

userprovided.url.is_url('https://www.example.com')
# => True
userprovided.url.is_url('www.example.com')
# => False

You can insist on a specific scheme:

userprovided.url.is_url('https://www.example.com', ('ftp'))
# => False (Schema does not match permitted)

userprovided.url.is_url('ftp://www.example.com', ('ftp'))
# => True

To check the URL with an actual connection attempt, you could use the salted library.

Determine a File Extension

Guess the correct filename extension from an URL and / or the mime-type returned by the server. Sometimes a valid URL does not contain a file extension (like https://www.example.com/), or it is ambiguous. So the mime type acts as a fallback. In case the correct extension cannot be determined at all, it is set to 'unknown'.

# retuns '.html'
userprovided.url.determine_file_extension(
    url='https://www.example.com',
    provided_mime_type='text/html'
)

# retuns '.pdf'
userprovided.url.determine_file_extension(
    'https://www.example.com/example.pdf',
    None
)

Check Email Addresses

userprovided.mail.is_email(None)
# => False

userprovided.mail.is_email('example@example.com')
# => True

File Hashes

You can check whether a specific hash method is available. This will raise a ValueError for MD5 and SHA1 even if they are available, because they are deprecated.

print(userprovided.hash.hash_available('md5'))
# => ValueError because md5 is deprecated

print(userprovided.hash.hash_available('sha256'))
# => True on almost any system

You can calculate hash sums for files. If you do not provide the method, this defaults to SHA256. Other supported methods are SHA224 and SHA512.

# returns the hash of the file as a string:
userprovided.hash.calculate_file_hash(pathlib.Path('./foo.txt'))

If you provide an expected value for the hash you can check for file changes or tampering. In the case the provided value and the calculated hash do not match, a ValueError exception is raised.

userprovided.hash.calculate_file_hash(
    file_path = pathlib.Path('./foo.txt'),
    hash_method = 'sha512',
    expected_hash = 'not_the_right_value')
# => raises an exception

Handle Calendar Dates

Does a specific date exist?

userprovided.date.date_exists(2020, 2, 31)
# => False

Normalize German or English long form dates:

userprovided.date.date_en_long_to_iso('October 3, 1990')
# => '1990-10-03'

userprovided.date.date_de_long_to_iso('3. Oktober 1990')
# => '1990-10-03'

Update and Deprecation Policy

  • No breaking changes in micro-versions.
  • It makes no sense to duplicate functionality already available in the Python Standard Library. Therefore, if this package contains functionality that becomes superseded by the Standard Library, it will start to log a depreciation warning. The functionality itself is planned to stay available for at least a major version of userprovided and as long as Python versions not containing this functionality are supported.

About

A Python package to check input for validity and / or plausibility. Also contained: some methods to convert input into standardized formats.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages