gpf.common.textutils module

Module that contains helper functions to improve text handling and formatting.

gpf.common.textutils.get_alphachars(text)[source]

Returns all alphabetic characters [a-zA-Z] in string text in a new (concatenated) string.

Example:

>>> get_alphachars('Test123')
'Test'
Parameters:text (str, unicode) – The string to search.
Return type:str, unicode
gpf.common.textutils.get_digits(text)[source]

Returns all numeric characters (digits) in string text in a new (concatenated) string.

Example:

>>> get_digits('Test123')
'123'
>>> int(get_digits('The answer is 42'))
42
Parameters:text (str, unicode) – The string to search.
Return type:str, unicode
gpf.common.textutils.to_str(value, encoding='UTF-8')[source]

This function behaves similar to the built-in str() method: it converts any value into a string. However, if value is unicode, it will be encoded according to the specified encoding.

Parameters:
  • value – The value to convert to string.
  • encoding – The encoding to use when value is unicode.
Return type:

str

Note

By default, the encoding is UTF-8, unless the user specified something else. If this function fails to encode the value into unicode using the specified encoding, the default system encoding is used instead (which often is cp1252). For this fallback case, the ‘replace’ method is used, which means that it will not raise an error if it fails. Characters that fail to encode will be replaced by a question mark.

gpf.common.textutils.to_unicode(value, encoding='UTF-8')[source]

This function behaves similar to the built-in unicode() method: it converts any value into a unicode object. However, if value is a str, it will be decoded according to the specified encoding.

Parameters:
  • value – The value to convert to unicode.
  • encoding – The encoding to use when value is a str.
Return type:

unicode

Note

By default, the encoding is UTF-8, unless the user specified something else. If this function fails to decode the value into unicode using the specified encoding, the default system encoding is used instead (which often is cp1252). For this fallback case, the ‘replace’ method is used, which means that it will not raise an error if it fails. Characters that fail to decode will be replaced by a question mark.

Warning

Python 2 only!

gpf.common.textutils.to_repr(value, encoding='UTF-8')[source]

This function behaves similar to the built-in repr() method: it converts any value into its representation. However, if value is unicode, it will be encoded according to the specified encoding (defaults to UTF-8). The encoding will use the ‘replace’ method, which means that it will not raise an error if it fails. This means that the representation of the unicode object will not have the ‘u’ prefix anymore.

Parameters:
  • value – The value for which to get its representation.
  • encoding – The encoding to use when value is unicode.
Return type:

str

gpf.common.textutils.capitalize(text)[source]

Function that works similar to the built-in string method str.capitalize(), except that it only makes the first character uppercase, and leaves the other characters unchanged.

Parameters:text (str, unicode) – The string to capitalize.
Return type:str, unicode
gpf.common.textutils.unquote(text)[source]

Strips trailing quotes from a text string and returns it.

Parameters:text (str, unicode) – The string to strip.
Return type:str, unicode
gpf.common.textutils.format_plural(word, number, plural_suffix='s')[source]

Function that prefixes word with number and appends plural_suffix to it if number <> 1. Note that this only works for words with simple conjugation (where the base word and suffix do not change). E.g. words like ‘sheep’ or ‘life’ will be falsely pluralized (‘sheeps’ and ‘lifes’ respectively).

Examples:

>>> format_plural('{} error', 42)
'42 errors'
>>> format_plural('{} bus', 99, 'es')
'99 buses'
>>> format_plural('{} goal', 1)
'1 goal'
>>> format_plural('{} regret', 0)
'0 regrets'
Parameters:
  • word (str, unicode) – The word that should be pluralized if number <> 1.
  • number (int, float) – The numeric value for which word will be prefixed and pluralized.
  • plural_suffix (str, unicode) – If word is a constant and the plural_suffix for it cannot be ‘s’, set your own.
Return type:

str, unicode

gpf.common.textutils.format_iterable(iterable, conjunction='and')[source]

Function that pretty-prints an iterable, separated by commas and adding a conjunction before the last item.

Example:

>>> iterable = [1, 2, 3, 4]
>>> format_iterable(iterable)
'1, 2, 3 and 4'
Parameters:
  • iterable (list, tuple) – The iterable (e.g. list or tuple) to format.
  • conjunction (str) – The conjunction to use before the last item. Defaults to “and”.
gpf.common.textutils.format_timedelta(start, stop=None)[source]

Calculates the time difference between start and stop datetime objects and returns a pretty-printed time delta. If stop is omitted, the current time (now()) will be used. The smallest time unit that can be expressed is in (floating point) seconds. The largest time unit is in days.

Example:

>>> t0 = _dt(2019, 1, 1, 1, 1, 1)  # where _dt = datetime
>>> format_timedelta(t0)
'1 day, 3 hours, 4 minutes and 5.2342 seconds'
Parameters:
  • start – The start time (t0) for the time delta calculation.
  • stop – The end time (t1) for the time delta calculation or now() when omitted.