Numpy char.decode() Function



The Numpy char.decode() function is used to decode each element in an array of byte-encoded strings i.e. typically of type bytes into a standard string i.e. typically of type str using a specified encoding.

This function is useful when we have an array of encoded data such as UTF-8 encoded bytes and we need to convert it into a readable string format.

Syntax

Following is the syntax of Numpy char.decode() function −

numpy.char.decode(a, encoding=None, errors=None)

Parameters

Below are the parameters of the Numpy char.decode() function −

  • a(array_like): The input array of byte-encoded strings.

  • encoding(str, optional): The encoding used to decode the byte strings. The default value is 'utf-8'.

  • errors(str, optional): This parameter specifies the error handling scheme where 'strict' raises an error, 'ignore' skips invalid characters and 'replace' replaces them with a placeholder.

Return Value

This function returns an array of decoded strings with the same shape as the input array. Each element of the array is a string decoded from the corresponding byte-encoded element in the input array.

Example 1

Following is the basic example of Numpy char.decode() function. Here in this example we have an array of byte-encoded strings and we are decoding them into regular strings −

import numpy as np

arr = np.array([b'hello', b'world', b'numPy'])
print("Original Array:",arr)
decoded_arr = np.char.decode(arr, encoding='utf-8')
print("Decoded array:",decoded_arr)

Below is the output of the basic example of numpy.char.decode() function −

Original Array: [b'hello' b'world' b'numPy']
Decoded array: ['hello' 'world' 'numPy']

Example 2

If the input contains bytes that are not valid in the specified encoding then we can handle errors using the errors parameter passing to the char.decode(). Here in this example the invalid byte \xff is replaced with the Unicode replacement character

import numpy as np
arr = np.array([b'hello', b'world\xff', b'numPy'])
decoded_arr = np.char.decode(arr, encoding='utf-8', errors='replace')
print(decoded_arr)

Here is the output of the above example −

['hello' 'world' 'numPy']

Example 3

when we want to ignore invalid characters during decoding we can use the errors='ignore' parameter in the char.decode() function. Here is the example of it −

import numpy as np
arr = np.array([b'hello', b'world\xff', b'numPy'])
decoded_arr = np.char.decode(arr, encoding='utf-8', errors='ignore')
print(decoded_arr)

Here is the output of the above example −

['hello' 'world' 'numPy']
numpy_string_functions.htm
Advertisements