“for line in…” results in UnicodeDecodeError: ‘utf-8’ codec can’t decode byte

August 23, 2021 by James Palmer

As suggested by Mark Ransom, I found the right encoding for that problem. The encoding was “ISO-8859-1”, so replacing open(“u.item”, encoding=”utf-8″) with open(‘u.item’, encoding = “ISO-8859-1″) will solve the problem.

The following also worked for me. ISO 8859-1 is going to save a lot, mainly if using Speech Recognition APIs.
Example:
file = open(‘../Resources/’ + filename, ‘r’, encoding=”ISO-8859-1”)

Related