When we are reading in a file, we are using f = open(filename, encoding="latin1")
Here's what encoding="latin1" means. The file we are reading, like any file, is just a bunch of 0's and 1's:
1011100110111100011111010000011101101011001100111000100001111001101110101000100100100101000101010100111010110001001001110000101001011110000110011001001011100001000000110111100010100110000001101001011111111111010100100001010111010111100111001110010111110101111001100000111010011101111010111110010010000101110101100111011011001010010011100101100011100010011001111100110101101011010111100011110001011110101011010111110011011011110101111111000001000111010010011011011000000101010001011000101011000010111
When we specify that the file is encoded using the "latin1" encoding, Python reads the file 8 digits (bits) at a time:
10111001
10111100
01111101
00000111
01101011
...
Each 8 bits correspond to a character (so that there are 256 characters in total), You can read about which 8-bit sequences correspond to which characters here.
Obviously, all the world languages cannot be expressed using 256 characters. For example, there are many tens of thousands Chinese characters. In order to encode them, more complex encoding schemes are needed. Just one of them is Unicode, which also is able to encode the alphabets of languages such as Japanese, Arabic, Hebrew, Russian, Korean, etc.