# Bias in Word Vectors

This is a demonstration of how machine learning models learn the biases
of the data that we provide. Something as innocuous as GloVe embeddings
can still contain biases.

Let's start by loading some embeddings. You can choose various
embeddings included in GloVe, or other embeddings to see how
biases make into embeddings trained on various models.

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchtext
import numpy as np

glove = torchtext.vocab.GloVe(name="6B", dim=100) # change me

In [3]:
glove["fair"].shape

torch.Size([100])

First, we will write a function that finds the closest words 
to a vector `vec` in the embedding space. We use Euclidean
distance.

In [4]:
def print_closest_words(vec, n=5):
    diff = np.linalg.norm(glove.vectors.numpy() - vec.numpy(), axis=1)
    lst = sorted(list(enumerate(diff)), key=lambda x: x[1])
    for idx, difference in lst[:n]:
        print(glove.itos[idx], difference)

In [6]:
print_closest_words(glove["fair"], n=5)

fair 0.0
kind 4.6599874
indeed 4.6798534
regard 4.6948347
sort 4.7112455


We can use this function to explore analogies. For example,
during the GloVe vector dicsussion, we saw that the embedding
space has nice structures that gives us analogies like

$$ king - man + woman \approx queen $$

In [7]:
print_closest_words(glove['king'] - glove['man'] + glove['woman'])

king 3.364068
queen 4.081079
monarch 4.6429076
throne 4.9055004
elizabeth 4.921559


In [8]:
print_closest_words(glove['king'])

king 0.0
prince 4.092166
queen 4.281252
monarch 4.4741716
brother 4.5366683


We get reasonable answers like "queen", "throne" and the name of
our current queen.

(Sidenote: I'm not sure why "king" is still the top word.
In fact in the examples below, the first word in our analogy
is returned as a closest word.)

We can likewise flip the analogy around:

In [9]:
print_closest_words(glove['queen'] - glove['woman'] + glove['man'])

queen 3.364068
king 4.081079
prince 5.0347586
royal 5.069449
majesty 5.1892147


Or, try a different but related analogies along the gender axis:

In [10]:
print_closest_words(glove['king'] - glove['prince'] + glove['princess'])

queen 3.9443252
princess 4.092166
king 4.5588813
elizabeth 5.3060265
sister 5.355579


In [11]:
print_closest_words(glove['uncle'] - glove['man'] + glove['woman'])

aunt 3.1536405
grandmother 3.1846805
niece 3.1966183
uncle 3.364068
mother 3.4163873


In [12]:
print_closest_words(glove['grandmother'] - glove['mother'] + glove['father'])

grandfather 1.9321214
uncle 2.5486367
father 2.6408308
grandmother 3.0479465
brother 3.175673


In [13]:
print_closest_words(glove['old'] - glove['young'] + glove['father'])

father 5.0813956
old 5.085829
grandfather 5.747161
son 5.751157
grandmother 5.998024


## Biased Word Vectors: Doctors

Let's try another example:

In [14]:
print_closest_words(glove['doctor'] - glove['man'] + glove['woman'])

doctor 3.364068
nurse 4.2283154
physician 4.7054324
woman 4.873425
dentist 4.969891


The $$doctor - man + woman \approx nurse$$ analogy is very concerning.
Just to verify, the same result does not appear if we flip the gender terms:

In [15]:
print_closest_words(glove['doctor'] - glove['woman'] + glove['man'])

doctor 3.364068
man 4.899869
dr. 5.05853
brother 5.144743
physician 5.152549


## Biased Word Vectors: Programmers

We see similar types of gender bias with other professions.

In [16]:
print_closest_words(glove['programmer'] - glove['man'] + glove['woman'])

programmer 3.364068
cosmetologist 4.8504686
salesclerk 4.9466257
psychotherapist 5.0096955
adoptee 5.0135107


Beyond the first result, none of the other words are even related to
programming! In contrast, if we flip the gender terms, we get very
different results:

In [17]:
print_closest_words(glove['programmer'] - glove['woman'] + glove['man'])

programmer 3.364068
programmers 5.2138195
setup 5.2186975
mechanic 5.461222
hacker 5.5201344


Here are the results for "engineer":

In [18]:
print_closest_words(glove['engineer'] - glove['man'] + glove['woman'])

engineer 3.364068
technician 4.6912084
educator 5.208781
contractor 5.237237
surgeon 5.2675548


In [19]:
print_closest_words(glove['engineer'] - glove['woman'] + glove['man'])

engineer 3.364068
mechanic 5.4588532
engineers 5.537788
master 5.6934347
technician 5.8191476
