Demo: visurb.drew.hu
Code: github.com/andrewhu/visurb

One of the coolest results from the early days of the deep-learning boom is Word2Vec, which used a single layer neural network to produce word embeddings, which are vectors of numbers representing words. Word2Vec allowed you to do things like perform arithmetic on words, e.g. King - Man + Woman = Queen. You can also visualize the embeddings using a dimensionality reduction method like t-SNE to bring the vectors down from 300D or so to 2D or 3D.

Transformers are a much more versatile and powerful tool for NLP, and they also produce higher quality text embeddings (and they work better on sentences and documents, just just single words). Using the Hugging Face Transformers library, producing embeddings from pretrained models is super easy. I scraped some definitions from the Urban Dictionary, produced text embeddings for each definition then plotted them.

Definition embeddings plotted in 2D

Still, since most definitions on the Urban Dictionary are heavily subjective (e.g. names and horoscopes), the visualization should be viewed as equally subjective. In fact, I really can’t think of any intuitive explanation for why some names might be closer to eachother than others. Though I did manage to get a pretty clear split between male and female names, likely due to the definitions including gender pronouns like “him” or “her”.