AI

Attention in LLMs and Extrapolation

It is now understood that the attention mechanism in large language models (LLMs) serves multiple functions. By analyzing attention, we gain insight into why LLMs succeed at in-context learning and chain-of-thought—and, consequently, why LLMs sometimes succeed at extrapolation. In this article, we aim to unpack this question by observing various types of attention mechanisms. Basic […]

Attention in LLMs and Extrapolation Read Post »

Interestingness First Classifiers

Most existing machine learning models aim to maximize predictive accuracy, but in this article, I will introduce classifiers that prioritize interestingness. What Does It Mean to Prioritize Interestingness? For example, let us consider the task of classifying whether a user is an adult based on their profile.If the profile contains an age feature, then the

Interestingness First Classifiers Read Post »

Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem

In the field of Natural Language Processing (NLP), a central theme has always been “how to make computers understand the meaning of words.” One fundamental technique for this is “Word Embedding.” This technique converts words into numerical vectors (lists of numbers), with methods like Word2Vec and GloVe being well-known. Using these vectors allows for calculations

Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem Read Post »

Hack Your Feed: Take Control of Your Recommendations

“Recommended for You” YouTube, Amazon, Netflix, X, news sites… We are surrounded by services displaying recommendations. While convenient, have you ever felt, “I keep getting recommended the same kinds of things,” or “I wish I could get recommendations from a different perspective”? Or perhaps wondered, “Are these recommendations biased?” The truth is, these “recommendations” (recommender

Hack Your Feed: Take Control of Your Recommendations Read Post »

Scroll to Top