XCS330 PS4 In-Context Learning & Fine-Tuning

Introduction In the XCS330 Problem Set 4, we will explore methods for performing few-shot learning with pre-trained LMs. Datasets from HuggingFace: Amazon Reviews: five-way classification problem. Given a review, predict the start rating, from 1 to 5. XSum: news articles from BBC. Given a news, return a one sentence summary. Use n-gram overlap to measure the score bAbI: question-answering tasks that requires reasoning about contextual information. AI benchmark developed by Facebook AI Research. Fine-tuning Here, we fine tune the entire model on k examples using two diffrent sizes of samller BERT models. ...

April 2, 2025

XCS330 PS3 MAML

Introduction In traditional machine learning, we have a lot of dataset for a specific tasks, while in meta-learning, we have many tasks with small datasets, and the hope is that we can train a model that can learn some fundamental idea from other tasks. Put it this way: the goal of traditional ML is to optimize the performance of a single task, while the goal of meta-learning is to optimize for adaptability. ...

March 14, 2025

XCS330 ProtoNet

In the 3rd assignment of XCS330, we will implement prototypical neworks (protonets) for few-shots image classification on the Omniglot dataset. Protonets Algorithm This is the protonet in a nutshell, and the example comes from the assignment: In this example, we compute three class prototypes c1, c2, c3 from the support features. The decision boundaries are computed using Euclidean distance. When there’s a new query, we can determine which class it belongs to. ...

March 8, 2025

XCS330 PS2 Part 1 - Data Processing

Introduction In this blog, I will talk about some tricks that I learned when I worked on the data processing part for assignment 2 of XCS300. Here’s the link to the assignment. This is how the data looks like: Understand Dataset In this assignment, we use the Omniglot dataset that has 1623 hand-written characters from 50 different languages. Each character has 20 (28 x 28) images. Running both the grader.py or main.py will download the dataset to local filesystem. A folder named omniglot_resized will be generated under the src directory, and this is the structure of the directory: ...

February 28, 2025

XCS330 PS2 Part 2 - MANN Architecture

Introduction In this blog, I will talk about the model part for assignment 2 of XCS300. Here’s the link to the assignment. This is the MANN (memory-augmented neural network) architecture: LSTM Here are some useful reading materials for understanding LSTM and RNN: What is LSTM (Long Short Term Memory)? PyTorch Tutorial - RNN & LSTM & GRU PyTorch Tutorial - Name Classification Using A RNN Here’s a nice diagram of LSTM from Understanding LSTM Networks: ...

February 28, 2025