Conv1D Layer

Here’s a pretty cool article on understanding PyTorch conv1d shapes for text classification. In this article, the shape of the example is: n = 1: number of batches d = 3: dimension of the word embedding l = 5: length of the sentence import torch.nn as nn import torch # Example represents one sentence here example = torch.rand(n=1, l=3, d=5) example.shape # torch.Size([1, 5, 3]) example # This is the output: tensor([[[0.0959, 0.1674, 0.1259], [0.8330, 0.5789, 0.2141], [0.3774, 0.8055, 0.4218], [0.1992, 0.4722, 0.3167], [0.4633, 0.0352, 0.8803]]]) In the above output, you can image each row represents one word. ...

April 5, 2025

XCS330 PS4 In-Context Learning & Fine-Tuning

Introduction In the XCS330 Problem Set 4, we will explore methods for performing few-shot learning with pre-trained LMs. Datasets from HuggingFace: Amazon Reviews: five-way classification problem. Given a review, predict the start rating, from 1 to 5. XSum: news articles from BBC. Given a news, return a one sentence summary. Use n-gram overlap to measure the score bAbI: question-answering tasks that requires reasoning about contextual information. AI benchmark developed by Facebook AI Research. Fine-tuning Here, we fine tune the entire model on k examples using two diffrent sizes of samller BERT models. ...

April 2, 2025

XCS330 PS3 MAML

Introduction In traditional machine learning, we have a lot of dataset for a specific tasks, while in meta-learning, we have many tasks with small datasets, and the hope is that we can train a model that can learn some fundamental idea from other tasks. Put it this way: the goal of traditional ML is to optimize the performance of a single task, while the goal of meta-learning is to optimize for adaptability. ...

March 14, 2025

XCS330 ProtoNet

In the 3rd assignment of XCS330, we will implement prototypical neworks (protonets) for few-shots image classification on the Omniglot dataset. Protonets Algorithm This is the protonet in a nutshell, and the example comes from the assignment: In this example, we compute three class prototypes c1, c2, c3 from the support features. The decision boundaries are computed using Euclidean distance. When there’s a new query, we can determine which class it belongs to. ...

March 8, 2025

XCS330 PS2 Part 1 - Data Processing

Introduction In this blog, I will talk about some tricks that I learned when I worked on the data processing part for assignment 2 of XCS300. Here’s the link to the assignment. This is how the data looks like: Understand Dataset In this assignment, we use the Omniglot dataset that has 1623 hand-written characters from 50 different languages. Each character has 20 (28 x 28) images. Running both the grader.py or main.py will download the dataset to local filesystem. A folder named omniglot_resized will be generated under the src directory, and this is the structure of the directory: ...

February 28, 2025

XCS330 PS2 Part 2 - MANN Architecture

Introduction In this blog, I will talk about the model part for assignment 2 of XCS300. Here’s the link to the assignment. This is the MANN (memory-augmented neural network) architecture: LSTM Here are some useful reading materials for understanding LSTM and RNN: What is LSTM (Long Short Term Memory)? PyTorch Tutorial - RNN & LSTM & GRU PyTorch Tutorial - Name Classification Using A RNN Here’s a nice diagram of LSTM from Understanding LSTM Networks: ...

February 28, 2025