Machine Learning Engineering

Signal Processing – Filters

Hung ManhAug 24, 20232 min read

Never really have been working with signals before, I first needed a basic fundamental understanding of filters. The following are my notes about low-pass and high-pass filters taking of the youtube video made by ritvikmath. Terminology Low-Pass and High-Pass Filter…

How to divide two timestamps in equal chunks

Hung ManhAug 9, 20233 min read

Given two dates, how to create equally spaced out times between them?

PyTorch – expected scalar type Float but found Double

Hung ManhDec 14, 20222 min read

TLDR: The default datatype of a numpy array translates to double/float64. If a Tensor is created from that array using torch.as_tensor it will adopt that datatype, which is not compatible with the default datatype of a neural network model which…

Training languagemodel – RuntimeError the expanded size of the tensor (100) must match the existing size (64) at non singleton dimension 1.

Hung ManhJul 4, 20221 min read

Context I trained a new languagemodel from scratch using huggingface’ framework and a preconfiguration of Roberta Model on a custom dataset. Now i wanted to vectorize a new dataset using the pretrained model. Observation I receive an error: Resolution This…

SentenceTransformer – float object is not subscriptable

Hung ManhJun 1, 20222 min read

TLDR: np.nan objects are of type float Observation I was trying to apply the SentenceTransformer (v2.2.0) on a list of custom documents to create embeddings for each of them, however i would get the error “TypeError: ‘float’ object is not…

Visual Explanation of Multi Head Attention

Hung ManhMay 28, 20225 min read

Why does changing the number of heads not change the number of parameters in the model? – In this post i want to present a short visual example of how changing attention heads impact the model architecture.

How to evaluate the Transformer Trainer

Hung ManhMay 4, 20222 min read

If you initialized a Trainer object, it will do the training boiler plate for you. Using the TrainingArguments, you can additionally customize your training process. One important argument is the evaluation_strategy which is set to “no” by default, thus no…

Difference between the Tokenizer and the PreTrainedTokenizer class

Hung ManhMar 17, 20224 min read

The Tokenizer and PreTrainedTokenizer classes perform different roles. The Tokenizer is a pipeline and defines the actual tokenization, while the PreTrainedTokenizer is more of a wrapper to provide additional functionality to be utilized by other components of the 🤗 Transformer…

How To Calculate the mean Average Precision (mAP) in object detection- an overview

Hung ManhDec 2, 202012 min read

When training an object detection model you want to quantify and compare different models preferably with just one comprehensive metric. Let’s take a look at the mean Average Precision (mAP).