Machine Learning Engineering
Signal Processing – Filters
Never really have been working with signals before, I first needed a basic fundamental understanding of filters. The following are my notes about low-pass and high-pass filters taking of the youtube video made by ritvikmath. Terminology Low-Pass and High-Pass Filter…
How to divide two timestamps in equal chunks
Given two dates, how to create equally spaced out times between them?
PyTorch – expected scalar type Float but found Double
TLDR: The default datatype of a numpy array translates to double/float64. If a Tensor is created from that array using torch.as_tensor it will adopt that datatype, which is not compatible with the default datatype of a neural network model which…
Training languagemodel – RuntimeError the expanded size of the tensor (100) must match the existing size (64) at non singleton dimension 1.
Context I trained a new languagemodel from scratch using huggingface’ framework and a preconfiguration of Roberta Model on a custom dataset. Now i wanted to vectorize a new dataset using the pretrained model. Observation I receive an error: Resolution This…
SentenceTransformer – float object is not subscriptable
TLDR: np.nan objects are of type float Observation I was trying to apply the SentenceTransformer (v2.2.0) on a list of custom documents to create embeddings for each of them, however i would get the error “TypeError: ‘float’ object is not…
Visual Explanation of Multi Head Attention
Why does changing the number of heads not change the number of parameters in the model? – In this post i want to present a short visual example of how changing attention heads impact the model architecture.
How to evaluate the Transformer Trainer
If you initialized a Trainer object, it will do the training boiler plate for you. Using the TrainingArguments, you can additionally customize your training process. One important argument is the evaluation_strategy which is set to “no” by default, thus no…
Difference between the Tokenizer and the PreTrainedTokenizer class
The Tokenizer and PreTrainedTokenizer classes perform different roles. The Tokenizer is a pipeline and defines the actual tokenization, while the PreTrainedTokenizer is more of a wrapper to provide additional functionality to be utilized by other components of the 🤗 Transformer…
How To Calculate the mean Average Precision (mAP) in object detection- an overview
When training an object detection model you want to quantify and compare different models preferably with just one comprehensive metric. Let’s take a look at the mean Average Precision (mAP).