Linear Classifier
- W will give scores for all class for an image
- highest score will be chosen for prediction
- Need to determine which W will be best
- need some way to quantify the badness of W
- Need efficient procedure for searching through the space of all possible Ws
- come up with what the correct value is
Loss Function
- tells how good our current classifier is
- Notations
- x: Input (Image, data)
- y: labels, class
- In classification, usually a single integer that stands for certain class
- $L_i$: Loss function
- take in predicted scores, true target (label)
- quantitative value of how bad prediction was
- L: average of losses of the entire dataset
Multi class SVM loss
$$
L_i=\sum_{j\neq y_i}max(0,s_j-s_{y_i}+1)\\s_j:\ Incorrect\ Score\\s_{y_i}:\ Correct\ Score
$$
- generalization of binary SVM to handle multiple classes
- Calculates the sum of all the scores for incorrect categories
- Compare the score of correct category
- If correct category score is greater than the incorrect score by some safety margin
- Loss = 0
- Safety margin = 1 (for above math notation
- If not sum it up through whole dataset and calc. the average
- Also referred as Hinge Loss
- As certain point reached loss = 0
- At initialization of W, W is very small leading to small scores in all category
- Loss will end up giving $n(classes)-1$
- Loops over all incorrect classes, Ss will be all similar with -1 from safety margin
- Squared Hinge Loss get used sometimes
- When you value very wrong and wrong differently
- Example Code
- There can be multiple W that gives 0 loss
- Always consider test/validation data in such case
Regularization