Computer Vision Reference

Resources

Fundamental Math

Linear Algebra:
- this or this
Vector Calculus:
- this or this

Computer Vision Fundamentals

CS131 : Computer Vision - Foundations and Applications

Deep Neural Networks

The Deep Learning Book
Gradient Descent cheatsheet
CS230 : Deep Learning
CS231n : Convolutional Neural Networks for Visual Recognition

Coursera

Machine Learning by Andrew Ng

Pytorch

We mainly use Pytorch because it has a very low learning curve and simplifies a large amount of neural network infrastructure building. Also, it's taught at UCSD in ML courses, which lets people apply skills learned in class to a project.

Pytorch has some very good tutorials to learn from
60 minute intro to ML: This is a good starting point, it gives an overview of how Pytorch tensors work, how autograd does all the heavy lifting, and the training process for a classifier.
Transfer Learning: Networks taking too long to train? Not enough data? Use someone else's network!

Ram

Computer Vision can be very RAM heavy, so just download more RAM

Coding cheat sheet

Detailed below are very common functions used across all of computer vision for manipulating images and more.

Here are the imports for the shorthand we use

import cv2  # OpenCV
import numpy as np  # numpy
from PIL import Image  # PIL
import torch  # PyTorch
import torchvision.transforms as transforms  # PyTorch transforms

Primary image objects

numpy object (OpenCV)

IMPORTANT NOTE: OpenCV reads images as BGR!
Channel values are unsigned 8-bit ints (UINT8)

Image object (PIL)

Channel values are floats in [0,1]

Tensor object (PyTorch)

Channel values are floats in [0,1]

Reading/opening images

Open as a OpenCV (numpy) image (NOTE: this opens the image as BGR)

cv2.imread(path)

Open as a PIL image

Image.Open(path)

Writing images to disk

Write OpenCV image to disk

type(img) == np.ndarray
cv2.imwrite(path, img)

Write Tensor to disk as image

type(img) == torch.Tensor
transforms.ToPILImage()(img).save(path)

Converting between image objects

Convert from PIL to OpenCV

type(img) == Image.Image
cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)

Convert from OpenCV to PIL

type(img) == np.ndarray
Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))

Convert from Tensor to PIL

type(img) == torch.Tensor
transforms.ToPILImage()(img)

Convert from PIL to Tensor

type(img) == Image.Image
transforms.ToTensor()(img)

Converting between colorspaces

OpenCV

OpenCV has several constants that specify which colorspace convert from and to.

cv2.COLOR_RGB2BGR  # RGB -> BGR
cv2.COLOR_BGR2GRAY  # BGR -> grayscale
# etc... you can swap the order of the colorspaces too

There exist many other colorspaces that you may find helpful (CIE, HSV, etc) and you can check all the conversions in the OpenCV docs here.

To actually convert the image

# replace cv2.COLOR_RGB2BGR with whatever you need
cv2.cvtColor(img, cv2.COLOR_RGB2BGR)