Computer Vision Reference
Resources
Fundamental Math
Computer Vision Fundamentals
- CS131 : Computer Vision - Foundations and Applications
Deep Neural Networks
- The Deep Learning Book
- Gradient Descent cheatsheet
- CS230 : Deep Learning
- CS231n : Convolutional Neural Networks for Visual Recognition
Coursera
- Machine Learning by Andrew Ng
Pytorch
We mainly use Pytorch because it has a very low learning curve and simplifies a large amount of neural network infrastructure building. Also, it's taught at UCSD in ML courses, which lets people apply skills learned in class to a project.
- Pytorch has some very good tutorials to learn from
- 60 minute intro to ML: This is a good starting point, it gives an overview of how Pytorch tensors work, how autograd does all the heavy lifting, and the training process for a classifier.
- Transfer Learning: Networks taking too long to train? Not enough data? Use someone else's network!
Ram
Computer Vision can be very RAM heavy, so just download more RAM
Coding cheat sheet
Detailed below are very common functions used across all of computer vision for manipulating images and more.
Here are the imports for the shorthand we use
import cv2 # OpenCV
import numpy as np # numpy
from PIL import Image # PIL
import torch # PyTorch
import torchvision.transforms as transforms # PyTorch transforms
Primary image objects
numpy
object (OpenCV)
- IMPORTANT NOTE: OpenCV reads images as BGR!
- Channel values are unsigned 8-bit ints (UINT8)
Image
object (PIL)
- Channel values are floats in [0,1]
Tensor
object (PyTorch)
- Channel values are floats in [0,1]
Reading/opening images
Open as a OpenCV (numpy) image (NOTE: this opens the image as BGR)
Open as a PIL image
Writing images to disk
Write OpenCV image to disk
Write Tensor to disk as image
Converting between image objects
Convert from PIL to OpenCV
Convert from OpenCV to PIL
Convert from Tensor to PIL
Convert from PIL to Tensor
Converting between colorspaces
OpenCV
OpenCV has several constants that specify which colorspace convert from and to.
cv2.COLOR_RGB2BGR # RGB -> BGR
cv2.COLOR_BGR2GRAY # BGR -> grayscale
# etc... you can swap the order of the colorspaces too
To actually convert the image