Introduction
In this section I explain all necessary steps to obtain the intrinsics matrix of a camera (Kin).
Perspective
Starting from camera model we can project a 3D point, referenced to optical center (OC) into image sensor, see figure 1.
We can see that we define f as negative value, physically this means that the image is inverted.
We use the ideal pinhole camera model (d = 0), it’s a good approximation when camera is in focus. The image sensor is located at Fz=f, this means that the focus is in the infinite point, except that d=0 then all objects are in focus.
We have to note that if the light rays comes from infinite point they arrives parallel to lens, and they crosses in the focal point.
From 2D point to pixel
If we want to calculate the image pixel (x,y) corresponding to projected 2D previous calculated point (po), we have to overlay that point into CCD (matrix of pixels) and assign po to the nearest pixel (see figure 2).
To assign the correct pixel to Po first we have to calculate the pixel densities (Sx, Sy) of the CCD, to do that we have to divide the number of pixels of each dimension by the CCD size.

After that we have to apply an offset correction due to the 0,0 pixel is on the upper left corner of CCD, and our center of coordinates (X and Y) is in the CCD center. And finally we have to get the nearest integer.
Corrections – Skew factor
If CCD pixels columns and rows are not right aligned, then the CCD shape is like a parallelogram, see figure 4. We can characterize this problem with the skew factor.
in the previous figure we can see the poi formulas with the skew factor.
Aspect ratio
No aspect ratio correction is needed due to aspect ratio information is intrinsic in Sx and Sy data.
But to work in image (especially videos images) it’s very important to know the following concepts:
- Display Aspect Ratio (DAR): Is the horizontal size of image (in length units) divided by vertical size (directly related with CCD size). Usually TV uses DAR = 4/3=1.33 or DAR=16/9= 1.78.
- Storage Aspect Ratio (SAR): Is the number of horizontal active pixels divided by the number of vertical active pixels (For example PAL TV uses SAR = 720/576 = 1.25).
- Pixel Aspect Ratio (PAR): It is DAR / SAR (For instance in WIDE PAL TV uses DAR = 16/9 and SAR = 720/576 -> PAR = DAR/SAR = 1.422).
PAR = 1 means that the pixel is a perfect square.