The world-to-camera transformation matrix is the inverse of the camera-to-world matrix. The camera-to-world matrix is the combination of a translation to the camera's position and a rotation to the camera's orientation. Thus, if M is the 3x3 rotation matrix corresponding to the camera's orientation and t is the camera's position, then the 4x4 camera-to-world matrix is:
M
00 M
01 M
02 t
x
M
10 M
11 M
12 t
y
M
20 M
21 M
22 t
z
0 0 0 1
Note that I've assumed that vectors are column vectors which are multiplied on the right to perform transformations. If you use the opposite convention, make sure to transpose the matrix.
To find M, you can use one of the formulas listed on Wikipedia, depending on your particular convention for roll, pitch, and yaw. Keep in mind that those formulas use the convention that vectors are row vectors which are multiplied on the left.
Instead of computing the camera-to-world matrix and inverting it, a more efficient (and numerically stable) alternative is to calculate the world-to-camera matrix directly. To do so, just invert the camera's position (by negating all 3 coordinates) and its orientation (by negating the roll, pitch, and yaw angles, and adjusting them to be in their proper ranges), and then compute the matrix using the same algorithm.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…