OpenCVの座標系

座標系というやつはどう定義されているかが非常に重要です。従って参照にすべきはQiitaでもなく，このブログでもなく公式APIリファレンスです。

これに書いてあることが全てです。でもなかなか探しづらいのでここにメモを書きます。

前提of前提

超前提知識ですが，OpenCVでは以下の前提があります。

右手系
カメラの座標系において，左上が原点，横方向がX（右が正），画像の縦方向がY（下が正），奥行き方向がZ（奥が正）

OpenCVで使われる回転の作法

回転の記述法でメジャーどころと言えば以下の3つが挙げられます。

オイラー法によるDCM（Direct Cosine Matrix）
ロドリゲスの回転公式による3変数記述
クォータニオン

OpenCVではちょくちょく回転を表す3×1のベクトルを算出しますが，これは全てロドリゲスの回転公式に基づいています。

公式のcv2.Rodrigues()のレファレンスを覗くと

f:id:ossyaritoori:20190302003143p:plain

とあります。

日本語でいうと，「ベクトルのノルムが回転角，正規化されたベクトルが回転軸」を表します。

X軸[1,0,0]まわりに45度回転させたい場合は以下のように書けばいいわけです。

import cv2
import numy as np

R, Jacob = cv2.Rodrigues(np.array([math.pi/4,0,0]))
print(R)
#R = array([[ 1.        ,  0.        ,  0.        ],
#        [ 0.        ,  0.70710678, -0.70710678],
#        [ 0.        ,  0.70710678,  0.70710678]])

なお，cv2.Rodriguesは返り値を2つ返すので注意。

従って以下のようなオイラー角のDCM変換を手で書いている人はOpenCV上では使うのをやめた方がいいです。ぱっと書きやすいので私は好きなんですが…

# Calculates Rotation Matrix given euler angles.
def eulerAnglesToRotationMatrix(theta) :
     
    R_x = np.array([[1,         0,                  0                   ],
                    [0,         math.cos(theta[0]), -math.sin(theta[0]) ],
                    [0,         math.sin(theta[0]), math.cos(theta[0])  ]
                    ])
         
         
                     
    R_y = np.array([[math.cos(theta[1]),    0,      math.sin(theta[1])  ],
                    [0,                     1,      0                   ],
                    [-math.sin(theta[1]),   0,      math.cos(theta[1])  ]
                    ])
                 
    R_z = np.array([[math.cos(theta[2]),    -math.sin(theta[2]),    0],
                    [math.sin(theta[2]),    math.cos(theta[2]),     0],
                    [0,                     0,                      1]
                    ])
                     
                     
    R = np.dot(R_z, np.dot( R_y, R_x ))
 
    return R

# Checks if a matrix is a valid rotation matrix.
def isRotationMatrix(R) :
    Rt = np.transpose(R)
    shouldBeIdentity = np.dot(Rt, R)
    I = np.identity(3, dtype = R.dtype)
    n = np.linalg.norm(I - shouldBeIdentity)
    return n < 1e-6
 
 
# Calculates rotation matrix to euler angles
# The result is the same as MATLAB except the order
# of the euler angles ( x and z are swapped ).
def rotationMatrixToEulerAngles(R) :
 
    assert(isRotationMatrix(R))
     
    sy = math.sqrt(R[0,0] * R[0,0] +  R[1,0] * R[1,0])
     
    singular = sy < 1e-6
 
    if  not singular :
        x = math.atan2(R[2,1] , R[2,2])
        y = math.atan2(-R[2,0], sy)
        z = math.atan2(R[1,0], R[0,0])
    else :
        x = math.atan2(-R[1,2], R[1,1])
        y = math.atan2(-R[2,0], sy)
        z = 0
 
    return np.array([x, y, z])

OpenCVでの座標変換計算の数式

前節でOpenCVがどのような形で座標を扱うことを想定しているかがわかりました。次はそれがどのように使われることを想定されているかです。

例えばOpenCVではカメラの位置を図るような関数がいくつか存在します。既知の3次元点と投影された画像の点との対応からカメラの姿勢を計算するPNP問題を解く場合が特にそうです。

心当たりがあるのは

SolvePNP系統
arucoモジュールのEsitmatePose系

でしょうか。

rvecやtvecの扱い

これらでは得られたカメラの位置として返り値にrvecsとtvecsという3×1のベクトルを返してきます。気になるのはこれらは果たしてどの座標系に属しているのかということです。（都合等をよく考えたらカメラ座標系にきまっているんですが）

以前の記事でもカメラ変位をカメラtoグローバルなのかグローバルtoカメラなのかで扱いが変わることを書いたと思います。 ossyaritoori.hatenablog.com

結論から言うとrvecsやtvecsはカメラのLocal座標系から見た値で表されています。従って，

OpenCVで測ったカメラ姿勢 $R_c=Rodrigues(rvec_c)^\top$ から測った回転物の姿勢 $R_o=Rodrigues(rvec_o)^\top$ はグローバル座標系では

$^g R_{o} = R_c R_o$

と計算できるわけです。

検証用投影プログラム

自分はちょっと疑い深いので実際に計算して確かめました。

三次元の点（グローバル座標）をカメラの視点に投影するcv2.projectPoints()という関数があります。 imgpts, jac = cv2.projectPoints(3Dpoints, rvecs, tvecs, K, dist)のように使えば3次元点を2次元に投影することができます。

f:id:ossyaritoori:20190302005627p:plain — 親の顔よりみた式

この中身をDistortionを含めないで書くならこういう風になります。

tc=tvecs
Rc,_=cv2.Rodrigues(rvecs)
def camProjection(points,Rc,tc,K=np.eye(3),campose_is_global=0):
    '''
    Input: CameraPose, Camera Matrix, Object Points in Global Coordinate
    Output: Projected Points
    '''
    h,w = points.shape
    if h != 3:
        print("Input do not match points style!")
    if campose_is_global:#assume Rc,tc is from global frame
        Proj = K.dot(np.concatenate([Rc.T, -np.dot(Rc.T,tc)], axis=1))
    else:#assume Rc,tc is from camera frame( opencv default)
        Proj = K.dot(np.concatenate([Rc, tc], axis=1))        
    pts = np.vstack((points,np.ones((1,w))))
    converted_pts= Proj.dot(pts)
    return Projection(converted_pts)

def Projection(points):
    h,w = points.shape
    if h != 3:
        print("Input do not match points style!")
    Dep = points[2,:]
    Depth = np.tile(Dep,(2,1))
    return points[0:2,:]/Depth