for Robot Artificial Inteligence

CPU 물리코어 가상코어 차이

|

가상코어를 논리코어라고ㄷ ㅗ말하긴 한다.

CPU 코어에 Thread가 들어가 있는데, 코어하나에 쓰레드가 여러개 들어간다.

이것이 물리 쓰레드

설명

가상쓰레드(논리쓰레드)는 흔히 우리가 개발하는데 있어서, 특정 프로그램에 대한 쓰레드를 할당해 주는 것이다.

Comment  Read more

Boost Library 사용하는 이유

|

Comment  Read more

SLAM KR QUESTION AND ANSWER

|

  • GN is faster than LM
    • but easy fall in local minimal
    • if intiialization precisely and local minimal mostly none, then choose GN
  • Superpoint 논문에서 매칭한거는 nn 매칭한 결과이고, superglue는 superpoint에서 추출된 descriptor를 입력받아 graph neural net에 기반하여 매칭하는거라고 보시면 됩니다. 즉, superpoint + superglue는 nearest neigbor matching(KD tree- euclidean distance) 을 사용하는것이 아닌(ML 방식), DL방식을 사용한것이다.

  • 짐벌락은 3 파라미터를 가진 오일러 각을 최적화에 쓰지 않는 이유이다. 4파라미터인 쿼터니언보다 3파라미터가 계산량은 더 좋겠지만, 짐벌락이라는 현상떄문에 미분이 되지 않는다.
    • 최적화 할때 파라미터 값이 막 바뀌는데, Rotation Matix의 특징을 유지하면서 바꾸기는 어렵다. 쿼터니언은 그거에 비해 쉽다.
  • Vins Mono에서 각속도 수식에 오메가 함수 안에 0값이 우하단에 들어가는데 이 경우 쿼터니언의 허수부가 앞에 있고 실수 부가 q = (q_x, q_y, q_z, q_w)

  • p(t) = q(t)(p,1)q(t)^t 이런 쿼터니언 회전식 미분하면 각속도랑 관련된 식이 나오고 순허수 (w,0)꼴로 표현됐던걸로 기억합니다, 각속도는 유닛이 아니게 되고 각도랑도 상관이 없구요. 회전을 표현하는 유닛쿼터니온인지 각속도같은 다른 수치를 표현하는 쿼터니온인지 구분해서 해석하면 될거같습니다.

  • 멀티 쓰레딩 잘하시는 분들은 생산 파이프라인 최적화 잘하실거 같다.

  • 내츄럴 랜드마크 쓰기 어려운 이유, geometry 추정이 쉽지 ㅇ낳다. 절대적인 id와 상관관계를 찾기 어렵다
  • ARM_NEON_x_x86 가속연산을 하다보면 보통 pc에서는 적용이 안되는데, 라이브러리를 통해 pc에서 바로 가속 연산 돌게해주는 라이브러리

  • cross proudct은 면적의 의미를 갖고 있으니 벡터사이의 변화량을 crodd product으로 측정.
  • 예를 들어 optical flow 변환한 좌표나, 포즈로 변환한 좌표나 완전히 같다면, 두베턱터에 방향이 같으니 cross product은 0이 된다.

It is “naive” in that the language and notations are those of ordinary informal mathematics, and in that it does not deal with consistency or completeness of the axiom system . Likewise, an axiomatic set theory is not necessarily consistent: not necessarily free of paradoxes.

  • VO과정 2D frame을 주어졌을떄, Feature extraction →feature matching(NN, RANSAC)→eight point algorithm → Essential Matrix → fundamental Matrix → SVD → Two Frame Pose Relatationship → 두카메라 포즈와 2d correspondence points들로 triangulation을 구하고 → pnp로 c2포즈 → Initialization이 끝났으면(SFM)
    • 위의 작업으로 triangualtion을 만들고 pnp로 포즈를 추정하는것을 반복한다.
  • EKF LOCALIZATION(github , probabilisticrRrobotics
  • depth의 단위는 m
  • Lifelong Graph Learning : Feature 매칭을 temporally growing graph로 해석해서 그래프 러닝 기법을 적용한것
  • Marker Tracking이나 Calibration은 최서 4개의 포인트가 필요하다. Normal Surface를 구하고 Rotation과 T값을 구한다.
  • orientation은 말 그대로 카메라 현재 Rotation Matrix
  • CheckBoard나 Tracker는 평면이라 기울여서 하면 안된다. Normal Surface에 대한 방향이 달라진다.
  • 실근(real root) 즉 least Square Root는 실근을 구하는 것이다.
  • solvepnp: 3d point들과 correpnoding한 2d point를 가지고 pose 를 구하는것
    • 이에 corresponding이 안맞다면 잘못된 값이 나오게 될 수 있고
    • 3d 포인트가 평면위에 있는 점이라면 y축과 거의 가깝게 되기 때문에 pose를 구하기 어렵다.
    • pnp는 2d-3d 대응 관계가 주어졌을때 푸는것이기 되는데 위와 같은 상황이 된다면, 2d-2d가 되어 수치적으로 불안정해 진다.
    • y축 좌표가 거의 비슷해서 평면위 점이 되므로 수치적으로 (numerically) 2d위에 있는 점과 마찬가지가 된다.
  • Save Trajectory를 보면 모든 프레임에서의 tracking pose를 save 해준다.
  • 두이미지 특징점간 픽셀 차이를 parrallax
  • normal equation 형태
  • bayes 이론이 들어간 것들은 마르코프 Assumetion을 사용하여서 probability를 추정한다.
  • 빛이 가득찬 환경에서는 로봇을 멈춰 빛이 지나가도록 하고, Tracking lost상태로 바꾼후 새로운 피쳐를 뽑아서 랜드마크에 대해 Relocalization 할 수 있도록 해야된다(빛이 카득찬 것은 버려야 한다.)
    • 혹은 imu나 휠 인코더로 그동안 모션 정보를 받아서 치환해주는 경우가 있다.
    • auto exposure를 빠르게 반응해준다면 해결할 수 도 있따.
  • Postive Definite하다는 것은 minimum point가 있다는 것 정도로 Optimal Value가 있고, 매니폴드 타고 내려가면서 정답에 도달할 방법이 있다.
  • BA에서 normal equation 행렬방식을 schur elimination로 mariginzation 한후 choleesky 혹은 LDLT Decompostion으로 풀릴 수 있으려면 Mariginzalization 결과 행렬이 Postive Definite Matrix여야 한다는 조건이 있어야 할 것 같은데 Marginzalization 결과가 항상 저 조건에 만족할 수 밖에 없는 이유는 Semi positive Definition,
  • p3p + ransac 이 p6p보다 빠르다. 란삭은 운좋게 1번 운나쁘면 여러번이기 떄문에 steady하지 않다.
  • Local Descritor 와 global Descritoptor의 차이
    • global은 이미지당 하나의 디스킵터이기 때문에 이미지와 이미지간에 얼마나 비슷한지 체크하는 것이고 lcoal descritor는 이미지간의 포인트 to 포인트 매칭으로 pnp한다던지 위치를 estimation하는데 쓰인다.
  • Ceres solver 2.1나옴. 매니폴드 계산 지원 및 쿠다 지원
  • rules.d 를 통해 usb를 구분할 수 있다.

Comment  Read more

Some Useful Command of SSH and tar, linux

|

Transfer data or zip to SSH server

sudo scp something.tar.gz chan@xxx.xxx.xx.xxx:/home/changyo/

Trasfer data from server to local computer

scp chan@xxx.xxx.xx.xxx:/home/changyo/2022-02-04-15-34-51.bag ~home/chan

Untar tar (Archive)

tar xf something-noetic.tar.gz -C /opt/rit/

Linux

Search in termnial : Ctr + R

ctrl+shift+f : past copied .

https://askubuntu.com/questions/180336/how-to-find-the-process-id-pid-of-a-running-terminal-program

https://askubuntu.com/questions/797957/how-to-kill-a-daemon-process-in-linux

https://stackoverflow.com/questions/46479196/best-practice-pid-file-for-unix-daemon

Comment  Read more

Understanding Realsense

|

Understanding Realsense

https://github.com/IntelRealSense/librealsense/blob/master/doc/depth-from-stereo.md

Using this tutorial you will learn the basics of stereoscopic vision, including block-matching, calibration and rectification, depth from stereo using opencv, passive vs. active stereo and relation to structured light.

Why Depth?

Regular consumer web-cams offer streams of RGB data within the visible spectrum. This data can be used for object recognition and tracking, as well as some basic scene understanding.

Even with machine learning grasping the exact dimensions of physical objects is a very hard problem

This is where depth cameras come-in. The goal of a depth camera is to add a brand-new channel of information, with distance to every pixel. This new channel can be used just like the rest (for training and image processing) but also for measurement and scene reconstruction.

Stereoscopic Vision

Depth from Stereo is a classic computer vision algorithm inspired by human binocular vision system. It relies on two parallel view-ports and calculates depth by estimating disparities between matching key-points in the left and right images:

Depth from Stereo algorithm finds disparity by matching blocks in left and right images

Most naive implementation of this idea is the SSD (Sum of Squared Differences) block-matching algorithm:

import numpy

fx = 942.8        # lense focal length
baseline = 54.8   # distance in mm between the two cameras
disparities = 64  # num of disparities to consider
block = 15        # block size to match
units = 0.001     # depth units

for i in xrange(block, left.shape[0] - block - 1):
    for j in xrange(block + disparities, left.shape[1] - block - 1):
        ssd = numpy.empty([disparities, 1])

        # calc SSD at all possible disparities
        l = left[(i - block):(i + block), (j - block):(j + block)]
        for d in xrange(0, disparities):
            r = right[(i - block):(i + block), (j - d - block):(j - d + block)]
            ssd[d] = numpy.sum((l[:,:]-r[:,:])**2)

        # select the best match
        disparity[i, j] = numpy.argmin(ssd)

# Convert disparity to depth
depth = np.zeros(shape=left.shape).astype(float)
depth[disparity > 0] = (fx * baseline) / (units * disparity[disparity > 0])

Rectified image pair used as input to the algorithm

Depth map produced by the naive SSD block-matching implementation

Point-cloud reconstructed using SSD block-matching

There are several challenges that any actual product has to overcome:

  1. Ensuring that the images are in fact coming from two parallel views
  2. Filtering out bad pixels where matching failed due to occlusion
  3. Expanding the range of generated disparities from fixed set of integers to achieve sub-pixel accuracy

Calibration and Rectification

In reality having two exactly parallel view-ports is challenging. While it is possible to generalize the algorithm to any two calibrated cameras (by matching along epipolar lines), the more common approach is image rectification. During this step left and right images are reprojected to a common virtual plane:

Software Stereo

opencv library has everything you need to get started with depth:

  1. calibrateCamera can be used to generate extrinsic calibration between any two arbitrary view-ports
  2. stereorectify will help you rectify the two images prior to depth generation
  3. stereobm(number of disparity clarification) and stereosgbm(Stereo processing by semiglobal matching and mutual information ) can be used for disparity calculation
  4. reprojectimageto3d to project disparity image to 3D space
import numpy
import cv2

fx = 942.8          # lense focal length
baseline = 54.8     # distance in mm between the two cameras
disparities = 128   # num of disparities to consider
block = 31          # block size to match
units = 0.001       # depth units

sbm = cv2.StereoBM_create(numDisparities=disparities,
                          blockSize=block)

disparity = sbm.compute(left, right)

depth = np.zeros(shape=left.shape).astype(float)
depth[disparity > 0] = (fx * baseline) / (units * disparity[disparity > 0])

Passive vs Active Stereo

The quality of the results you will get with this algorithm depends primarily on the density of visually distinguishable points (features) for the algorithm to match. Any source of texture, natural or artificial will improve the accuracy significantly.

That’s why it is extremely useful to have an optional texture projector (usually adding details outside of the visible spectrum). As an added benefit, such projector can be used as an artificial source of light at night or in the dark.

Input images illuminated with texture projector

Left: opencv stereobm without projector. Right: stereobm with projector.

Remark on Structured-Light

Structured-Light is an alternative approach to depth from stereo. It relies on recognizing a specific projected pattern in a single image.

For customers interested in a structured-light solution Intel offers the SR300 RealSense camera

Having certain benefits, structured-light solutions are known to be fragile since any external interference (from either sunlight or another structured-light device) will prevent you from getting any depth.

In addition, since laser projector has to illuminate the entire scene, power consumption grows with range, often requiring a dedicated power source.

Depth from stereo on the other hand, only benefits from multi-camera setup and can be used with or without projector.

D400 Intel RealSense cameras

D400 RealSense cameras offer the following basic features:

  1. The device comes fully calibrated producing hardware-rectified pairs of images
  2. All depth calculations is done by the camera at up-to 90 FPS
  3. The device offers sub-pixel accuracy and high fill-rate
  4. There is an on-board texture projector for tough lighting conditions
  5. The device runs of standard USB 5V power-source drawing around 1-1.5 W

This product was designed from the ground up to address conditions critical to robotics / drones developers and to overcome the limitations of structured light.

Depth-map using D415 Intel RealSense camera

realsense pattern matching

because of realsesne depth image data generated by pattern matching between stereo camera and structed light way, if camera seeing some pattern scene, dept data will be generated in wrong because of pattern matching way.

Realsense Docs

https://github.com/IntelRealSense/librealsense/tree/master/doc

What is active infrared (IR) stereo

The camera “sees” both IR and visible light and performs dense stereo matching based on features it can see. In visible wavelength, such features would be visible features, like corners, texture, edges, etc…

In infrared, it also can pick up on artificial features generated by the projector (in addition to natural features)

https://www.researchgate.net/figure/The-depth-perception-based-on-active-infrared-stereo-vision-technology_fig2_328507744

To improve the 3D point cloud modeling of complex indoor environment, a novel point cloud registration method based on IntelR RealSenseTM depth camera is proposed in this paper, which can reduce the influence of measuring errors. The IntelR RealSenseTM depth camera adopts active infrared (IR) stereo vision technology, which is shown in Figure 2. The depth perception based on stereo vision is implemented by two image sensors and an infrared projector. The infrared projector projects non-visible structured IR pattern to improve depth accuracy in scenes. The depth calculation process in presented in the right of Figure 2. The depth image processor obtains the scene data by the two image sensors, and the depth values for each pixel can be calculated by correlating the points on the left image to the right image.

https://www.google.com/imgres?imgurl=https%3A%2F%2Fwww.laser2000.de%2F518-thickbox_default%2Fosela-random-pattern-projector-the-premium-laser-for-structured-lighting.jpg&imgrefurl=https%3A%2F%2Fwww.laser2000.de%2Fen%2Flaser-diode-modules%2F344-osela-random-pattern-projector-the-premium-laser-for-structured-lighting.html&tbnid=KBpXOuVs7WZ2KM&vet=12ahUKEwi_i7ScqMT2AhWRTPUHHbxmBFsQMygLegUIARDOAQ..i&docid=A9kYlC-CWSts5M&w=800&h=800&q=structured IR pattern&hl=en&ved=2ahUKEwi_i7ScqMT2AhWRTPUHHbxmBFsQMygLegUIARDOAQ

[https://www.google.com/imgres?imgurl=https%3A%2F%2Fstatic-01.hindawi.com%2Farticles%2Fjs%2Fvolume-2014%2F852621%2Ffigures%2F852621.fig.001.jpg&imgrefurl=https%3A%2F%2Fwww.hindawi.com%2Fjournals%2Fjs%2F2014%2F852621%2F&tbnid=6JxHaQD2t-4Z7M&vet=12ahUKEwi_i7ScqMT2AhWRTPUHHbxmBFsQMygCegUIARC8AQ..i&docid=8-jTry2NDHZNM&w=600&h=360&q=structured IR pattern&hl=en&ved=2ahUKEwi_i7ScqMT2AhWRTPUHHbxmBFsQMygCegUIARC8AQ](https://www.google.com/imgres?imgurl=https%3A%2F%2Fstatic-01.hindawi.com%2Farticles%2Fjs%2Fvolume-2014%2F852621%2Ffigures%2F852621.fig.001.jpg&imgrefurl=https%3A%2F%2Fwww.hindawi.com%2Fjournals%2Fjs%2F2014%2F852621%2F&tbnid=6JxHaQD2t-4Z7M&vet=12ahUKEwi_i7ScqMT2AhWRTPUHHbxmBFsQMygCegUIARC8AQ..i&docid=8-jTry2NDHZNM&w=600&h=360&q=structured%20IR%20pattern&hl=en&ved=2ahUKEwi_i7ScqMT2AhWRTPUHHbxmBFsQMygCegUIARC8AQ)

Comment  Read more