April 2023

Year 3

Posted by Daniela Rato on April 24, 2023

Improvements of HPE optimization, creation of evaluation scripts and comparison with other hpe algorithms

Improvements on HPE optimization algorithm

Since last month we added two new optimization functionalities (that can be turned on and off):

  • Frame to frame residuals: The optimization is restricted by the position of the last frame, in the case where of consecutive frames, which means that the movement from frame to frame should be small and smooth and the distance between the same joint in consecutive frames should be small and high distances are penalized.
  • Link length residuals: length of each link should not be much higher than the average of the link length for all the past optimized frames.

Ground truth

Since we are currently using simulation, I've created a script that when extracting the images of each camera, also listens to the gazebo/link_states topic and extracts the gazebo ground truth for the timestamp of the image (this is not a problem because this topic publishes at 500Hz and the images are published at 20Hz). So each dataset should have the folders with the images and also a file where the ground truth poses for each joint are written.

This poses are used, in the optimization script, to draw the ground truth skeleton for each frame in the visualization, which allows to understand if the optimization is being successful or not.

The following video shows the visualization, in black, of the groud truth skeleton and projections in the 3D visualization and images, respectively.

Creation of evaluation scripts

I created two different evaluation scripts: one to evaluate the 3D pose and one to evaluate the 2D detections. The 3D evaluation script produces tables that contain the RMS, X, Y and Z errors (meters) for each joint and each frame, and average per frame (that is, per skeleton), and tables that show the RMS, X, Y and Z average errors for each joint in all the frames. It also shows an image per frame of the ground truth vs optimized skeleton joint locations.

...

Please note that the gazebo skeletons are different from the openpose skeletons so for now I selected only the matching joints for evaluation. The visualization shows light gray link connections simply to understand the skeleton better, they do not have any meaning or are used in any way. Yellow lines are the connections between the same joint for the detected and ground truth skeletons.

The 2D detections evaluation scripts produces tables that contain the RMS, X, Y and Z errors (pix) for each joint, each frame and each camera, and average per frame and per camera (that is, per skeleton per camera). It also outputs a table with the average indicators for each camera and all the frames. This script also shows the camera images with the detected points and the ground truth projections.

...

In the image, the squares represent the ground truth projections and the crosses represent the detected joints.

You can check further details and tables here.

Comparison with other algorithms

To compare our method with different algorithms, we need to convert their output into a similar json file outputted by out method, to allow using out evaluation scripts.

Currently, I am trying to run the Openpose 3D, MixSTE and MotionBert algorithms in our simulated datasets. These algorithms were selected in the paperwithcode website by checking the best 3D HPE algorithms, but I am unable to replicate them, I will move to different ones.

Currently, I've made some progress with implementing the MotionBert algorithm but the results were not as expected, as you can see in the next videos. You can read more details about the implementation in this issue.

On-going tasks

  • Adding the link length restrictions to optimization (optimization weights are very low when compared to other algorithms - need fixing)
  • Calibrate an entire video and output a video of the 3D pose
  • Discuss the possibility of describing the skeleton with Denavit–Hartenberg parameters
  • Compare with different algorithms
  • Improve first guess with last frame optimized skeleton in case of consecutive frames

Issues