Video Inference on TFLite Pose Estimation using Movenet Lightning Model
MoveNet is an ultra fast and accurate model that detects 17 keypoints of a body. The model is offered on TF Hub with two variants, known as Lightning and Thunder.
Lightning as its name suggest its super-fast while inference whereas Thunder is intended for higher accuracy. Both models run faster than real time (30+ FPS) on most modern desktops, laptops, and phones, which proves crucial for live fitness, health, and wellness applications.
Today, we will be looking at how to run/infer TFLite models on videos
Link to github : https://github.com/amalaj7/Pose-Estimation-Video-Inference
Clone the Repository
git clone https://github.com/amalaj7/Pose-Estimation-Video-Inference.git
cd Pose-Estimation-Video-Inference/
Create a new conda environment (if needed)
conda create -n ENV_NAME python=3.7 -y
Activate the environment
conda activate ENV_NAME
Install the dependencies
pip install -r requirements.txt
To Run Pose Estimation on Image
python movenet_prediction.py
Result :
In this example, i have used Movenet Lightning from TFHub for faster inference speed, you can also checkout these models as well : https://www.tensorflow.org/lite/examples/pose_estimation/overview
Note : Thunder and Lightning uses different input size for the model, thunder resizes to 256, other variants(like float16 or int8) use 257 whereas lightning resizes the input to 192. So change the input size accordingly based on the model.
References : https://www.tensorflow.org/hub/tutorials/movenet
Hope you learned something new today, Happy Learning!