Video Inference on TFLite Pose Estimation using Movenet Lightning Model

MoveNet is an ultra fast and accurate model that detects 17 keypoints of a body. The model is offered on TF Hub with two variants, known as Lightning and Thunder.

Lightning as its name suggest its super-fast while inference whereas Thunder is intended for higher accuracy. Both models run faster than real time (30+ FPS) on most modern desktops, laptops, and phones, which proves crucial for live fitness, health, and wellness applications.

Today, we will be looking at how to run/infer TFLite models on videos

Clone the Repository

git clone
cd Pose-Estimation-Video-Inference/

Create a new conda environment (if needed)

conda create -n ENV_NAME python=3.7 -y

Activate the environment

conda activate ENV_NAME

Install the dependencies

pip install -r requirements.txt

To Run Pose Estimation on Image


In this example, i have used Movenet Lightning from TFHub for faster inference speed, you can also checkout these models as well :

Note : Thunder and Lightning uses different input size for the model, thunder resizes to 256, other variants(like float16 or int8) use 257 whereas lightning resizes the input to 192. So change the input size accordingly based on the model.

