|
# Foot Calib Pos Image Processor |
|
|
|
This project uses the TvCalib library to calculate the homography matrix for a football pitch image. This matrix allows mapping image points onto a standard 2D representation of the pitch (minimap). |
|
|
|
The project also includes a pose estimation step (ViTPose) to detect players and calculate the average color of their torso. |
|
|
|
**The main result is a minimap where each player is represented by their colored skeleton, drawn at a *dynamically* reduced scale around their projected position on the pitch.** |
|
The position is determined by projecting a reference point (feet/bottom of bbox) using the homography. The skeleton is then drawn using its relative coordinates (original image), scaled, and translated. The scale used depends on the **player's vertical position (Y) on the minimap** (higher on the minimap = smaller) and a base factor adjustable via the `--target_avg_scale` option. |
|
|
|
It is also possible to visualize a minimap with the projection of the original image for comparison. |
|
|
|
## Features |
|
|
|
* Homography calculation from a single image via TvCalib. |
|
* Person detection (RT-DETR) and pose estimation (ViTPose). |
|
* Calculation of the filtered average torso color for each player. |
|
* Projection of each player's reference point (feet/bbox) onto the minimap. |
|
* Generation of a minimap with the **original skeletons (colored, dynamically scaled based on projected Y position, and offset) drawn around the projected point**. |
|
* (Optional) Generation of a minimap with the projected original image. |
|
* Possibility to save the calculated homography matrix. |
|
|
|
## Project Structure |
|
|
|
``` |
|
. |
|
├── .git/ # Git metadata |
|
├── .venv/ # Python virtual environment (recommended) |
|
├── common/ # Common Python modules (potentially) |
|
├── data/ # Data (input images, etc.) |
|
├── models/ |
|
│ └── segmentation/ |
|
│ └── train_59.pt # Pre-trained segmentation model (TO DOWNLOAD) |
|
├── tvcalib/ # Source code of the TvCalib library (or a fork/adaptation) |
|
│ └── infer/ |
|
│ └── module.py # Main module for TvCalib inference |
|
├── .gitignore # Files ignored by Git |
|
├── main.py # Main script entry point |
|
├── requirements.txt # Python dependencies file |
|
├── visualizer.py # Module for generating visualization minimaps |
|
├── pose_estimator.py # Module for pose estimation and player data extraction |
|
└── README.md # This file |
|
``` |
|
|
|
## Installation |
|
|
|
1. **Clone the repository:** |
|
```powershell |
|
git clone <repository-url> |
|
cd Foot_calib_pos_image_processor |
|
``` |
|
|
|
2. **Create a virtual environment (recommended):** |
|
```powershell |
|
python -m venv venv |
|
.\venv\Scripts\Activate.ps1 |
|
``` |
|
|
|
3. **Install dependencies:** |
|
```powershell |
|
pip install -r requirements.txt |
|
``` |
|
*(Make sure to install PyTorch with appropriate CUDA support if needed.)* |
|
|
|
4. **Download the segmentation model:** |
|
Place `train_59.pt` in `models/segmentation/`. |
|
|
|
5. **(Automatic) Download detection/pose models:** |
|
The RT-DETR and ViTPose models will be downloaded automatically. |
|
|
|
## Usage |
|
|
|
Run the `main.py` script providing the path to the image: |
|
|
|
```powershell |
|
python main.py path/to/your/image.jpg [OPTIONS] |
|
``` |
|
|
|
**Options:** |
|
|
|
* `image_path`: Path to the input image (required). |
|
* `--output_homography PATH.npy`: Saves the calculated homography matrix. |
|
* `--optim_steps NUMBER`: Number of optimization steps for calibration (default: 500, was 1000 in original README example). |
|
* `--target_avg_scale FLOAT`: **Target average** scale factor for drawing skeletons (default: 0.35). The script attempts to adjust the internal base scale so that the resulting average scale (after inverse dynamic modulation) is close to this value. |
|
|
|
**Example:** |
|
|
|
```powershell |
|
# Simple usage (target average size 0.35) |
|
python main.py data/img3.png |
|
|
|
# Aim for larger skeletons on average (target 0.5) |
|
python main.py data/img2.png --target_avg_scale 0.5 |
|
``` |
|
|
|
The script will display: |
|
* Time taken and homography matrix. |
|
* Estimated internal base scale. |
|
* Requested TARGET average scale. |
|
* ACTUALLY applied FINAL average scale. |
|
* Window: **Minimap with Original Projection**. |
|
* Window: **Minimap with Offset Skeletons** (dynamically scaled inversely, targeting the average scale). |
|
* Press any key to close. |
|
|
|
## Key Dependencies |
|
|
|
* PyTorch, OpenCV, NumPy, PyTorch Lightning |
|
* SoccerNet, Kornia, Hugging Face Transformers, Pillow |