saeed/Image-and-Video-Understanding-Project

Go to file

saeedkhosravi94 50b6df1c3f init commit

2025-11-08 21:39:37 +01:00

init commit

2025-11-08 21:39:37 +01:00

init commit

2025-11-08 21:39:37 +01:00

.gitignore

init commit

2025-11-08 21:39:37 +01:00

README.md

init commit

2025-11-08 21:39:37 +01:00

README.md

Image and Video Understanding Project

This project compares two deep learning models for instance segmentation on waste detection: Mask R-CNN (using Detectron2) and YOLOv8.

Project Structure

MRCNN/: Mask R-CNN implementation using Detectron2
- Training and evaluation code in main.ipynb
- Trained models and results in results/
YOLO/: YOLOv8 segmentation implementation
- Training and evaluation code in main.ipynb
- Trained models and results in results/

Dataset

Both models are trained on the TACO (Trash Annotations in Context) dataset with 20 classes of waste objects including:

Plastic bottles, glass bottles, bottle caps
Drink cans, paper cups, cartons
Plastic film, wrappers, straws
Cigarettes, and other litter items

Models

Mask R-CNN: ResNet-101 backbone with Feature Pyramid Network
YOLOv8: Large segmentation model (YOLOv8l-seg)

Results

Training and evaluation results are stored in the respective results/ directories for each model.