SkillHub

yolo-vision-tools

v1.2.3

Use Ultralytics YOLO to perform computer vision tasks, such as detecting people or objects in images and videos, classifying images, estimating human poses, and tracking cars, people, or animals in videos.

Sourced from ClawHub, Authored by Ruoyu

Installation

Please help me install the skill `yolo-vision-tools` from SkillHub official store. npx skills add Ruoyu05/yolo-vision-tools

Ultralytics YOLO Vision Tools

Ultralytics YOLO is a state-of-the-art computer vision framework supporting multiple tasks including object detection, instance segmentation, image classification, pose estimation, and oriented bounding box detection. This skill provides comprehensive guidance for using YOLO effectively.

Latest Model: YOLO26 (released January 2026) features end-to-end NMS-free inference and optimized edge deployment. For stable production workloads, both YOLO26 and YOLO11 are recommended.

Quick Start

1. Installation & Environment Check

# Install/update Ultralytics
pip install -U ultralytics

# Verify installation and check environment
yolo checks

The yolo checks command validates Python version, PyTorch, CUDA, GPU availability, and all dependencies. For detailed environment troubleshooting, see Environment Check or use the provided environment check script: python scripts/check_environment.py.

2. Basic Usage Examples

Python Interface

from ultralytics import YOLO

# Load a model (YOLO automatically infers task from model)
model = YOLO("yolo26n.pt")  # or your custom model path

# Predict on various sources
# By default, outputs are saved to workspace/yolo-vision folder
results = model("image.jpg")                     # image file → saved to yolo-vision/outputs/images/
results = model("video.mp4", stream=True)        # video with streaming → saved to yolo-vision/outputs/videos/
results = model("https://example.com/image.jpg") # URL → saved to yolo-vision/outputs/images/
results = model(0, show=True)                   # webcam with display → saved to yolo-vision/outputs/videos/

# Custom output directory (optional)
results = model("image.jpg", project="/custom/path")  # save to custom directory

CLI Interface

# Basic syntax: yolo TASK MODE ARGS
# By default, outputs are saved to workspace/yolo-vision folder
yolo predict model=yolo26n.pt source="image.jpg"  # → saved to yolo-vision/runs/detect/predict/

# Task-specific examples
yolo detect predict model=yolo26n.pt source="video.mp4"  # → saved to yolo-vision/runs/detect/predict/
yolo segment predict model=yolo26n-seg.pt source="image.jpg"  # → saved to yolo-vision/runs/segment/predict/
yolo pose predict model=yolo26n-pose.pt source="image.jpg"  # → saved to yolo-vision/runs/pose/predict/

# Custom output directory (optional)
yolo predict model=yolo26n.pt source="image.jpg" project="/custom/path"  # save to custom directory

3. Model Selection

For quick start, use these default models: - Detection: yolo26n.pt (nano), yolo26s.pt (small), yolo26m.pt (medium) - Segmentation: yolo26n-seg.pt, yolo26s-seg.pt, yolo26m-seg.pt - Classification: yolo26n-cls.pt, yolo26s-cls.pt, yolo26m-cls.pt - Pose Estimation: yolo26n-pose.pt, yolo26s-pose.pt, yolo26m-pose.pt - Oriented Detection: yolo26n-obb.pt, yolo26s-obb.pt, yolo26m-obb.pt

For complete model list and selection guidance: Model Names | Model Selection

Core Workflow

Step 1: Understand YOLO Tasks

YOLO supports five main computer vision tasks. Choose the right task for your application: - Detection: Identify and localize objects with bounding boxes - Segmentation: Generate pixel-level masks for objects - Classification: Categorize entire images - Pose Estimation: Detect keypoints for pose analysis - Oriented Detection: Detect rotated objects with angle parameter

Detailed comparison: Task Types

Step 2: Select Appropriate Model

Consider these factors when selecting a model: - Speed vs. Accuracy: Nano (fastest) → X (most accurate) - Hardware Constraints: GPU memory, CPU performance - Application Requirements: Real-time vs. batch processing

Guidance: Model Selection

Step 3: Configure Parameters

Common configuration parameters: - conf: Confidence threshold (default: 0.25) - iou: IoU threshold for NMS (default: 0.7) - imgsz: Input image size (default: 640) - device: Device ID (0 for first GPU, cpu for CPU) - save: Save results to disk - show: Display results in real-time

Complete examples: Configuration Samples

Step 4: Process Results

YOLO returns Results objects containing: - boxes: Bounding boxes, confidence scores, class labels - masks: Segmentation masks (for segmentation tasks) - keypoints: Pose keypoints (for pose estimation) - probs: Classification probabilities (for classification) - obb: Oriented bounding boxes (for OBB tasks)

Advanced Topics

Training Custom Models

from ultralytics import YOLO

# Load a model
model = YOLO("yolo26n.pt")

# Train on custom dataset
results = model.train(data="dataset.yaml", epochs=100, imgsz=640)

Training guide: Training Basics | Dataset Preparation

Installation Options

Multiple installation methods available: - pip: pip install -U ultralytics - Conda: conda install -c conda-forge ultralytics - Docker: Pre-built images for GPU/CPU environments - From Source: For development and customization

Detailed instructions: Installation Guide

Performance Optimization

  • Streaming Mode: Use stream=True for videos/long sequences to reduce memory
  • Batch Processing: Process multiple images together for efficiency
  • Hardware Acceleration: Configure CUDA, TensorRT, or OpenVINO for optimal performance

Reference Documentation

Document Description
Environment Check Comprehensive environment validation and troubleshooting
Installation Guide All installation methods (pip, Conda, Docker, source)
Task Types Detailed comparison of YOLO tasks and use cases
Model Names Complete YOLO26 model list with specifications
Model Selection Strategy for choosing models based on requirements
Configuration Samples Parameter configuration examples for various scenarios
Dataset Preparation Guide for preparing custom datasets for training
Training Basics Fundamentals of training YOLO models on custom data
Parameter Reference Complete reference for all YOLO configuration parameters

Utility Scripts

To save token usage and provide ready-to-use tools, the following Python scripts are available in the scripts/ directory:

Script Description Usage Example
check_environment.py Comprehensive environment diagnostics python scripts/check_environment.py
config_templates.py Ready-to-use configuration templates from scripts.config_templates import get_production_config
dataset_tools.py Dataset preparation and conversion tools from scripts.dataset_tools import coco_to_yolo
training_helpers.py Training, evaluation, and model management from scripts.training_helpers import evaluate_model
quick_tests.py Quick functionality tests python scripts/quick_tests.py --test environment
model_utils.py Model selection and validation utilities from scripts.model_utils import select_model

Benefits of using scripts: - Save tokens: Large code blocks are extracted from documentation - Ready-to-use: No need to copy-paste code from documentation - Modular: Import only what you need - Maintainable: Scripts can be updated independently

Troubleshooting

Common Issues

Q: yolo command not found after installation? A: Try python -m ultralytics yolo or check Python environment PATH.

Q: How to use specific GPU? A: Set device=0 (first GPU) or device=cpu for CPU-only mode.

Q: Model downloads slowly? A: Set ULTRALYTICS_HOME environment variable to control cache location.

Q: How to filter specific classes? A: Use classes parameter: classes=[0, 2, 5] (class indices).

Q: Memory issues with long videos? A: Use stream=True to process videos as generators.

Q: Real-time webcam support? A: Yes, use source=0 (default camera) with show=True for live display.

Getting Help

  • Run yolo checks to diagnose environment issues
  • Check official documentation: https://docs.ultralytics.com
  • Review configuration reference: https://docs.ultralytics.com/usage/cfg/

Output Directory Convention

Default Output Location

When processing images or videos with YOLO, if the user does not specify an output directory, all generated files will be saved to the workspace's yolo-vision folder.

File Organization

The yolo-vision folder will be organized as follows:

yolo-vision/
├── inputs/            # Original input files (copied for reference)
├── outputs/           # Processed files with detection results
│   ├── images/        # Detected images
│   ├── videos/        # Detected videos  
│   └── previews/      # Preview images
├── reports/           # Analysis reports and statistics
│   ├── json/          # JSON format reports
│   ├── markdown/      # Markdown format reports
│   └── csv/           # CSV format data
├── models/            # Downloaded YOLO models
│   ├── yolo26/        # YOLO26 models
│   ├── yolo11/        # YOLO11 models
│   └── custom/        # Custom trained models
└── logs/              # Processing logs and debug information

Automatic Folder Creation

The skill will automatically: 1. Create the yolo-vision folder if it doesn't exist 2. Create all subdirectories as needed 3. Organize files by date and task type 4. Generate timestamp-based filenames for easy tracking

Example Usage

# Without specifying output directory - uses default yolo-vision folder
results = model("image.jpg")  # Output saved to yolo-vision/outputs/images/

# With custom output directory
results = model("image.jpg", save_dir="/custom/path")  # Uses specified path

Benefits

  1. Consistency: All YOLO outputs in one predictable location
  2. Organization: Files automatically categorized by type
  3. Backup: Input files are preserved for reference
  4. Reproducibility: Easy to find and compare previous analyses
  5. Clean Workspace: Prevents clutter in the main workspace directory

User Override

Users can still specify custom output directories when needed: - By providing a save_dir parameter in Python code - By using the --project flag in CLI commands - By setting the ULTRALYTICS_PROJECT environment variable


License Note: Ultralytics YOLO is available under AGPL-3.0 for open source use and Enterprise License for commercial applications. Review licensing at https://ultralytics.com/license.