xtranslate
v3.2.0智能翻译PDF、Word、Excel、PPT、TXT、RTF文件,保留格式、优化排版,支持多引擎与批量处理...
Installation
Xtranslate 智能翻译工具 / Intelligent Translation Tool
OpenClaw调用方式 / OpenClaw Invocation Method
当用户在OpenClaw中提出以下需求时,调用本技能: When users request the following in OpenClaw, invoke this skill:
| 用户指令 / User Commands | 执行动作 / Execution Actions |
|---|---|
| "翻译文档" / "翻译文件" / "翻译PDF/Word" "Translate document" / "Translate file" / "Translate PDF/Word" |
调用程序自动翻译指定文件 Invoke program to automatically translate specified file |
| "批量翻译" / "翻译文件夹" "Batch translation" / "Translate folder" |
批量处理文件夹内所有支持格式的文件 Batch process all supported format files in folder |
调用示例 / Invocation Examples
用户说 / User says:"帮我翻译这个PDF文件"
"Help me translate this PDF file"
Claude执行 / Claude executes:
python src/main.py --file document.pdf --engine cloud
用户说 / User says:"批量翻译这个文件夹"
"Batch translate this folder"
Claude执行 / Claude executes:
python src/main.py --dir ./documents --engine cloud --model "DeepSeek (V3/R1)"
核心指令 / Core Commands
当用户需要翻译文档时,执行以下流程:
When users need to translate documents, execute the following process:
- 配置翻译引擎 → 云端模型 / Ollama本地 / Python内置
Configure translation engine → Cloud model / Ollama local / Python built-in - 指定文件 → 支持PDF/DOCX/XLSX/PPTX/TXT/RTF等格式
Specify file → Support PDF/DOCX/XLSX/PPTX/TXT/RTF and other formats - 执行翻译 → 程序自动提取文本、翻译、保留格式、优化排版
Execute translation → Program automatically extracts text, translates, preserves format, optimizes layout - 输出结果 → 生成翻译后的文件到指定目录
Output results → Generate translated files to specified directory
快速使用 / Quick Start
# 翻译单个文件 / Translate single file
python src/main.py --file document.pdf --engine cloud
# 批量翻译文件夹 / Batch translate folder
python src/main.py --dir ./documents --output ./translated
# 指定模型和语言方向 / Specify model and language direction
python src/main.py --file doc.docx --model "GPT-4o" --sl en --tl zh-CN
# 使用Ollama本地模型 / Use Ollama local model
python src/main.py --file doc.pdf --engine ollama
配置说明 / Configuration Guide
环境变量配置 / Environment Variable Configuration
# DeepSeek API密钥 / API Key
$env:DEEPSEEK_API_KEY="your-api-key-here"
# OpenAI API密钥 / API Key
$env:OPENAI_API_KEY="your-api-key-here"
# Anthropic API密钥 / API Key
$env:ANTHROPIC_API_KEY="your-api-key-here"
# SiliconFlow API密钥 / API Key
$env:SILICONFLOW_API_KEY="your-api-key-here"
支持的翻译引擎 / Supported Translation Engines
| 引擎 / Engine | 说明 / Description | 适用场景 / Use Cases |
|---|---|---|
| cloud | 云端大模型(DeepSeek/GPT-4o/Claude等) Cloud large models (DeepSeek/GPT-4o/Claude, etc.) |
高质量翻译,推荐 High-quality translation, recommended |
| ollama | 本地Ollama模型 Local Ollama model |
隐私保护,离线使用 Privacy protection, offline use |
| python | 内置translate库 Built-in translate library |
紧急备用,质量一般 Emergency backup, average quality |
支持的云端模型 / Supported Cloud Models
- DeepSeek (V3/R1)
- GPT-4o
- Claude 3.5 Sonnet
- SiliconFlow (Qwen2.5)
- 自定义 (OpenAI兼容接口)
Custom (OpenAI compatible interface)
性能优化配置 / Performance Optimization Configuration
- 智能批次优化: 自动调整批次大小以平衡速度和质量
Smart Batch Optimization: Automatically adjust batch size to balance speed and quality - 并发处理: 支持多批次并行翻译(可配置)
Concurrent Processing: Support multi-batch parallel translation (configurable) - 参数调优: temperature=0.3 平衡准确性和创造性
Parameter Tuning: temperature=0.3 balances accuracy and creativity - 大文档优化: 针对90页以上文档的特殊优化配置
Large Document Optimization: Special optimization configuration for documents over 90 pages
输出结构 / Output Structure
output/ # 输出文件夹(自动创建)<br># Output folder (automatically created)
├── translated_原文件名.docx # 翻译后的Word文档<br># Translated Word document
├── translated_原文件名.pdf # 翻译后的PDF(通过Word转换)<br># Translated PDF (converted from Word)
├── translated_原文件名.xlsx # 翻译后的Excel<br># Translated Excel
├── translated_原文件名.pptx # 翻译后的PPT<br># Translated PPT
tmp/ # 临时文件夹<br># Temporary folder
└── [PDF转Word临时文件] # 自动清理<br># [PDF to Word temporary files] # Auto cleanup
文件说明 / File Description
- 自动排版 - Word文档自动设置宋体/Times New Roman字体,1.25倍行距
Automatic Layout - Word documents automatically set SimSun/Times New Roman font, 1.25 line spacing - 格式保留 - 保留原文档的表格、段落、样式等格式
Format Preservation - Preserve tables, paragraphs, styles and other formats of original documents - 关键词引导 - 自动提取文档关键词,提升专业术语翻译准确性
Keyword Guidance - Automatically extract document keywords to improve professional term translation accuracy
功能特点 / Features
- ✅ 多格式支持 - PDF/DOCX/XLSX/PPTX/TXT/RTF/WPS
Multi-format Support - PDF/DOCX/XLSX/PPTX/TXT/RTF/WPS - ✅ 格式保留 - 翻译后保留原文档结构和样式
Format Preservation - Preserve original document structure and style after translation - ✅ 智能排版 - 自动优化Word字体和行距
Smart Layout - Automatically optimize Word fonts and line spacing - ✅ 批量处理 - 支持文件夹批量翻译
Batch Processing - Support folder batch translation - ✅ 多引擎支持 - 云端/本地/内置三种引擎可选
Multi-engine Support - Three engines available: cloud/local/built-in -
✅ 关键词提取 - 使用jieba提取专业术语提升翻译质量
Keyword Extraction - Use jieba to extract professional terms to improve translation quality -
✅ API加密 - API Key加密存储,确保安全
API Encryption - Encrypt API Key storage to ensure security - ✅ 智能监控 - 全程记录翻译过程,自动性能分析
Smart Monitoring - Record translation process throughout, automatic performance analysis - ✅ 自动优化 - 基于历史数据提供优化建议
Auto Optimization - Provide optimization suggestions based on historical data - ✅ JSON批量 - 云模型采用JSON格式批量翻译,提升效率
JSON Batch - Cloud model uses JSON format for batch translation, improving efficiency - ✅ 性能优化 - 智能批次调整,大文档处理速度提升25-30%
Performance Optimization - Smart batch adjustment, large document processing speed improved by 25-30% - ✅ 并发加速 - 支持多批次并行处理,进一步提升效率
Concurrent Acceleration - Support multi-batch parallel processing, further improve efficiency
文件结构 / File Structure
Xtranslate/
├── SKILL.md # 核心指令 + 元数据<br># Core instructions + metadata
├── requirements.txt # Python依赖列表<br># Python dependency list
├── Xtranslate.spec # PyInstaller打包配置<br># PyInstaller packaging configuration
│
├── src/ # 源代码目录<br># Source code directory
│ ├── config.py # 全局配置(引擎、模型、语言等)<br># Global configuration (engines, models, languages, etc.)
│ ├── main.py # 命令行入口程序<br># Command line entry program
│ ├── translator.py # 翻译引擎模块<br># Translation engine module
│ ├── file_handler.py # 文件读写处理器<br># File read/write processor
│ ├── analyzer.py # 文本分析与关键词提取<br># Text analysis and keyword extraction
│ ├── formatter.py # Word排版优化器<br># Word layout optimizer
│ ├── crypto_utils.py # API Key加密工具<br># API Key encryption tool
│ ├── translation_monitor.py # 翻译过程监控器<br># Translation process monitor
│ └── ...
│
├── templates/ # 代码模板<br># Code templates
│ ├── module.py.md # Python模块模板<br># Python module template
│ └── translator_engine.py.md # 新增引擎模板<br># New engine template
│
├── examples/ # 代码示例<br># Code examples
│ ├── good.md # 优秀代码示例<br># Excellent code examples
│ └── anti-pattern.md # 反模式示例<br># Anti-pattern examples
│
├── references/ # 规范参考<br># Specification references
│ ├── naming-convention.md # 命名规范<br># Naming convention
│ └── project-structure.md # 项目结构规范<br># Project structure specification
│
├── scripts/ # 辅助脚本<br># Auxiliary scripts
│ └── check-deps.py # 依赖检查脚本<br># Dependency check script
│
├── translation_monitor.json # 翻译监控记录文件<br># Translation monitoring record file
├── analyze_performance.py # 性能分析脚本<br># Performance analysis script
├── view_monitor.py # 监控记录查看脚本<br># Monitoring record viewing script
├── view_detail.py # 详细记录查看脚本<br># Detailed record viewing script
│
├── build/ # 构建文件<br># Build files
├── dist/ # 可执行文件输出<br># Executable file output
│ └── Xtranslate.exe # Windows可执行文件<br># Windows executable file
├── output/ # 翻译结果输出<br># Translation result output
└── tmp/ # 临时文件<br># Temporary files
模块说明 / Module Description
| 模块 / Module | 功能 / Function |
|---|---|
| config.py | 全局配置管理,包含模型配置、API设置、批次参数 Global configuration management, including model configuration, API settings, batch parameters |
| main.py | 命令行入口,支持参数解析和批量处理 Command line entry, supports parameter parsing and batch processing |
| translator.py | 翻译核心,支持cloud/ollama/python三种引擎 Translation core, supports cloud/ollama/python three engines |
| file_handler.py | 多格式文件读写(Word/Excel/PPT/PDF/TXT等) Multi-format file read/write (Word/Excel/PPT/PDF/TXT, etc.) |
| analyzer.py | 使用jieba进行中文关键词提取 Use jieba for Chinese keyword extraction |
| formatter.py | Word文档自动排版优化 Word document automatic layout optimization |
| crypto_utils.py | Fernet加密API Key
Fernet encrypt API Key |
| translation_monitor.py | 翻译过程全程监控,性能统计与自动分析
Full-process monitoring of translation, performance statistics and automatic analysis |
依赖安装 / Dependency Installation
pip install -r requirements.txt
主要依赖 / Main Dependencies
- openai - 云端模型API调用
Cloud model API calls - ollama - 本地Ollama模型
Local Ollama model - python-docx - Word文档处理
Word document processing - pdf2docx - PDF转Word
PDF to Word conversion - openpyxl - Excel处理
Excel processing -
python-pptx - PPT处理
PPT processing -
jieba - 中文分词和关键词提取
Chinese word segmentation and keyword extraction - cryptography - API Key加密
API Key encryption - translate - Python内置翻译库
Python built-in translation library - striprtf - RTF文件处理
RTF file processing
智能监控系统 / Smart Monitoring System
监控功能 / Monitoring Functions
- 全程记录 - 自动记录每次翻译的详细过程
Full Process Recording - Automatically record detailed process of each translation - 性能统计 - 精确测量各阶段耗时
Performance Statistics - Accurately measure time consumption of each stage - 自动分析 - 每10次翻译后生成性能报告
Automatic Analysis - Generate performance report after every 10 translations - 优化建议 - 基于历史数据提供针对性优化方案
Optimization Suggestions - Provide targeted optimization solutions based on historical data
监控文件 / Monitoring Files
translation_monitor.json- 存储所有翻译记录translation_monitor.json- Store all translation recordsanalyze_performance.py- 性能深度分析脚本analyze_performance.py- Performance deep analysis scriptview_monitor.py- 查看记录概览view_monitor.py- View record overviewview_detail.py- 查看详细记录view_detail.py- View detailed records
使用示例 / Usage Examples
# 查看翻译记录概览 / View translation record overview
python view_monitor.py
# 查看详细记录 / View detailed records
python view_detail.py
# 性能深度分析 / Performance deep analysis
python analyze_performance.py
故障排除 / Troubleshooting
| 问题 / Issues | 解决方案 / Solutions |
|---|---|
| API Key错误 / API Key error | 检查环境变量是否正确设置 Check if environment variables are correctly set |
| PDF转换失败 / PDF conversion failed | 确保PDF非扫描件,有可选中文字 Ensure PDF is not scanned and has selectable text |
| Ollama连接失败 / Ollama connection failed | 确保Ollama服务已启动
Ensure Ollama service is started |
| 翻译质量差 / Poor translation quality | 尝试切换云端模型,或使用关键词引导
Try switching cloud models, or use keyword guidance |
| 排版错乱 / Layout disorder | 检查原文档格式是否过于复杂
Check if original document format is too complex |
| 翻译速度慢 / Slow translation speed | 使用analyze_performance.py分析性能瓶颈
Use analyze_performance.py to analyze performance bottlenecks |
| 监控记录异常 / Monitoring record exception | 检查translation_monitor.json文件权限
Check translation_monitor.json file permissions |
| 大文档处理过慢 / Large document processing too slow | 检查是否启用了性能优化配置
Check if performance optimization configuration is enabled |
| 并发处理失败 / Concurrent processing failed | 降低concurrent参数或检查API限制
Reduce concurrent parameter or check API limits |
注意:本工具仅供个人学习研究使用,请遵守相关平台规则。