12 KiB
STEP Processor Skill — Project Context
CoWork Memory Document · MPMedia Engineering
Updated: June 2026 · Status: ✅ FULLY TESTED — Smoke-tested on 3 Chinese-sourced STEP files
What This Is
A Python-based CAD processing skill for MPMedia's engineering workflow. Reads STEP/STP files (display enclosures, mounting kits, hardware assemblies) and automates: thumbnail generation, parts BOM export (MPM-branded Excel), Chinese-to-English label translation, STEP file rewriting with English labels, geometric queries, and external dimensional diagram creation.
Lives as a CoWork skill folder with a SKILL.md. Callable from Claude or directly from CLI.
Current Working State (as of June 2026)
Smoke Test Results
All three production STEP files tested end-to-end:
| File | Faces | Parts (STEP parser) | Translations | _EN.step |
|---|---|---|---|---|
| MR16s Gen1 | 5,655 | 28 | 16/16 | ✅ All English |
| MR27s Gen1 | 6,732 | 31 | 18/18 | ✅ All English |
| MR28uws Gen1 | 5,891 | 31 | 20/20 | ✅ All English |
What Works
- STEP loading: build123d primary, FreeCAD headless fallback
- GBK encoding: Chinese CAD files fully decoded (see Architecture Notes below)
- BOM extraction: STEP text parser as primary source — correctly reads Chinese names
- Translation: Claude API (
claude-haiku-4-5-20251001) — batched, single call per file - STEP rewriter:
_EN.stepproduced with English labels in both PRODUCT name fields - BOM export: MPM-branded
.xlsx(Montserrat/Open Sans, Dark Shade header, Gold border, alternating rows) - Thumbnails: 6 PNG views via pyrender (front, rear, left, right, iso_left, iso_right)
BOM Output Format
Columns in Excel output order:
| Column | Header in Excel |
|---|---|
| part_number | Part # |
| part_description | Part Description |
| quantity | Qty |
| level | Level |
| parent | Parent |
| bbox_x_mm | X (mm) |
| bbox_y_mm | Y (mm) |
| bbox_z_mm | Z (mm) |
| notes | Notes |
| part_name_supplier | Supplier Part Name |
part_description = translated English name (was part_name_english internally).
part_name_supplier = original Chinese name from supplier file (last column, was part_name_original).
Architecture Notes (Critical — Read Before Debugging)
GBK Encoding Fix
STEP files from Chinese CAD tools (SolidWorks CN, etc.) embed raw GBK bytes in PRODUCT entity name strings. This caused two separate problems, each with a different fix:
Problem 1: BOM shows garbled Chinese (mojibake) OpenCASCADE (OCC) / build123d's STEP reader applies an internal codec that maps each 2-byte GBK sequence to incorrect Unicode codepoints. This is NOT reversible via latin-1 → GBK round-trip because OCC's codec is not latin-1.
Fix: Bypass OCC entirely for part name extraction. bom.py reads the raw STEP file text directly with encoding detection (UTF-8 → GBK → latin-1 fallback) and parses PRODUCT entities via regex. This is the primary name source; OCC assembly walk is fallback only.
Problem 2: _EN.step still shows Chinese in CAD viewer
The rewriter (rewriter.py) was reading the file as UTF-8, turning GBK bytes into replacement chars (U+FFFD). Chinese names became ??? and never matched the translation map.
Fix: rewriter.py uses the same encoding-detection reader as bom.py. File is read as GBK when UTF-8 produces replacement chars.
Problem 3: Only first PRODUCT name field was replaced
ISO 10303-21 PRODUCT entity format: #N = PRODUCT('id', 'name', 'description', (#...)). Both the first and second quoted strings carry the part name. CAD viewers (including OpenCASCADE CAD Assistant) display the second field.
Fix: Updated PRODUCT_PATTERN regex to capture both fields with 5 groups. Replacement writes the translated name into both positions.
Translation API Model
Current model: claude-haiku-4-5-20251001
Previous value claude-sonnet-4-20250514 was returning 404 — updated in translator.py.
BOM Excel Library
openpyxl must be installed in the venv. The skill falls back to CSV silently if it's missing — don't rely on this fallback, install it explicitly.
Prerequisites for a New macOS Computer
Hardware Requirements
- Apple Silicon Mac (M1/M2/M3/M4) — all wheels are native arm64
- No Rosetta required
- macOS 13 Sequoia or later recommended
Software Stack
Python 3.10–3.12 (3.13 works but less tested; 3.9 too old)
Homebrew (for cairo system libs)
~/step-processor-env (Python venv — all pip packages go here)
Python Packages (pip install in venv)
pip install cadquery-ocp # OCC kernel, native arm64, ~300MB
pip install build123d # STEP loader, primary
pip install trimesh pyrender # thumbnail rendering pipeline
pip install Pillow numpy pandas # image and data processing
pip install anthropic # Claude API client
pip install openpyxl # Excel BOM output — REQUIRED
pip install cairosvg # SVG→PNG/PDF for diagrams (optional, diagrams only)
System Libraries (Homebrew)
brew install cairo pango gdk-pixbuf libffi # required for cairosvg
FreeCAD Fallback (Optional)
Only needed if build123d fails on a specific file.
- Download official arm64
FreeCAD.appfrom https://github.com/FreeCAD/FreeCAD/releases/latest - Drag to
/Applications, launch once so macOS approves it - Verify:
/Applications/FreeCAD.app/Contents/Resources/bin/freecadcmd --version
macOS 15 Sequoia: conda-forge FreeCAD is killed by Gatekeeper. Use the official signed
.apponly.
Anthropic API Key
The key must be available in the environment where the processor runs.
For interactive terminal use:
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc
source ~/.zshrc
For Desktop Commander / CoWork (non-interactive shell):
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv
~/.zshrc is only sourced in interactive shells. Desktop Commander spawns non-interactive zsh — it reads ~/.zshenv instead. Both files should have the key.
Setup — Fast Path (New Mac)
# 1. Create venv
python3 -m venv ~/step-processor-env
source ~/step-processor-env/bin/activate
# 2. Install all packages
pip install --upgrade pip
pip install cadquery-ocp build123d trimesh pyrender Pillow numpy pandas anthropic openpyxl
# 3. System libs for diagram export
brew install cairo pango gdk-pixbuf libffi
pip install cairosvg
# 4. API key (do BOTH for interactive + Desktop Commander)
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv
source ~/.zshrc
# 5. Verify
python -c "import build123d, openpyxl, anthropic, trimesh; print('ALL OK')"
# 6. Test run
cd /path/to/step-skill-folder
python step_processor.py yourfile.step --no-thumbnails --verbose
Expected output for Chinese STEP file:
INFO [build123d] Loaded: yourfile.step | NNNN faces | NN parts
INFO STEP text parser found NN unique part names
INFO BOM extracted: NN parts
INFO BOM XLSX → yourfile_bom.xlsx
INFO Chinese part names detected — auto-translating
INFO API returned NN translations
INFO _EN.step written: yourfile_EN.step (NN labels replaced)
File Structure
STEP File Skill/
step_processor.py ← CLI entry point
modules/
__init__.py
loader.py ← build123d load + GBK mojibake helper
bom.py ← STEP text parser + MPM-branded xlsx output
renderer.py ← 6-view PNG thumbnails (pyrender)
translator.py ← Claude API translation (claude-haiku-4-5-20251001)
rewriter.py ← _EN.step writer (GBK-aware, both PRODUCT fields)
query_engine.py ← Natural language geometry queries + REPL
external_diagram.py ← Dimensional diagram generator
schemas/
external_diagram_schema.json
parts_mapping_schema.json
templates/
datablock_template.md
SKILL.md ← Claude skill instructions
INSTALL.md ← Library setup details
SETUP_CHECKLIST.md ← Step-by-step setup + progressive tests
COWORK_CONTEXT.md ← This file
CLI Reference
# Activate environment first (every new terminal session)
source ~/step-processor-env/bin/activate
cd /path/to/STEP\ File\ Skill
# Default: thumbnails + BOM + auto-translate if Chinese detected
python step_processor.py enclosure.step
# BOM only (no thumbnails, no translate)
python step_processor.py enclosure.step --no-thumbnails --no-translate
# Force translation even if auto-detect is off
python step_processor.py enclosure.step --translate
# Single geometric query and exit
python step_processor.py enclosure.step --query "list all mounting holes"
# Interactive geometry REPL
python step_processor.py enclosure.step --repl
# External dimensional diagram
python step_processor.py enclosure.step --diagram
python step_processor.py enclosure.step --diagram --diagram-pdf
python step_processor.py enclosure.step --diagram --diagram-mode enclosure_plus_mounting
# Verbose (shows backend selection, timing, fallback notices)
python step_processor.py enclosure.step --verbose
Known Issues and Pending Work
BOM bbox Enrichment (partial)
Rows beyond the first only get 5.0 × 5.0 × 5.0mm placeholder bounding boxes. The OCC child enumeration only retrieves the top-level shape's children correctly. Root cause: build123d's .children on compound shapes doesn't walk sub-assemblies for bbox. Fix: map part names from STEP text parser back to OCC children by label — non-trivial due to OCC label mangling on CJK files.
Mounting Hole Filter — Minimum Diameter
The query engine's mounting hole detector returns PCB vias (0.4mm diameter) alongside actual mounting holes. Needs a minimum diameter threshold (recommend 2.0mm floor). Pending update to query_engine.py.
Phase 7 Full Test
The SETUP_CHECKLIST.md Phase 7 tests have been validated for Tests 1, 2, 4 (BOM, translation, rewrite). Tests 3 (thumbnails), 5 (REPL), 6/7 (diagrams) not yet re-run post GBK fix. Thumbnails were verified in an earlier session; diagram code is scaffolded but output quality against production files not fully validated.
Conda freecad_env Cleanup
From a prior session: conda activate freecad_env && conda deactivate && conda env remove -n freecad_env. The conda FreeCAD approach was abandoned in favor of the signed FreeCAD.app. This env is dead weight on the local machine.
Library Stack Decision Log
| Decision | Choice | Reason |
|---|---|---|
| CAD kernel | build123d | Native arm64 arm; clean API; same OCC as existing viewer |
| Fallback CAD | FreeCAD.app (signed) | conda-forge builds killed by Gatekeeper on Sequoia |
| Translation | Claude Haiku API | Batched, manufacturing-context prompted, flags ambiguity |
| Diagrams | SVG-first + cairosvg | No GUI dependency; vector quality; cairosvg handles PNG/PDF |
| Excel | openpyxl | MPM brand formatting, column control, frozen panes |
| Rejected | pythonocc | Rosetta/x64 conda dependency — non-starter on Apple Silicon |
| Rejected | conda-forge FreeCAD | Unsigned binaries killed by Gatekeeper on macOS 15 |
Planned Integration Points
- Odoo V18 MRP: Model number from
meta.jsontriggers lookup of weight, BOM cost, stock status - CoWork product docs:
meta.jsonfeeds product data cards; diagram PNGs embed in Knowledge Base articles - RDMC (
rdmc.messagepoint.tv): Thumbnail PNGs for display inventory visual reference - OnSign.tv: Enclosure dimensions for content sizing reference