# STEP Processor Skill — Project Context ### CoWork Memory Document · MPMedia Engineering **Updated:** June 2026 · **Status:** ✅ FULLY TESTED — Smoke-tested on 3 Chinese-sourced STEP files --- ## What This Is A Python-based CAD processing skill for MPMedia's engineering workflow. Reads STEP/STP files (display enclosures, mounting kits, hardware assemblies) and automates: thumbnail generation, parts BOM export (MPM-branded Excel), Chinese-to-English label translation, STEP file rewriting with English labels, geometric queries, and external dimensional diagram creation. Lives as a CoWork skill folder with a `SKILL.md`. Callable from Claude or directly from CLI. --- ## Current Working State (as of June 2026) ### Smoke Test Results All three production STEP files tested end-to-end: | File | Faces | Parts (STEP parser) | Translations | _EN.step | |------|-------|---------------------|--------------|----------| | MR16s Gen1 | 5,655 | 28 | 16/16 | ✅ All English | | MR27s Gen1 | 6,732 | 31 | 18/18 | ✅ All English | | MR28uws Gen1 | 5,891 | 31 | 20/20 | ✅ All English | ### What Works - **STEP loading**: build123d primary, FreeCAD headless fallback - **GBK encoding**: Chinese CAD files fully decoded (see Architecture Notes below) - **BOM extraction**: STEP text parser as primary source — correctly reads Chinese names - **Translation**: Claude API (`claude-haiku-4-5-20251001`) — batched, single call per file - **STEP rewriter**: `_EN.step` produced with English labels in both PRODUCT name fields - **BOM export**: MPM-branded `.xlsx` (Montserrat/Open Sans, Dark Shade header, Gold border, alternating rows) - **Thumbnails**: 6 PNG views via pyrender (front, rear, left, right, iso_left, iso_right) ### BOM Output Format Columns in Excel output order: | Column | Header in Excel | |--------|----------------| | part_number | Part # | | part_description | Part Description | | quantity | Qty | | level | Level | | parent | Parent | | bbox_x_mm | X (mm) | | bbox_y_mm | Y (mm) | | bbox_z_mm | Z (mm) | | notes | Notes | | part_name_supplier | Supplier Part Name | `part_description` = translated English name (was `part_name_english` internally). `part_name_supplier` = original Chinese name from supplier file (last column, was `part_name_original`). --- ## Architecture Notes (Critical — Read Before Debugging) ### GBK Encoding Fix STEP files from Chinese CAD tools (SolidWorks CN, etc.) embed raw GBK bytes in `PRODUCT` entity name strings. This caused two separate problems, each with a different fix: **Problem 1: BOM shows garbled Chinese (mojibake)** OpenCASCADE (OCC) / build123d's STEP reader applies an internal codec that maps each 2-byte GBK sequence to incorrect Unicode codepoints. This is NOT reversible via latin-1 → GBK round-trip because OCC's codec is not latin-1. Fix: Bypass OCC entirely for part name extraction. `bom.py` reads the raw STEP file text directly with encoding detection (UTF-8 → GBK → latin-1 fallback) and parses `PRODUCT` entities via regex. This is the primary name source; OCC assembly walk is fallback only. **Problem 2: _EN.step still shows Chinese in CAD viewer** The rewriter (`rewriter.py`) was reading the file as UTF-8, turning GBK bytes into replacement chars (U+FFFD). Chinese names became `???` and never matched the translation map. Fix: `rewriter.py` uses the same encoding-detection reader as `bom.py`. File is read as GBK when UTF-8 produces replacement chars. **Problem 3: Only first PRODUCT name field was replaced** ISO 10303-21 `PRODUCT` entity format: `#N = PRODUCT('id', 'name', 'description', (#...))`. Both the first and second quoted strings carry the part name. CAD viewers (including OpenCASCADE CAD Assistant) display the second field. Fix: Updated `PRODUCT_PATTERN` regex to capture both fields with 5 groups. Replacement writes the translated name into both positions. ### Translation API Model Current model: `claude-haiku-4-5-20251001` Previous value `claude-sonnet-4-20250514` was returning 404 — updated in `translator.py`. ### BOM Excel Library `openpyxl` must be installed in the venv. The skill falls back to CSV silently if it's missing — don't rely on this fallback, install it explicitly. --- ## Prerequisites for a New macOS Computer ### Hardware Requirements - Apple Silicon Mac (M1/M2/M3/M4) — all wheels are native arm64 - **No Rosetta required** - macOS 13 Sequoia or later recommended ### Software Stack ``` Python 3.10–3.12 (3.13 works but less tested; 3.9 too old) Homebrew (for cairo system libs) ~/step-processor-env (Python venv — all pip packages go here) ``` ### Python Packages (pip install in venv) ```bash pip install cadquery-ocp # OCC kernel, native arm64, ~300MB pip install build123d # STEP loader, primary pip install trimesh pyrender # thumbnail rendering pipeline pip install Pillow numpy pandas # image and data processing pip install anthropic # Claude API client pip install openpyxl # Excel BOM output — REQUIRED pip install cairosvg # SVG→PNG/PDF for diagrams (optional, diagrams only) ``` ### System Libraries (Homebrew) ```bash brew install cairo pango gdk-pixbuf libffi # required for cairosvg ``` ### FreeCAD Fallback (Optional) Only needed if build123d fails on a specific file. - Download official arm64 `FreeCAD.app` from https://github.com/FreeCAD/FreeCAD/releases/latest - Drag to `/Applications`, launch once so macOS approves it - Verify: `/Applications/FreeCAD.app/Contents/Resources/bin/freecadcmd --version` > **macOS 15 Sequoia:** conda-forge FreeCAD is killed by Gatekeeper. Use the official signed `.app` only. ### Anthropic API Key The key must be available in the environment where the processor runs. **For interactive terminal use:** ```bash echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc source ~/.zshrc ``` **For Desktop Commander / CoWork (non-interactive shell):** ```bash echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv ``` `~/.zshrc` is only sourced in interactive shells. Desktop Commander spawns non-interactive zsh — it reads `~/.zshenv` instead. **Both files should have the key.** --- ## Setup — Fast Path (New Mac) ```bash # 1. Create venv python3 -m venv ~/step-processor-env source ~/step-processor-env/bin/activate # 2. Install all packages pip install --upgrade pip pip install cadquery-ocp build123d trimesh pyrender Pillow numpy pandas anthropic openpyxl # 3. System libs for diagram export brew install cairo pango gdk-pixbuf libffi pip install cairosvg # 4. API key (do BOTH for interactive + Desktop Commander) echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv source ~/.zshrc # 5. Verify python -c "import build123d, openpyxl, anthropic, trimesh; print('ALL OK')" # 6. Test run cd /path/to/step-skill-folder python step_processor.py yourfile.step --no-thumbnails --verbose ``` Expected output for Chinese STEP file: ``` INFO [build123d] Loaded: yourfile.step | NNNN faces | NN parts INFO STEP text parser found NN unique part names INFO BOM extracted: NN parts INFO BOM XLSX → yourfile_bom.xlsx INFO Chinese part names detected — auto-translating INFO API returned NN translations INFO _EN.step written: yourfile_EN.step (NN labels replaced) ``` --- ## File Structure ``` STEP File Skill/ step_processor.py ← CLI entry point modules/ __init__.py loader.py ← build123d load + GBK mojibake helper bom.py ← STEP text parser + MPM-branded xlsx output renderer.py ← 6-view PNG thumbnails (pyrender) translator.py ← Claude API translation (claude-haiku-4-5-20251001) rewriter.py ← _EN.step writer (GBK-aware, both PRODUCT fields) query_engine.py ← Natural language geometry queries + REPL external_diagram.py ← Dimensional diagram generator schemas/ external_diagram_schema.json parts_mapping_schema.json templates/ datablock_template.md SKILL.md ← Claude skill instructions INSTALL.md ← Library setup details SETUP_CHECKLIST.md ← Step-by-step setup + progressive tests COWORK_CONTEXT.md ← This file ``` --- ## CLI Reference ```bash # Activate environment first (every new terminal session) source ~/step-processor-env/bin/activate cd /path/to/STEP\ File\ Skill # Default: thumbnails + BOM + auto-translate if Chinese detected python step_processor.py enclosure.step # BOM only (no thumbnails, no translate) python step_processor.py enclosure.step --no-thumbnails --no-translate # Force translation even if auto-detect is off python step_processor.py enclosure.step --translate # Single geometric query and exit python step_processor.py enclosure.step --query "list all mounting holes" # Interactive geometry REPL python step_processor.py enclosure.step --repl # External dimensional diagram python step_processor.py enclosure.step --diagram python step_processor.py enclosure.step --diagram --diagram-pdf python step_processor.py enclosure.step --diagram --diagram-mode enclosure_plus_mounting # Verbose (shows backend selection, timing, fallback notices) python step_processor.py enclosure.step --verbose ``` --- ## Known Issues and Pending Work ### BOM bbox Enrichment (partial) Rows beyond the first only get 5.0 × 5.0 × 5.0mm placeholder bounding boxes. The OCC child enumeration only retrieves the top-level shape's children correctly. Root cause: build123d's `.children` on compound shapes doesn't walk sub-assemblies for bbox. Fix: map part names from STEP text parser back to OCC children by label — non-trivial due to OCC label mangling on CJK files. ### Mounting Hole Filter — Minimum Diameter The query engine's mounting hole detector returns PCB vias (0.4mm diameter) alongside actual mounting holes. Needs a minimum diameter threshold (recommend 2.0mm floor). Pending update to `query_engine.py`. ### Phase 7 Full Test The SETUP_CHECKLIST.md Phase 7 tests have been validated for Tests 1, 2, 4 (BOM, translation, rewrite). Tests 3 (thumbnails), 5 (REPL), 6/7 (diagrams) not yet re-run post GBK fix. Thumbnails were verified in an earlier session; diagram code is scaffolded but output quality against production files not fully validated. ### Conda freecad_env Cleanup From a prior session: `conda activate freecad_env && conda deactivate && conda env remove -n freecad_env`. The conda FreeCAD approach was abandoned in favor of the signed FreeCAD.app. This env is dead weight on the local machine. --- ## Library Stack Decision Log | Decision | Choice | Reason | |----------|--------|--------| | CAD kernel | build123d | Native arm64 arm; clean API; same OCC as existing viewer | | Fallback CAD | FreeCAD.app (signed) | conda-forge builds killed by Gatekeeper on Sequoia | | Translation | Claude Haiku API | Batched, manufacturing-context prompted, flags ambiguity | | Diagrams | SVG-first + cairosvg | No GUI dependency; vector quality; cairosvg handles PNG/PDF | | Excel | openpyxl | MPM brand formatting, column control, frozen panes | | Rejected | pythonocc | Rosetta/x64 conda dependency — non-starter on Apple Silicon | | Rejected | conda-forge FreeCAD | Unsigned binaries killed by Gatekeeper on macOS 15 | --- ## Planned Integration Points - **Odoo V18 MRP**: Model number from `meta.json` triggers lookup of weight, BOM cost, stock status - **CoWork product docs**: `meta.json` feeds product data cards; diagram PNGs embed in Knowledge Base articles - **RDMC** (`rdmc.messagepoint.tv`): Thumbnail PNGs for display inventory visual reference - **OnSign.tv**: Enclosure dimensions for content sizing reference