This commit is contained in:
Jason Stedwell
2026-06-17 16:03:26 -05:00
parent fa1e9b68c7
commit c1abe36822
99 changed files with 1562887 additions and 0 deletions
+276
View File
@@ -0,0 +1,276 @@
# STEP Processor Skill — Project Context
### CoWork Memory Document · MPMedia Engineering
**Updated:** June 2026 · **Status:** ✅ FULLY TESTED — Smoke-tested on 3 Chinese-sourced STEP files
---
## What This Is
A Python-based CAD processing skill for MPMedia's engineering workflow. Reads STEP/STP files (display enclosures, mounting kits, hardware assemblies) and automates: thumbnail generation, parts BOM export (MPM-branded Excel), Chinese-to-English label translation, STEP file rewriting with English labels, geometric queries, and external dimensional diagram creation.
Lives as a CoWork skill folder with a `SKILL.md`. Callable from Claude or directly from CLI.
---
## Current Working State (as of June 2026)
### Smoke Test Results
All three production STEP files tested end-to-end:
| File | Faces | Parts (STEP parser) | Translations | _EN.step |
|------|-------|---------------------|--------------|----------|
| MR16s Gen1 | 5,655 | 28 | 16/16 | ✅ All English |
| MR27s Gen1 | 6,732 | 31 | 18/18 | ✅ All English |
| MR28uws Gen1 | 5,891 | 31 | 20/20 | ✅ All English |
### What Works
- **STEP loading**: build123d primary, FreeCAD headless fallback
- **GBK encoding**: Chinese CAD files fully decoded (see Architecture Notes below)
- **BOM extraction**: STEP text parser as primary source — correctly reads Chinese names
- **Translation**: Claude API (`claude-haiku-4-5-20251001`) — batched, single call per file
- **STEP rewriter**: `_EN.step` produced with English labels in both PRODUCT name fields
- **BOM export**: MPM-branded `.xlsx` (Montserrat/Open Sans, Dark Shade header, Gold border, alternating rows)
- **Thumbnails**: 6 PNG views via pyrender (front, rear, left, right, iso_left, iso_right)
### BOM Output Format
Columns in Excel output order:
| Column | Header in Excel |
|--------|----------------|
| part_number | Part # |
| part_description | Part Description |
| quantity | Qty |
| level | Level |
| parent | Parent |
| bbox_x_mm | X (mm) |
| bbox_y_mm | Y (mm) |
| bbox_z_mm | Z (mm) |
| notes | Notes |
| part_name_supplier | Supplier Part Name |
`part_description` = translated English name (was `part_name_english` internally).
`part_name_supplier` = original Chinese name from supplier file (last column, was `part_name_original`).
---
## Architecture Notes (Critical — Read Before Debugging)
### GBK Encoding Fix
STEP files from Chinese CAD tools (SolidWorks CN, etc.) embed raw GBK bytes in `PRODUCT` entity name strings. This caused two separate problems, each with a different fix:
**Problem 1: BOM shows garbled Chinese (mojibake)**
OpenCASCADE (OCC) / build123d's STEP reader applies an internal codec that maps each 2-byte GBK sequence to incorrect Unicode codepoints. This is NOT reversible via latin-1 → GBK round-trip because OCC's codec is not latin-1.
Fix: Bypass OCC entirely for part name extraction. `bom.py` reads the raw STEP file text directly with encoding detection (UTF-8 → GBK → latin-1 fallback) and parses `PRODUCT` entities via regex. This is the primary name source; OCC assembly walk is fallback only.
**Problem 2: _EN.step still shows Chinese in CAD viewer**
The rewriter (`rewriter.py`) was reading the file as UTF-8, turning GBK bytes into replacement chars (U+FFFD). Chinese names became `???` and never matched the translation map.
Fix: `rewriter.py` uses the same encoding-detection reader as `bom.py`. File is read as GBK when UTF-8 produces replacement chars.
**Problem 3: Only first PRODUCT name field was replaced**
ISO 10303-21 `PRODUCT` entity format: `#N = PRODUCT('id', 'name', 'description', (#...))`. Both the first and second quoted strings carry the part name. CAD viewers (including OpenCASCADE CAD Assistant) display the second field.
Fix: Updated `PRODUCT_PATTERN` regex to capture both fields with 5 groups. Replacement writes the translated name into both positions.
### Translation API Model
Current model: `claude-haiku-4-5-20251001`
Previous value `claude-sonnet-4-20250514` was returning 404 — updated in `translator.py`.
### BOM Excel Library
`openpyxl` must be installed in the venv. The skill falls back to CSV silently if it's missing — don't rely on this fallback, install it explicitly.
---
## Prerequisites for a New macOS Computer
### Hardware Requirements
- Apple Silicon Mac (M1/M2/M3/M4) — all wheels are native arm64
- **No Rosetta required**
- macOS 13 Sequoia or later recommended
### Software Stack
```
Python 3.103.12 (3.13 works but less tested; 3.9 too old)
Homebrew (for cairo system libs)
~/step-processor-env (Python venv — all pip packages go here)
```
### Python Packages (pip install in venv)
```bash
pip install cadquery-ocp # OCC kernel, native arm64, ~300MB
pip install build123d # STEP loader, primary
pip install trimesh pyrender # thumbnail rendering pipeline
pip install Pillow numpy pandas # image and data processing
pip install anthropic # Claude API client
pip install openpyxl # Excel BOM output — REQUIRED
pip install cairosvg # SVG→PNG/PDF for diagrams (optional, diagrams only)
```
### System Libraries (Homebrew)
```bash
brew install cairo pango gdk-pixbuf libffi # required for cairosvg
```
### FreeCAD Fallback (Optional)
Only needed if build123d fails on a specific file.
- Download official arm64 `FreeCAD.app` from https://github.com/FreeCAD/FreeCAD/releases/latest
- Drag to `/Applications`, launch once so macOS approves it
- Verify: `/Applications/FreeCAD.app/Contents/Resources/bin/freecadcmd --version`
> **macOS 15 Sequoia:** conda-forge FreeCAD is killed by Gatekeeper. Use the official signed `.app` only.
### Anthropic API Key
The key must be available in the environment where the processor runs.
**For interactive terminal use:**
```bash
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc
source ~/.zshrc
```
**For Desktop Commander / CoWork (non-interactive shell):**
```bash
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv
```
`~/.zshrc` is only sourced in interactive shells. Desktop Commander spawns non-interactive zsh — it reads `~/.zshenv` instead. **Both files should have the key.**
---
## Setup — Fast Path (New Mac)
```bash
# 1. Create venv
python3 -m venv ~/step-processor-env
source ~/step-processor-env/bin/activate
# 2. Install all packages
pip install --upgrade pip
pip install cadquery-ocp build123d trimesh pyrender Pillow numpy pandas anthropic openpyxl
# 3. System libs for diagram export
brew install cairo pango gdk-pixbuf libffi
pip install cairosvg
# 4. API key (do BOTH for interactive + Desktop Commander)
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv
source ~/.zshrc
# 5. Verify
python -c "import build123d, openpyxl, anthropic, trimesh; print('ALL OK')"
# 6. Test run
cd /path/to/step-skill-folder
python step_processor.py yourfile.step --no-thumbnails --verbose
```
Expected output for Chinese STEP file:
```
INFO [build123d] Loaded: yourfile.step | NNNN faces | NN parts
INFO STEP text parser found NN unique part names
INFO BOM extracted: NN parts
INFO BOM XLSX → yourfile_bom.xlsx
INFO Chinese part names detected — auto-translating
INFO API returned NN translations
INFO _EN.step written: yourfile_EN.step (NN labels replaced)
```
---
## File Structure
```
STEP File Skill/
step_processor.py ← CLI entry point
modules/
__init__.py
loader.py ← build123d load + GBK mojibake helper
bom.py ← STEP text parser + MPM-branded xlsx output
renderer.py ← 6-view PNG thumbnails (pyrender)
translator.py ← Claude API translation (claude-haiku-4-5-20251001)
rewriter.py ← _EN.step writer (GBK-aware, both PRODUCT fields)
query_engine.py ← Natural language geometry queries + REPL
external_diagram.py ← Dimensional diagram generator
schemas/
external_diagram_schema.json
parts_mapping_schema.json
templates/
datablock_template.md
SKILL.md ← Claude skill instructions
INSTALL.md ← Library setup details
SETUP_CHECKLIST.md ← Step-by-step setup + progressive tests
COWORK_CONTEXT.md ← This file
```
---
## CLI Reference
```bash
# Activate environment first (every new terminal session)
source ~/step-processor-env/bin/activate
cd /path/to/STEP\ File\ Skill
# Default: thumbnails + BOM + auto-translate if Chinese detected
python step_processor.py enclosure.step
# BOM only (no thumbnails, no translate)
python step_processor.py enclosure.step --no-thumbnails --no-translate
# Force translation even if auto-detect is off
python step_processor.py enclosure.step --translate
# Single geometric query and exit
python step_processor.py enclosure.step --query "list all mounting holes"
# Interactive geometry REPL
python step_processor.py enclosure.step --repl
# External dimensional diagram
python step_processor.py enclosure.step --diagram
python step_processor.py enclosure.step --diagram --diagram-pdf
python step_processor.py enclosure.step --diagram --diagram-mode enclosure_plus_mounting
# Verbose (shows backend selection, timing, fallback notices)
python step_processor.py enclosure.step --verbose
```
---
## Known Issues and Pending Work
### BOM bbox Enrichment (partial)
Rows beyond the first only get 5.0 × 5.0 × 5.0mm placeholder bounding boxes. The OCC child enumeration only retrieves the top-level shape's children correctly. Root cause: build123d's `.children` on compound shapes doesn't walk sub-assemblies for bbox. Fix: map part names from STEP text parser back to OCC children by label — non-trivial due to OCC label mangling on CJK files.
### Mounting Hole Filter — Minimum Diameter
The query engine's mounting hole detector returns PCB vias (0.4mm diameter) alongside actual mounting holes. Needs a minimum diameter threshold (recommend 2.0mm floor). Pending update to `query_engine.py`.
### Phase 7 Full Test
The SETUP_CHECKLIST.md Phase 7 tests have been validated for Tests 1, 2, 4 (BOM, translation, rewrite). Tests 3 (thumbnails), 5 (REPL), 6/7 (diagrams) not yet re-run post GBK fix. Thumbnails were verified in an earlier session; diagram code is scaffolded but output quality against production files not fully validated.
### Conda freecad_env Cleanup
From a prior session: `conda activate freecad_env && conda deactivate && conda env remove -n freecad_env`. The conda FreeCAD approach was abandoned in favor of the signed FreeCAD.app. This env is dead weight on the local machine.
---
## Library Stack Decision Log
| Decision | Choice | Reason |
|----------|--------|--------|
| CAD kernel | build123d | Native arm64 arm; clean API; same OCC as existing viewer |
| Fallback CAD | FreeCAD.app (signed) | conda-forge builds killed by Gatekeeper on Sequoia |
| Translation | Claude Haiku API | Batched, manufacturing-context prompted, flags ambiguity |
| Diagrams | SVG-first + cairosvg | No GUI dependency; vector quality; cairosvg handles PNG/PDF |
| Excel | openpyxl | MPM brand formatting, column control, frozen panes |
| Rejected | pythonocc | Rosetta/x64 conda dependency — non-starter on Apple Silicon |
| Rejected | conda-forge FreeCAD | Unsigned binaries killed by Gatekeeper on macOS 15 |
---
## Planned Integration Points
- **Odoo V18 MRP**: Model number from `meta.json` triggers lookup of weight, BOM cost, stock status
- **CoWork product docs**: `meta.json` feeds product data cards; diagram PNGs embed in Knowledge Base articles
- **RDMC** (`rdmc.messagepoint.tv`): Thumbnail PNGs for display inventory visual reference
- **OnSign.tv**: Enclosure dimensions for content sizing reference