Files
step-parse/skill.src/COWORK_CONTEXT.md
T
Jason Stedwell c1abe36822 phase 0
2026-06-17 16:03:26 -05:00

277 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# STEP Processor Skill — Project Context
### CoWork Memory Document · MPMedia Engineering
**Updated:** June 2026 · **Status:** ✅ FULLY TESTED — Smoke-tested on 3 Chinese-sourced STEP files
---
## What This Is
A Python-based CAD processing skill for MPMedia's engineering workflow. Reads STEP/STP files (display enclosures, mounting kits, hardware assemblies) and automates: thumbnail generation, parts BOM export (MPM-branded Excel), Chinese-to-English label translation, STEP file rewriting with English labels, geometric queries, and external dimensional diagram creation.
Lives as a CoWork skill folder with a `SKILL.md`. Callable from Claude or directly from CLI.
---
## Current Working State (as of June 2026)
### Smoke Test Results
All three production STEP files tested end-to-end:
| File | Faces | Parts (STEP parser) | Translations | _EN.step |
|------|-------|---------------------|--------------|----------|
| MR16s Gen1 | 5,655 | 28 | 16/16 | ✅ All English |
| MR27s Gen1 | 6,732 | 31 | 18/18 | ✅ All English |
| MR28uws Gen1 | 5,891 | 31 | 20/20 | ✅ All English |
### What Works
- **STEP loading**: build123d primary, FreeCAD headless fallback
- **GBK encoding**: Chinese CAD files fully decoded (see Architecture Notes below)
- **BOM extraction**: STEP text parser as primary source — correctly reads Chinese names
- **Translation**: Claude API (`claude-haiku-4-5-20251001`) — batched, single call per file
- **STEP rewriter**: `_EN.step` produced with English labels in both PRODUCT name fields
- **BOM export**: MPM-branded `.xlsx` (Montserrat/Open Sans, Dark Shade header, Gold border, alternating rows)
- **Thumbnails**: 6 PNG views via pyrender (front, rear, left, right, iso_left, iso_right)
### BOM Output Format
Columns in Excel output order:
| Column | Header in Excel |
|--------|----------------|
| part_number | Part # |
| part_description | Part Description |
| quantity | Qty |
| level | Level |
| parent | Parent |
| bbox_x_mm | X (mm) |
| bbox_y_mm | Y (mm) |
| bbox_z_mm | Z (mm) |
| notes | Notes |
| part_name_supplier | Supplier Part Name |
`part_description` = translated English name (was `part_name_english` internally).
`part_name_supplier` = original Chinese name from supplier file (last column, was `part_name_original`).
---
## Architecture Notes (Critical — Read Before Debugging)
### GBK Encoding Fix
STEP files from Chinese CAD tools (SolidWorks CN, etc.) embed raw GBK bytes in `PRODUCT` entity name strings. This caused two separate problems, each with a different fix:
**Problem 1: BOM shows garbled Chinese (mojibake)**
OpenCASCADE (OCC) / build123d's STEP reader applies an internal codec that maps each 2-byte GBK sequence to incorrect Unicode codepoints. This is NOT reversible via latin-1 → GBK round-trip because OCC's codec is not latin-1.
Fix: Bypass OCC entirely for part name extraction. `bom.py` reads the raw STEP file text directly with encoding detection (UTF-8 → GBK → latin-1 fallback) and parses `PRODUCT` entities via regex. This is the primary name source; OCC assembly walk is fallback only.
**Problem 2: _EN.step still shows Chinese in CAD viewer**
The rewriter (`rewriter.py`) was reading the file as UTF-8, turning GBK bytes into replacement chars (U+FFFD). Chinese names became `???` and never matched the translation map.
Fix: `rewriter.py` uses the same encoding-detection reader as `bom.py`. File is read as GBK when UTF-8 produces replacement chars.
**Problem 3: Only first PRODUCT name field was replaced**
ISO 10303-21 `PRODUCT` entity format: `#N = PRODUCT('id', 'name', 'description', (#...))`. Both the first and second quoted strings carry the part name. CAD viewers (including OpenCASCADE CAD Assistant) display the second field.
Fix: Updated `PRODUCT_PATTERN` regex to capture both fields with 5 groups. Replacement writes the translated name into both positions.
### Translation API Model
Current model: `claude-haiku-4-5-20251001`
Previous value `claude-sonnet-4-20250514` was returning 404 — updated in `translator.py`.
### BOM Excel Library
`openpyxl` must be installed in the venv. The skill falls back to CSV silently if it's missing — don't rely on this fallback, install it explicitly.
---
## Prerequisites for a New macOS Computer
### Hardware Requirements
- Apple Silicon Mac (M1/M2/M3/M4) — all wheels are native arm64
- **No Rosetta required**
- macOS 13 Sequoia or later recommended
### Software Stack
```
Python 3.103.12 (3.13 works but less tested; 3.9 too old)
Homebrew (for cairo system libs)
~/step-processor-env (Python venv — all pip packages go here)
```
### Python Packages (pip install in venv)
```bash
pip install cadquery-ocp # OCC kernel, native arm64, ~300MB
pip install build123d # STEP loader, primary
pip install trimesh pyrender # thumbnail rendering pipeline
pip install Pillow numpy pandas # image and data processing
pip install anthropic # Claude API client
pip install openpyxl # Excel BOM output — REQUIRED
pip install cairosvg # SVG→PNG/PDF for diagrams (optional, diagrams only)
```
### System Libraries (Homebrew)
```bash
brew install cairo pango gdk-pixbuf libffi # required for cairosvg
```
### FreeCAD Fallback (Optional)
Only needed if build123d fails on a specific file.
- Download official arm64 `FreeCAD.app` from https://github.com/FreeCAD/FreeCAD/releases/latest
- Drag to `/Applications`, launch once so macOS approves it
- Verify: `/Applications/FreeCAD.app/Contents/Resources/bin/freecadcmd --version`
> **macOS 15 Sequoia:** conda-forge FreeCAD is killed by Gatekeeper. Use the official signed `.app` only.
### Anthropic API Key
The key must be available in the environment where the processor runs.
**For interactive terminal use:**
```bash
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc
source ~/.zshrc
```
**For Desktop Commander / CoWork (non-interactive shell):**
```bash
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv
```
`~/.zshrc` is only sourced in interactive shells. Desktop Commander spawns non-interactive zsh — it reads `~/.zshenv` instead. **Both files should have the key.**
---
## Setup — Fast Path (New Mac)
```bash
# 1. Create venv
python3 -m venv ~/step-processor-env
source ~/step-processor-env/bin/activate
# 2. Install all packages
pip install --upgrade pip
pip install cadquery-ocp build123d trimesh pyrender Pillow numpy pandas anthropic openpyxl
# 3. System libs for diagram export
brew install cairo pango gdk-pixbuf libffi
pip install cairosvg
# 4. API key (do BOTH for interactive + Desktop Commander)
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv
source ~/.zshrc
# 5. Verify
python -c "import build123d, openpyxl, anthropic, trimesh; print('ALL OK')"
# 6. Test run
cd /path/to/step-skill-folder
python step_processor.py yourfile.step --no-thumbnails --verbose
```
Expected output for Chinese STEP file:
```
INFO [build123d] Loaded: yourfile.step | NNNN faces | NN parts
INFO STEP text parser found NN unique part names
INFO BOM extracted: NN parts
INFO BOM XLSX → yourfile_bom.xlsx
INFO Chinese part names detected — auto-translating
INFO API returned NN translations
INFO _EN.step written: yourfile_EN.step (NN labels replaced)
```
---
## File Structure
```
STEP File Skill/
step_processor.py ← CLI entry point
modules/
__init__.py
loader.py ← build123d load + GBK mojibake helper
bom.py ← STEP text parser + MPM-branded xlsx output
renderer.py ← 6-view PNG thumbnails (pyrender)
translator.py ← Claude API translation (claude-haiku-4-5-20251001)
rewriter.py ← _EN.step writer (GBK-aware, both PRODUCT fields)
query_engine.py ← Natural language geometry queries + REPL
external_diagram.py ← Dimensional diagram generator
schemas/
external_diagram_schema.json
parts_mapping_schema.json
templates/
datablock_template.md
SKILL.md ← Claude skill instructions
INSTALL.md ← Library setup details
SETUP_CHECKLIST.md ← Step-by-step setup + progressive tests
COWORK_CONTEXT.md ← This file
```
---
## CLI Reference
```bash
# Activate environment first (every new terminal session)
source ~/step-processor-env/bin/activate
cd /path/to/STEP\ File\ Skill
# Default: thumbnails + BOM + auto-translate if Chinese detected
python step_processor.py enclosure.step
# BOM only (no thumbnails, no translate)
python step_processor.py enclosure.step --no-thumbnails --no-translate
# Force translation even if auto-detect is off
python step_processor.py enclosure.step --translate
# Single geometric query and exit
python step_processor.py enclosure.step --query "list all mounting holes"
# Interactive geometry REPL
python step_processor.py enclosure.step --repl
# External dimensional diagram
python step_processor.py enclosure.step --diagram
python step_processor.py enclosure.step --diagram --diagram-pdf
python step_processor.py enclosure.step --diagram --diagram-mode enclosure_plus_mounting
# Verbose (shows backend selection, timing, fallback notices)
python step_processor.py enclosure.step --verbose
```
---
## Known Issues and Pending Work
### BOM bbox Enrichment (partial)
Rows beyond the first only get 5.0 × 5.0 × 5.0mm placeholder bounding boxes. The OCC child enumeration only retrieves the top-level shape's children correctly. Root cause: build123d's `.children` on compound shapes doesn't walk sub-assemblies for bbox. Fix: map part names from STEP text parser back to OCC children by label — non-trivial due to OCC label mangling on CJK files.
### Mounting Hole Filter — Minimum Diameter
The query engine's mounting hole detector returns PCB vias (0.4mm diameter) alongside actual mounting holes. Needs a minimum diameter threshold (recommend 2.0mm floor). Pending update to `query_engine.py`.
### Phase 7 Full Test
The SETUP_CHECKLIST.md Phase 7 tests have been validated for Tests 1, 2, 4 (BOM, translation, rewrite). Tests 3 (thumbnails), 5 (REPL), 6/7 (diagrams) not yet re-run post GBK fix. Thumbnails were verified in an earlier session; diagram code is scaffolded but output quality against production files not fully validated.
### Conda freecad_env Cleanup
From a prior session: `conda activate freecad_env && conda deactivate && conda env remove -n freecad_env`. The conda FreeCAD approach was abandoned in favor of the signed FreeCAD.app. This env is dead weight on the local machine.
---
## Library Stack Decision Log
| Decision | Choice | Reason |
|----------|--------|--------|
| CAD kernel | build123d | Native arm64 arm; clean API; same OCC as existing viewer |
| Fallback CAD | FreeCAD.app (signed) | conda-forge builds killed by Gatekeeper on Sequoia |
| Translation | Claude Haiku API | Batched, manufacturing-context prompted, flags ambiguity |
| Diagrams | SVG-first + cairosvg | No GUI dependency; vector quality; cairosvg handles PNG/PDF |
| Excel | openpyxl | MPM brand formatting, column control, frozen panes |
| Rejected | pythonocc | Rosetta/x64 conda dependency — non-starter on Apple Silicon |
| Rejected | conda-forge FreeCAD | Unsigned binaries killed by Gatekeeper on macOS 15 |
---
## Planned Integration Points
- **Odoo V18 MRP**: Model number from `meta.json` triggers lookup of weight, BOM cost, stock status
- **CoWork product docs**: `meta.json` feeds product data cards; diagram PNGs embed in Knowledge Base articles
- **RDMC** (`rdmc.messagepoint.tv`): Thumbnail PNGs for display inventory visual reference
- **OnSign.tv**: Enclosure dimensions for content sizing reference