Files
step-parse/skill.src/COWORK_CONTEXT.md
T
Jason Stedwell c1abe36822 phase 0
2026-06-17 16:03:26 -05:00

12 KiB
Raw Blame History

STEP Processor Skill — Project Context

CoWork Memory Document · MPMedia Engineering

Updated: June 2026 · Status: FULLY TESTED — Smoke-tested on 3 Chinese-sourced STEP files


What This Is

A Python-based CAD processing skill for MPMedia's engineering workflow. Reads STEP/STP files (display enclosures, mounting kits, hardware assemblies) and automates: thumbnail generation, parts BOM export (MPM-branded Excel), Chinese-to-English label translation, STEP file rewriting with English labels, geometric queries, and external dimensional diagram creation.

Lives as a CoWork skill folder with a SKILL.md. Callable from Claude or directly from CLI.


Current Working State (as of June 2026)

Smoke Test Results

All three production STEP files tested end-to-end:

File Faces Parts (STEP parser) Translations _EN.step
MR16s Gen1 5,655 28 16/16 All English
MR27s Gen1 6,732 31 18/18 All English
MR28uws Gen1 5,891 31 20/20 All English

What Works

  • STEP loading: build123d primary, FreeCAD headless fallback
  • GBK encoding: Chinese CAD files fully decoded (see Architecture Notes below)
  • BOM extraction: STEP text parser as primary source — correctly reads Chinese names
  • Translation: Claude API (claude-haiku-4-5-20251001) — batched, single call per file
  • STEP rewriter: _EN.step produced with English labels in both PRODUCT name fields
  • BOM export: MPM-branded .xlsx (Montserrat/Open Sans, Dark Shade header, Gold border, alternating rows)
  • Thumbnails: 6 PNG views via pyrender (front, rear, left, right, iso_left, iso_right)

BOM Output Format

Columns in Excel output order:

Column Header in Excel
part_number Part #
part_description Part Description
quantity Qty
level Level
parent Parent
bbox_x_mm X (mm)
bbox_y_mm Y (mm)
bbox_z_mm Z (mm)
notes Notes
part_name_supplier Supplier Part Name

part_description = translated English name (was part_name_english internally). part_name_supplier = original Chinese name from supplier file (last column, was part_name_original).


Architecture Notes (Critical — Read Before Debugging)

GBK Encoding Fix

STEP files from Chinese CAD tools (SolidWorks CN, etc.) embed raw GBK bytes in PRODUCT entity name strings. This caused two separate problems, each with a different fix:

Problem 1: BOM shows garbled Chinese (mojibake) OpenCASCADE (OCC) / build123d's STEP reader applies an internal codec that maps each 2-byte GBK sequence to incorrect Unicode codepoints. This is NOT reversible via latin-1 → GBK round-trip because OCC's codec is not latin-1.

Fix: Bypass OCC entirely for part name extraction. bom.py reads the raw STEP file text directly with encoding detection (UTF-8 → GBK → latin-1 fallback) and parses PRODUCT entities via regex. This is the primary name source; OCC assembly walk is fallback only.

Problem 2: _EN.step still shows Chinese in CAD viewer The rewriter (rewriter.py) was reading the file as UTF-8, turning GBK bytes into replacement chars (U+FFFD). Chinese names became ??? and never matched the translation map.

Fix: rewriter.py uses the same encoding-detection reader as bom.py. File is read as GBK when UTF-8 produces replacement chars.

Problem 3: Only first PRODUCT name field was replaced ISO 10303-21 PRODUCT entity format: #N = PRODUCT('id', 'name', 'description', (#...)). Both the first and second quoted strings carry the part name. CAD viewers (including OpenCASCADE CAD Assistant) display the second field.

Fix: Updated PRODUCT_PATTERN regex to capture both fields with 5 groups. Replacement writes the translated name into both positions.

Translation API Model

Current model: claude-haiku-4-5-20251001 Previous value claude-sonnet-4-20250514 was returning 404 — updated in translator.py.

BOM Excel Library

openpyxl must be installed in the venv. The skill falls back to CSV silently if it's missing — don't rely on this fallback, install it explicitly.


Prerequisites for a New macOS Computer

Hardware Requirements

  • Apple Silicon Mac (M1/M2/M3/M4) — all wheels are native arm64
  • No Rosetta required
  • macOS 13 Sequoia or later recommended

Software Stack

Python 3.103.12          (3.13 works but less tested; 3.9 too old)
Homebrew                  (for cairo system libs)
~/step-processor-env      (Python venv — all pip packages go here)

Python Packages (pip install in venv)

pip install cadquery-ocp        # OCC kernel, native arm64, ~300MB
pip install build123d           # STEP loader, primary
pip install trimesh pyrender    # thumbnail rendering pipeline
pip install Pillow numpy pandas # image and data processing
pip install anthropic           # Claude API client
pip install openpyxl            # Excel BOM output — REQUIRED
pip install cairosvg            # SVG→PNG/PDF for diagrams (optional, diagrams only)

System Libraries (Homebrew)

brew install cairo pango gdk-pixbuf libffi   # required for cairosvg

FreeCAD Fallback (Optional)

Only needed if build123d fails on a specific file.

macOS 15 Sequoia: conda-forge FreeCAD is killed by Gatekeeper. Use the official signed .app only.

Anthropic API Key

The key must be available in the environment where the processor runs.

For interactive terminal use:

echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc
source ~/.zshrc

For Desktop Commander / CoWork (non-interactive shell):

echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv

~/.zshrc is only sourced in interactive shells. Desktop Commander spawns non-interactive zsh — it reads ~/.zshenv instead. Both files should have the key.


Setup — Fast Path (New Mac)

# 1. Create venv
python3 -m venv ~/step-processor-env
source ~/step-processor-env/bin/activate

# 2. Install all packages
pip install --upgrade pip
pip install cadquery-ocp build123d trimesh pyrender Pillow numpy pandas anthropic openpyxl

# 3. System libs for diagram export
brew install cairo pango gdk-pixbuf libffi
pip install cairosvg

# 4. API key (do BOTH for interactive + Desktop Commander)
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshrc
echo 'export ANTHROPIC_API_KEY="sk-ant-YOUR-KEY"' >> ~/.zshenv
source ~/.zshrc

# 5. Verify
python -c "import build123d, openpyxl, anthropic, trimesh; print('ALL OK')"

# 6. Test run
cd /path/to/step-skill-folder
python step_processor.py yourfile.step --no-thumbnails --verbose

Expected output for Chinese STEP file:

INFO  [build123d] Loaded: yourfile.step | NNNN faces | NN parts
INFO  STEP text parser found NN unique part names
INFO  BOM extracted: NN parts
INFO  BOM XLSX → yourfile_bom.xlsx
INFO  Chinese part names detected — auto-translating
INFO  API returned NN translations
INFO  _EN.step written: yourfile_EN.step (NN labels replaced)

File Structure

STEP File Skill/
  step_processor.py           ← CLI entry point
  modules/
    __init__.py
    loader.py                 ← build123d load + GBK mojibake helper
    bom.py                    ← STEP text parser + MPM-branded xlsx output
    renderer.py               ← 6-view PNG thumbnails (pyrender)
    translator.py             ← Claude API translation (claude-haiku-4-5-20251001)
    rewriter.py               ← _EN.step writer (GBK-aware, both PRODUCT fields)
    query_engine.py           ← Natural language geometry queries + REPL
    external_diagram.py       ← Dimensional diagram generator
  schemas/
    external_diagram_schema.json
    parts_mapping_schema.json
  templates/
    datablock_template.md
  SKILL.md                    ← Claude skill instructions
  INSTALL.md                  ← Library setup details
  SETUP_CHECKLIST.md          ← Step-by-step setup + progressive tests
  COWORK_CONTEXT.md           ← This file

CLI Reference

# Activate environment first (every new terminal session)
source ~/step-processor-env/bin/activate
cd /path/to/STEP\ File\ Skill

# Default: thumbnails + BOM + auto-translate if Chinese detected
python step_processor.py enclosure.step

# BOM only (no thumbnails, no translate)
python step_processor.py enclosure.step --no-thumbnails --no-translate

# Force translation even if auto-detect is off
python step_processor.py enclosure.step --translate

# Single geometric query and exit
python step_processor.py enclosure.step --query "list all mounting holes"

# Interactive geometry REPL
python step_processor.py enclosure.step --repl

# External dimensional diagram
python step_processor.py enclosure.step --diagram
python step_processor.py enclosure.step --diagram --diagram-pdf
python step_processor.py enclosure.step --diagram --diagram-mode enclosure_plus_mounting

# Verbose (shows backend selection, timing, fallback notices)
python step_processor.py enclosure.step --verbose

Known Issues and Pending Work

BOM bbox Enrichment (partial)

Rows beyond the first only get 5.0 × 5.0 × 5.0mm placeholder bounding boxes. The OCC child enumeration only retrieves the top-level shape's children correctly. Root cause: build123d's .children on compound shapes doesn't walk sub-assemblies for bbox. Fix: map part names from STEP text parser back to OCC children by label — non-trivial due to OCC label mangling on CJK files.

Mounting Hole Filter — Minimum Diameter

The query engine's mounting hole detector returns PCB vias (0.4mm diameter) alongside actual mounting holes. Needs a minimum diameter threshold (recommend 2.0mm floor). Pending update to query_engine.py.

Phase 7 Full Test

The SETUP_CHECKLIST.md Phase 7 tests have been validated for Tests 1, 2, 4 (BOM, translation, rewrite). Tests 3 (thumbnails), 5 (REPL), 6/7 (diagrams) not yet re-run post GBK fix. Thumbnails were verified in an earlier session; diagram code is scaffolded but output quality against production files not fully validated.

Conda freecad_env Cleanup

From a prior session: conda activate freecad_env && conda deactivate && conda env remove -n freecad_env. The conda FreeCAD approach was abandoned in favor of the signed FreeCAD.app. This env is dead weight on the local machine.


Library Stack Decision Log

Decision Choice Reason
CAD kernel build123d Native arm64 arm; clean API; same OCC as existing viewer
Fallback CAD FreeCAD.app (signed) conda-forge builds killed by Gatekeeper on Sequoia
Translation Claude Haiku API Batched, manufacturing-context prompted, flags ambiguity
Diagrams SVG-first + cairosvg No GUI dependency; vector quality; cairosvg handles PNG/PDF
Excel openpyxl MPM brand formatting, column control, frozen panes
Rejected pythonocc Rosetta/x64 conda dependency — non-starter on Apple Silicon
Rejected conda-forge FreeCAD Unsigned binaries killed by Gatekeeper on macOS 15

Planned Integration Points

  • Odoo V18 MRP: Model number from meta.json triggers lookup of weight, BOM cost, stock status
  • CoWork product docs: meta.json feeds product data cards; diagram PNGs embed in Knowledge Base articles
  • RDMC (rdmc.messagepoint.tv): Thumbnail PNGs for display inventory visual reference
  • OnSign.tv: Enclosure dimensions for content sizing reference