Fine-Tune a Local SLM to Clean Master Data on Your Mac

⬅️ Back to Tutorials

By the end you have a ~600 MB model on disk that takes messy vendor, customer, or material JSON and returns normalized fields (ISO country codes, trimmed text, canonical IBANs, fixed dates and amounts) with a rule-based safety net behind it. Plan on 30–45 minutes on an Apple Silicon Mac; most of that is downloads and training, not typing.

TLDR:

Clone Local-SLM-Data-Cleaner, run make setup then make model.
make data builds 1,000 synthetic messy→clean pairs from deterministic rules in convention_spec.py (no real client data).
make baseline-serve + make baseline scores the stock Qwen3-0.6B before training; write down field accuracy.
make train fine-tunes with MLX LoRA; make fuse and make gguf produce qwen3-0.6b-cleaner-q8_0.gguf.
make serve + make eval + make demo prove the after score beats baseline and clean one live record.

Prerequisites: Mac with Apple Silicon (M1 or later), 8 GB RAM, ~5 GB disk, Homebrew. Intel Macs cannot run the MLX training step.

Step 1: Install tools and clone the repo

brew install python git llama.cpp
git clone https://github.com/TMFNK/Local-SLM-Data-Cleaner.git
cd Local-SLM-Data-Cleaner
make setup

make setup installs Python deps and mlx-lm. Success looks like >> Done. Next: make model with no red ERROR lines above it.

Step 2: Download the base model

make model

Pulls Qwen3-0.6B (~1.2 GB) from Hugging Face into your cache. No account needed. Done when it prints model ready.

Step 3: Generate synthetic training data

make data
make sanity

make data writes data/train.jsonl, valid.jsonl, and test.jsonl (default 800/100/100). The generator invents clean records, corrupts them like real messy master data, then labels each pair with the same deterministic algorithm the runtime uses later.

make sanity should report 100% field accuracy on the test split. That is the answer key checking itself, not a model score yet.

Want more examples? make data N=2000.

Step 4: Score the model before training

Terminal 1:

make baseline-serve

Wait for listening on http://127.0.0.1:8080. First run downloads ~600 MB.

Terminal 2 (same project folder):

make baseline

Note the field accuracy line. That is your before number. Stop the server in Terminal 1 with Ctrl+C before training.

Step 5: Fine-tune with MLX LoRA

make train

Loss should trend down over a few minutes. Output lands in adapters/. If the Mac runs out of memory, close browser tabs and retry with make train BATCH=2.

Step 6: Fuse and export to GGUF

make fuse
make gguf

If make gguf cannot find llama.cpp sources:

cd .. && git clone https://github.com/ggml-org/llama.cpp && cd Local-SLM-Data-Cleaner

You should see qwen3-0.6b-cleaner-q8_0.gguf (~600 MB) when ls *.gguf runs clean.

Step 7: Serve, evaluate, and demo

Terminal 1:

make serve

Terminal 2:

make eval
make demo

make eval should beat your Step 4 baseline on field accuracy. make demo sends one messy JSON record through the server and prints cleaned output plus a changes audit list.

Pro tip: Steps that call make baseline-serve, make serve, or make eval need two Terminal windows. The server holds port 8080 until you Ctrl+C it. If the port is busy, use make serve PORT=8081 and make eval PORT=8081.

Cleanup

# stop the model server in Terminal 1 with Ctrl+C
# optional: remove cloned llama.cpp sibling if you only needed it for gguf

If it breaks: Cannot reach the model server means the serve step is not listening yet. Address already in use means an old server is still on 8080. Training killed mid-run? Re-open Terminal, cd back into the repo, and continue; finished downloads and data/ files are still there. Full troubleshooting lives in the project README.