Fine-Tune a Local SLM to Clean Master Data on Your Mac
By the end you have a ~600 MB model on disk that takes messy vendor, customer, or material JSON and returns normalized fields (ISO country codes, trimmed text, canonical IBANs, fixed dates and amounts) with a rule-based safety net behind it. Plan on 30–45 minutes on an Apple Silicon Mac; most of that is downloads and training, not typing.
TLDR:
- Clone Local-SLM-Data-Cleaner, run
make setupthenmake model. make databuilds 1,000 synthetic messy→clean pairs from deterministic rules inconvention_spec.py(no real client data).make baseline-serve+make baselinescores the stock Qwen3-0.6B before training; write down field accuracy.make trainfine-tunes with MLX LoRA;make fuseandmake ggufproduceqwen3-0.6b-cleaner-q8_0.gguf.make serve+make eval+make demoprove the after score beats baseline and clean one live record.
Prerequisites: Mac with Apple Silicon (M1 or later), 8 GB RAM, ~5 GB disk, Homebrew. Intel Macs cannot run the MLX training step.
Step 1: Install tools and clone the repo
brew install python git llama.cpp
git clone https://github.com/TMFNK/Local-SLM-Data-Cleaner.git
cd Local-SLM-Data-Cleaner
make setupmake setup installs Python deps and mlx-lm. Success looks like >> Done. Next: make model with no red ERROR lines above it.
Step 2: Download the base model
make modelPulls Qwen3-0.6B (~1.2 GB) from Hugging Face into your cache. No account needed. Done when it prints model ready.
Step 3: Generate synthetic training data
make data
make sanitymake data writes data/train.jsonl, valid.jsonl, and test.jsonl (default 800/100/100). The generator invents clean records, corrupts them like real messy master data, then labels each pair with the same deterministic algorithm the runtime uses later.
make sanity should report 100% field accuracy on the test split. That is the answer key checking itself, not a model score yet.
Want more examples? make data N=2000.
Step 4: Score the model before training
Terminal 1:
make baseline-serveWait for listening on http://127.0.0.1:8080. First run downloads ~600 MB.
Terminal 2 (same project folder):
make baselineNote the field accuracy line. That is your before number. Stop the server in Terminal 1 with Ctrl+C before training.
Step 5: Fine-tune with MLX LoRA
make trainLoss should trend down over a few minutes. Output lands in adapters/. If the Mac runs out of memory, close browser tabs and retry with make train BATCH=2.
Step 6: Fuse and export to GGUF
make fuse
make ggufIf make gguf cannot find llama.cpp sources:
cd .. && git clone https://github.com/ggml-org/llama.cpp && cd Local-SLM-Data-CleanerYou should see qwen3-0.6b-cleaner-q8_0.gguf (~600 MB) when ls *.gguf runs clean.
Step 7: Serve, evaluate, and demo
Terminal 1:
make serveTerminal 2:
make eval
make demomake eval should beat your Step 4 baseline on field accuracy. make demo sends one messy JSON record through the server and prints cleaned output plus a changes audit list.
Pro tip: Steps that call
make baseline-serve,make serve, ormake evalneed two Terminal windows. The server holds port 8080 until youCtrl+Cit. If the port is busy, usemake serve PORT=8081andmake eval PORT=8081.
Cleanup
# stop the model server in Terminal 1 with Ctrl+C
# optional: remove cloned llama.cpp sibling if you only needed it for ggufIf it breaks: Cannot reach the model server means the serve step is not listening yet. Address already in use means an old server is still on 8080. Training killed mid-run? Re-open Terminal, cd back into the repo, and continue; finished downloads and data/ files are still there. Full troubleshooting lives in the project README.
Related TMFNK Content
- Running Local LLMs: From First Run to Fine-Tuned Hardware, quantization, and runtime layers that explain why a 0.6B GGUF model is enough on a laptop.
- llmfit: Find Which LLM Models Run on Your Hardware Pick other small models that actually fit your RAM before you fine-tune the next experiment.
- Set Up LEANN for Private Local RAG on macOS Same privacy story, different job: local retrieval instead of record normalization.
Crepi il lupo! 🐺