GLM-4-9B-0414 Neon v2
Base Model: THUDM/GLM-4-9B-0414
by Auri/Aurorae
Description: RP finetune of GLM-4-9B-0414. Feels nice, lots of personality, if bit quirky sometimes. Nice prose, not too Claude-ish or Gemini-ish. Doesn't seem to like too long system prompts or charcards though. Seems to like JSON formatted system prompts.
Use Cases:
• Character roleplay
• Creative writing
• Story generation
• Interactive fiction
Training Details:
• 77M tokens of synthetic RP and short story data
• 1 epoch training
• 11 hours on 2xRTX 3090 (provided by OwenArli)
• QLoRA + CCE for memory optimization
Links:
• Huggingface (Full Weights)
• GGUF Quantizations
Usage:
Format: GLM4 instruct formatting
Template:
[gMASK]<sop><|system|>
{system_prompt}<|user|>
{prompt}<|assistant|>
Recommended Samplers:
• Temperature: 1.0
• Min-P: 0.1
• Repetition Penalty: 1.03
Backend Notes:
• KoboldCPP: Use latest version + `--overridekv glm4.rope.dimension_count=int:64`
• vLLM: Works OOTB on vLLM >= 0.8.5
• EXL2/3: Should work out of the box
• llama.cpp: Latest versions support GGUFs OOTB
Special Thanks:
• OwenArli for compute and tuning help
• ArliAI for collaboration
• Artus for free inference
• BeaverAI community for feedback
|