🌈 GLM-4-32B-0414 Neon v2 🌈
32B RP finetune with personality and variety
<< Back to Main | << Back to Models | Hugging Face | GitHub
GLM-4-32B-0414 Neon v2
Base Model: THUDM/GLM-4-32B-0414
by Auri/Aurorae

Description: RP finetune of GLM-4-32B-0414. Feels nice, lots of personality, lots of variety, if bit quirky sometimes. Pretty smart, but sometimes plays dumb for a swipe, just let it be itself. Nice prose, not too Claude-ish or Gemini-ish. Bit of structural repetitions happen sometimes, but that's how modern LLMs are so ¯\_(ツ)_/¯. Seems to like JSON formatted system prompts.

Use Cases:
• Character roleplay
• Creative writing
• Story generation
• Interactive fiction

Training Details:
• 77M tokens of synthetic RP and short story data
• 1 epoch training
• 28 hours on 4xRTX 3090 (provided by OwenArli)
• QLoRA + CCE with sequence parallelism

Links:
Huggingface (Full Weights)
GGUF Quantizations

Usage:
Format: GLM4 instruct formatting
Template:
[gMASK]<sop><|system|>
{system_prompt}<|user|>
{prompt}<|assistant|>


Recommended Samplers:
• Temperature: 1.0
• Min-P: 0.1
• Repetition Penalty: 1.03

Backend Notes:
KoboldCPP: Use latest version + `--overridekv glm4.rope.dimension_count=int:64`
vLLM: Works OOTB on vLLM >= 0.8.5
EXL3: Should work out of the box
llama.cpp: Latest versions support GGUFs OOTB

Special Thanks:
• OwenArli for compute and tuning help
• ArliAI for collaboration
• Artus for free inference
• BeaverAI community for feedback


← Back to Model Archive | Browse by Author | Browse by Series