ALLURA.MOE - GLM-4-9B-0414 Neon v2

🌟 GLM-4-9B-0414 Neon v2 🌟
9B RP finetune with personality and nice prose

<< Back to Main | << Back to Models | Hugging Face | GitHub

GLM-4-9B-0414 Neon v2
Base Model: THUDM/GLM-4-9B-0414
by Auri/Aurorae

Description: RP finetune of GLM-4-9B-0414. Feels nice, lots of personality, if bit quirky sometimes. Nice prose, not too Claude-ish or Gemini-ish. Doesn't seem to like too long system prompts or charcards though. Seems to like JSON formatted system prompts.

Use Cases:
• Character roleplay
• Creative writing
• Story generation
• Interactive fiction

Training Details:
• 77M tokens of synthetic RP and short story data
• 1 epoch training
• 11 hours on 2xRTX 3090 (provided by OwenArli)
• QLoRA + CCE for memory optimization

Links:
• Huggingface (Full Weights)
• GGUF Quantizations

Usage:
Format: GLM4 instruct formatting
Template:
[gMASK]<sop><|system|>
{system_prompt}<|user|>
{prompt}<|assistant|>

Recommended Samplers:
• Temperature: 1.0
• Min-P: 0.1
• Repetition Penalty: 1.03

Backend Notes:
• KoboldCPP: Use latest version + `--overridekv glm4.rope.dimension_count=int:64`
• vLLM: Works OOTB on vLLM >= 0.8.5
• EXL2/3: Should work out of the box
• llama.cpp: Latest versions support GGUFs OOTB

Special Thanks:
• OwenArli for compute and tuning help
• ArliAI for collaboration
• Artus for free inference
• BeaverAI community for feedback

← Back to Model Archive | Browse by Author | Browse by Series