Hermes 4.3 | NLP.COM.AI

Parameters

36B (based on ByteDance Seed-OSS-36B-Base; dense)

License

Apache 2.0

Key Features

First production model trained entirely on Psyche distributed network; matches/exceeds Hermes 4 70B performance at half parameter cost; 512K context (extended from 128K); hybrid reasoning with <think> tags; SOTA on RefusalBench; trained twice (centralized vs distributed) with Psyche version outperforming; uses DisTrO optimizer for internet-scale distributed training secured by Solana blockchain; 144k tokens/sec across 24 global nodes; neutrally aligned; fits on consumer GPUs with GGUF quantization.

Paper / Source

https://arxiv.org/abs/2508.18255