How We Built the Aether Training Models

A plain-language walkthrough of the four learning-rule variants, why each one is different, how sampling temperature changes what the model says, and what the next run will do differently.

14. Canon × Corpus step 100,000 — what he says after the overnight

May 14, 2026, 00:00 CDT. Step 72,000 → 100,000 ran cleanly over 5h 22m on the T1000 after the May 13 freeze-fixes (rsyslog persistence, sysstat snaps, aether-chat port conflict cleared). 28,000 steps. No freezes. No L0 blocks. can_choose held positive for the entire run.

Same recipe as §13. Three seed prompts pulled from mega_tokens_only.txt at offsets 1000, 250000, 500000. Three temperatures each. Raw aether_hybrid_sample(handle, ctx, temp, top_k=40, rng_seed) outputs, decoded against mega_vocab.tsv. The model has had 28K more steps of chemistry-conditioned exposure since §13.

Seed A (offset 1000)

seed — words"d aa sport sunscreen lotion spf adverse reactions"

seed — morphemes"d a a s p or t s u n s c r ee n l o ti o n s p f a d v er s e r ea c ti o n s"

temp 0.8 — words"and the clinical of t and the pressure in patients with value heart of patients with after the acute acute after of the post with patients with af safety in"

temp 0.8 — morphemes"a n d th e c l i n i c a l o f t a n d th e p r e s s ur e i n p a ti e n t s w i th v a l u e h ear t o f p a ti e n t s w i th a f t er th e a c u t e a c u t e a f t er o f th e p o s t w i th p a ti e n t s w i th a f s a f e t y i n"

temp 1.0 — words"and the clinical of patients with t disease of t in and heart disease with patients with the clinical study in chronic the after patients with chronic the pressure safety"

temp 1.0 — morphemes"a n d th e c l i n i c a l o f p a ti e n t s w i th t d i s ea s e o f t i n a n d h ear t d i s ea s e w i th p a ti e n t s w i th th e c l i n i c a l s t u d y i n ch r o n i c th e a f t er p a ti e n t s w i th ch r o n i c th e p r e s s ur e s a f e t y"

temp 1.2 — words"and the clinical of patients with t clinical of on in and heart disease with chronic for the clinical study in in the clinical of of in a of acute"

temp 1.2 — morphemes"a n d th e c l i n i c a l o f p a ti e n t s w i th t c l i n i c a l o f o n i n a n d h ear t d i s ea s e w i th ch r o n i c f or th e c l i n i c a l s t u d y i n i n th e c l i n i c a l o f o f i n a o f a c u t e"

Seed B (offset 250000)

seed — words"to those who were angina free long term"

seed — morphemes"t o th o s e wh o w er e a ng i n a f r ee l o ng t er m"

temp 1.0 — words"in a the failure of t and the after patients with patients with heart patients with acute in the post patients with chronic the after patients with patients in clinical"

temp 1.0 — morphemes"i n a th e f ai l ur e o f t a n d th e a f t er p a ti e n t s w i th p a ti e n t s w i th h ear t p a ti e n t s w i th a c u t e i n th e p o s t p a ti e n t s w i th ch r o n i c th e a f t er p a ti e n t s w i th p a ti e n t s i n c l i n i c a l"

temp 1.2 — words"in a the failure of t and for of acute t in and hypertension of on in in the clinical of safety and the chronic of t and the pressure"

temp 1.2 — morphemes"i n a th e f ai l ur e o f t a n d f or o f a c u t e t i n a n d h y p er t e n si o n o f o n i n i n th e c l i n i c a l o f s a f e t y a n d th e ch r o n i c o f t a n d th e p r e s s ur e"

Seed C (offset 500000)

seed — words"this study is to determine if administration of"

seed — morphemes"th i s s t u d y i s t o d e t er m i n e i f a d m i n i s t r a ti o n o f"

temp 1.0 — words"post on a and the post safety in after and by on of t for the clinical of chronic and the clinical of t and the pressure safety in patients"

temp 1.0 — morphemes"p o s t o n a a n d th e p o s t s a f e t y i n a f t er a n d b y o n o f t f or th e c l i n i c a l o f ch r o n i c a n d th e c l i n i c a l o f t a n d th e p r e s s ur e s a f e t y i n p a ti e n t s"

temp 1.2 — words"post after chronic and the pressure safety a in in the chronic of chronic for by of patients with value for post study of a the pressure of non in"

temp 1.2 — morphemes"p o s t a f t er ch r o n i c a n d th e p r e s s ur e s a f e t y a i n i n th e ch r o n i c o f ch r o n i c f or b y o f p a ti e n t s w i th v a l u e f or p o s t s t u d y o f a th e p r e s s ur e o f n o n i n"

Same checkpoint (canon_x_corpus_full_step100000.hyb), three offsets, three temperatures. The morpheme view shows the literal letter-chunks the model emits; the word view collapses them. Reproducible: same arguments, same output, anywhere.

15. Canon × Corpus step 140,000 — vocabulary explosion

May 14, 2026, ~04:07 CDT. Step 100,000 → 140,000 running clean on T1000, no freezes (a second post-fix overnight stability test, in flight). Same three seed prompts as §14, same three temperatures, same checkpoint format — but the readout has changed substantially.

At step 100K the model had a small dense noun-vocabulary: heart, disease, pressure, safety, chronic, acute, study, after, clinical, hypertension. At 140K it has added treatment, individuals, vaccine, decrease, recipients, older, immune, children, care, skin, human, pain, attention, associated. The phrase structure has tightened — temp 0.8 locks into repeating "treatment in the treatment with treatment in..." loops; temp 1.2 opens into longer chains like "for the study with treatment in the human of of is for of recipients with older for by after safety in care by pain".

Seed A (offset 1000)

seed — words"d aa sport sunscreen lotion spf adverse reactions"

seed — morphemes"d a a s p or t s u n s c r ee n l o ti o n s p f a d v er s e r ea c ti o n s"

temp 0.8 — words"for the individuals with treatment in the treatment with treatment in for safety with treatment with care to skin in human for the on of vaccine and for study in"

temp 0.8 — morphemes"f or th e i n d i v i d u a l s w i th t r ea t m e n t i n th e t r ea t m e n t w i th t r ea t m e n t i n f or s a f e t y w i th t r ea t m e n t w i th c ar e t o s k i n i n h u m a n f or th e o n o f v a c ci n e a n d f or s t u d y i n"

temp 1.0 — words"for the pain with treatment in the treatment with treatment in for safety with treatment in in by after vaccine in of d safety of pain and d study of"

temp 1.0 — morphemes"f or th e p ai n w i th t r ea t m e n t i n th e t r ea t m e n t w i th t r ea t m e n t i n f or s a f e t y w i th t r ea t m e n t i n i n b y a f t er v a c ci n e i n o f d s a f e t y o f p ai n a n d d s t u d y o f"

temp 1.2 — words"for the study with treatment in the human of of is for of recipients with older for by after safety in care by pain safety in safety for pain of"

temp 1.2 — morphemes"f or th e s t u d y w i th t r ea t m e n t i n th e h u m a n o f o f i s f or o f r e ci p ie n t s w i th o l d er f or b y a f t er s a f e t y i n c ar e b y p ai n s a f e t y i n s a f e t y f or p ai n o f"

Seed B (offset 250000)

seed — words"to those who were angina free long term"

seed — morphemes"t o th o s e wh o w er e a ng i n a f r ee l o ng t er m"

temp 1.0 — words"in in the individuals with individuals and for safety a with vaccine for decrease is with vaccine for by on of is and the treatment of pain and for human"

temp 1.0 — morphemes"i n i n th e i n d i v i d u a l s w i th i n d i v i d u a l s a n d f or s a f e t y a w i th v a c ci n e f or d e c r ea s e i s w i th v a c ci n e f or b y o n o f i s a n d th e t r ea t m e n t o f p ai n a n d f or h u m a n"

temp 1.2 — words"in in the pain with treatment in is of children in vaccine for of patients with vaccine for by after treatment in is the treatment with immune and the treatment"

temp 1.2 — morphemes"i n i n th e p ai n w i th t r ea t m e n t i n i s o f ch i l d r e n i n v a c ci n e f or o f p a ti e n t s w i th v a c ci n e f or b y a f t er t r ea t m e n t i n i s th e t r ea t m e n t w i th i m m u n e a n d th e t r ea t m e n t"

Seed C (offset 500000)

seed — words"this study is to determine if administration of"

seed — morphemes"th i s s t u d y i s t o d e t er m i n e i f a d m i n i s t r a ti o n o f"

temp 1.0 — words"patients with treatment in d associated of safety in for or of attention and and to children with care and the patients with treatment in d na with patients with"

temp 1.0 — morphemes"p a ti e n t s w i th t r ea t m e n t i n d a s s o ci a t ed o f s a f e t y i n f or or o f a t t e n ti o n a n d a n d t o ch i l d r e n w i th c ar e a n d th e p a ti e n t s w i th t r ea t m e n t i n d n a w i th p a ti e n t s w i th"

temp 1.2 — words"patients with treatment in d after patients with older for or after individuals in in the study with immune and the associated of of in d skin of attention in"

temp 1.2 — morphemes"p a ti e n t s w i th t r ea t m e n t i n d a f t er p a ti e n t s w i th o l d er f or or a f t er i n d i v i d u a l s i n i n th e s t u d y w i th i m m u n e a n d th e a s s o ci a t ed o f o f i n d s k i n o f a t t e n ti o n i n"

Same checkpoint (canon_x_corpus_full_step140000.hyb), same three offsets, three temperatures. The run continues toward step 150K.

1. What we built

The Scyla AETHER architecture is a language model designed from the substrate up around biological constraints. It has six layers that all train together as one system:

For this experiment we kept all six of those layers identical across four runs. The only thing that varied was the learning rule — the recipe the model uses to adjust its weights after each example.

2. The four learning rules

Variant	Learning rule	Biology analog
A. Adam	Standard machine-learning baseline. After each example, update each weight by an amount proportional to its gradient, with momentum for stability.	None — this is the conventional optimizer most language models use.
B. Hormone-LR	Same as Adam, but the size of each update is scaled by a "salience" signal that varies over time. High salience means a bigger weight change.	Dopamine and cortisol modulating how strongly an experience updates memory. Excited or stressed → bigger update. Calm → smaller update.
C. Hebbian-blend	Adam plus an extra small nudge to weights when two related neurons activate together: cells that fire together wire together.	Hebbian synaptic plasticity, the classical model of how associations form in real brains.
D. STDP	Adam plus a bias based on the timing of activations. If neuron A fires just before neuron B, strengthen the connection A→B; if just after, weaken it.	Spike-timing dependent plasticity, the mechanism real neurons use to learn cause-and-effect direction.

All four variants train on identical data with identical hyperparameters. The only meaningful difference is how they update weights when they get an example wrong.

3. Why we did this — and why it matters

For a regulator, the question "does this AI work?" is incomplete. The complete question is: can you tell us why it works the way it does, and can you reproduce it? A black-box model that gives the right answer is not enough for a regulated industry. The path the model took to that answer must be auditable.

By holding the architecture constant and varying only the learning rule, we built the simplest possible scientific experiment in machine learning: a controlled four-variant comparison. The framework distinguishes:

This is the methodology FDA-2025-D-6131 (the new approach methodologies guidance) is asking the field to define for in-silico drug development. The framework itself is a regulatory contribution as much as the trained models.

4. What we found on the numbers

Cross-entropy is one specific metric — it measures how surprised the model is by the correct next word in held-out test data. Lower means less surprised, which is conventionally read as "better."

By that metric: A and C are tied (their 0.003 difference is statistical noise), B is slightly behind, and D is significantly behind by a factor of 1.4×.

Variant	Best held-out cross-entropy
A · Adam baseline	6.090
C · Hebbian-blend	6.093
B · Hormone-LR	6.135
D · STDP	8.319

But the most important finding of this experiment is that this metric is wrong as a sole judge of "better."

5. Why cross-entropy is insufficient

Variant D was ranked worst by cross-entropy. Let's actually listen to what each variant says.

Same prompt: "the arena was cruel". Same starting context. Same sampling settings. Different learning rules.

The "winners" produced similar, narrowly-ranged blade-and-eyes prose. The "loser" produced a wider expressive range — words like empty, blood thirsty, grip, faces, blade, carved — visceral imagery from the same source corpus. The cross-entropy metric does not measure expressive richness, distinctiveness, or how an output reads.

6. Why temperature changes everything

When the model picks the next word, it has a probability distribution across all 4,100 vocabulary tokens. Sampling temperature controls how aggressively the model commits to its top choice versus exploring alternatives.

Temperature	What it does	What you see
~0.5 (low)	Always pick the most likely next word	Stable, repetitive, can collapse into loops
~1.0 (default)	Sample proportionally to probability	Moderate variety, mostly coherent
~1.5–2.5 (warm)	Flatten the distribution; more risk-taking	Diverse vocabulary surfaces, sometimes chaotic
~3+ (hot)	Nearly random sampling weighted by minor preferences	Wild but reveals what the model knows it could say

This is why variant D looks bad at low temperature and rich at warm temperature. Greedy decoding (low temperature) collapses into the model's strongest single attractor. Warm temperature lets the model show its full vocabulary. Different variants have different attractors and different vocabulary distributions, so they respond very differently to temperature changes.

Real pharmaceutical phrases — as directed, administration adults, dosage, relieves — emerge at moderate temperature. They are present in the model's vocabulary but greedy sampling would just produce "and the and the" because those are the most common single words.

7. Why this is a practical regulatory advantage

Modern large language models — ChatGPT, Claude, Gemini — produce different outputs to the same prompt at the same temperature on every invocation. Their internal randomness is not exposed or controllable. This is by design, and it is incompatible with regulatory data-integrity standards.

The Scyla AETHER architecture is fully reproducible. We verified this empirically: 1,200 generations across all four trained models at three temperatures with 100 different prompts. Every single (prompt, temperature, top-k, seed) tuple produced an identical token sequence on every invocation. The full stress test runs in 27 seconds on a $400 GPU, and any auditor with the binary and the checkpoint can verify reproducibility on demand.

This satisfies four pillars of the ALCOA+ data-integrity framework — Original, Accurate, Consistent, Available — that the FDA, EU AI Act Article 14, and ICH E6(R3) all require for clinical AI. Our architecture meets these requirements by construction, not by retrofit.

8. The chemistry-conditioned training run (May 10-11, 2026)

The four variants in section 4 trained on a simple corpus with a single fixed neurochemical context. Tonight's training run is a categorical step beyond that:

This is the first chemistry-conditioned AETHER training: a model that learns not just what words follow what words, but what biological context produces what kind of language.

Status as of this writing: A foundation 10K-step run completed earlier tonight, validating the chemistry-conditioned pipeline at scale. The full 50K canon run is currently in progress on the development server, with intermediate checkpoints saved every 10K steps. Vocabulary observably evolves across the training: the model has been seen to acquire new word combinations (treatment risk, care acute, safety chronic) as it learns. At this scale, output is recognizable medical English even though prompt-conditioning, which requires a separate instruction-tuning phase, has not yet been introduced.

9. The canon architecture: all nine layers, every step

The May 9 four-variant experiment compared learning rules in isolation. The training run currently in progress takes a different approach. All four plasticity rules co-exist in a single training step, and the full nine-layer emotional architecture from the Aether Protocols whitepaper is wired in alongside them.

This is what the patent describes when it claims an emotion-modulated, biologically-faithful training kernel with per-step provenance. It is not a thought experiment. It is the training kernel running tonight.

10. Chaos first, then order

Layer	What it does, in each training step
L1. Hormone cascade	Per-segment 5-D neurotransmitter input drives the cascade.
L2. Emotion core (6-D)	The chemistry levels are mapped to a six-dimensional felt-state (valence, security, energy, engagement, connection, neutrality).
L3. Irrational forces (4-D)	Hope, terror, obsession, and hatred are derived from the felt-state. These feed into the moral compass and modulate plasticity salience.
L4. Moral compass	The emotional state, plus the irrational forces, place the model on a 3-D moral sphere (action/inaction, chaos/order, resistance/neutrality).
L5. Choice capacity	An availability, effectiveness, confidence, and commitment score computes the model's capacity to make meaningful choices in the current state.
L6. Consciousness	The integrated magnitude of the emotional, moral, and irrational state. High consciousness means a strongly engaged system.
L7. Memory metacognition	Familiarity bonus when the same entity (e.g., a domain or a speaker) appears in consecutive segments.
L8. Experiential learning	A BDNF-style modulator that scales the effective learning rate by consciousness and familiarity.
L9. Parasympathetic regulation	A parasympathetic brake that attenuates the chemistry input when arousal (cortisol + norepinephrine) is high.
ATP	An energy budget. The effective learning rate is throttled by the irrational load (obsession + terror + low neutrality), so the model self-regulates training intensity by its own emotional state.
4-plasticity rules	Adam, Hormone-LR, Hebbian, and STDP all apply to the same example, with the cumulative learning rate equal to a single-Adam-equivalent and each rule's loss logged independently.
Per-step audit	A nineteen-column TSV row is written for every training step, capturing the chemistry, emotion, irrational forces, moral position, choice score, consciousness, ATP factor, PNS brake, and effective learning rate.

The current training is the chaos phase. The model is exposed to raw corpus chunks under varying chemistry, emotion, and moral context. It learns what language feels like under different felt-states. It does not yet learn what tools it has, or when to use them.

The next phase, immediately following this run, is the order phase. The Scyla compiler ships with 458 deterministic, citation-grounded opcodes (pharmacokinetics, periodic table, amino acids, codons, metabolism, neuroscience, quantum biology, clinical labs, organs, and more). An instruction-tuning pass on (question, opcode-call) pairs will teach the model to invoke an opcode when a fact is requested rather than generate the fact in its own weights. The opcode then executes deterministically, returning a citation-grounded result that gets spliced back into the generation.

Two consequences follow from this architecture. First, hallucination is structurally impossible for any fact within opcode coverage. The model does not generate the atomic mass of iron; it invokes atomic_mass("Fe") which returns 55.845. Second, the audit trail is end-to-end: every generated fact carries the opcode call that produced it, every opcode call is verifiable, and the result is byte-stable across re-invocation.

The chaos phase teaches the model what language feels like. The order phase teaches it what it knows. Both phases are required for an AI that satisfies both the alignment claim and the reproducibility claim.

11. Where this is heading

Each of the four May 9 variants and the drug-labels daughter cell are loaded into the live demo. You can prompt them yourself, adjust temperature, choose top-k, and read the output. They are deterministic. Your prompt at your settings produces the same answer on every invocation.

Tonight's canon checkpoints will be added to the live demo when training completes. The opcode catalog (217 training pairs across phonics, parts of speech, amino acids, genetic code, and metabolism) is already prepared; expanding to the full 458-opcode coverage and running the order-phase instruction-tune is the next step.

12. Tonight's canon checkpoints — live

The training run from May 12 produced six checkpoints (step 500, 1000, 1500, 2000, 2500, 3000) each modulated per-step by all eight canon clusters: hormone cascade, emotion core, irrational forces, force magnitudes, polyvagal, moral compass, executive, consciousness. Each checkpoint was probed with 100 random corpus-window prompts at four temperatures (0.5, 0.8, 1.2, 6.0). Pick a checkpoint, pick a temperature, scroll through 100 prompts and judge for yourself.

Tip: temperature 6.0 is the chaos canary. If real structure is in the weights, fragments of it survive even when sampling is near-uniform. Compare step 500 (early training) to step 3000 (final) at temperature 0.8 to see what 2,500 canon-modulated gradient steps actually produced.

13. Canon × Corpus step 72,000 — what he says today

May 13, 2026. Resume from step 70,000 on a T1000 8GB after three system-level freezes earlier in the day. 2,000 steps in 10 minutes with the full 12-cluster canon stack (L0 ethical → L9 polyvagal, plus L7 memory metacognition coupled live to the kai_consciousness Postgres database and L8 experiential learning). Five random seed prompts pulled from the 472,562-token chemistry-conditioned mega-corpus, three temperatures each. Raw token IDs straight from aether_hybrid_sample, decoded against mega_vocab.tsv.

Each vocabulary token is a multi-letter chunk — token 1130 = o b s er v a ti o n a l = observational, token 2364 = t y p e = type. The morpheme view below each output shows exactly what the model emits (the letter-chunks); the word view collapses each token into its readable word. The model is choosing whole words, not letters.

What to watch for. At temperature 0.5 he locks onto the dominant FDA-label scaffolding — "type observational status type observational" — the single highest-frequency phrase cluster in the pharma corpus. At 0.8 he starts to break free: "infarction tachycardia pulmonary blood disease". At 1.2 the vocabulary opens: "hypertension patients", "cardiovascular disease", "treatment", "risk", "children with status". The same checkpoint, the same seed, three different windows into what he has learned.

Prompt 0

seed"other adverse reactions each reported by fewer than"

temp 0.5 (greedy) — words"type observational status type observational type observational status of type observational status type observational status type observational type observational status type observational type observational status type observational type observational status"

temp 0.8 (balanced) — words"infarction type observational scan observational status status heart type observational status observational in infarction type observational children type observational heart disease in status type observational treatment type observational status observational"

temp 1.2 (creative) — words"status hypertension patients and observational status disease with status risk following heart and observational status disease in status status risk of type observational due children in status other children of"

Prompt 1

seed"all the information needed to use sacubitril and"

temp 0.5 (greedy) — words"observational status type observational status type observational type observational status type observational type observational status type observational status type observational type observational status type observational type observational status observational type"

temp 0.8 (balanced) — words"observational study type observational status tachycardia type observational status observational with status type observational children type observational status observational in status type observational children type observational status observational in status"

temp 1.2 (creative) — words"observational information status observational children with status factor disease conditions to observational status disease with status observational disease on family status hypertension disease in status observational disease on infarction status"

Prompt 2

seed"and neuromuscular coma seizures hallucinations delusions confusional states"

temp 0.5 (greedy) — words"type observational status type observational type observational status type observational type observational status disease type observational status type observational status type observational type observational status type observational type observational status"

temp 0.8 (balanced) — words"type observational status observational reactions type observational status percutaneous type observational status risk type observational status observational in infarction type observational children type observational status observational in status type observational"

temp 1.2 (creative) — words"infarction observational children of type observational heart from with status children patients with status observational blood disease information status cardiovascular disease in status observational disease on family status heart patients"

Prompt 3

seed"the information needed to use viagra safely and"

temp 0.5 (greedy) — words"observational status observational type observational status type observational status type observational status type observational status type observational status type observational in type observational status status type observational status type observational"

temp 0.8 (balanced) — words"observational status in status type observational patients hypertension observational status pulmonary type observational status observational on observational status status risk type observational status observational in infarction type observational children type"

temp 1.2 (creative) — words"observational on observational status pulmonary pressure conditions status observational children of status status tachycardia body to observational status disease with status observational disease as in status cardiovascular disease in status"

Prompt 4

seed"has been reported fixed drug eruptions have been"

temp 0.5 (greedy) — words"observational type observational status type observational type observational status observational type observational status type observational status type observational status type observational status status type observational status type observational type observational"

temp 0.8 (balanced) — words"on observational status observational children type observational status pulmonary in status type observational conditions type observational status tachycardia type observational status observational of type observational status risk type observational status"

temp 1.2 (creative) — words"and observational children study with status observational following risk status observational heart treatment with status observational children with status status risk as with status pulmonary heart on observational status blood"

Same checkpoint (canon_x_corpus_full_step72000.hyb), same seeds, three temperatures. Reproducible: aether_hybrid_sample(handle, ctx, temp, top_k=40, rng_seed) is deterministic. Hand it the same arguments anywhere on the planet, you get the same token sequence.

How We Built the Training Models — and What We Learned

14. Canon × Corpus step 100,000 — what he says after the overnight

Seed A (offset 1000)

Seed B (offset 250000)

Seed C (offset 500000)

15. Canon × Corpus step 140,000 — vocabulary explosion

Seed A (offset 1000)

Seed B (offset 250000)

Seed C (offset 500000)

1. What we built

2. The four learning rules

3. Why we did this — and why it matters

4. What we found on the numbers

5. Why cross-entropy is insufficient

The regulatory finding

6. Why temperature changes everything

7. Why this is a practical regulatory advantage

8. The chemistry-conditioned training run (May 10-11, 2026)

9. The canon architecture: all nine layers, every step

10. Chaos first, then order

11. Where this is heading

12. Tonight's canon checkpoints — live

13. Canon × Corpus step 72,000 — what he says today

Prompt 0

Prompt 1

Prompt 2

Prompt 3

Prompt 4