Part 1 built the rules: language precision, material physics, gradient backgrounds, rescue strategies. Part 2 tested new subjects and discovered the mercury bust at 9.85 with the highest set average in the project.

Part 2 also left an unanswered question.

Fire + Ice scored 9.20 on a gradient and 8.25 in a winter cafe. The gradient won by nearly a full point. But the mercury bust scored 9.44 average on a gradient and 9.63 in a dark museum. The museum won.

Same project. Same evaluation framework. Opposite results. Gradients beat environments for one material. Environments beat gradients for another.

Part 3 is the investigation. I generated 36 more images across 9 variations, each one designed to isolate why.

The answer rewrites the gradient rule entirely. And it comes down to a single question: who controls the light?

The Gap: 0.95 Points

Fire + Ice on a dark charcoal gradient: 9.20 peak, 8.55 average.

Fire + Ice at an outdoor winter cafe: 8.25 peak, 8.01 average.

A 0.95-point gap on peak score. A 0.54-point gap on average. The cafe version had everything going for it. Falling snow echoing the ice side. Warm string lights echoing the fire side. A steaming coffee cup mirroring the steam at the transition zone. Thematically perfect.

But the materials looked like VFX overlays. The AI rendered a normal face and then added fire and ice on top. The overcast daylight in the cafe provided its own illumination. The materials were additions to the scene, not the structural reality of the face.

I wanted to know exactly why. So I attacked the problem from four different angles.

Four Fixes, Ranked

Fix 1: Kill the Competing Light

Same Fire + Ice concept. But instead of a bright cafe, I placed the subject in a dimly lit bar at night. Single overhead pendant light. Dark wood counter. Glass on the bar.

Peak score: 9.45. Set average: 8.74. That's 1.20 points above the cafe on peak.

The materials became the light sources. In darkness, the fire side was the warm light. The ice side was the cool light. The bar counter split clean down the middle: warm orange on the fire side, cool blue on the ice. Not because I prompted it. Because the face was the only illumination in the scene, and the AI had to render the materials as genuine light-emitting objects to make the image physically coherent.

The glass on the bar caught the ice side's blue. The dark wood reflected the fire's orange. Every surface in the environment became proof that the materials were physically real.

And the emotional register shifted completely. The cafe version was whimsical. The dark bar was menacing. Noir. Furrowed brow carrying genuine threat through the materials. That emotional range is impossible on a gradient background.

Fix 2: Make the Environment React

Same bright cafe. Same overcast daylight. But the environment serves the materials instead of ignoring them. Fire melts the snow on the table, scorching the wood. Ice freezes the coffee cup solid with frost spreading across the ceramic and icicles forming on the table edge.

Peak score: 8.70. Set average: 8.13.

The frozen coffee cup in the best image was the single most memorable environmental detail in 164 images. A mundane object transformed by proximity to an impossible face. But the fix only closed 47% of the gap. Reactive environments prove the materials are real. They don't fix the lighting competition.

Fix 3: Overpower With Language

Same bright cafe. Same everything. But the prompt language went aggressive. "CONSTRUCTED ENTIRELY from flickering flames replacing skin and bone." "NO human skin visible on either half." All caps for the critical instructions. Destruction language instead of construction language.

Peak score: 8.85. Set average: 8.44.

Language closed 63% of the gap. "Replacing skin and bone" outperformed "constructed entirely from" because destruction language creates more commitment than construction language. "NO human skin visible" pushed material coverage from roughly 60% to 85% of the face.

But even with the most aggressive language I could write, bright daylight still softened the result. Fire still read as an effect rather than a structure.

Fix 4: Fog as Natural Gradient

Fog should be the best of both worlds. Environmental context with gradient-like isolation. Subject emerges from dense winter fog on an empty street at night.

Peak score: 8.60. Set average: 7.73.

The worst of the four fixes. Fog softened everything including the materials. Fire lost sharp flame edges. Ice lost crystalline definition. The dreamy, ethereal aesthetic fought against the sharp "material construction" quality that drives high scores.

One exception: the best image in the set had LOW fog pooling at chest level while the face rose above it, mostly clear. Low fog isolates the body while keeping the face sharp. High fog kills material definition.

The Hierarchy

Fix

Peak

Avg

Points Gained Over Cafe

Dark environment (kill the light)

9.45

8.74

+1.20

Extreme language (overpower)

8.85

8.44

+0.60

Reactive environment (prove it)

8.70

8.13

+0.45

Low fog (isolate)

8.60

7.73

+0.35

Gradient baseline

9.20

8.55

reference

Original cafe

8.25

8.01

baseline

Language matters. Lighting matters twice as much.

The problem was never environments versus gradients. It was lighting competition. In bright environments, ambient light provides its own illumination. Materials become decorative additions. In dark environments, materials become the primary light source. The AI has to render them as physically real objects that emit, reflect, or transmit light because nothing else in the scene does.

Fire + Ice in a dark bar, scored 9.45. The fire side lights the counter orange. The ice side turns the glass blue. Not because I prompted those reflections, but because the face is the only illumination in the room. The AI had to treat the materials as real light sources to make the scene work.

Prompt: Portrait of person seated alone in dimly lit bar at night, face showing left half constructed entirely from actively burning flames with flickering fire casting orange light onto the bar counter, right half formed of crystalline ice with frost patterns and frozen surface creating condensation on nearby glass, face is BUILT from these materials not merely glowing, dark moody interior with single overhead spotlight on subject, warm amber light from fire side illuminating dark wood surfaces, cold blue reflections from ice side visible on bar counter, 50mm lens, shallow depth of field isolating subject in darkness, hyper-realistic material physics demonstration in minimal environment, weird but photographic

Does Darkness Work for Everything?

The dark bar proved the principle for Fire + Ice. Emissive materials that generate their own light. Of course darkness helps them. The fire IS the light.

But what about materials that don't emit light? Reflective metals? Dark solids? Organic structures?

I tested four materials in dark environments matched to their optical properties.

Mercury in a Museum: The Floor Rises

Covered in Part 2. Mercury bust in a dark museum with a single spotlight scored 9.63 average versus 9.44 on a gradient. The spotlight gave mercury something to reflect. The museum floor showed those reflections. The dark silhouettes added narrative scale.

Dark environment wins for liquid reflective materials. Higher average, dramatically tighter consistency (0.40 range versus 1.00).

Obsidian by Candlelight: When AI Gives You Kintsugi

Polished obsidian volcanic glass. Sharp fracture planes. Mirror-black surfaces. I placed it at a dark wooden table lit only by a cluster of candles.

Peak score: 9.35. Set average: 8.79. Both higher than the amber gradient version (9.20 peak, 8.56 avg).

But the scores aren't the story. The story is what Firefly did with the candlelight.

Obsidian is natural glass. At thin sections, it's translucent. I'd prompted "occasional translucent edges revealing deep amber glow." Candlelight was the only light source. So the only visible detail besides mirror reflections was warm light passing through thin obsidian fracture edges.

Golden lines running through black glass.

The AI rendered kintsugi. The Japanese art of repairing broken pottery with gold. Nobody prompted it. The combination of obsidian's natural translucency at thin sections, warm candlelight color temperature, and fracture line geometry produced golden repair veins across a black face.

Four images. Four different fracture interpretations. Gold kintsugi veins. Ember seam lines. Geometric low-poly with amber fill. Fine hairline cracks with delicate golden threads. Every image discovered a new way to express the same emergent physics.

Your best results come from material physics you didn't think to ask for. I described candlelight and obsidian. Firefly gave me the Japanese art of golden repair. The AI understood the optical physics of translucent volcanic glass better than my prompt did.

Obsidian by candlelight, scored 9.35. I prompted fracture planes and translucent edges. Firefly gave me kintsugi: golden repair veins running through black volcanic glass. The AI understood how warm light passes through thin obsidian sections and rendered the result as the Japanese art of golden repair. Nobody asked for this.

Prompt: Portrait where facial features are formed from polished obsidian volcanic glass with sharp conchoidal fracture planes, mirror-polished black surface reflecting warm candlelight, occasional translucent edges revealing deep amber glow where light passes through thin obsidian sections, face maintains human expression through angular obsidian construction, seated at dark wooden table with cluster of lit candles providing the only illumination, candlelight reflecting off obsidian facets creating multiple warm points of light across dark face, deep shadows between fracture planes, 85mm lens, shallow depth of field with candle flames soft in foreground, hyper-realistic geological material physics, weird but photographic

Coral in the Dark: The Exception That Proves the Rule

Living coral on a turquoise gradient scored 9.20 peak and 8.85 average. One of the most consistent performers in the project.

I placed coral in dark ocean water with bioluminescent organisms providing the only light.

Peak score: 8.00. Set average: 7.43. A 1.42-point drop in average. The single biggest negative gap of any comparison in Session 3.

Three problems compounding.

Bioluminescence is too dim to function as a light source. Fire illuminates a bar. A spotlight illuminates mercury. Candlelight illuminates obsidian. But scattered faint bioluminescent dots are decorative, not functional. The coral face sat in near-darkness with no real key light.

Coral is brown and mauve. Low contrast against dark water. Mercury is mirror-reflective. Obsidian has polished facets. Fire emits light. Coral is muted warm-brown organic material. Against a dark background it nearly disappeared. The turquoise gradient worked specifically because it provided complementary color contrast.

Organic complexity needs light to be visible. Coral's entire visual appeal is structural detail. Polyp textures, branching patterns, translucent membranes. Those require good, even lighting. In near-darkness, intricate structure collapses into murky shadow.

Dark environments fail for low-contrast organic materials. Coral needs bright, complementary-colored, even lighting to reveal the detail that makes it score well. Darkness doesn't amplify coral's properties because coral has no strong optical property to amplify.

Coral in dark ocean water, scored 8.00 peak with 7.43 average. Down 1.42 points from the turquoise gradient version. Bioluminescence is too dim to light a scene. Brown coral disappears against dark water. Organic detail collapses into murky shadow. The exception that proved the rule: dark environments fail materials with nothing to amplify.

Copper in the Rain: The Lottery

Oxidized copper with progressive verdigris patina. On a gradient: 9.00 peak, 8.74 average, roughly 0.50 range. The most consistent material in the entire project.

I placed it on a rain-slicked city street at night. Neon signs reflecting colored light onto wet pavement. Rain droplets on the polished copper areas. Street light overhead.

Peak score: 9.00. Same as the gradient. Set average: 8.10. Down 0.64 points. Range: 2.27. More than four times the gradient's variance.

The best image was spectacular. Multi-colored neon reflections playing directly on the copper surface. Purple on the left temple, green on the right cheek, warm orange on the crown. The rain droplets sat convincingly on polished areas. The stoic expression read clearly despite full metal transformation. This image did something the gradient version literally cannot do. It justified the environment.

The worst image scored 6.73. The lowest score of any material in any setting across the entire Phase 2 dataset. Patina transition was an abrupt horizontal line across the forehead. Eyes were blank white. The verdigris face was flat and barely caught any neon color. It read as body paint, not metal.

Same prompt. Same environment. 2.27-point spread.

Copper broke my clean rule. It's reflective, so dark environment should win. But copper has two competing surface states. Polished copper catches every photon aggressively. Verdigris patina absorbs light like a sponge. Dark environments amplified both behaviors simultaneously. When Firefly committed to the dual-surface physics, the result was the best image in the set. When it got confused about which surface state dominated, it produced the worst.

The dark environment turned copper into a lottery ticket.

Copper in neon rain, scored 9.00 (best) vs 6.73 (worst). Same prompt. Same environment. 2.27-point spread. When Firefly committed to both surface states simultaneously, the neon reflections playing across polished copper were stunning. When it didn't commit, the result was body paint. The dark environment turned copper into a lottery.

The Unified Principle

Dark environments amplify whatever optical property a material already has.

Reflective materials become more reflective. Mercury's mirror surface catches every photon from the spotlight. Light-emitting materials become more dominant. Fire becomes the only illumination in the room. Translucent materials reveal their internal behavior. Obsidian's thin edges glow gold from candlelight passing through.

But materials with no strong optical property have nothing to amplify. Coral is matte, brown, non-reflective. Darkness makes it invisible.

And materials with competing optical behaviors on the same surface introduce a consistency penalty. Copper's polished zones and patina zones respond to light in opposite ways. Dark environments amplify both simultaneously, and when the AI has to render two contradictory light responses, it sometimes fails catastrophically.

The question isn't "should I use a dark environment?" It's three questions:

Can this material interact with light? If no (coral, wood, water), use bright environments with complementary-colored gradients.

Does it interact in one consistent way? If yes (mercury, obsidian, fire, ice), dark environments will match or beat gradients with stronger narrative and emotional range.

Does it interact in two competing ways? If yes (oxidized copper, potentially rusted iron, tarnished silver), gradients are safer. Dark environments are lottery tickets. Use them only if you're willing to generate extra and cherry-pick.

Who controls the light? If your material can take control, let it. If it can't, provide the light it needs.

What 164 Images Taught Me

Three parts. Five sessions of original testing. Three sessions of expansion and environmental investigation. 164 scored images. 30+ prompt variations. One evaluation framework applied consistently across every single generation.

The patterns that survived all of that aren't tips. They're principles.

Language is the first lever. "Constructed from transparent glass with visible internal structure" versus "face made of glass" is the difference between 9.70 and 5.45. Specific physics language forces the AI past texture overlays into structural transformation. But language only closes about 40% of the gap when other variables are wrong.

Lighting is the bigger lever. It matters twice as much as language. Materials in dark environments where they control the light score 15-25% higher than the same materials in bright ambient light. The AI treats light-controlling materials as physically real objects because the scene's coherence depends on it.

Materials tell stories. Mercury dissolving a marble bust implies time. Fire acting on a face implies transformation. Coral growing along bone structure implies biology. Every top scorer carries narrative weight beyond visual spectacle. Colored glass has color. Fire + Ice has conflict. Story beats surface.

Weak materials can be rescued. Water failed at 5.45. Paired with diamond: 8.86. Wood failed at 6.36. Paired with crystal: 8.76. The partner provides structural reference. Give the AI something it knows how to render, and the weak material comes along for the ride.

Art historical context multiplies scores. A mercury face on a gradient is a tech demo. A Greek bust dissolving into liquid gold in a dark museum is a contemporary art installation. Cultural reference gives the AI stronger training data and gives viewers deeper interpretive access. The mercury bust averaged 9.63 in the museum. Nothing else came close.

Your best images come from physics you didn't prompt. I described candlelight and obsidian. Firefly gave me kintsugi. I prompted silver mercury. Warm spotlights gave me liquid gold. The AI understands material optics well enough to produce emergent behaviors. Your job is to set up the conditions. The physics will follow.

Consistency has a cost. Dark environments can beat gradients on peak scores. But as material complexity increases, so does variance. Mercury museum: 0.40 range. Fire + Ice dark bar: 1.00 range. Copper rain: 2.27 range. Gradients give you a tight cluster every time. Dark environments give you lottery tickets. Know which one you need before you burn credits.

The Prompt Decision Tree

Before you write your next material transformation prompt, run through this:

Step 1: Material selection. Does it have three-dimensional structure? (Yes: proceed. No: pair it with something that does.)

Step 2: Language. "Constructed from" or "composed entirely of" with specific physics properties. Never "made of."

Step 3: Lighting environment. Can the material control light? Emissive materials and uniform reflective materials go dark. Organic/matte materials go bright with complementary gradients. Dual-state materials go gradient for safety, dark for potential peaks.

Step 4: Background. Turquoise for warm metals. Amber for dark materials. Dark charcoal for thermal/elemental. Blue for cool materials. Never white.

Step 5: Subject. Sculpture if you want to avoid the mannequin problem. Props if you're doing hands. Full body for maximum drama. Portraits for maximum material detail.

That decision tree captures what 164 images taught me. Not all of it. But the principles that survived every test I threw at them.

This is Part 3 of "164 Images Later," a 3-part series on systematic AI material transformation testing.

Every prompt shared in full. Every score documented. The complete prompt companion with generation notes is linked below.

Built with Adobe Firefly Image 5.

Keep Reading