I typed "portrait of person with face made of clear flowing water" into Adobe Firefly Image 5 and hit generate.

The AI gave me an underwater photograph. Score: 5.45 out of 10.

I typed "portrait where facial features are constructed from transparent glass with visible internal structure and prismatic light refraction" and hit generate again.

Score: 9.70.

Same tool. Same concept. The difference was language. And that single discovery launched a testing project that would eventually span 164 images, 30+ prompt variations, and three months of systematic experimentation with material transformation portraits.

This is Part 1 of that story. Not theory. Not "10 tips for better prompts." This is what 164 scored, documented, and compared images actually taught me about how AI interprets materials.

The Baseline: Everything I Thought I Knew Was Wrong

The idea was simple. Photorealistic portraits where human faces are made of different materials. Glass, water, marble, wood, metal. The "weird but photographic" aesthetic I've been developing. Surrealist scenarios rendered with the precision of studio photography.

I started with six materials and generated 24 images. Four per material, scored on a weighted rubric covering visual quality, prompt alignment, consistency, uniqueness, and engagement potential. Each dimension weighted differently. Visual quality at 30%, prompt alignment at 25%, the rest splitting the remaining 45%.

Session 1 average: 6.48 out of 10. Two images out of 24 cleared the 8.0 competition-worthy threshold. An 8% success rate.

Here's what went wrong.

Water (5.45 avg): "Face made of water" triggered underwater photography. Firefly saw "water" and "face" and placed the subject underwater. The material transformation never happened.

Marble (6.28 avg): "Face sculpted from white marble with classical sculpture quality" produced actual classical sculptures. Not living faces made of marble. Literal statues.

Fabric (5.04 avg): "Face formed from flowing silk fabric" created fashion photography of fabric wrapped around faces. An accessory, not a transformation.

Wood (6.81 avg): "Face carved from ancient weathered oak" read as a texture overlay. Like someone Photoshopped a wood grain onto a portrait. No structural depth.

Two materials showed promise. Crystal glass scored 7.64 with one image hitting 8.30. Molten gold metal scored 7.68 with one reaching 8.00. Both had something the failures lacked: specific physical properties in the prompt that forced the AI beyond surface-level interpretation.

The crystal prompt specified "prismatic light refraction" and "geometric crystal facets forming facial structure." The molten metal prompt described "visible heat distortion and glow" and "liquid metal surface with metallic shine."

Physics language. Not description language. That was the thread worth pulling.

The Fix That Changed Everything: +24% in One Session

Session 2 tested four materials with rebuilt prompts. The core change was surgical.

Old language: "Face made of [material]" New language: "Facial features constructed from [material] with [specific physics]"

Three specific fixes:

  1. Construction verbs replaced vague descriptors. "Constructed from," "composed entirely of," and "cast in" replaced "made of" and "formed from." These verbs carry structural intent. The AI treated them differently.

  2. Material physics replaced material names. Instead of "glass face," the prompt described "transparent glass with visible light refraction, glass edges catching rim light, internal structure visible through transparency." The AI had to render those physics, not just apply a texture.

  3. Expression grounding kept the result human. "Face maintains human expression while composed of [material]" prevented the AI from drifting into pure object rendering.

Session 2 results: 8.06 average. A 24% improvement. Success rate jumped from 8% to 75%.

The glass prompt went from 7.64 to 8.64 average, with the best image hitting 9.70. The bronze prompt scored 8.84 average with a 9.40 peak. Even liquid mercury, which should have failed like water did, scored 8.41 because the prompt specified "surface tension visible" and "droplets suspended in composition."

The turquoise gradient background also emerged here as a universal technique. Bronze on turquoise scored 9.40. Glass on blue gradient scored 8.64. Every material that performed well sat on a colored gradient, never a white background. White backgrounds averaged 7.20 across all tests. Gradients averaged 8.5+. That's not a stylistic preference. It's a 1.3-point scoring gap.

Glass portrait, scored 9.70. Same concept as the 5.45 water failure. Different language. "Constructed from transparent glass with visible internal structure" did what "face made of water" couldn't.

Prompt: Close-up portrait where entire head and facial features are constructed from clear transparent glass, skin is glass surface with visible light refraction, eyes are glass spheres, lips and facial contours are smooth glass forms maintaining human expression, internal light passing through transparent structure, professional studio photography, dramatic backlighting creating rim light on glass edges with subtle front fill, 85mm portrait lens, deep focus showing glass internal structure, blue gradient background, prismatic light refraction visible along facial planes, hyper-realistic glass physics, weird but photographic, sophisticated composition

Combinations: When Two Materials Outscore One

With single materials mapped, Session 3 tested combinations. The hypothesis: pairing two proven materials should score near their individual peaks.

It did better than that.

Glass + Bronze (9.21 avg, 9.70 peak): The left half transparent glass with prismatic refraction, right half polished bronze with green patina, turquoise gradient background. The transition zone where glass becomes opaque and metallic was the detail that pushed scores up. Lighting had to demonstrate both transparency AND reflectivity in a single frame.

Mercury freezing to chrome (8.80 avg): Liquid mercury on top crystallizing into solid chrome below. Phase change physics. The crystallization pattern at the transition zone created a narrative. Not just two materials side by side, but a process caught mid-transformation.

Wood + Crystal (8.76 avg): This one mattered. Wood alone scored 6.36. A failure. But paired with crystal, it jumped to 8.76. A 37% improvement. Crystal rescued the weak material by providing structural contrast. The AI could render wood convincingly when it had crystal to define what "structural transformation" looked like by comparison.

This became the rescue strategy. Weak materials paired with strong partners consistently produced publication-worthy results. The partner didn't just add visual interest. It gave the AI a reference point for how material transformation should work.

Fire + Ice: The Universal Champion

Session 4 pushed into extreme material contrasts. Thermal (fire and ice), clarity (water and diamond), state (smoke and stone), luminosity (light and shadow).

Fire + Ice scored 9.43 average across all four generations. Two images tied at 9.85, the highest composite score in the entire testing framework. It became the single best-performing variation across all sessions.

Why it worked comes down to three factors:

Universal symbolism. Everyone understands fire and ice. The contrast is primal. No interpretation needed.

Clear visual separation. Warm orange-red versus cool blue. The eye reads the split instantly. No ambiguity about where one material ends and another begins.

Physics demonstration. The transition zone where flames meet frost, with steam and heat shimmer, proves both materials are physically real within the image. The interaction sells the transformation.

Water also got redeemed here. Water alone: 5.45. Water paired with diamond: 8.86. A 62% improvement through strategic pairing and better language. "Flowing transparent water with visible ripples and surface tension maintained in face shape by impossible physics" gave the AI structural constraints that "face made of water" never did.

Light and shadow failed at 7.06. Too abstract. The AI lost the "photographic" quality and drifted into graphic design. The lesson: materials need physical presence. Fire is tangible. Light is not.

Fire + Ice portrait, scored 9.85. The highest composite score in the testing framework. Two images out of four tied at this score. Universal symbolism, clear visual separation, and physics interaction at the transition zone.

Prompt: Portrait where left half of face is constructed from blazing fire with visible flames, heat distortion, and glowing embers integrated into facial features, seamlessly transitioning to right half frozen in clear crystalline ice with frost patterns and frozen vapor visible, transition zone shows flames meeting ice with steam and heat shimmer where materials interact, face maintains human expression despite extreme thermal opposition, professional studio photography, dramatic lighting with warm orange-red glow from fire side contrasting cool blue illumination on ice side, 85mm portrait lens, deep focus showing both fire movement and ice crystalline structure, split gradient background with warm red-orange transitioning to cool blue, hyper-realistic thermal physics demonstrating fire and ice interaction, weird but photographic, dramatic composition

Pattern Universality: Does It Work on Everything?

Session 5 answered the question that would determine whether these patterns were tricks or principles. Can Fire + Ice, Glass + Bronze, and Water + Diamond produce the same results on hands, full bodies, animals, and objects as they do on portraits?

Yes.

Fire + Ice on full body figures: 9.60 average. Higher than portraits (9.43). The larger canvas gave the materials more room. Dynamic poses added energy. A figure mid-stride with one leg aflame and the other crystallized created movement that a static portrait can't.

Glass + Bronze on hands: 9.26 average. Two hands reaching toward each other, one glass, one bronze. The best image (9.45) evoked a "Creation of Adam" reference that nobody prompted. The AI found the art historical resonance on its own.

Water + Diamond on an eagle in flight: 8.84 average. Matching the portrait score almost exactly. One wing flowing water, the other faceted diamond. Wildlife photography grounding kept it photorealistic.

Fire + Ice on a Greek vase: 9.38 average. Objects scored slightly lower than human subjects. Consistent 0.3 to 0.5 point gap. The visual quality was still 10/10, but the engagement dimension dropped. Human subjects carry emotional connection that objects don't.

Full body beat portraits. That was the surprise. Scale enhances drama. The pattern held across every subject type tested.

Expanding the Material Library: Six New Tests

With the core patterns proven across 88 images, I tested six new materials to extend the tier rankings. Same portrait subject, same evaluation framework. Direct comparison.

Colored Glass Tax

Ruby red glass: 8.20 peak (clear glass: 9.70). A 1.50-point drop.

Emerald green glass: 8.60 peak. A 1.10-point drop.

Color absorbs the transparency that made clear glass score so high. Ruby's deep red saturation blocked the internal light refraction that was the entire visual appeal. Emerald partially recovered through crystalline structure, but both lagged behind colorless glass significantly.

Not all glass is created equal. Color costs you the transparency that drives the score.

Three Materials Work

Fire + Ice + Crystal: 9.20 peak, 8.55 average.

Only 0.23 points below the two-material Fire + Ice (9.43). The crystal didn't compete with fire and ice. It naturally became a mediator, a bridge between the thermal extremes. In the best image, diamond-cut crystal facets ran vertically between the fire and ice zones, concentrating and redirecting light between them.

The AI figured out the role of the third material before I did.

The Most Reliable Prompt I Found

Oxidized copper with verdigris: 9.00 peak, 8.76 average. Range of only 0.70 points across all four generations.

Every single image was usable. No failures. No lottery. The progressive patina from polished copper at the forehead through green-yellow oxidation around the eyes to deep turquoise-green at the chin told a story about time passing. Oxidation concentrated on lower surfaces, matching real-world gravity patterns.

Peak score 0.40 below bronze (9.40). But copper never produced a bad image. Sometimes the best prompt is not the one that peaks highest. It's the one that never fails.

Breaking the Organic Material Curse

Wood scored 6.36. Water scored 5.45. Organic materials were the graveyard of this project.

Living coral scored 9.20. Set average: 8.95 with only a 0.60 range.

Coral demolished the organic ceiling by 2.84 points over wood. The reason rewrites the entire "organic materials fail" narrative. AI doesn't struggle with organic materials. It struggles with materials that have no structure to render.

Wood grain is a two-dimensional texture. Water is formless liquid. Neither gives the AI a three-dimensional structure to build a face from. But coral has rigid branching architecture. Polyp details along ridges. Translucent membranes between branches. The AI had geometry to work with.

The organic material problem was actually a structural definition problem. Structure beats origin.

Living coral portrait, scored 9.20 with a set average of 8.95. Wood scored 6.36. Water scored 5.45. Coral demolished both because it has three-dimensional branching architecture the AI can build from. The organic material problem was a structural definition problem.

Prompt: Portrait where facial features are constructed from living coral formation, intricate branching coral structures forming cheekbones and brow ridge, tiny polyp details visible along coral ridges, skin-like membrane visible between coral branches showing translucent organic depth, natural coral coloration ranging from deep orange to pale pink, face maintains human expression while composed of marine biological architecture, professional studio photography, soft diffused lighting mimicking underwater caustic light patterns, 85mm portrait lens, deep focus showing coral polyp detail, blue gradient background suggesting ocean depth, hyper-realistic biological material demonstration, weird but photographic

Obsidian and the Amber Gradient Discovery

Obsidian volcanic glass scored 9.20 peak on an amber gradient background. Not turquoise. Amber.

The warm amber gradient behind jet-black obsidian created the illusion of inner fire. Light glowing through eye slits. Something alive behind the polished black shell. An emotional register achieved entirely without facial expression.

Turquoise works universally. But amber behind dark materials creates a specific narrative. The background implied a light source that doesn't exist, and viewers read it as an internal glow. Background color is not decoration. It's a narrative tool.

Obsidian portrait on amber gradient, scored 9.20. The warm amber background behind jet-black volcanic glass creates the illusion of inner fire. Something looking out from inside the obsidian shell. Background color as narrative tool, not decoration.

Transformation vs. Replacement

The three-material test revealed something I hadn't articulated until I saw it side by side. Fire and ice acting ON a face preserved human expression. The face was still alive underneath, being transformed. But a face that IS glass or IS coral created mannequins. Objects, not people.

Transformation materials (fire, ice, mercury) feel alive. Replacement materials (glass, coral, obsidian) feel like sculptures.

Both can score 9.0+. But the emotional register is fundamentally different. Know which one you want before you start prompting.

The Material Tier List (After 128 Images)

Tier 1: Competition-Ready (9.0+ peak)

  • Fire + Ice: 9.85 peak / 9.43 avg (universal champion)

  • Glass + Bronze: 9.70 peak / 9.21 avg (dual-material excellence)

  • Fire + Ice + Crystal: 9.20 peak / 8.55 avg (three materials validated)

  • Obsidian + amber gradient: 9.20 peak / 8.56 avg (implied inner fire)

  • Living Coral: 9.20 peak / 8.95 avg (organic curse-breaker, most consistent)

Tier 2: Strong Reliable (8.0-8.99 peak)

  • Oxidized Copper: 9.00 peak / 8.76 avg (never fails, tightest range)

  • Water + Diamond: 9.45 peak / 8.86 avg (water redeemed by pairing)

  • Emerald Glass: 8.60 peak / 7.95 avg (partial transparency recovery)

  • Ruby Glass: 8.20 peak / 7.23 avg (color kills transparency)

Tier 3: Needs a Partner (below 8.0 solo)

  • Wood: 6.36 avg alone / 8.76 avg with crystal (+37%)

  • Water: 5.45 avg alone / 8.86 avg with diamond (+62%)

What Part 1 Taught Me

Five sessions. 88 original images plus 40 new material tests. 128 scored and compared generations. The patterns that emerged aren't opinions. They're data.

  1. Language precision is worth 3+ points. "Constructed from transparent glass with visible internal structure" versus "face made of glass" is the difference between 9.70 and 5.45.

  2. Material physics drive scores more than material names. Describing refraction, surface tension, patina progression, and crystalline structure forces the AI past texture overlays into structural transformation.

  3. Weak materials can be rescued. Wood failed alone. Water failed alone. Both scored 8.5+ when paired with strong partners. The partner provides a structural reference point.

  4. Gradient backgrounds outperform white by 1.3+ points. Turquoise is universal. Amber works behind dark materials. White looks generic.

  5. Structure beats origin. Coral at 9.20 demolished wood at 6.36. The AI needs three-dimensional geometry to build from. If a material has no inherent structure, it has no path to transformation.

  6. Patterns transfer across subjects. Fire + Ice works on portraits, full bodies, hands, animals, and objects. Full body actually scores higher than portraits.

Part 2 takes these materials to new subjects. Hands on a piano. A dancer mid-leap. A classical Greek bust dissolving into liquid mercury. And the single highest-scoring set average in the entire project.

The statue that lost itself.

This is Part 1 of "164 Images Later," a 3-part series on systematic AI material transformation testing. Part 2 drops next week.

Every prompt in this article is shared in full. Every score is documented. If you want the complete prompt companion with generation notes, it's linked below.

Built with Adobe Firefly Image 5.

Keep Reading