1. Summary
Recraft ships in two flavors, and the difference matters more than the naming suggests. Recraft V4 PRO is the 2K photorealism model for commercial product imagery. Recraft V4 is the fast, design-led model for branding, layouts, and vector work. Recraft V4 Pro is best understood as the production-oriented version of Recraft V4. It follows the same design-led visual direction, but adds higher native resolution and finer detail. Use V4 to find the direction; use V4 Pro when the image needs to become a polished final asset
Pick V4 PRO when the output has to look like a paid studio shoot. Pick V4 when the output is a poster, an icon set, or any asset where composition and color taste matter more than photographic finish.
Top strengths
- First-try quality (5.0/5.0) – highest first-try score in the lineup. Outputs are production-ready without re-prompting, even from simple prompts.
- Surface quality (4.8/5.0) – exceptional material rendering across metal, glass, fabric, and multi-material compositions.
- Text rendering (4.7/5.0) – typography integrates structurally into compositions for posters, packaging, signage, and editorial layouts.
Top gaps
- Generation speed (3.5/5.0) – ~10s per image is slower than speed-focused models. Not built for rapid iteration.
- Spatial reasoning (3.5/5.0) – complex multi-element compositions lose detail. Grid layouts and dense arrangements are inconsistent.
- Style fidelity (3.7/5.0) – character consistency across scenes is fragile. Style reference adherence is weaker than top competitors.
Best-fit use cases
- SVG vector assets – logos, icons, brand marks, and illustrations exported as real editable paths for print and web.
- Editorial campaign layouts – posters, magazine covers, and brand visuals that feel art-directed without a designer.
- Merch-ready designs for POD – print-ready vectors and compositions that go straight to production with no tracing or cleanup.
How We Tested
Test environment
- Platform: Kittl (internal) + Recraft API.
- Date range: April to May 2026.
- Model versions tested: Recraft V4 PRO (2048×2048) and Recraft V4 (1024×1024 raster, plus vector variant).
Prompt set
- Total prompts: ~200 across both models.
- Categories covered: product photography, editorial spreads, food and beverage, fashion lookbooks, branding posters, icon and logo sets, illustration, vector layouts, and material studies.
Scoring scale
Each capability is scored 1 to 5 against a fixed definition. Scores are absolute, not relative.
- 5 = Executes the definition without meaningful failure.
- 4 = Executes reliably with minor inconsistencies.
- 3 = Execution is inconsistent or partial.
- 2 = Execution frequently fails or requires significant workarounds.
- 1 = Unusable for this capability.
Capability definitions

3. Capability Scores and Breakdown
Overall scores:
- Recraft V4 overall score: 4.26/5.0

3.1 Prompt adherence: Score 4.5
What this measures: Output matches the request exactly.
What works:
- Interprets detailed creative direction accurately, following visual relationships like reflections, materials, and background interactions.
- Short prompts produce surprisingly strong results — the model fills in design-quality decisions about composition, color, and lighting without explicit instruction.
What breaks:
- Highly complex multi-constraint prompts lose later instructions. The model deprioritizes details listed further in the prompt.
- Specific character actions (mid-bite, precise hand gestures) are sometimes interpreted loosely rather than followed literally.
Examples:



See the prompt here
Steaming handcrafted latte with intricate rosetta foam art in a matte ceramic cup, resting on a reclaimed walnut bar top with visible wood grain and soft natural wear. In the softly blurred background, a rustic ceramic pour over dripper, a loosely folded burlap coffee bean sack, scattered roasted coffee beans, and a small stainless steel spoon create a warm artisan café setting. A faint curl of steam rises from the cup, while warm amber pendant light casts a golden glow across the scene, creating subtle highlights on the cup rim and gentle shadows on the wood surface. The palette is rich espresso brown, creamy ivory, toasted caramel, and muted walnut tones. Shallow depth of field keeps the foam art in crisp focus, with the surrounding objects softly fading into bokeh. Fine texture is visible in the ceramic glaze, burlap fibers, and polished wood grain. Cozy, inviting mood, premium café aesthetic, highly realistic, detailed food photography, 2K resolution.
3.2 Text rendering accuracy: Score 4.7
What this measures: Spelling, legibility and font fidelity.
Strengths:
- Typography integrates structurally into compositions — text on posters, packaging, and signage feels art-directed rather than stamped on.
- Short to medium text strings render with high fidelity: headlines, brand names, event details, and product labels land clean.
- Handles serif, sans-serif, and display typefaces with strong stylistic control.
Limitations:
- Dense text blocks (ingredient lists, legal disclaimers, long paragraphs) degrade in accuracy.
- Custom letterforms and calligraphic logotypes need human refinement after generation.
Examples:



See the prompt here
Create a street art fashion editorial photoshoot with a bold, rebellious, high-energy vibe. Feature a stylish model in edgy streetwear, such as oversized jackets, layered pieces, statement sneakers, graphic textures, chains, sunglasses, and trend-forward accessories. Set the photoshoot in an urban environment filled with graffiti walls, pasted posters, spray paint textures, neon signs, concrete backdrops, alleyways, and raw city details. Fill the composition with lots of wording and typographic elements, making the text a major visual feature. Include layered phrases, slang, fashion statements, and expressive words scattered across the scene like a street-style magazine editorial meets graffiti culture.
3.3 Camera & composition: Score 4.5
What this measures: Perspective, angles, depth of field, framing.
Strengths:
- Art-directed composition is the model’s core identity — balanced negative space, intentional framing, and cohesive color relationships.
- Camera language in prompts (lens, aperture, ISO) produces believable depth of field, lighting falloff, and texture.
- Best negative space handling in the lineup — ideal for layouts where text overlays need breathing room.
Limitations:
- Composition choices can override prompt intent. The model’s strong aesthetic opinion sometimes prioritizes visual taste over literal instruction.
- Extreme fisheye or unconventional perspective prompts are interpreted loosely.
Examples :



See the prompt here
Full-length portrait of a model wearing an oversized oatmeal linen blazer and wide-leg sage trousers, standing barefoot on sun-warmed sandstone steps in a Mediterranean courtyard, captured with a wide-angle lens from a low bottom-right angle in a three-quarter perspective to emphasize the drape and silhouette of the outfit while including more of the surrounding architecture. Cascading bougainvillea in vivid magenta frames the archway behind, while soft golden-hour side light creates gentle highlights along the fabric folds and natural contours of the body. Natural skin texture, visible linen weave, and subtle sandstone surface detail are clearly rendered. The composition uses muted earth tones and botanical greens with a pop of fuchsia, creating an elegant editorial fashion mood. Crisp focus on the model, softly layered background depth, refined lifestyle photography aesthetic, 2K clarity.
3.4 Generation speed: Score 3.5
What this measures: Time from prompt to finished output.
Strengths:
- ~10 seconds per image for V4 standard is acceptable for focused creative work.
- Exploration Mode offsets the speed gap by generating eight visual directions from a single prompt.
Limitations:
- Slower than speed-focused models like Nano Banana 2 (~3-5s). Not suited for rapid A/B testing or high-volume iteration.
- V4 Pro at ~28s is particularly slow for workflows requiring quick turnaround.
3.5 First-try quality: Score 5.0
What this measures: Usable output without re-prompting.
What works:
- Highest first-try score in the lineup. Even minimal prompts produce outputs that feel intentionally designed rather than randomly generated.
- The model’s “design taste” means first outputs already have balanced composition, cohesive color, and refined detail.
- Editorial and campaign-style outputs are frequently usable as-is — no re-rolling needed.
What breaks:
- The model’s strong aesthetic opinion means the first output may not match what the user had in mind, even if it’s objectively well-composed. Re-prompting is about steering direction, not fixing quality.
Examples:



See the prompt here
A warm overhead shot of freshly baked sourdough bread on a rustic wooden cutting board, golden crust with deep scoring lines, scattered flour dust, a small chalkboard sign reading “Baked Fresh Daily” in elegant hand-lettered script, soft morning light from a nearby window, cozy earth tones with pops of warm amber, clean and balanced composition, natural artisan bakery styling
3.6 Style Fidelity: Score 3.7
What this measures: Era/aesthetic match and character consistency when using a reference image.
What works:
- Strong default editorial aesthetic — warm, cinematic, campaign-quality across most prompts.
- Design-style prompts (badges, stickers, icons, emblems) produce consistent, production-ready results.
- Low style drift within a single generation session.
What breaks:
- Character consistency across different scenes is fragile – characters drift significantly between generations.
- Reference image adherence is weaker than competitors. The model takes creative liberties that can override brand guidelines.
- Illustration and artistic styles outside the model’s editorial default are less convincing.
Examples :



See the prompt here
Art-directed flat illustration of a creative co-working loft with floor-to-ceiling windows overlooking a city skyline at golden hour, two designers collaborating around a large monitor displaying colorful UI mockups, vibrant teal and burnt orange accent palette against warm neutral tones, clear visual hierarchy with foreground desk props and background city glow
3.7 Skin Realism: Score 4.2
What this measures: Pore detail, tonal range, melanin accuracy.
What works:
- Cinematic skin rendering, warm, textured, and emotionally specific. Portraits feel editorial rather than clinical.
- Natural tonal variation and lighting interaction on skin surfaces.
- Strong performance for campaign-style portraits where mood matters more than clinical accuracy.
What breaks:
- Pore-level detail is weaker than GPT Image 2 and Nano Banana Pro at extreme close-up.
- Skin texture under dramatic or sharp lighting setups can feel smoothed or stylized rather than photographic.
Examples :



See the prompt here
Editorial shampoo beauty photoshoot featuring a female model with healthy, hair that is floating on water, styled to look fresh and naturally hydrated. Close-up to mid-shot composition, luminous skin, clean makeup, confident expression, and elegant pose. Hair appears smooth, soft, and nourished with subtle moisture droplets and defined strands catching the light. Luxury beauty campaign aesthetic, minimal studio background, soft diffused lighting with crisp highlights, clean composition, premium shampoo advertisement style with flowers, realistic texture, high-end editorial photography, fresh and refined mood, 2K clarity.
3.8 Hand Anatomy: Score 3.8
What this measures: Correct finger count, natural poses.
What works:
- Simple hand poses (resting, holding objects, gesturing) render correctly in most runs.
- Improvement over previous Recraft generations — fewer spontaneous extra fingers.
What breaks:
- Tight close-ups with precise hand actions (cutting, piano playing, interlocking fingers) still break.
- Dynamic or complex hand poses deform, particularly when multiple hands interact.
Example:



See the prompt here :
Close-up of two hands cradling a warm matte ceramic mug filled with steaming oat milk latte, soft natural window light, visible skin texture and subtle hand veins, clean neutral linen sleeves, shallow depth of field with crisp focus on the fingers and cup rim, cozy lifestyle photography aesthetic, 2K detail.
3.9 Surface Quality: Score 4.8
What this measures: Metal, glass, fabric, multi-material rendering.
Strengths:
- Strongest material rendering in the lineup. Metal, glass, fabric, and multi-material compositions are physically convincing.
- Micro-detail in surface textures, brushed steel reflections, fabric weave, glass refractions resolves at a level that passes professional inspection.
- Product photography benefits significantly: packaging, jewelry, cosmetics, and fashion items render at studio quality.
Limitations:
- Seamless texture patterns and tileable surfaces are not a strength. Avoid for textile repeats or wallpaper patterns :
Example:



See the prompt here
A photo featuring a stylized dramatic editorial scene. Two figures with strong androgynous features, shaved heads, extremely pale skin, wearing sculptural avant-garde white garments with ruffled high collars. Positioned against a rich cobalt blue background. Theatrical lighting — cold and directional. The makeup is graphic: bold geometric shapes in black over the eyes. Moody, conceptual, high fashion editorial.
3.10 Spatial Reasoning: Score 3.5
What this measures: Left/right, counting, grid layouts.
What works:
- Simple compositions with few elements are handled correctly.
- Structured prompts for icon sets and grid layouts produce usable results with careful prompt engineering.
What breaks:
- Counts above 5-6 items become unreliable.
- Complex multi-element scenes lose spatial accuracy and simplify detail.
- Grid layouts and dense arrangements are inconsistent, items blend or overlap.



See the prompt here
Ultra-realistic editorial food photograph of a blank white menu held by warm olive-toned hands, viewed from directly above, resting on a richly saturated deep teal linen tablecloth. The menu is placed diagonally across the frame, creating elegant visual movement and negative space. To the left, a handcrafted ceramic plate in warm ivory holds a carefully arranged octopus carpaccio, thin translucent slices glistening with olive oil, sprinkled with bright green herbs, pink peppercorns, and small flakes of sea salt. The surface reflects light subtly, emphasizing freshness and moisture. To the right, a matte ceramic plate in burnt orange holds grilled lemon halves and a small portion of charred vegetables, their caramelized surfaces glowing with warm highlights. A crystal glass filled with pale golden white wine sits near the top of the frame, refracting warm light and casting subtle golden reflections across the tablecloth. Brushed gold cutlery is arranged with intentional spacing, adding warmth and luxury. Lighting is soft and directional, evoking natural window light from the upper side, producing gentle shadows and enhancing textures in skin, ceramic glaze, and food surfaces. Tasteful Mediterranean editorial composition, rich harmonious colors (deep teal, olive skin tones, ivory, citrus yellow, burnt orange), refined fine dining aesthetic, ultra-realistic food textures, magazine-quality photography, 8K resolution, no text.
3.11 Lighting Quality: Score 4.7
What this measures: Named techniques, color temperature, multi-source.
What works:
- Cinematic lighting is a core strength, dramatic side lighting, half-shadow portraits, golden hour, and studio setups all render with high fidelity.
- Color temperature control is precise. The model responds well to specific Kelvin values and named lighting techniques.
- Multi-source lighting with complex shadow interactions resolves cleanly.
What breaks:
- Flat, utilitarian lighting setups (product catalog, ID badge) default toward the model’s editorial bias.
Examples:



See the prompt here
Editorial beauty photoshoot featuring a Korean makeup look with flawless dewy skin, soft gradient eyeshadow, subtle shimmer on the eyelids, precise eyeliner, naturally defined brows, glossy rose lips, and a radiant glass-skin finish. Show a female model with delicate, refined features, smooth luminous complexion, and a poised confident expression. Use a close-up to mid-shot composition with dramatic lighting, strong contrast, and sculpted highlights that accentuate the cheekbones, skin texture, and glossy makeup finish. Include moody shadow play, directional studio light, and a dark minimal background to enhance the high-fashion editorial atmosphere. Focus on fresh, elegant, polished Korean beauty aesthetics with a cinematic luxury feel, clean styling, premium beauty campaign mood, ultra-detailed skin and makeup texture, high-end editorial photography, 2K clarity.
4. Competitive Position
Overall rank: #5 of 8 image models (Score: 4.26/5.0).

Where Recraft V4 leads :
First-try quality: (Score 5.0) Only model scoring a perfect 5. Outputs are production-ready from the first generation, the “design taste” advantage means less time re-rolling and more time shipping.
Surface quality (Score 4.8): Tied for best with GPT Image 2. Brushed metal, glazed ceramic, wet stone, and woven linen all render with material-accurate light interaction. Multi-material compositions hold their physical logic, which is where most other models start to break down. Also, very strong on physics realism for still imagery. Water droplets, condensation, fabric drape, and specular highlights all behave correctly. V4 PRO is the model for shots where the product has to look like it exists in the room.
Lighting quality (Score 4.7): Golden hour, soft window light, and multi-source studio setups all behave correctly. For a candle founder shooting product imagery or a coffee brand prepping packaging proofs, V4 PRO is the model for shots where the light has to look like it’s coming from somewhere real.
Skin realism (Score 4.6): Visible pore detail, natural tonal range, and lifelike highlights make V4 PRO the model for editorial fashion and beauty work.
Unique capabilities:
Print-ready resolution: Native 2K headline advantage. The 2048×2048 output lands at full resolution for print without an upscaling pass. For a candle founder building a printed lookbook, or a coffee brand prepping packaging proofs, that saves a step and keeps detail intact at large scale.
Native SVG vectors: The only model producing real editable vector paths. No tracing, no cleanup, straight from prompt to Figma, Illustrator, or print production.


See the prompt here
Overhead flat lay of a vibrant Mediterranean mezze spread on a handmade terracotta platter — jewel-toned roasted beets, bright green tabbouleh, golden hummus drizzled with paprika-infused olive oil, scattered pomegranate seeds and fresh mint leaves — set on a sun-bleached linen tablecloth, soft natural window light from the left, rich saturated earth tones, print-ready resolution


See the prompt here
Editorial fashion poster with bold streetwear energy. Close-up portrait of a young woman with bleached eyebrows and glossy skin, wearing an oversized vintage leather jacket, gold chain layered necklaces, confident nonchalant pose.
Flat electric red background with large distorted bubble-style lettering reading “KITTL” behind the subject, slightly warped and glowing. Micro-typography details scattered around the edges. Halftone texture and print grain across the entire image.
Where Recraft V4 is outperformed:
Spatial reasoning: (Score 3.5). Behind GPT Image 2 (4.7) and Seedream 5.0 Lite (4.8) on complex layouts and multi-element scenes.
Generation speed: (Score 3.5). Nano Banana 2 (5.0) delivers images in 3-5s vs Recraft V4’s ~10s. For rapid iteration workflows, Recraft is not the right choice.
5. Use Case Guide
Best fit for Recraft V4: Brand designers and POD creators
- SVG vector production: Logos, icons, brand marks, and illustrations as real editable paths. No tracing step, no Illustrator cleanup.
- Editorial campaign layouts: Posters, magazine covers, and social campaigns that feel art-directed without hiring a designer.
- Product photography with text: Packaging labels, signage, and marketing visuals where typography and materials both need to be production-grade.
- Brand identity exploration: Exploration Mode generates eight visual directions from one prompt, ideal for early-stage brand development.
Best fit for Recraft V4 Pro: Product-based small business (SMB)
- Product hero images: Skincare, candles, coffee, and specialty food brands get studio-quality hero shots without a studio. Material rendering and lifelike lighting carry the cost of the model on this use case alone.
- Product photography: Watch ads, jewelry close-ups, and luxury accessories benefit from the 2K resolution and the model’s precision on metal, glass, and stone.
- Food and beverage visuals: Overhead flat lays, restaurant menu shots, and recipe spreads land with the saturation, texture, and lighting of a paid food photographer.
- Fashion and beauty imagery: Full-length editorial portraits with believable fabric weave, natural skin texture, and golden-hour location lighting. Multi-image sets hold visual identity across a campaign.
- Material-heavy scenes: Compositions with metal, glass, ceramic, fabric, or wet surfaces in the same frame. Surface quality holds where most other models start to break down.
- Final marketing assets: Production-ready outputs for campaigns where finish and fidelity matter more than iteration speed. The right model for the last step, not the concept stage.
- Print-facing visuals: Catalogs, magazine ads, packaging mockups, and stockist pitch decks where the native 2K resolution carries detail to full bleed without an upscaling pass.
6. Kittl’s Verdict
Recraft V4 is the model you reach for when the output needs to look designed, not generated. Its “design taste” – the way it handles composition, negative space, color relationships, and typography, produces work that feels art-directed even from short prompts. The native SVG vector capability is unique and genuinely useful: POD creators, brand designers, and anyone shipping to print can go from prompt to production file without a tracing or cleanup step.
Don’t use it when speed matters most. At ~10 seconds per image (28s for Pro), it’s not built for high-volume iteration or A/B testing at pace, Nano Banana 2 delivers comparable commercial quality in a third of the time. Also avoid when the brief requires character consistency across scenes (characters drift), complex spatial compositions (multi-element prompts simplify), or skin-level photorealism at extreme close-up (GPT Image 2 leads there). The model’s strong aesthetic opinion is a strength when you want editorial polish, but a limitation when you need neutral, predictable output for catalogs or simple mockups.
Bottom line: Use Recraft V4 when the brief is “make this look like a designer touched it” — especially when the deliverable is a vector, a campaign visual, or anything going to print. Pair it with Nano Banana 2 for speed and GPT Image 2 for complex scenes. Use Recraft V4 PRO for product visuals, mockups, and clean commercial layouts where finish matters more than speed.
Last updated: May 2026. Tested on Recraft V4 PRO and Recraft V4 via Kittl + Recraft API. Model capabilities subject to change.

Kittl brings pro-quality templates, premium fonts, and AI-powered tools right to your browser. Drag, drop, customize, and share. Perfect for brands, creators, and anyone who loves great design. Kittl was made to give you seamless and fast design process.


