a.i.envisions.

Controlling Midjourney using Remix Mode and Variations

by | Sep 9, 2023 | Featured, Interior design

Today:

  • How prompt terms affect other things too
  • Remix Mode
  • Varying Region
  • Varying Whole Image
  • Image Prompt from Collage

I want to combine several concepts together. I want a character who’s an elderly lady, who’s smartly attired in a business setting, and I also want that character to be a cyborg. Let’s go into Midjourney and craft a prompt to make that happen.

Let’s start with a deliberately simplistic construction:

a character who’s an elderly lady, who’s smartly attired in a business setting, and also a cyborg

How prompt terms affect other things too

This is not a bad start, but it’s not quite what I was going for. Midjourney has inferred some things here.

  • The style is not realistic, with the second two having quite a painterly look. I think this is because I have used the word “character”, and that is associated with character design, concept art, game design etc. in the model’s vector embeddings.
  • The cyborg element and the character element are fused rather simplistically, with a woman’s head on top of a robotic body.
  • It’s hard to say what the setting is but it’s not what I meant when I said “business”. Option 1 has the prop of a kind of tablet or notebook, which helps, and while the background appears smart and high-status, the overall feeling is definitely far-future sci-fi rather than near future business setting. I believe this is because the word “cyborg” sits close to other sci-fi concepts.

My hypothesis after this first result is therefore that the words “cyborg” and “character” are strongly associated with sci-fi concept art.

Now let’s change one thing at a time and see what happens, starting with the word “character”. Will removing this affect the concept art style, or will the word “cyborg” still anchor it to this cluster of meanings?

Remember, we’re limiting ourselves here for now: no inpainting, no variations, no processing in other applications, no term weights and no parameters. How close can we get to what we want with just a simple prompt?

an elderly lady, who’s smartly attired in a business setting, and also a cyborg

Some things have happened. One of the four images is now external. Two of the four images now feature something that looks a little like a business suit. One of the images retains hints of a painterly style.

I would say Image 2 has a colour tone, posing and backdrop that most resembles the near-future business scenario I’m looking for. By contrast, Image 4 has an overtly sci-fi concept art feel.

I’m now going to branch this tree at Image 2 and continue working there. The outfit is the weak part as this doesn’t strongly enough convey the business character and is too sci-fi. Let’s address that with some inpainting.

Remix Mode

None of these are quite what I am going for. Still too sci-fi. We’ll now activate some additional functionality in Midjourney by using the Remix mode.

To do this, enter the command /settings to open the settings menu, and select Remix mode from the list of buttons. By default, it will be off.

This fires up a very important function – the ability to add new prompt terms to the vary region

I will enter new search terms here in order to give more control over the outfit.

smart business suit jacket

This is clearly no good. We’ve gone too far the other way now from the cyborg look, the pink colour is at odds with the rest of the composition, the generator doesn’t know if it’s meant to be generating something more masculine or feminine, and the hands that have appeared are distracting. Let’s add some more specificity.

Here, I’ve used the prompt women’s black suit jacket, and I’ve also redrawn the infill boundary to include more of the mechanical sections at her neck.

I like Image 3 best. Let’s now move onto the setting.

Although the near future business setting is spot on, I want her to be inside rather than outside. The image style is also much closer to the photographic style I’m after than some of those earlier graphic art styles, and I believe this has emerged through removal of the term “character”.

I’m going to experiment with three different methods:

 

  1. varying + remixing the image,
  2. varying + remixing a region, with the background selected, and
  3. sperately generating a new background, compositing the foreground with the background externally to Midjourney, and then using the collage as an image prompt.

1. Varying + Remixing Entire Image

In the following, I’ve subtle-varied the earlier image, using the same prompt with new terms:

an elderly lady, who’s smartly attired in a business setting, and also a cyborg

This has mostly not worked. The setting looks more like an interior, but it’s ambiguous and there are some inconsistencies and discordant uses of colour. We’ve also lost the character. I will now try a strong-vary but adjusting the prompt and adding the “seed” parameter and an image prompt to keep the result closer to the previous.

an elderly lady who’s also a cyborg, smartly dressed, inside an office with desks, chairs and glass

I really like some of what is going on here. The robotic eyes and digital glasses of Image 1 are cool, and the character is clearly posed inside the correct environment and wearing the right outfit. Images 3 and 4 compositionally match the original, but 1 and 2 are different. You’ll note another interesting effect here, which is that the character appears to be getting younger. Or at any rate, they appear to have had some plastic surgery. The character in Image 2 could quite easily pass for 25 years old, were their hair darker.

I guess this also proves that Karl Lagerfeld was from the future.

Here, I’ve run the exact same prompt again, but using subtle-vary instead of strong-vary. As you can see, the results are far more similar to the original. We’ve still lost our character though, and there are some strange things going on in one or two of them. I think Images 1 and 2 have been the most successful.

Now let’s try a different method.

2. Varying + Remixing a Region

Again, starting with that same image, I’ve masked out only the background and used the following prompt:

office interior, smart modern, cool blues and greys, chairs, desks and glass

Clearly a disaster! The colours don’t match, the lighting doesn’t match, the content doesn’t match.

I’ll do that again but now with image prompt and seed.

This is much more like it!

Finally, let’s try a third way

3. Image Prompt from Collage

I’m now going to generate a background by itself, isolate the character, composite them using image editing software, and then use the collage as an image prompt. I start by creating a new prompt using the same text as the background inpaint that generated that last image, plus the same seed I’ve been using throughout, but no image prompt as I don’t want it to include the character this time.

glassy office interior, smart, modern, cool blues and greys –seed 477419480

Here we have some options to play with. You can probably already see the first problem though – the camera vantage point is not consistent between our background and our character.

Using an image editor, I cut out the character and overlay it onto the different backgrounds to see which is the best match. I choose Image 1, however position the character higher within the fame to better suit the camera angle. This necessitates some rough manual painting to extend the character downwards. Remember, it doesn’t need to be neat as it’s just a prompt.

Uploading that image to Discord, I’m able to use it as a prompt, along with the text from the more successful image varies above, and the same seed.

an elderly lady who’s also a cyborg, smartly dressed, inside an office with desks, chairs and glass

This feels like it’s losing control. I believe it could be a useful technique under certain circumstances, but it may be more effective with Stable Diffusion + Control Net rather than Midjourney.

Of these three, in order to maintain the character while having flexibility over the background, inpainting the background seems most effective. To finish, I’ll return to that result and see if I can get a better composition with some more work.

Here it is – extended sideways using the pan right button and terms relating to the desk and window added using Remix Mode. If this image were needed somewhere important I’d do more work at this point, but this illustrates how a targeted and problem-solving approach to image generation is practical using Midjourney’s current tools.

If you found this exploration helpful, please share it

🙂