Researchers at Apple have unveiled a new model that allows users to explain changes they wish to make to a photo in simple terms without ever using photo editing software.
The MGIE model, which Apple developed on with the University of California, Santa Barbara, can crop, resize, flip, and add effects to photographs all through text prompts.
MGIE, or MLLM-Guided picture Editing, is a tool that may be used for both basic and advanced picture editing tasks. For example, it can be used to change an object’s shape or brightness within a shot. Two distinct applications of multimodal language models are combined in the model. It starts by learning how to decipher user commands. After that, it “imagines” what the edit would be like.
All users need to do to edit a photo with MGIE is enter out the changes they wish to make. The example of altering a picture of a pepperoni pizza was given in the publication. When you type “make it more healthy,” vegetables are added as toppings. When the model is instructed to “add more contrast to simulate more light,” a dark photograph of tigers in the Sahara becomes brighter.
“Instead of brief but ambiguous guidance, MGIE derives explicit visual-aware intention and leads to reasonable image editing. We conduct extensive studies from various editing aspects and demonstrate that our MGIE effectively improves performance while maintaining competitive efficiency. In the publication, the researchers stated, “We also believe the MLLM-guided framework can contribute to future vision-and-language research,”
According to VentureBeat, Apple not only released an online demo on Hugging Face Spaces but also made MGIE available for download via GitHub. Beyond study, the corporation has not said what its plans are for the model.
Certain image generating platforms, such as OpenAI’s DALL-E 3, can alter photos that they generate based on text inputs in a basic way. Adobe, the company that created Photoshop and is primarily used by users for picture editing, has its own AI editing paradigm. Generative fill, which gives photographs generated backgrounds, is powered by its Firefly AI model.
Unlike Microsoft, Meta, or Google, Apple has not been a major player in the generative AI sector; nevertheless, CEO Tim Cook has stated that the company hopes to introduce more AI features to its devices this year. To facilitate the training of AI models on Apple Silicon chips, Apple researchers published an open-source machine learning framework dubbed MLX in December.