Fifteenth International Conference on The Image

Abstract

‘Prompt engineering’, whereby a user writes a text prompt to generate an image using AI, is a relatively new skill and relies on Natural Language Processing (NLP). The AI image generator, (DALL-E 3, Midjourney etc.) uses NLP to interpret the user’s requirements, and although initial results can be very effective, the more specific the requirements, the harder it can be to generate the desired result. Simple commands (‘A horse standing in a field’, for example), are easy to generate, but specific details, such as the actions of the horse and its placement in relation to other features, are harder for NLP to interpret, and can be hard to describe using text alone. In addition to this, the current generation of AI image generators cannot interpret the semiotic meaning of objects, (the systems could be said to ‘understand’ what something is, but not what it means). This requires the user to describe a scene or object that has semiotic meaning, using a semantic description. This paper investigates how a visual (drawing), rather than a written language can be more effective when using AI image generators and examines the implications of this for industry and education.

Presenters

Nicholas Lewis
Senior Lecturer, Faculty of Arts and Creative Industries, University of Sunderland, Sunderland, United Kingdom

Details

Presentation Type

Paper Presentation in a Themed Session

Theme

2024 Special Focus—Images and Imaginaries from Artificial Intelligence

KEYWORDS

Drawing, AI, Generative AI, NLP, Creative Industry, Illustration, Education, Semiotics

Error