Abstract
‘Prompt engineering’, whereby a user writes a text prompt to generate an image using AI, is a relatively new skill and relies on Natural Language Processing (NLP). The AI image generator, (DALL-E 3, Midjourney etc.) uses NLP to interpret the user’s requirements, and although initial results can be very effective, the more specific the requirements, the harder it can be to generate the desired result. Simple commands (‘A horse standing in a field’, for example), are easy to generate, but specific details, such as the actions of the horse and its placement in relation to other features, are harder for NLP to interpret, and can be hard to describe using text alone. In addition to this, the current generation of AI image generators cannot interpret the semiotic meaning of objects, (the systems could be said to ‘understand’ what something is, but not what it means). This requires the user to describe a scene or object that has semiotic meaning, using a semantic description. This paper investigates how a visual (drawing), rather than a written language can be more effective when using AI image generators and examines the implications of this for industry and education.
Presenters
Nicholas LewisSenior Lecturer, Faculty of Arts and Creative Industries, University of Sunderland, Sunderland, United Kingdom
Details
Presentation Type
Paper Presentation in a Themed Session
Theme
2024 Special Focus—Images and Imaginaries from Artificial Intelligence
KEYWORDS
Drawing, AI, Generative AI, NLP, Creative Industry, Illustration, Education, Semiotics