Can computers invent new things that change the world, even pharmaceuticals and solutions outside the scope of human capacity? The answer is a resounding “yes”, according to Hafsteinn Einarsson, assistant professor of computer science at the University of Iceland.
“There is no doubt that computers will invent more and more things. It is already happening. AI is already being used in research and development in the pharmaceutical industry. Of course, until now there has always been a human element when AI is utilised, but this will become less prominent in the future in my opinion,” says Hafsteinn.
Christmas pictures created by computers using AI interface created by Hafsteinn have caused a bit of stir. Hafsteinn writes a textual description that he feeds to the computer that promptly creates images based on the descriptions. The process can be viewed as art creation using AI methods.
“The pictures are created using a specific type of dispersion modelling, where noise is cancelled from the images. Using such a model on an image consisting only of noise you can add models that control what parts of the image emerge. I can for example use another model trained to work on the interplay of text and images. This enables me to input a textual description and direct the creative process so that with each step the image better fits the textual description.”
AI based on models to create images from text
When Hafsteinn is asked to explain better it becomes clear that the process is complex, and language may be inadequate to explain exactly what happens.
“Several models are used in the process, but the model is the one that ultimately unveils the image; by cancelling noise step by step,” says Hafsteinn.
“The model trained for text and images that I use is based on contrastive learning. A neural network attempts to embed text and image in the same space; thus the embedding of the text used to describe a picture is similar to the embedding of the image. Embedding of a text that does not fit the image contrasts to it, and should not be similar to the embedding of the image in the allotted space.”
At this point the technological jargon has become a bit dense, but Hafsteinn says that by training AI a presentation of text and image will contain data on the content of the text and what can be found in the picture. In other words, the computer learns to create images that fit the text.
“I also use less essential models to for example enlarge the image, or clean it up a little.”
“The model trained for text and images that I use is based on contrastive learning. A neural network attempts to embed text and image in the same space; thus the embedding of the text used to describe a picture is similar to the embedding of the image. Embedding of a text that does not fit the image contrasts to it, and should not be similar to the embedding of the image in the allotted space,” says Hafsteinn. Here is one of the Christmas pictures created by computers using AI interface.
Pictures of virtually anything can be created
These days Hafsteinn is researching both text and images using neural network methods. In this process he uses the aforementioned methods, using contrastive learning, and thus he encountered the distribution models.
“Interest in the interaction between models of the kind I have been using has been growing in recent months, and an explosion in the field of artistic creation using AI is taking place. This is the first time a convincing image of literally anything can be created with relatively little effort. I have been in touch with international artists using methods similar to mine, and it is varies greatly how the methods enter the creative process. This development is very exciting and I believe we will see a lot of progress in this field in the near future.
When the discussion turns to innovation Hafsteinn says that concerning the image creation the results we see have more to so with data access and processing power than novel methods.
“Of course, it is important to be familiar with the methodology to use the data correctly. Therefore, processing infrastructure and data must be present so this kind of work can be pursued. Training models with images and text enables us to better understand and analyse data that was not easily accessible before.”
Hafsteinn will hold a presentation at UTmessan at the beginning of February, where he will discuss this process and methods in the field of automated art creation.