AI Art Critic
Too many people are unable to receive proper art education due to financial, time, or societal restrictions. My AI Art Critic addresses this issue by being an easily accessible yet reliable tool to give feedback given an image from the users’ input, developing their technical skills alongside creativity. Through this project, I hope to inspire fellow engineers and artists to combine the power of STEM with the beauty of art.
Area of Interest
Information Science, Data Science, Artificial Intelligence
Monta Vista High School
AI Art Critic: ML Features
My goal for the end of these three weeks is to make a Minimal Viable Product (MVP). In the second week, I researched various object detection methods in depth, as well as tools I can use to build machine learning models.
I used TensorFlow’s Object Detection API to determine the realistic components of a piece of artwork.
- Uses Google’s TensorFlow object detection model, which can detect up to 80 objects and works exceptionally well, for my purposes.
- Spend hours trying to install the correct dependencies, download the specific version of TensorFlow and NumPy needed, and find the files in the Jupyter Notebook to run.
This feature returns the determined mood of an input piece of art.
- To speed up the process, I used Nanonets, a free alternative to other costly auto ML services, such as Google’s AutoML or AWS.
- I collected around 430 pieces of art from my friends, art studio, brother, and myself.
- To label the dataset, I grouped pieces into folders by mood (energetic, calm, pleasant, unpleasant) and uploaded them onto Nanonets. Nanonets automatically trains a object classification model and returns a string containing the certainties of each for the image input.
- In order to give nuanced feedback, I wrote come code to determine the specific mood utilizing this mood chart in Figure 13 of this research paper.
- It works quite well!
HANDWRITING DIGIT CLASSIFICATION:
This was just a test to try out AutoKeras.
- Researched AutoKeras
- Wrote and trained an image classification model using Keras API. Used the standard MNIST dataset as the data. Used 10 epochs to increase accuracy.
- The downsides are that all images must have the same dimensions, which will not work well for my purposes.
This feature returns the European art era the input piece most aligns with.
- Again, I used Nanonets just as a MVP before I write my own algorithms in the future.
- Downloaded ~80,000 images (entirety of wikiarts dataset as of 2016), surprisingly took up only 27 GB.
- Used 1,000 images for each of the 27 eras, unless less than 1,000 for a specific era.
- I don’t think Nanonets free trial used all of the images I uploaded to train the model. The accuracy is just under 50%, which is relatively acceptable given the miniscule nuances of each style.
- To accommodate for this, I wrote some code to parse the output and display both the winner and runner-up eras.
AI Art Critic: Non-ML Features
I spent my first week developing my non-machine learning features of my AI Art Critic, which served as a rigorous introduction to the world of Python.
The first feature that I developed is a value analysis program that determines the ratios of dark, medium, and light pixels in the artwork.
- Takes a specified image and converts it to 8-bit grayscale to simplify the process.
- Stores the returned values from the .getdata() method in a list, which is iterated through using a for loop to increment the dark, medium, and light pixel counts by comparing the current pixel value with threshold values.
- Perform some arithmetic to give the ratios of dark, medium, and light pixels.
This is the second feature I developed, which determines the average color, the dominant color, the N-most dominant colors, and saves a copy of the image using only N-colors.
- Relies on NumPy, SciPy, and ImageIO
- Opens image and shrinks it for lower calculation times. Stores pixel values in an array and determines clustered colors using k-means clustering.
- Processes array and sorts to get the most frequent cluster, which is therefore the dominant color. Convert to hexadecimal to print alongside the RGB value. I modified code from Stack Overflow for this step.
- Converts image to only N-colors by comparing the first term of the enumerated values with previously obtained locations of the color clusters and determining which of the N-colors to fill it with. I modified code from Stack Overflow for this step.
- Average color is not too useful so it is put in a function that will only be called if necessary. Iterates across the image pixels and sums the red, green, blue, and alpha. Divide by total pixels to get average RGBA, and convert to hex.
Object Detection + Image Recognition with Deep Learning
My Starter project is a combination of the Object Detection with Deep Learning and the Image Recognition projects.
I downloaded and installed Raspbian OS to Raspberry Pi through a SD card, then set up Raspberry Pi Bluetooth. To simplify things, I installed TensorFlow Lite to run a pre-trained model. Due to the low processing power of the Raspberry Pi 3, the model can only run at 2 frames per second. Nonetheless, the accuracy is surprisingly high.
These were straightforward projects to do so I completed them on the first day by following a tutorial; I spent the next day researching and analyzing the code to work out exactly what I had done.