Looking for sequences with an image classifier
After playing around with combining an audio spectrograms with an image classifier I decided to try indentifying sequences using the image classifier with a different input.
I went with drawing traces with PoseNet and training the image classifier on the traces to try to identify specific gestures.
The training imagess for the image classifer were generated by:
- A sketch to draws traces of head position data from PoseNet
- I recorded a video of me making a specific gestures with the macOS screen capture utility for each class I wanted to train
- I extracted frames from the video using VLC
- I resized the images using Preview.app
- I loaded the images into Teachable Machine and trained the image classification model
Compared to a standard data processing pipeline, it is amazing how resilient the image pipeline was to my hacking.
I spent most of my time for this week’s assignment playing around with audio.
I was originally inspired by looking at the visualizations for the Sounds Teachable Machine. It shows a sliding window extracting images from a streaming spectrogram.
The two aspects I found interesting were the timbre and timing.
- differentiating timing / volume / timbre: playing piano scales with my left and right hand
- differentiatiing timbre: clapping my hands in different rooms
- training on a cloud compute instance
- drawing more elaborate traces to allow for more expressive gestures