AI does such a great job using vision to read and solve complex math problems, surely it could convert sheet music to MIDI.
Well, it can’t and don’t call me Shirley.
ChatGPT
One interesting thing about ChatGPT, is that each new session that I start, it seems to load a different Python library seemingly at random to generate the MIDI file. It also consistently generated an error related to f-strings. I suppose I should incorporate that into my prompt. After all that, it still got the duration wrong as well as only generating one note per chord.
Gemini
Since Google has been touting their extraordinary vision, I had high hopes that this would do better. It was an utter failure. It didn’t recognize anything and asked for more time. Which was strange. I checked back after about 15 minutes and it asked for more time. I checked back the next day and it asked for more time. It sensed my frustration and gave up.
Claude
Claude performed the best of the three I tested, but still could not get the timing correct. The chords and bass note were correct, but despite numerous tries, it could not interpret the dotted quarter notes or tied notes.
Next Steps
This seems like a good candidate for training (and more research). Bonus points if you can name this song.
If you’d like to see transcripts of these chat sessions, click here.