OCRing Music from YouTube with Common Lisp

by superdiskon 1/5/2025, 12:29 PMwith 18 comments

by notpublicon 1/6/2025, 12:10 PM

Instead of doing a diff, curious if Normalized compression distance (NCD)[1] will yield a better result. It is very simple algorithm:

to compare two images, i1 and i2

  l1  = length(gzip(i1))
  l2  = length(gzip(i2))
  l12 = length(gzip(concatenate(i1, i2))

  ncd = (l12 - min(l1, l2))/max(l1, l2)
Here is a nice article where I found out about this long ago.

https://yieldthought.com/post/95722882055/machine-learning-t...

From the article:

"Basically it states that the degree of similarity between two objects can be approximated by the degree to which you can better compress them by concatenating them into one object rather than compressing them individually."

[1] https://en.wikipedia.org/wiki/Normalized_compression_distanc...

by varjagon 1/6/2025, 9:59 AM

If you're also getting a 500:

https://web.archive.org/web/20250106075631/https://nickfa.ro...

by xenoniteon 1/6/2025, 12:17 PM

To OCR music scores, see e.g., https://digitalcollection.zhaw.ch/items/276365b9-0a20-4286-a...

by rcarmoon 1/6/2025, 8:58 AM

Holy cow.

by kanwisheron 1/6/2025, 8:03 AM

honestly this would be better with an AI model