Here's some fun examples!
Laughing Kookaburra: <https://search.acousticobservatory.org/search/index.html?q=h...>
Pacific Koel: <https://search.acousticobservatory.org/search/index.html?q=h...>
Chiming Wedgebill: <https://search.acousticobservatory.org/search/index.html?q=h...>
How it works, in a nutshell: We use audio source separation (<https://blog.research.google/2022/01/separating-birdsong-in-...>) to pull apart the A2O data, and then run an embedding model (<https://arxiv.org/abs/2307.06292>) on each channel of the separated audio to produce a 'fingerprint' of the sound. All of this is put in a vector database with a link back to the original audio. When someone performs a search, we embed their audio, and then match against all of the embeddings in the vector database.
Right now, about 1% of the A2O data is indexed (the first minute of every recording, evenly sampled across the day). We're looking to get initial feedback and will then continue to iterate and expand coverage.
(Oh, and here's a bit of further reading: https://blog.google/intl/en-au/company-news/technology/ai-ec... )