Record Bird Sounds and Make Merlin Better!

By Scott Crabtree

Singing Rufous-winged Sparrow, Dan Weisz

How often have you thought, “I wish the Merlin app was better at identifying the birds I’m hearing. Why haven’t those people at Cornell fixed this?”

The developers at Cornell’s Laboratory of Ornithology use computer vision and machine learning to create the sound identification algorithms used in the Merlin Sound ID app. It’s all based on the sound recordings resident in the Macaulay Library—those are the recordings submitted by birders like you and me!

Cornell needs a minimum of 100 quality recordings of a single species to train that species’ model. They need good spectrograms (a visual representation of the spectrum of frequencies of a signal as it varies with time) because the Merlin app uses them for identification.

How do you get a high technical quality recording? Here are some basic tips:

  • Get close to the bird!
  • Know which microphone your phone is using
  • Point the microphone at the target bird
  • Don’t move or talk while you are recording

Once you’ve got a recording, you should edit it (following Cornell’s recommendations) to increase the chances of it being used by Merlin to refine the model for that species.

I recently recorded a Rufous-winged Sparrow near the Rillito River Park. I used my Android phone, the RecForge II app (an app recommended by Cornell; the Merlin app would work as well), and a Røde VideoMic-Me plug-in microphone which helps with directionality.

Following Cornell’s tutorial on sound recording editing, I used the freeware program, OcenAudio. Once the program is set up, and the recording opened, I got this first look:


 
Here I can see the beginnings of a high technical quality recording with strong amplitude/loudness in the waveform (top) trace, and a good frequency capture in the spectrogram (bottom) trace. (Of note, I was about 10 feet from the bird.)

The simple editing steps Cornell asks from you are trimming and normalizing. Trimming is just removing the irrelevant ends of the recording, deleting handling noise and footsteps in order to isolate the birds’ sounds. Normalizing is amplifying the resultant selection to a standardized level of -3.0 dB. It’s all covered in the tutorial, and normalization is a set-up once feature.

If needed, a High Pass filter below 250 Hz can be used to adjust for road/wind noise: Effects ➜ Filter➜ High Pass Filter.


 
My final result is a quality recording useful for refining the Merlin model for the Rufous-winged Sparrow. It’s a product of good tradecraft and simple editing.

Once I’ve exported the recording (use the File menu) to my computer directory, it’s ready to upload to my eBird checklist using the Add Media button on the web version of eBird. This puts it in the Macaulay Library automatically. The final recording can be heard here.

How does your recording end up helping the Merlin app get better? There isn’t some wizard who magically changes those recordings into identification algorithms. First, those recordings have to be annotated by volunteers using Cornell’s MerlinVision where the desired bird sounds are isolated so that the sample can go into the Lab’s machine learning system. These are extracts from the guidance for MerlinVision volunteers:
 
 



Annotating recordings with MerlinVision is a lot of work (believe me!), and the better the quality of the recordings that go into the Macaulay Library via eBird, the easier it is for model training.

Cornell’s Laboratory of Ornithology depends on eBirders, like you and me, to provide the kinds of recordings they need for developing and refining Merlin. Go out and make some recordings, and you’ll be helping to make Merlin better!


Scott Crabtree has been a birder for over 50 years, and still hears birds pretty well! A resident of Tucson since 2018, Scott frequently leads birding trips for Tucson Audubon. 



Comments