Custom HRTF without Measuring
After getting some bone conducting headphones, I noticed things like the old virtual barbershop demo worked eerily well. This kind of got me back on the thought process of getting my hrtf measured. After some searching it seems like there are still no good methods for doing so at home in a DIY manner. There does exist one company that has a method for measuring your hrtf at home called earfish, however you need to apply to get access to the equipment. I also saw some other open source options related to using vive trackers (they explicitly say not the headset, only the trackers.) Which is a little expensive when the in ear mics also run over $100. I have multiple vr headsets so I was considering writing an app that would in vr would show up as a set of arrows to look at, it could then record the head position, orientation and the sound itself. Afterwords that could be processed into the hrtf file itself. The problem, how do you even compute the hrtf in the first place from that information. I know it's possible since that's all getting it officially measured is, just with a moving sound source instead of moving the receiver (I also realize that room acoustics could also cause issues, but that to me is less important if I can't generate anything in the first place).
The idea I had after all of this, what if instead of trying to compute an hrtf from just a sound, I wear the headphones (or earbuds, or bone conduction headphones) and try and generate a new one from already existing ones. I tried to write a few examples and I got some half way decent results, I used an xbox controller and the goal was to move the left analog stick in the direction of the sound, I then used the right stick for the height of the sound. I then just kept playing the sound using different hrtf and asking the user where they heard the sound from. I also allowed them to say I don't know.
From there things get a little iffy, I personally just did a cutoff of accuracy based on the Great-circle distance. The idea from there is based on the direction of the sound we could either interpolate from closest 3 points or if a point is close enough just use it directly. I also considered taking the wrong directions the users entered before and mixing them to the average. Even though the source was technically a different direction, if the user heard it in that direction, it should be usable again for that direction.
The most I really got out of this though was just finding the most accurate hrtf for me. I stopped short of generating a new hrtf or mixing them live, I was also concerned that distance information is missing, some if not all hrtf files can hold this information, but I had no way of inputting it and most pre-existing hrtfs I couldn't really perceive a distance on. So all of that information was lost. On top of that I don't know if it's a universal issue or maybe my head is the perfectly weird shape for this, but most existing hrtfs with a sound in front of me sound like it's behind my head. I had very few sounds I could explicitly identify as being in front of me. The last issue was visualization. After collecting all of the information I really wanted to try and see where the true direction vs my imputed direction was, but after rendering a sphere with all of the points almost everything directly right or left was so crowded with points that it was hard to see any clear patterns I could work off of. I think during the testing the program needs to try and find a number of areas where information needs to be filled in and test those. But at the same time it needs to change enough that the user can't just figure it out over 3 samples or so. The other option is mirroring, while not applicable to all users most users (I would assume) have a mostly symmetrical head, so I think mirroring a source might be a good way to extend the data set.