On Fri, Feb 17, 2012 at 5:11 PM, Nowlan, Sean <Sean.Nowlan@gtri...<mailto:Sean.Nowlan@gtri...>> wrote:
I built Tom's safe_align branch on E100 and ran volk_profile. It segfaulted on "RUN_VOLK_TESTS:volk_32fc_s32fc_multiply_32fc_a. I'll get a stack trace for you.
Really interesting that it's the same block. Hopefully, it's a single, simple fix. I'll look into it when you can get me the stack trace.
We are interested in determining the best architecture at instantation
time. What would be the best strategy? We though about running the
same operations several times for each architecture, measure the
results and use the fastest one for the processing blocks. Would this
be the right approach?
Run volk_profile. It does exactly what you said, and writes the results to ~/.volk/volk_config. Volk reads this file when it is involked (sorry) to determine which particular function to execute. So all you do is run volk_profile once on any given machine, and it's optimized.