<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Kev &amp; Piers on hypermusic</title>
	<atom:link href="http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/feed/" rel="self" type="application/rss+xml" />
	<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/</link>
	<description>internet music technology since ~2002</description>
	<lastBuildDate>Sat, 04 Sep 2010 20:04:47 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<item>
		<title>By: Kevin Prichard</title>
		<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/comment-page-1/#comment-5433</link>
		<dc:creator>Kevin Prichard</dc:creator>
		<pubDate>Wed, 31 Mar 2010 18:15:49 +0000</pubDate>
		<guid isPermaLink="false">http://gonze.com/blog/?p=2362#comment-5433</guid>
		<description>Took me awhile to get to it, but I&#039;ve written a C app that decodes an MP3 file into a stream of CD audio-type samples, and delivers that to a FFT preprocessor.  Working on the FFT in and out part now, will post back when I finish that bit.</description>
		<content:encoded><![CDATA[<p>Took me awhile to get to it, but I&#8217;ve written a C app that decodes an MP3 file into a stream of CD audio-type samples, and delivers that to a FFT preprocessor.  Working on the FFT in and out part now, will post back when I finish that bit.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin Prichard</title>
		<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/comment-page-1/#comment-5375</link>
		<dc:creator>Kevin Prichard</dc:creator>
		<pubDate>Thu, 11 Mar 2010 22:57:16 +0000</pubDate>
		<guid isPermaLink="false">http://gonze.com/blog/?p=2362#comment-5375</guid>
		<description>Good info, Luc.  I keep hitting patent pages while searching about this area, it&#039;s a bit of a minefield.  I don&#039;t suppose making it a FOSS offering would change matters, hm?

I switched to fftw - it&#039;s a tiny library compared with openframeworks, with super-fast execution, plus it handles n-dimensional transforms.

The idea of jumping up an abstraction level to the methods used in facial recognition - intriguing.  I&#039;ll hafta look into that.

&quot;...the identifier for a time segment is the difference between the highest and lowest amplitude.&quot;

Another interesting idea, I&#039;ll add it to my test case list.  First, I gotta get my code working under fftw.

Another cool thing about fftw, it does runtime precompilation of an analysis - the initial compilation takes hundreds of millis (depending on CPU), and then the actual FFT transforms on live samples take microseconds after that.

I&#039;m gonna keep chipping away at this til I got some answers... just curious.</description>
		<content:encoded><![CDATA[<p>Good info, Luc.  I keep hitting patent pages while searching about this area, it&#8217;s a bit of a minefield.  I don&#8217;t suppose making it a FOSS offering would change matters, hm?</p>
<p>I switched to fftw &#8211; it&#8217;s a tiny library compared with openframeworks, with super-fast execution, plus it handles n-dimensional transforms.</p>
<p>The idea of jumping up an abstraction level to the methods used in facial recognition &#8211; intriguing.  I&#8217;ll hafta look into that.</p>
<p>&#8220;&#8230;the identifier for a time segment is the difference between the highest and lowest amplitude.&#8221;</p>
<p>Another interesting idea, I&#8217;ll add it to my test case list.  First, I gotta get my code working under fftw.</p>
<p>Another cool thing about fftw, it does runtime precompilation of an analysis &#8211; the initial compilation takes hundreds of millis (depending on CPU), and then the actual FFT transforms on live samples take microseconds after that.</p>
<p>I&#8217;m gonna keep chipping away at this til I got some answers&#8230; just curious.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lucas Gonze</title>
		<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/comment-page-1/#comment-5374</link>
		<dc:creator>Lucas Gonze</dc:creator>
		<pubDate>Thu, 11 Mar 2010 17:51:53 +0000</pubDate>
		<guid isPermaLink="false">http://gonze.com/blog/?p=2362#comment-5374</guid>
		<description>p.s. caution: this area of work is heavily patented.</description>
		<content:encoded><![CDATA[<p>p.s. caution: this area of work is heavily patented.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lucas Gonze</title>
		<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/comment-page-1/#comment-5373</link>
		<dc:creator>Lucas Gonze</dc:creator>
		<pubDate>Thu, 11 Mar 2010 17:51:12 +0000</pubDate>
		<guid isPermaLink="false">http://gonze.com/blog/?p=2362#comment-5373</guid>
		<description>Sorry to be so slow to respond, Kev.  

I love the idea of identifying locations in a song via a frequency table mapped to some window.  

Though, I think that the endpoint of this line of dev is to search for an arbitrary acoustic fingerprint within a song and call the location wherever it&#039;s found the timecode.  

The concept would be: here&#039;s an acoustic fingerprint for a five second snippet, including patterns in frequency and amplitude and anything else you can find.  

One way to identify frequency and amplitude patterns is to treat the acoustic data as if it were a face and use face recognition algorithms, e.g. principle components analysis.

A simpler method is to measure variance in some dimension over a fixed time window.  So say that the identifier for a time segment is the difference between the highest and lowest amplitude.

It&#039;s fun hacking.  Makes me wish, yet again, that I did comp sci grad school.</description>
		<content:encoded><![CDATA[<p>Sorry to be so slow to respond, Kev.  </p>
<p>I love the idea of identifying locations in a song via a frequency table mapped to some window.  </p>
<p>Though, I think that the endpoint of this line of dev is to search for an arbitrary acoustic fingerprint within a song and call the location wherever it&#8217;s found the timecode.  </p>
<p>The concept would be: here&#8217;s an acoustic fingerprint for a five second snippet, including patterns in frequency and amplitude and anything else you can find.  </p>
<p>One way to identify frequency and amplitude patterns is to treat the acoustic data as if it were a face and use face recognition algorithms, e.g. principle components analysis.</p>
<p>A simpler method is to measure variance in some dimension over a fixed time window.  So say that the identifier for a time segment is the difference between the highest and lowest amplitude.</p>
<p>It&#8217;s fun hacking.  Makes me wish, yet again, that I did comp sci grad school.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin Prichard</title>
		<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/comment-page-1/#comment-5356</link>
		<dc:creator>Kevin Prichard</dc:creator>
		<pubDate>Thu, 04 Mar 2010 20:26:45 +0000</pubDate>
		<guid isPermaLink="false">http://gonze.com/blog/?p=2362#comment-5356</guid>
		<description>Update: I&#039;ve built a rough little C++ prog that generates the frequency sets using the OF FFT method.  More work to go before I&#039;ll know if there&#039;s any unique aspect to this data.

Maybe this is an all-around silly idea that those with more digital audio experience would get right away, but I do enjoy experimenting.</description>
		<content:encoded><![CDATA[<p>Update: I&#8217;ve built a rough little C++ prog that generates the frequency sets using the OF FFT method.  More work to go before I&#8217;ll know if there&#8217;s any unique aspect to this data.</p>
<p>Maybe this is an all-around silly idea that those with more digital audio experience would get right away, but I do enjoy experimenting.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin Prichard</title>
		<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/comment-page-1/#comment-5327</link>
		<dc:creator>Kevin Prichard</dc:creator>
		<pubDate>Thu, 25 Feb 2010 21:21:10 +0000</pubDate>
		<guid isPermaLink="false">http://gonze.com/blog/?p=2362#comment-5327</guid>
		<description>I&#039;ve been pointed by my friend to openframeworks, says it has FFT capability.  There&#039;s some discussion on FFT usage here-

http://www.openframeworks.cc/forum/viewtopic.php?f=10&amp;t=2184&amp;view=unread

Quite possibly worth a quick hack, just to know if there&#039;s anything to this!</description>
		<content:encoded><![CDATA[<p>I&#8217;ve been pointed by my friend to openframeworks, says it has FFT capability.  There&#8217;s some discussion on FFT usage here-</p>
<p><a href="http://www.openframeworks.cc/forum/viewtopic.php?f=10&amp;t=2184&amp;view=unread" rel="nofollow">http://www.openframeworks.cc/forum/viewtopic.php?f=10&amp;t=2184&amp;view=unread</a></p>
<p>Quite possibly worth a quick hack, just to know if there&#8217;s anything to this!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin Prichard</title>
		<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/comment-page-1/#comment-5326</link>
		<dc:creator>Kevin Prichard</dc:creator>
		<pubDate>Thu, 25 Feb 2010 20:00:15 +0000</pubDate>
		<guid isPermaLink="false">http://gonze.com/blog/?p=2362#comment-5326</guid>
		<description>I just had a look at http://en.wikipedia.org/wiki/Acoustic_fingerprint

It seems that existing fingerprinting methods split between identifying an audio source as a whole, and just an audio fragment.  Gonna look deeper into this...</description>
		<content:encoded><![CDATA[<p>I just had a look at <a href="http://en.wikipedia.org/wiki/Acoustic_fingerprint" rel="nofollow">http://en.wikipedia.org/wiki/Acoustic_fingerprint</a></p>
<p>It seems that existing fingerprinting methods split between identifying an audio source as a whole, and just an audio fragment.  Gonna look deeper into this&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin Prichard</title>
		<link>http://gonze.com/blog/2010/02/18/kev-piers-on-hypermusic/comment-page-1/#comment-5325</link>
		<dc:creator>Kevin Prichard</dc:creator>
		<pubDate>Thu, 25 Feb 2010 17:44:48 +0000</pubDate>
		<guid isPermaLink="false">http://gonze.com/blog/?p=2362#comment-5325</guid>
		<description>Simplicity and pragmatism tend to bear out, the least end-user effort wins.

To recollect some of this, the binding of lyrics and other metadata to timed points in a song is complicated by encoding and editing changes to a track.  There is no one-dimensional identifier that isn&#039;t subject to morphing when a track is ripped, re-encoded or otherwise messed about with.

Discarding timecode as a bind-point, the next thing that comes to mind is the fingerprinting of concurrent frequencies within a frame or time period.  Frequency analysis is something that players already do, for equalization and visualization and other things.

Say there was a way to abstract the frequencies in a given frame or snippet of audio, such that we could reduce a 0.1 second sound slice to a few numbers:

[881.2hz, 220hz, 4150hz, 338hz]

Adding in the amplitude of each frequency as a percent of the source medium&#039;s dynamic range:

[881.2hz@25%, 220hz@40%, 4150hz50%, 338hz@76%]

Would that be enough information to provide a unique location fingerprint for a point in a song?  Only experimentation could tell just how unique these characterizations are.  I know of libraries and applications which provide frequency analysis methods (e.g. Squeak.)

Say that frequency spans 0Hz-16KHz, a range that fits nicely into 14 bits.  Amplitude, expressed in decibels or % of max, another 6 bits.  So, three bytes and change to describe a given frequency sample.

With a pipeline of frequency@amplitude sets, a plugin would need lyrics to be indexed by unique set, so it would make sense to sort the freq@amp values in each set coming down the pipe:

[220hz, 338hz, 881hz, 4150hz]
[105hz, 240hz, 881hz, 2300hz, 4150hz]
[95hz, 262hz, 881hz, 2300hz, 4150hz]
...

The first thing to do is experiment, reduce test tracks to streams of freq@amp sets, and evaluate whether uniqueness exists.

Just some ideas... Hey, I know a developer who toys with digital audio for a living, he introduced me to Squeak.  Gonna have a word with that boy and see what he knows.</description>
		<content:encoded><![CDATA[<p>Simplicity and pragmatism tend to bear out, the least end-user effort wins.</p>
<p>To recollect some of this, the binding of lyrics and other metadata to timed points in a song is complicated by encoding and editing changes to a track.  There is no one-dimensional identifier that isn&#8217;t subject to morphing when a track is ripped, re-encoded or otherwise messed about with.</p>
<p>Discarding timecode as a bind-point, the next thing that comes to mind is the fingerprinting of concurrent frequencies within a frame or time period.  Frequency analysis is something that players already do, for equalization and visualization and other things.</p>
<p>Say there was a way to abstract the frequencies in a given frame or snippet of audio, such that we could reduce a 0.1 second sound slice to a few numbers:</p>
<p>[881.2hz, 220hz, 4150hz, 338hz]</p>
<p>Adding in the amplitude of each frequency as a percent of the source medium&#8217;s dynamic range:</p>
<p>[881.2hz@25%, 220hz@40%, 4150hz50%, 338hz@76%]</p>
<p>Would that be enough information to provide a unique location fingerprint for a point in a song?  Only experimentation could tell just how unique these characterizations are.  I know of libraries and applications which provide frequency analysis methods (e.g. Squeak.)</p>
<p>Say that frequency spans 0Hz-16KHz, a range that fits nicely into 14 bits.  Amplitude, expressed in decibels or % of max, another 6 bits.  So, three bytes and change to describe a given frequency sample.</p>
<p>With a pipeline of frequency@amplitude sets, a plugin would need lyrics to be indexed by unique set, so it would make sense to sort the freq@amp values in each set coming down the pipe:</p>
<p>[220hz, 338hz, 881hz, 4150hz]<br />
[105hz, 240hz, 881hz, 2300hz, 4150hz]<br />
[95hz, 262hz, 881hz, 2300hz, 4150hz]<br />
&#8230;</p>
<p>The first thing to do is experiment, reduce test tracks to streams of freq@amp sets, and evaluate whether uniqueness exists.</p>
<p>Just some ideas&#8230; Hey, I know a developer who toys with digital audio for a living, he introduced me to Squeak.  Gonna have a word with that boy and see what he knows.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
