Partitioning of prosodic features for audio similarity comparison
Multiple methods for partitioning space for use in comparing audio samples using prosodic features are examined and researched. Specific prosodic features are chosen for use within an online system that will allow for users to submit audio clips and receive matches. The audio requires processing before being input to the system which is comprised of multiple steps. Existing methodologies using classifier systems requiring classifier training are discussed and deemed unsuitable for this application. The partitioning of extracted features into representative points or regions in the search space is focused on, with 2 approaches. k-means clustering with multiple different validity measures is examined as well as vector quantization using a scalar quantizer. Experimental results show that clustering is ill-suited for use and finding a good k is unlikely. A scalar quantizer is implemented based on its ability to effectively quantize the space without changing how the space is discretized. It is also concluded that a method to trim the input data to reduce the codebook size of the quantizer is not inherently better, yielding more representative points compared to using all the input data.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Geimer, Matthew Steven
- Thesis Advisors
-
Owen, Charles
- Committee Members
-
Dyksen, Wayne
Rehberger, Dean
Pramanik, Sakti
- Date Published
-
2010
- Subjects
-
Cluster analysis--Data processing
Prosodic analysis (Linguistics)
Data compression (Telecommunication)
Vector processing (Computer science)
- Program of Study
-
Computer Science
- Degree Level
-
Masters
- Language
-
English
- Pages
- 34 pages
- ISBN
-
9781124382876
1124382879
- Permalink
- https://doi.org/doi:10.25335/65yc-ww52