Music & Image Retrieval

December 2024

Music & Image Retrieval System

PCA-based Image Search and Query-by-Humming Music Recognition

GoPythonReactFastAPIPCASVDLinear AlgebraMIDI Processing

A comprehensive information retrieval system that implements both image and music search capabilities. The system uses Principal Component Analysis (PCA) for efficient image similarity matching and Query-by-Humming techniques for music identification from MIDI files.

Developed as part of the Linear Algebra and Geometry course at Institut Teknologi Bandung, this project demonstrates practical applications of linear algebra concepts including SVD, eigenvalue decomposition, and vector similarity measures in multimedia retrieval systems.

Key Features

  • Implemented PCA-based image retrieval using Singular Value Decomposition (SVD) for dimensionality reduction
  • Developed Query-by-Humming system with MIDI processing and windowing techniques for music identification
  • Built feature extraction pipeline with Absolute, Relative, and First Tone Based (ATB, RTB, FTB) histograms
  • Implemented cosine similarity and Euclidean distance measures for accurate content matching
  • Created full-stack web application with Go backend, Python FastAPI, and React frontend

Technical Implementation

Image Processing

Grayscale conversion, normalization, and PCA projection for similarity matching

Audio Processing

MIDI parsing, pitch normalization, and feature vector extraction

Linear Algebra

SVD decomposition, eigenvalue analysis, and principal component computation

Similarity Measures

Cosine similarity and Euclidean distance for content matching

Backend Architecture

Go with Gin framework and Python FastAPI for microservices

Frontend

React with modern UI for file upload and search visualization

PCA Image Retrieval Algorithm

  1. 1Preprocessing: Convert images to grayscale and normalize to consistent dimensions
  2. 2Vectorization: Flatten image matrices into 1D vectors for mathematical processing
  3. 3Standardization: Center data around zero mean for each pixel position
  4. 4SVD Computation: Apply Singular Value Decomposition to extract principal components
  5. 5Projection: Transform images into lower-dimensional PCA space
  6. 6Similarity: Calculate Euclidean distances in PCA space for ranking

Query-by-Humming Features

  • MIDI Processing: Extract melody tracks and normalize tempo/pitch variations
  • Windowing: Sliding window technique for flexible melody segment matching
  • Feature Extraction: ATB, RTB, and FTB histogram generation for tone analysis
  • Similarity Matching: Cosine similarity computation for melody identification