December 2024
Music & Image Retrieval System
PCA-based Image Search and Query-by-Humming Music Recognition
A comprehensive information retrieval system that implements both image and music search capabilities. The system uses Principal Component Analysis (PCA) for efficient image similarity matching and Query-by-Humming techniques for music identification from MIDI files.
Developed as part of the Linear Algebra and Geometry course at Institut Teknologi Bandung, this project demonstrates practical applications of linear algebra concepts including SVD, eigenvalue decomposition, and vector similarity measures in multimedia retrieval systems.
Key Features
- Implemented PCA-based image retrieval using Singular Value Decomposition (SVD) for dimensionality reduction
- Developed Query-by-Humming system with MIDI processing and windowing techniques for music identification
- Built feature extraction pipeline with Absolute, Relative, and First Tone Based (ATB, RTB, FTB) histograms
- Implemented cosine similarity and Euclidean distance measures for accurate content matching
- Created full-stack web application with Go backend, Python FastAPI, and React frontend
Technical Implementation
Image Processing
Grayscale conversion, normalization, and PCA projection for similarity matching
Audio Processing
MIDI parsing, pitch normalization, and feature vector extraction
Linear Algebra
SVD decomposition, eigenvalue analysis, and principal component computation
Similarity Measures
Cosine similarity and Euclidean distance for content matching
Backend Architecture
Go with Gin framework and Python FastAPI for microservices
Frontend
React with modern UI for file upload and search visualization
PCA Image Retrieval Algorithm
- 1Preprocessing: Convert images to grayscale and normalize to consistent dimensions
- 2Vectorization: Flatten image matrices into 1D vectors for mathematical processing
- 3Standardization: Center data around zero mean for each pixel position
- 4SVD Computation: Apply Singular Value Decomposition to extract principal components
- 5Projection: Transform images into lower-dimensional PCA space
- 6Similarity: Calculate Euclidean distances in PCA space for ranking
Query-by-Humming Features
- MIDI Processing: Extract melody tracks and normalize tempo/pitch variations
- Windowing: Sliding window technique for flexible melody segment matching
- Feature Extraction: ATB, RTB, and FTB histogram generation for tone analysis
- Similarity Matching: Cosine similarity computation for melody identification