Abstract: We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.
MetaAudio-A-Few-Shot-Audio-Classification-Benchmark
Citation
A new comprehensive and diverse few-shot acoustic classification benchmark. If you use any code or results from results from this work, please cite the following (paper will go live shortly):
@misc{https://doi.org/10.48550/arxiv.2204.02121,
title = {MetaAudio: A Few-Shot Audio Classification Benchmark},
doi = {10.48550/ARXIV.2204.02121},
url = {https://arxiv.org/abs/2204.02121},
author = {Heggan, Calum and Budgett, Sam and Hospedales, Timothy and Yaghoobi, Mehrdad},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution Non Commercial Share Alike 4.0 International}
}
Licensing for work is Attribution-NonCommercial CC BY-NC
Enviroment
We use miniconda for our experimental setup. For the purposes of reproduction we include the environment file. This can be set up using the following command
conda env create --file torch_gpu_env.txt
Contents Overview
This repo contains the following:
- Multiple problem statement setups with accompanying results which can be used moving forward as baselines for few-shot acoustic classification. These include:
- Normal within-dataset generalisation
- Joint training to both within and cross-dataset settings
- Additional data -> simple classifier for cross-dataset
- Length shifted and stratified problems for variable length dataset setting
- Standardised meta-learning/few-shot splits for 5 distinct datasets from a variety of sound domains. This includes both baseline (randomly generated splits) as well as some more unique and purposeful ones such as those based on available meta-data and sample length distributions
- Variety of algorithm implementations designed to deal with few-shot classification, ranging from 'cheap' traditional training pipelines to SOTA Gradient-Based Meta-Learning (GBML) models
- Both Fixed and Variable length dataset processing pielines
Algorithm Implementations
Algorithms are custom built, operating on a similar framework with a common set of scripts. Those included in the paper are as follows:
For both MAML & Meta-Curvature we also make use of the Learn2Learn framework.
Datasets
We primarily cover 5 datasets for the majority of our experimentation, these are as follows:
In addition to these however, we also include 2 extra datasets for cross-dataset testing:
as well as a proprietary version of AudioSet we use for pre-training with simple classifiers. We obtained/scraped this dataset using the code from here:
We include sources for all of these datasets in Dataset Processing