Abstract: Deploying large-scale foundation models (FMs) in resource-constrained devices presents critical challenges due to their substantial computational and memory requirements. This is ...