Evaluating and Fine-Tuning Vision-Language-Action Models for Robotic Control in Novel Environments
Author(s) : Kilian Preuss
Following the reviewers suggestions, I made several improvements to the extended abstract. I added a Keywords section after the summary and clarified the expressions “general reasoning capabilities” and “strong performance across”. I also corrected various formatting issues and included a short explanation before the results table to better contextualize our hypotheses. In addition, I added an interpretation of the results after the table and a brief conclusion section to complete the extended abstract. In the Related Work section, I replaced a method I originally intended to compare with another approach that is more relevant to our study. I included the actual experimental results in the table and added a column indicating the percentage of trainable parameters. In the Methods section, particularly within the PEFT Methods Considered subsection, I added further precision and clarifications. Finally, I updated the references so that the publication venues of the cited papers are now clearly indicated.