Machine Learning (ML)-based Intrusion Detection Systems (IDS) are increasingly proposed for deployment in Industrial Control Systems (ICS) to detect evolving and previously unseen attacks. However, ML models are vulnerable to adversarial examples, i.e., carefully crafted inputs that induce misclassification while remaining functionally valid and physically plausible. In safety-critical ICS environments, this vulnerability makes systematic robustness benchmarking essential prior to deployment. In this paper, we introduce the Framework for Evasion and Validation for Industrial Control Systems (FEVA-ICS), a novel end-to-end benchmarking platform designed to assess ML-based IDS robustness in a realistic black-box setting. FEVA-ICS incorporates two attack strategies, a query-based approach and a surrogate model-based approach. In particular, we propose Correlation-Driven Feature Shift (CorrShift), a novel query-based adversarial attack tailored for ICS that preserves physical plausibility and temporal consistency. We also include surrogate-model transfer attacks using gradient-based methods, such as Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD). Through comprehensive experiments, we show that CorrShift consistently outperforms surrogate-based attacks in effectiveness and generalizability, highlighting the importance of ICS-aware adversarial design. The results underscore the need for adversarial robustness evaluation in ML-based IDS pipelines. FEVA-ICS establishes a practical and extensible benchmark for adversarial robustness assessment, supporting safer and more reliable deployment of ML-based IDS in real-world ICS environments.