Skip to main content


Home Adversarial machine learning

Adversarial machine learning

Adversarial machine learning definition

Adversarial machine learning is a field of study that focuses on vulnerabilities and risks in machine learning models. The goal is to develop techniques that can prevent misuse of machine learning algorithms.

In regular machine learning, models are trained on sound data representing real-world situations. In adversarial machine learning researchers focus on how attackers can create inputs to trick the model instead.

See also: artificial intelligence, machine learning

History of adversarial attacks:

Early Discoveries (2000s): Researchers first observed the potential vulnerabilities of machine learning models and realized that classifiers, like support vector machines and decision trees, could be manipulated by carefully crafted inputs.

Exploration of Adversarial Examples (2013): Researchers at the University of Wyoming coined the term “adversarial examples.” They demonstrated that small, imperceptible changes in input data could cause deep neural networks to misclassify objects. This discovery raised awareness that we need defenses against adversarial attacks.

Breakthrough in Deep Learning (2014): Deep learning models, particularly deep neural networks, gained popularity for achieving remarkable performance in various tasks, such as image and speech recognition. However, researchers discovered that these powerful models were highly susceptible to adversarial attacks.

Increase in Adversarial Research (2016-2018): Adversarial machine learning gained significant attention in academia and the industry during this period. Researchers from various institutions started publishing a significant number of papers on adversarial attacks, defenses, and the impact on different machine learning algorithms.

Real-World Impact (2018-present): Adversarial attacks are no longer limited to academic demonstrations. They now show real-world impact, especially in computer vision and autonomous systems. For example, researchers found that adversarial attacks on stop signs could deceive object detection systems used in self-driving cars.

Development of Defense Techniques (ongoing): As adversarial attacks continue to pose a challenge, researchers and practitioners are working on developing defense techniques to improve the robustness of machine learning models. It includes adversarial training, defensive distillation, and input preprocessing methods.

Types of adversarial attacks:

  • Adversarial Examples: Creating inputs that trick the model into giving incorrect predictions.
  • Evasion Attacks: Manipulating inputs in real-time to mislead the model.
  • Poisoning Attacks: Attackers tamper with the training data to change the model's behavior or performance.