عذرا، المحتوى في هذه الصفحة غير متوفر باللغة التي تفضلها.

تخطي إلى المحتوى الرئيسي

الصفحة الرئيسية Adversarial attacks

Adversarial attacks

Adversarial attacks definition

Adversarial attacks are malicious techniques designed to trick artificial intelligence (AI) and machine learning systems into making mistakes. Attackers do this by making subtle, carefully-planned changes to the data they feed into the AI model. These small modifications can cause the AI system to produce completely wrong results. 

These attacks are becoming a serious concern in AI security, especially in critical applications like facial recognition, autonomous vehicles, medical diagnosis systems, and cybersecurity tools where accuracy is essential.

See also: adversarial machine learning

Types of adversarial attacks

  • Evasion attacks. An attacker introduces malicious input to a trained model to cause it to make incorrect predictions. The changes to the input data are often minimal but specifically crafted to mislead the model.
  • Poisoning attacks. In this type of attack, the adversary contaminates the training data used to build the ML model. By injecting  malicious data points into the training set, attackers can manipulate the model's learning process, causing it to learn incorrect patterns and make biased or erroneous predictions when deployed.
  • Model extraction attacks. These attacks aim to replicate or steal an ML model by querying it repeatedly and observing its outputs. The attacker then uses this information to create a substitute model that mimics the behavior of the original, potentially bypassing its security measures. 
  • Byzantine attacks. Byzantine attacks involve malicious participants submitting corrupted model updates or gradients to poison the global model. This can degrade the model's performance or introduce backdoors.