Linear discriminant analysis

# Linear discriminant analysis

## Linear discriminant analysis definition

Linear discriminant analysis (LDA) is a statistical method that identifies the linear combination of features that best separates different groups of objects or events.

LDA has its roots in the early 20th century, primarily from the work of Ronald A. Fisher. In 1936, Fisher introduced the linear discriminant to solve two-class classification problems. Over time, LDA has been extended and has found its use in various fields, from biology to finance.

LDA has advantages but needs certain conditions, like similar data spread for all groups. If these aren’t met, the method may not work well. In such situations, quadratic discriminant analysis (QDA) or logistic regression could be better.

## How linear discriminant analysis works

1. Compute the within-class and between-class scatter. It’s ideal for the within-class scatter to be low, which means data points in each class are closely clustered together. Meanwhile, the between-class scatter should be high, so that the different classes are well separated.
2. Determine the linear discriminants. LDA then finds the linear combinations of features that maximize the between-class scatter and minimize the within-class scatter. These linear combinations serve as the new axes (or directions) that best separate the data into classes.
3. Project data. The original data points are then projected onto these new axes. That provides a new perspective of the data where classes are as separated as possible.

## Practical uses of linear discriminant analysis

• Biology and medicine. LDA allows genomic scientists to spot different gene expressions in different conditions. In medical imaging, it helps distinguish between benign and malignant tumors based on certain features. It’s also used for differential diagnosis based on patient measurements.
• Face recognition. Biometric authentication systems use LDA to extract the most distinguishable features of faces.
• Finance. LDA detects patterns in financial data, such as ones that point at potentially fraudulent activities. It can also predict if a customer is likely to default based on their financial data.
• Marketing. LDA groups customers by their purchasing behavior, demographics, and other data. It also helps identify customers who are likely to stop using a service or product.
• Environmental science. LDA helps classify ecological zones based on various environmental factors. It can also predict the presence or absence of a particular species based on habitat characteristics.