**What is the Classification Algorithm?**

Our task in the analysis part starts from the step to know the targeted class. So this whole process is said to be classification. An algorithm is a procedure or formula for solving the problems of mathematics and computer science, which is based on doing the steps in the sequence of specified actions. We can view the computer program as a detailed algorithm. We use an algorithm in almost all information technology.

For example- taking into consideration the search engine algorithm, it takes search strings of keyboards. It operates as input, which is associated with the concerned web pages and gives us results, respectively. An encryption algorithm like that of the US Department of defence’s data encryption standard (des) uses the secret key algorithm to protect the data from getting it hacked or getting viral because leakage of the country's information can put them into danger. As long as the algorithm is sufficiently cited, no one lacking the key can decrypt the secured data.

**Some of the Examples of the Target Class**

To an analysis of the buyer data to predict whether he would be buying the computer accessories (target class: yes or no)

Grouping and differentiating fruits based on its colour, taste, size, weight (target class: apple, mango, litchi, cherry, papaya, orange, melon, and tomato)

Differentiating the gender from hair length (target class: male or female)

Now, we are going to understand the concept of the classification algorithm with differentiating them according to the gender-based on their hair length (by no means am I trying to boilerplate by gender, this is only for example sake). We should have the proper hair length value. Let us suppose the discern boundary hair length value is 25.0 cm; then, we say that is hair length is more than that, then gender could be male or female.

**Dataset Sources and Content**

The dataset contains salaries. The following is descriptive of our dataset:

Of classes: 2(">50k" and "<=50k")

Of attributes (columns): 7

Of instances (rows): 48,842

This data was taken from the census bureau database.

**Explanation**

Two classes of salaries are taken into account. The first one is greater than 50k and the second one is equal to and less than 50k. If we take 7 attributes or columns and rows up to 48,842, considering data from the census bureau database, then we can easily distribute the names of the people with 7 attributes under the two group salaries considered in the initial phase. Hence several calculations and labour work can be avoided using this method.

**Applications of Classification Algorithms**

- Email spam classification
- Bank customers loan pay willingness prediction
- Cancer tumour cells identification
- Sentiment analysis
- Drugs classification
- Facial keypoints detection
- Pedestrians detection in an automotive car driving

**Types of Classification Algorithms**

Classification algorithms could be broadly classified as the following:

- Linear classifiers
- Logistic regression
- Naïve Bayes classifier
- Fisher's linear discriminant
- Support vector machines
- Least squares support vector
- Quadratic classifiers
- Kernel estimation
- K-nearest neighbour
- Decision trees
- Random forests
- Neural networks
- Learning vector quantization

**Explanation of Some of the Important Types of Classification Algorithm**

**1. Logistic Regression**

Logistic regression is a classification and not a regression algorithm.

**R-code**

X< - cbind (x_train, y_train)

# train the model using the training sets and check score logistic < - glm (y_train - ., data = x, family ="binomial")

Summary (logistic)

# predict output

Predicted= predict (logistic, x_test)

There are many steps which can help us to improve the model:

- Include interaction terms
- Remove features
- Regularize techniques
- Use a non-linear model

**Advantage**

- It is designed for classification and is most useful to understand the influence of some independent variables on a single outcome variable.

**Disadvantages**

- It works only when the predicted variable is binary.

**Suggested Course**

Master the Coding Interview: Data Structures + Algorithms

### 2. **Decision Trees**

The decision tree supports a supervised learning algorithm using classification problems.

**R-code**

Library (rpart)

X < - cbind (x_train, y_train)

# grow tree

Fit < - rpart (y_train - ., data = x, method="class")

Summary (fit)

# predict output

Predicted = predict (fit, x_test)

**Advantages**

- A decision tree is simple to understand and visualize, requires little data preparation, and can handle both numerical and categorical data.

**Disadvantages**

- It can be created as complex trees.

**3. Naive Bayes Classifier**

It takes into assumption the independence between predictors or what's known as Bayes theorem.

It helps us to calculate posterior probability p(c/x) from p(c), p(x) and p(x0/c)

P(c/x) = (p(x/c) p(c)) / p(x)

Here,

P(c/x) is the posterior probability of class (target) stated predictor (attribute).

**Example:**

Now we will classify it on the bases of the weather that the players will be playing or not.

**Step 1:**Firstly, we have to convert data set to the frequency table.**Step 2:**Now, we have to create a likelihood table by finding the overcast probability = 0.29 and probability of playing is 0.64**Step 3:**After the second step, we have to calculate the posterior probability for each class by using the naïve bayesian equation.

**Example:**

A Golf player will play if the weather is sunny. Is this statement correct?

We will be solving the equation by

P (yes/sunny) =p (sunny/yes)*p (yes)/p (sunny)

Now, p (sunny/yes) =3/9 =0.33,

P (sunny) =5/14 =0.36,

P (yes) =9/14 =0.64.

Now, p (yes/sunny) =0.33*0.64/0.36 =0.60

This has a higher probability.

**R-code**

Library (e1071)

X < - cbind (x_train, y_train)

# fitting model

Fit < -naivebayes (y_train - ., data= summary (fit)

#predict output

Predicted = predict (fit, x_test)

**Advantages**

- This type of algorithm needs a small amount of training data to estimate the required parameters.
- This method is enormously fast compared to more cosmopolitan methods.

**Disadvantages**

- It does not make good estimates.

**4. SVM (Support Vector Machine)**

It helps in coordinating groups with different features. For example, if we only had two features like the height and hair length of an individual, firstly, we had to plot these two-dimensional spaces where each point has two coordinates, which are known as support vectors.

**R-code**

Library (e1071)

X < - cbind (x_train, y_train)

# fitting model

Fit < - svm (y_train - ., data = x)

Summary (fit)

#predict output

Predicted = predict (fit, x_test)

**Advantages**

Good in high dimensional spaces and uses a subset of training points in the decision function, so it is also memory efficient.

**Disadvantages**

It gives out complex outcomes that may be difficult to understand and analyze.

It would take into consideration any of the groups as per the directions of the users even if they are not relevant.

**5. Stochastic Gradient Descent**

Stochastic gradient descent is used when the sample size is large.

**R-code**

From sklearn.linear_model import sgdclassifier

Sgd = sgdclassifier (loss = "modified_huber", shuffle = true, random_state =101)

Sgd.fit (x_train, y_train)

Y_pred = sgd.predict (x_test)

**Advantages**

- Efficiency and ease of implementation.

**Disadvantages**

- It requires several hyper-parameters, and it is sensitive to feature scaling.

**Conclusion**

Classification Algorithms help ineffective analysis of the buyer data to predict whether he would be buying the computer accessories. It also helps in grouping items and differentiating the inputs from one another, which saves a huge lot of time and effort. As a result, analysis becomes easier, and the process of classification supports speed up the decision-making process, which is vital for maintaining the sustenance and growth of business in the highly competitive world.

Interested in more mathmatical thinking? We recommend this MasterClass course with Terence Tao. His approach to common problem-solving can also be applied to more complex classification algorithms. Or try these DataCamp courses for data science projects.

**People are also reading:**

- How to become a data scientist?
- What is Data Science?
- Best Data Science Tools
- Statistics for Data Science
- Best Data Science Projects
- Best Data Science Certifications
- What is Data Analysis?
- Difference between Data Science vs Data Analytics
- Get the Difference between Data Analyst vs Data Scientist
- Difference between Data Science vs Machine Learning