# Binary Classification `predict()` Method: sklearn vs keras

`predict()`

method in two widely used libraries: scikit-learn (sklearn) and Keras.As a data scientist or software engineer, you may have come across the task of binary classification. This is a fundamental problem in machine learning where the goal is to predict a binary outcome, i.e., either a 0 or 1. There are many algorithms and libraries available to solve this problem, but two of the most popular are scikit-learn (sklearn) and Keras. In this blog post, we will compare the `predict()`

method of these two libraries for binary classification.

## Table of Contents

- What is the
`predict()`

method? - Binary classification with sklearn
- Binary classification with Keras
- Comparison of
`predict()`

method in sklearn and Keras - When to Use sklearn
`predict()`

Method - When to Use Keras
`predict()`

Method - Conclusion

## What is the `predict()`

method?

Before we dive into the comparison of the `predict()`

method of sklearn and Keras, let’s first understand what this method does. The `predict()`

method is used to make predictions on new data using a trained model. In binary classification, the `predict()`

method takes in a set of features and outputs either a 0 or 1, which represents the class of the new data.

## Binary classification with sklearn

sklearn is a popular machine learning library in Python that provides a variety of algorithms and tools for data scientists and software engineers. To perform binary classification with sklearn, we first need to import the necessary modules and load our data. We will use the breast cancer dataset from sklearn as an example.

```
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
# Load data
data = load_breast_cancer()
X = data.data
y = data.target
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train model
model = LogisticRegression()
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
```

In the above code, we first load the breast cancer dataset and split it into training and testing sets. We then train a logistic regression model on the training data and make predictions on the testing data using the `predict()`

method. The output of the `predict()`

method is stored in the y_pred variable.

## Binary classification with Keras

Keras is a deep learning library that provides a high-level API for building and training neural networks. To perform binary classification with Keras, we need to define our model architecture and compile it before training and making predictions. Let’s see an example of binary classification with Keras using the same breast cancer dataset.

```
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
# Define model architecture
model = Sequential([
Dense(30, activation='relu', input_shape=(30,)),
Dense(1, activation='sigmoid')
])
# Compile model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Train model
model.fit(X_train, y_train, epochs=50, batch_size=32)
# Make predictions
y_pred_keras=model.predict(X_test)
y_pred_keras=np.argmax(y_pred_keras,axis=1)
```

In the above code, we first define our model architecture using the Sequential API of Keras. We then compile the model with the Adam optimizer and binary crossentropy loss function. We train the model for 50 epochs and make predictions on the testing data using the `predict()`

method. The output of the `predict()`

method is stored in the `y_pred_keras`

variable.

## Comparison of `predict()`

method in sklearn and Keras

Now that we have seen examples of binary classification with sklearn and Keras, let’s compare the `predict()`

method of these two libraries. The `predict()`

method of sklearn returns a 1D array of predicted class labels, whereas the `predict()`

method of Keras returns a 2D array of predicted class probabilities. To get the predicted class labels from the `predict()`

method, we need to use the `argmax()`

method of numpy.

```
# Get predicted class labels from Keras
import numpy as np
y_pred_keras=model.predict(X_test)
y_pred_keras=np.argmax(y_pred_keras,axis=1)
# Compare predictions
print("sklearn predictions:", y_pred)
print("Keras predictions:", y_pred_keras)
```

Output:

```
sklearn predictions: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0]
Keras predictions: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0]
```

In the above code, we get the predicted class labels from the `predict()`

method of Keras and use the `argmax()`

method of numpy to convert the 2D array to a 1D array. We then compare the predictions of the two methods and print the results.

## When to Use sklearn `predict()`

Method

Use scikit-learn’s `predict()`

method when you need a straightforward and easy-to-use solution for binary classification. Scikit-learn provides a wide range of algorithms, making it a go-to choice for quick implementations and prototyping. If simplicity and flexibility are your priorities, sklearn’s `predict()`

method might be the right fit.

## When to Use Keras `predict()`

Method

Keras, on the other hand, is a high-level neural networks API that is well-suited for deep learning tasks. If your binary classification problem involves complex patterns and large datasets, Keras may be more appropriate. The `predict()`

method in Keras is optimized for neural networks, providing advanced features and customization options.

## Conclusion

In this blog post, we have compared the `predict()`

method of sklearn and Keras for binary classification. Both libraries provide an easy-to-use API for making predictions on new data. The `predict()`

method of sklearn returns a 1D array of predicted class labels, whereas the `predict()`

method of Keras returns a 2D array of predicted class probabilities. To get the predicted class labels from the `predict()`

method of Keras, we need to use the `argmax()`

method of numpy. When choosing between sklearn and Keras for binary classification, it is important to consider the complexity of the problem and the size of the dataset. For simple problems with small datasets, sklearn may be sufficient. For more complex problems with larger datasets, Keras may provide better performance.

#### About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Join today and get 150 hours of free compute per month.