Skip to main content

Python SDK

Overview

Whether you're working in traditional Python programs or Jupyter notebooks, the Barbara Client Libraries for Python enables in-code deployment of trained machine learning models. This article explores the functionalities of the SDK, guiding you through integrating Barbara's model library and deploying models to edge nodes, all within your familiar Python environment.

Barbara VSCode Extension

Barbara VSCode Extension

Features

  • Seamless Uploads: Upload your models directly from Python code or Jupyter Notebooks to the Panel Library.
  • Simplified Deployment: Deploy models directly to edge nodes with just a few lines of code.
  • Enhanced Efficiency: Automate model training & deployment tasks and save valuable development time.
  • Streamlined Workflow: Integrate edge AI development seamlessly into your existing Python environment.

Prerequisites

To use the barbara-sdk, you must have the following:

  1. Python (version 3.8 or higher)
  2. A Barbara platform account
  3. Barbara API credentials. Access Barbara's API with these 4 credentials:
    • Username: Your Barbara username.
    • Password: Your Barbara password.
    • Client Secret: This credential is only available for users with an Enterprise License.
    • Client Id: This credential is only available for users with an Enterprise License.
Barbara API Credentials

API Credentials are only available with Enterprise License.
Get in contact with our Support Team (support@barbara.tech) to upgrade your license if necessary.

Example of Usage

Let's train a survival prediction model for the Titanic disaster in Jupyter Notebook!

We'll use a dataset called titanic.csv that contains information about the passengers, including their survival status, tickets, and gender. By training a model on this data, we'll be able to predict the likelihood of survival for new passengers based on their characteristics.

Install the library

To install the library just go to your console and type:

pip install barbara-sdk

Download the example

To get started, download the Titanic Jupyter Notebook example which includes all the necessary files to train a survival prediction model.

note

Download the Titanic example in a zip file from here.

What's included: Once downloaded, unzip the file to find:

  1. titanic_example.ipynb: This Jupyter Notebook file contains the Python code for the example.
  2. titanic.csv: This file contains the passenger data for the Titanic disaster.
  3. credentials.json : This file is to be filled with your specific credentials.

Fill in Credentials

Open the credentials.json file and fill in the 4 credentials described in prerequisites section to access Barbara API.

Fill in credentials

Fill in credentials

Open the example in Jupyter Notebook

Jupyter Notebook

Jupyter Notebook

Train the model using titanic data

Import all necessary libraries

import logging
import json
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn import metrics
from keras.models import Sequential
from keras.layers import Dense
import seaborn as sns
import matplotlib.pyplot as plt

Import barbara-sdk client library

import barbara

Get the titanic data from csv file

data = pd.read_csv('titanic.csv')

This line reads a comma-separated values (CSV) file named titanic.csv into a pandas DataFrame object and stores it in the variable data.

Raw data imported

Raw data imported

Clean the data imported from csv file

Overall, this code snippet cleans, transforms, and prepares a data object for further analysis. It handles missing values, converts data types, encodes categorical data, and creates a new feature based on existing ones. Finally, it selects relevant columns, removes rows with missing data and shows the dataframe.

data.replace('?', np.nan, inplace= True)
data = data.astype({'age': np.float64, 'fare': np.float64})
data.replace({'male': 1, 'female': 0}, inplace=True)
data['relatives'] = data.apply (lambda row: int((row['sibsp'] + row['parch']) > 0), axis=1)
data.corr(numeric_only=True).abs()[["survived"]]
data = data[['sex', 'pclass','age','relatives','fare','survived']].dropna()
data.head()

Cleaned data

Cleaned data

Visualize data in graphs

In this part of the code, the data is presented graphically:

  1. Age distribution by survival and sex (violin plot type)
  2. Survival distribution by Passenger Class and sex (point plot type)
  3. Fare distribution by survival and sex (violing plot type)
fig, axs = plt.subplots(ncols=3, figsize=(20,3))
sns.violinplot(x="survived", y="age", hue="sex", data=data, ax=axs[0])
sns.pointplot(x="pclass", y="survived", hue="sex", data=data, ax=axs[1])
sns.violinplot(x="survived", y="fare", hue="sex", data=data, ax=axs[2])

Plot Data

Plot Data

Split data in train vs test

This code splits the data from the Titanic dataset into training and testing sets. The training set (80%) will be used to train a machine learning model to predict passenger survival based on features like sex, class, age, etc. The testing set (20%) will be used to evaluate the model's performance on unseen data.

x_train, x_test, y_train, y_test = train_test_split(data[['sex','pclass','age','relatives','fare']], data.survived, test_size=0.2, random_state=0)

Standardize data

This code prepares the data for machine learning by performing standardization. It creates a StandardScaler object that analyzes the training data to understand the distribution of each feature (e.g. age, fare). Then, it uses this information to transform both the training and testing data. This transformation scales the features to have a zero mean and unit standard deviation, ensuring all features have similar influence during model training and potentially improving the machine learning model's performance.

sc = StandardScaler()
X_train = sc.fit_transform(x_train)
X_test = sc.transform(x_test)
Did you know...

Standardization is a common data preprocessing step in machine learning because it improves the performance of many machine learning algorithms. By scaling the features to have a similar range, it ensures that all features contribute equally to the model during training. This can lead to faster convergence and better overall accuracy of the model.

Define the model

This code defines a neural network architecture for predicting passenger survival on the Titanic:

  • It creates a sequential model with two hidden layers, each containing 5 neurons and using the ReLU activation function.
  • The input layer size is specified based on the number of features (likely 5 for sex, class, age, etc.).
  • Finally, a single output neuron with a sigmoid activation function predicts the probability of survival (between 0 and 1).
model = Sequential()
model.add(Dense(5, kernel_initializer = 'uniform', activation = 'relu', input_dim = 5))
model.add(Dense(5, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dense(1, kernel_initializer = 'uniform', activation = 'sigmoid'))
model.summary()

The model summary is printed to show the network's structure.

Model Summary

Model Summary

Train the model

This code trains the neural network for predicting passenger survival:

  • It configures the training process with the Adam optimizer to adjust the network's weights, uses binary crossentropy to measure prediction errors, and tracks accuracy as a performance metric.
  • Finally, it trains the model for 50 epochs, feeding it the training data (features and survival labels) in batches of 32 samples at a time. By iterating through the data and adjusting weights, the model learns to improve its ability to predict survival based on passenger information.
model.compile(optimizer="adam", loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=32, epochs=50)

Test the trained model

This code evaluates the model's performance on unseen data:

  • It uses the trained model to predict survival probabilities for the testing set.
  • Then, it rounds these probabilities to 0 (not survived) or 1 (survived) to create binary predictions.
  • Finally, it calculates the accuracy score to see how often the model's predictions match the actual survival labels in the testing data. This accuracy score indicates how well the model generalizes to unseen data.
y_pred = np.rint(model.predict(X_test).flatten())
print(metrics.accuracy_score(y_test, y_pred))

Model Accuracy in test data

Model Accuracy in test data

Save the trained model

This code snippet saves the trained machine learning model (stored in model) to a file named titanic/1. This refers to a project name and a version number for the model.

model.export("titanic/1")

Deploy the model to your Edge Node

Get Barbara Handler for the SDK

This code sets up authentication and logging for interacting with an API likely provided by a library called barbara. Here's a breakdown:

  1. Load Credentials from credentials.json file:
f = open('./credentials.json')
cred = json.load(f)
f.close()
  1. Configure Logging
logging.basicConfig(format='%(asctime)s %(levelname)s %(message)s', level=logging.INFO)
  1. Create an instance of ApiClient from the barbara library using the credentials from the json file
bbr = barbara.ApiClient(cred['client_id'], cred['client_secret'], cred['username'], cred['password'])

Upload the trained model to the library

You can upload the model assigning the name defined in the variable model_name with the following code:

model_name = "Titanic"
bbr.models.upload("./titanic", model_name)

Utilize the bbr.models.list() function to view a list of all models currently stored in your library.

bbr.models.list()

Model list

Model list

For a specific model (like Titanic in this example), use bbr.models.list_versions(model_name) to see all the different versions you've uploaded. This helps track changes made to your model over time.

bbr.models.list_versions(model_name)

Version list

Version list

Deploy the model to an Edge Node

First we can list our available Edge Nodes to check the name of the node we want to deploy to:

bbr.nodes.list()

Edge Nodes list

Edge Nodes list

Once we know the node name (in this case we want to deploy to Virtual Machine (amd64)) we can use the following function:

node_name = "Virtual Machine (amd64)"
model_name = "Titanic"
bbr.models.deploy(node_name, model_name)

The model will be deployed to the Edge Node and served automatically. We can check it by listing every workload:

bbr.workloads.list()

Workload list

Workload list

Start o stop serving a model

Once we know the workload_id of our model, we can stop serving it:

workload_id = '666844535f6107c90fc2d6f6'
bbr.workloads.stop(workload_id)

Or start serving it again:

bbr.workloads.start(workload_id)

Uploading a new version to the model

In case we have made some modifications to the trained model and we want to deploy it again, we can create another version in the following way:

  1. We upload the new version of the model located in ./titanicpath to our library.
bbr.models.upload("./titanic", model_name)
  1. We deploy the new version of the model to the node defined in the variable node_name.
bbr.models.deploy(node_name, model_name)

Removing the model from the Edge Node

We can remove the model from the edge node by removing the workload this way:

bbr.workloads.remove(workload_id)

Removing the model from the library

Finally, we can remove the model from the library:

bbr.models.delete(model_name)