Introducing

Guidepad's ML Plugin

The guidepad-ML plugin is an extension of the guidepad platform that helps users with their end-to-end ML lifecycle. The plugin includes a feature store, model store, and model deployment tools. The feature store helps with retrieving feature data for model training and inference. The model store manages model metadata, keeps track of Python package requirements and model artifacts, and contains tools for uploading and retrieving model artifacts from object cloud storage. The model deployment tools assist with building and pushing a Docker image used to host a model with a web server. This notebook demos how you can utilize these tools in a use case where a customer-to-customer fashion store website wants to predict the app rating score their users would rate the company's mobile app.

Demo Problem Description

We have a dataset of mobile app reviews of a company's customer-to-customer fashion store app. The company wants to prompt users to rate the app if a user is likely to leave a positive rating. The company's data science team is tasked with training a model for this purpose and decide to utilize an additional dataset of historical user statistcs. To produce a training dataset, they need to combine these two datasets with a point-in-time join, ensuring the model doesn't have access to future data during training. After training a model for predicting app ratings, the company wants to deploy the model with access to online user feature data to generate predictions on-demand.

In this demo, we will

  1. Represent our feature data as a GuidePad type
  2. Load data into our offline feature store
  3. Create a training dataset using a dataset of time series features and labels
  4. Move data from the offline store to the online store
  5. Create a training dataset from our offline store and use it to train a model.
  6. Build and push a Docker image for the model using the Guidepad CLI
  7. Deploy the model, which retrieves features from the online store to generate predictions.

Walkthroughs of using model versioning, scheduling jobs to move data from the offline store to online store, and scheduling jobs for model retraining will be included in other demos.

import pandas as pd
import numpy as np
import datetime
import pytz

import sys
sys.path.append('/home/tommy/Projects/guidepad')
sys.path.append('/home/tommy/Projects/guidepad-ml')

import os
os.environ['GUIDEPAD_ENV_FILENAME'] = '.env'

import guidepad
from guidepad.types import attributes
from guidepad_ml.types import Entity, EntityView, TimeDelta

guidepad.initialize()
/home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_alchemy_pool_enabled option in [core] has been moved to the sql_alchemy_pool_enabled option in [database]
    /home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:549: DeprecationWarning: The sql_alchemy_conn option in [core] has been moved to the sql_alchemy_conn option in [database] - the old setting has been used, but please update your config.
      option = self._get_option_from_config_file(deprecated_key, deprecated_section, key, kwargs, section)
    /home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:443: FutureWarning: The 'dag_default_view' setting in [webserver] has the old default value of 'tree'. This value has been changed to 'grid' in the running config, but please update your config before Apache Airflow 3.0.
      warnings.warn(
    /home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:443: FutureWarning: The 'log_filename_template' setting in [logging] has the old default value of '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log'. This value has been changed to 'dag_id={{ ti.dag_id }}/run_id={{ ti.run_id }}/task_id={{ ti.task_id }}/{% if ti.map_index >= 0 %}map_index={{ ti.map_index }}/{% endif %}attempt={{ try_number }}.log' in the running config, but please update your config before Apache Airflow 3.0.
      warnings.warn(
    /home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:549: DeprecationWarning: The auth_backend option in [api] has been renamed to auth_backends - the old setting has been used, but please update your config.
      option = self._get_option_from_config_file(deprecated_key, deprecated_section, key, kwargs, section)
    /home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:443: FutureWarning: The 'auth_backends' setting in [api] has the old default value of 'airflow.api.auth.backend.deny_all'. This value has been changed to 'airflow.api.auth.backend.session' in the running config, but please update your config before Apache Airflow 3.0.
      warnings.warn(
    /home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:443: FutureWarning: The 'log_id_template' setting in [elasticsearch] has the old default value of '{dag_id}-{task_id}-{execution_date}-{try_number}'. This value has been changed to '{dag_id}-{task_id}-{run_id}-{map_index}-{try_number}' in the running config, but please update your config before Apache Airflow 3.0.
      warnings.warn(

 - the old setting has been used, but please update your config.
 - the old setting has been used, but please update your config.
 - the old setting has been used, but please update your config.

This dataset of e-commerce users contains the following columns:

  • user_id: a unique identifier for a user
  • country_code: the country where a user lives
  • products_sold: the number of products a user has sold
  • has_ios_app: a boolean indicating if the user has the e-commerce iOS app
  • has_profile_picture: a boolean indicating whether the user has a profile picture
  • days_since_last_login: the number of days since the user has most recently logged in
  • num_products_liked: the number of products the user has liked
  • num_followers: the number of users who follow the user
  • num_follows: the number of users that the user is following
  • event_timestamp: the time stamp of when data this data was recorded
users_df = pd.read_parquet('users.parquet')
users_df['event_timestamp'] = pd.to_datetime(users_df['event_timestamp'])
users_df.head()
user_id country_code products_sold has_ios_app has_profile_picture days_since_last_login num_products_liked num_followers num_follows event_timestamp
46 -1369438276320193587 fr 1 True False 127 0 31 8 2020-01-01 00:00:00+00:00
114 4907046938384800140 us 0 True False 15 0 28 29 2020-01-01 00:00:00+00:00
164 -5922910413286749505 gb 3 True True 665 0 10 8 2020-01-01 00:00:00+00:00
295 27416093258331010 it 0 True True 488 150 18 9 2020-01-01 00:00:00+00:00
863 -5083503492646689429 hk 0 True False 173 3 27 8 2020-01-01 00:00:00+00:00

Swipe or drag left and right to view all the data columns

Define a Guidepad Entity type that models your feature data

Code in the Guidepad ML plugin resides in a separate, dedicated git repo. Each of your ML projects can be organized in a directory in this repo. Each directory can contain code to define your entity, entity view, and model classes, as well as code for loading your model, inference, and model retraining.

%%writefile ../guidepad_ml/service/catalog/app_rating/entities.py
from guidepad_ml.types import Entity, TimeDelta
from guidepad.types import attributes
import datetime

class C2CStoreUser(Entity):
    user_id = attributes.String(default='')
    country_code = attributes.String(default='')
    products_sold = attributes.Int(default=0)
    has_ios_app = attributes.Bool(default=False)
    has_profile_picture = attributes.Bool(default=False)
    days_since_last_login = attributes.Int(default=0)
    num_products_liked = attributes.Int(default=0)
    num_follows = attributes.Int(default=0)
    num_followers = attributes.Int(default=0)
    event_timestamp = attributes.DateTime(default=datetime.datetime.utcnow())

    class TypeConfig:
        instancestore_name = 'c2cStoreUser'
from guidepad_ml.service.catalog.app_rating.entities import C2CStoreUser

Create an entity view that represents a logical grouping of the feature data you will use for your model.

The entity view defines additional metadata about your entity such as where online features are stored, how long online features are active (ttl), and the column names used for the entity id and the event timestamp.

# Define an EntityView for user stats
user_entity_view = EntityView()
user_entity_view._id = 'a1129bc8ea7d4df9a08f2dc0d354c54f'
user_entity_view.name = 'c2c_store_user_stats'
user_entity_view.online_store_name = 'onlineStore'
user_entity_view.online_store_uri = '<>'
user_entity_view.ttl = TimeDelta()
user_entity_view.ttl.first.days = 2000
user_entity_view.entity_type = C2CStoreUser
user_entity_view.entity_id_column = 'user_id'
user_entity_view.event_timestamp_column = 'event_timestamp'
user_entity_view.most_recent_materialization_timestamp = datetime.datetime(1900, 1, 1, 12)
user_entity_view.save()
0
if C2CStoreUser.list_single() == None:
    for i, row in users_df.iterrows():
        C2CStoreUser(**row).save()

App rating dataset

Our app rating dataset contains the following information

  • user_id: a unique identifier for a user
  • rating: the rating the user gave to the app
  • event_timestamp: the timestamp of when the user gave the rating
user_app_ratings_df = pd.read_parquet('app_ratings.parquet')
user_app_ratings_df['event_timestamp'] = pd.to_datetime(user_app_ratings_df['event_timestamp'])
user_app_ratings_df.head()
user_id rating event_timestamp
0 6870940546848049750 4.5 2020-08-03 13:02:14+00:00
1 5724841529455566712 2.0 2020-02-05 08:15:37+00:00
2 -685666612473067274 5.0 2020-01-19 18:40:58+00:00
3 -3692647451487730241 5.0 2020-03-24 21:12:06+00:00
4 245123921236534929 4.0 2020-04-11 18:04:30+00:00

Swipe or drag left and right to view all the data columns

Build a training set. Use the feature store to retrieve user data from within a specific time range

We want the features values of the entities at the time of each rating.

Here is a visual representation of the point-in-time join that is performed when calling user_entity_view.get_historical_features()

Guidepad will perform a point-in-time join of your app ratings dataset with user feature data you have stored in the offline feature store.

train_df = user_entity_view.get_historical_features(
    entity_df=user_app_ratings_df
)
train_df['has_profile_picture'] = train_df['has_profile_picture'].astype(int)

train_df.head()
user_id rating event_timestamp event_id country_code products_sold has_ios_app has_profile_picture days_since_last_login num_products_liked num_follows num_followers
0 -685666612473067274 5.0 2020-01-19 18:40:58+00:00 100 it 41 True 0 11 544 450 100
1 -7659168937010608427 3.0 2020-01-25 12:22:05+00:00 100 es 65 True 1 11 66 9 51
2 1205881756843387030 1.5 2020-01-31 17:57:10+00:00 100 it 84 True 1 11 0 8 45
3 2457018450561830086 2.0 2020-02-02 13:32:29+00:00 100 it 7 True 0 11 1 108 59
4 5724841529455566712 2.0 2020-02-05 08:15:37+00:00 100 it 66 True 0 15 7 431 103

Swipe or drag left and right to view all the data columns

Notice that the feature data retrieved for user 6870940546848049750 is the feature data logged on 2020-07-01 because we wanted to retrieve the most recent feature data for this user as of 2020-08-03.

User 6870940546848049750 has feature data logged on 2020-01-01, 2020-07-01, 2021-01-01, and 2021-07-01.

train_df[train_df['user_id']=='6870940546848049750']
user_id rating event_timestamp event_id country_code products_sold has_ios_app has_profile_picture days_since_last_login num_products_liked num_follows num_followers
11 6870940546848049750 4.5 2020-08-03 13:02:14+00:00 100 fr 152 True 0 4 63 13 143

Swipe or drag left and right to view all the data columns

users_df[users_df['user_id']=='6870940546848049750']
user_id country_code products_sold has_ios_app has_profile_picture days_since_last_login num_products_liked num_followers num_follows event_timestamp
3215 6870940546848049750 fr 163 True False 11 60 137 13 2020-01-01 00:00:00+00:00
0 6870940546848049750 fr 152 True False 4 63 143 13 2020-07-01 00:00:00+00:00
1 6870940546848049750 fr 177 True True 1 65 154 13 2021-01-01 00:00:00+00:00
2 6870940546848049750 us 203 True True 2 70 165 14 2021-07-01 00:00:00+00:00

Swipe or drag left and right to view all the data columns

Train a model to predict app rating

We will specify which features we want to use and use the 'rating' column as the target variable. We scale the feature data based on values in the training set and then train a linear regression model. We record the scaling parameters which will be utilized during inference when predictions on new, incoming data are needed.

from sklearn import preprocessing
from sklearn.linear_model import LinearRegression
import joblib

feature_cols = [
    'products_sold',
    'has_profile_picture',
    'days_since_last_login',
    'num_products_liked',
    'num_followers',
    'num_follows'
]
X_train = train_df.loc[:, feature_cols]
y_train = train_df.loc[:, 'rating']

# Scale features
scaler = preprocessing.StandardScaler().fit(X_train)
X_train_scaled = scaler.transform(X_train)

lr = LinearRegression().fit(X_train_scaled, y_train)
model_filename = 'lr_model.pkl'
joblib.dump(lr, model_filename)

# We will save the parameters of the scaler to use 
# when preprocessing data after the model is deployed
mean_array = scaler.mean_
var_array = scaler.var_

# alternatively, you could save the scaler to a pickle file,
# store it in S3, and retrieve it for use in Model._load_model()

Create custom Model class for your model

  • Define a custom _load_model method that uses the Artifact module to load the model from S3
  • Define a custom run_model method that retrieves feature data from the online store, scales the feature data, and returns the prediction(s)
  • We also define a helper function _set_online_store to include the online store in Guidepad's set of active session datastores
  • Define input and output types for your model

The code for this can be placed in /guidepad_ml/service/catalog/app_rating_model/app_rating.py

%%writefile ../guidepad_ml/service/catalog/app_rating/app_rating.py
import guidepad
from guidepad.datastore.database import MongoDatastore
from guidepad.types import BaseType, attributes
from guidepad_ml.service.model import Model
from guidepad_ml.types import EntityView
from werkzeug.utils import secure_filename
import pandas as pd
import numpy as np
import datetime
import joblib

class AppRatingModelInput(BaseType):
    user_id = attributes.List(contains=attributes.String(default=''))


class AppRatingModelOutput(BaseType):
    predictions = attributes.List(
        contains=attributes.Nested(
            attributes={
                'user_id': attributes.String(default=''),
                'rating': attributes.Float(default=0.0)
            }
        ))


class AppRatingModel(Model):
    service_type = attributes.String(default='ml.model.app_rating')
    entity_view = attributes.Reference(referenced_type=EntityView)
    feature_scaling = attributes.Nested(
        attributes={
            'feature_means': attributes.List(contains=attributes.Float(default=0.0)),
            'feature_vars': attributes.List(contains=attributes.Float(default=0.0))
        }
    )

    def _load_model(self):
        with self.model_artifact.first.open(mode='rb') as m:
            self.model = joblib.load(m)

        self.feature_cols = [
            'products_sold',
            'has_profile_picture',
            'days_since_last_login',
            'num_products_liked',
            'num_followers',
            'num_follows'
        ]
        self._set_online_store()

    def run_model(self, model_input):
        entity_ids = list(model_input.user_id)
        entity_df = pd.DataFrame({
            'user_id': model_input.user_id
        })
        df = self.entity_view.first.get_online_features(
            entity_df = entity_df
        )
        entity_ids = list(df[self.entity_view.first.entity_id_column])
        data = df[self.feature_cols].to_numpy()
        scaled_data = np.divide(np.subtract(data, self.feature_scaling.feature_means), np.sqrt(self.feature_scaling.feature_vars))
        pred = self.model.predict(scaled_data)
        output = [{
            'user_id': entity_ids[i],
            'rating': float(p)
        } for i, p in enumerate(pred)]

        return output
    
    def _set_online_store(self):
        online_store = MongoDatastore(
            name = self.entity_view.first.online_store_name,
            uri = self.entity_view.first.online_store_uri
        )
        online_store.database = 'onlineStore'
        guidepad.datastore.session_datastores.add_datastore(online_store)

Create an instance of this class. In this demo, we only have one version of this model so only one instance is needed.

  • The entity_view associated with the model provides functionality for easily retrieving online feature values
  • The feature_scaling attribute will be used for feature scaling during inference
  • The package_requirements is a list of Python packages that will be installed with pip when building an image for the model.
  • because all Models are Services, your Model instance can also be managed via the guidepad user interface.
from guidepad.service.state.plan.plan import StatePlan
from guidepad_ml.service.catalog.app_rating import (
    AppRatingModel,
    AppRatingModelInput,
    AppRatingModelOutput
)   

model = AppRatingModel()
model._id = 'c55dca6c5cea4d6dad565157fee02c3f'
model.name = 'app_rating_model'
model.model_input = AppRatingModelInput
model.model_output = AppRatingModelOutput
model.entity_view = user_entity_view
model.feature_scaling = {
    'feature_means': mean_array,
    'feature_vars': var_array
}
model.version='1.1'
model.package_requirements=[
    'scikit-learn==1.2.0',
    'pandas==1.5.2'
]

model.save()
0

The model class contains the functions create_artifact and upload_artifact

  • Use these to create an Artifact record associated with your model and to upload the model artifact to cloud object storage
model.create_artifact(filename='lr_model.pkl', backend='s3')
model.upload_artifact()
Model artifact already exists

Move feature data to the online store

First, add your online store to GuidePad's session datastores (with model._set_online_store()). Then call user_entity_view.materialize() to move over the active features to the online store.

This addresses one of the painpoints of deploying a model in production: making sure it has access to up-to-date feature data that is transformed in the same way as it was during training.

model._set_online_store()

user_entity_view.materialize(
    end_datetime=pd.Timestamp.utcnow(),
    incremental=False
)

Verify that we can retrieve feature from the online store.

Once deployed, the app's inference code will make use of the .get_online_features() method to retrieve online features for the entities of interest

user_entity_view.get_online_features(
    entity_df = pd.DataFrame({
        'user_id': ['6870940546848049750']
    })
)
event_id user_id country_code products_sold has_ios_app has_profile_picture days_since_last_login num_products_liked num_follows num_followers event_timestamp
0 100 6870940546848049750 us 203 True True 2 70 14 165 2021-07-01T00:00:00+00:00

Swipe or drag left and right to view all the data columns

You can retrieve feature data for multiple entities at once

user_entity_view.get_online_features(
    entity_df = pd.DataFrame({
        'user_id': ['6870940546848049750', '-4640272621319568052']
    })
)
event_id user_id country_code products_sold has_ios_app has_profile_picture days_since_last_login num_products_liked num_follows num_followers event_timestamp
0 100 -4640272621319568052 us 152 True False 12 14 10 131 2020-01-01T00:00:00+00:00
1 100 6870940546848049750 us 203 True True 2 70 14 165 2021-07-01T00:00:00+00:00

Swipe or drag left and right to view all the data columns

Run the model locally

source /home/tommy/Projects/guidepad/venv/bin/activate

export PYTHONPATH="/home/tommy/Projects/guidepad"

export KUBECONFIG="/home/tommy/.kube/admin_kubeconfig"

guidepad service run --service-id c55dca6c5cea4d6dad565157fee02c3f

Retrieve app rating predictions for user(s)

You can retrieve app ratings for one or more users at once

import requests

res = requests.post(
    'http://localhost:5000',
    json={
        'user_id': ['-4640272621319568052']
    }
)

print(res.json())

res = requests.post(
    'http://localhost:5000',
    json={
        'user_id': ['-4640272621319568052', '-1369438276320193587']
    }
)
print(res.json())

Build a Docker image for the model and push the image to a remote Docker registry

When you ran model = AppRatingModel(), your model was given a set of default state-plans. This includes a state-plan for building and pushing an image. When you apply this build state-plan, an Artifact record corresponding to the image is saved to a container registry. This Artifact and associated image is then used when deploying the Model.

You can use the Guidepad CLI to apply the build state-plan of your Model.

%cd /home/tommy/Projects/guidepad
!export PYTHONPATH="/home/tommy/Projects/guidepad"
!python3.10 -m guidepad.cli.cli service apply-state-plan --service-id '{model._id}' --state-plan-id default_model_k8s_build
/home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_alchemy_pool_enabled option in [core] has been moved to the sql_alchemy_pool_enabled option in [database]
    /home/tommy/Projects/guidepad
    /home/tommy/.local/lib/python3.10/site-packages/airflow/configuration.py:549: DeprecationWarning: The sql_alchemy_conn option in [core] has been moved to the sql_alchemy_conn option in [database] - the old setting has been used, but please update your config.
      option = self._get_option_from_config_file(deprecated_key, deprecated_section, key, kwargs, section)
    /home/tommy/.local/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_alchemy_pool_enabled option in [core] has been moved to the sql_alchemy_pool_enabled option in [database] - the old setting has been used, but please update your config.
    /home/tommy/.local/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_engine_encoding option in [core] has been moved to the sql_engine_encoding option in [database] - the old setting has been used, but please update your config.
    /home/tommy/.local/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_alchemy_schema option in [core] has been moved to the sql_alchemy_schema option in [database] - the old setting has been used, but please update your config.
    [2023-01-06 13:10:03,074] {control_plane.py:112} DEBUG - Examining phase 7f04160abf264013b9eefd29c2ca0519 of state plan default_model_k8s_build for execution.
    [2023-01-06 13:10:03,148] {control_plane.py:43} DEBUG - Executing phase 7f04160abf264013b9eefd29c2ca0519.
    [2023-01-06 13:10:03,582] {control_plane.py:57} DEBUG - Examining stage 15ad834f10c645efa703e097efdfb00d of phase 7f04160abf264013b9eefd29c2ca0519 for execution.

To deploy your Model, you can use the default deploy StatePlan.

If you want to deploy to a specific Environment, for example a Kubernetes cluster, you can specify that in your StatePlan or provide the Environment as an option to StatePlan execution:

from guidepad.environment.k8s import KubernetesEnvironment

model.state_plans[1].phases[0].environment = KubernetesEnvironment.list_single({
    'name': 'guidepad-dev'
})

model.save()

Use the Guidepad CLI to deploy the model

!python3.10 -m guidepad.cli.cli service apply-state-plan --service-id '{model._id}' --state-plan-id default_k8s_deployed
/home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_alchemy_pool_enabled option in [core] has been moved to the sql_alchemy_pool_enabled option in [database]
    /home/tommy/.local/lib/python3.10/site-packages/airflow/configuration.py:549: DeprecationWarning: The sql_alchemy_conn option in [core] has been moved to the sql_alchemy_conn option in [database] - the old setting has been used, but please update your config.
      option = self._get_option_from_config_file(deprecated_key, deprecated_section, key, kwargs, section)
    /home/tommy/.local/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_alchemy_pool_enabled option in [core] has been moved to the sql_alchemy_pool_enabled option in [database] - the old setting has been used, but please update your config.
    /home/tommy/.local/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_engine_encoding option in [core] has been moved to the sql_engine_encoding option in [database] - the old setting has been used, but please update your config.
    /home/tommy/.local/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_alchemy_schema option in [core] has been moved to the sql_alchemy_schema option in [database] - the old setting has been used, but please update your config.
    [2023-01-06 14:39:35,916] {control_plane.py:112} DEBUG - Examining phase 4a64873bfada479bbae31f3749b52b35 of state plan default_k8s_deployed for execution.
    [2023-01-06 14:39:35,992] {control_plane.py:43} DEBUG - Executing phase 4a64873bfada479bbae31f3749b52b35.
    [2023-01-06 14:39:36,152] {control_plane.py:57} DEBUG - Examining stage c87eb0a4cb9246489fc8f75514990f50 of phase 4a64873bfada479bbae31f3749b52b35 for execution.
    [2023-01-06 14:39:45,565] {control_plane.py:236} DEBUG - Running ansible-playbook with command: ansible-playbook -i 127.0.0.1, --connection=local -e "@/home/tommy/.guidepad/state_plan/exec/default_k8s_deployed/20230106T193935.916429/4a64873bfada479bbae31f3749b52b35/c87eb0a4cb9246489fc8f75514990f50/executor_stage_configuration.json" -e host_key_checking=False -e ansible_python_interpreter=/usr/bin/python3.10 -e exec_config_path=/home/tommy/.guidepad/state_plan/exec/default_k8s_deployed/20230106T193935.916429/4a64873bfada479bbae31f3749b52b35/c87eb0a4cb9246489fc8f75514990f50/executor_stage_configuration.json /home/tommy/.guidepad/state_plan/exec/default_k8s_deployed/20230106T193935.916429/4a64873bfada479bbae31f3749b52b35/c87eb0a4cb9246489fc8f75514990f50/stage_playbook.yml
    [2023-01-06 14:39:45,565] {control_plane.py:239} DEBUG - Performing state plan execution with command: ansible-playbook -i 127.0.0.1, --connection=local -e "@/home/tommy/.guidepad/state_plan/exec/default_k8s_deployed/20230106T193935.916429/4a64873bfada479bbae31f3749b52b35/c87eb0a4cb9246489fc8f75514990f50/executor_stage_configuration.json" -e host_key_checking=False -e ansible_python_interpreter=/usr/bin/python3.10 -e exec_config_path=/home/tommy/.guidepad/state_plan/exec/default_k8s_deployed/20230106T193935.916429/4a64873bfada479bbae31f3749b52b35/c87eb0a4cb9246489fc8f75514990f50/executor_stage_configuration.json /home/tommy/.guidepad/state_plan/exec/default_k8s_deployed/20230106T193935.916429/4a64873bfada479bbae31f3749b52b35/c87eb0a4cb9246489fc8f75514990f50/stage_playbook.yml
    [2023-01-06 14:39:47,052] {procutils.py:21} INFO - 
    
    [2023-01-06 14:39:47,052] {procutils.py:21} INFO - PLAY [Execute State Plan against a K8s cluster.] *******************************
    
    [2023-01-06 14:39:47,072] {procutils.py:21} INFO - 
    
    [2023-01-06 14:39:47,072] {procutils.py:21} INFO - TASK [Gathering Facts] *********************************************************
    
    [2023-01-06 14:39:48,879] {procutils.py:21} INFO - ok: [127.0.0.1]
    
    [2023-01-06 14:39:48,950] {procutils.py:21} INFO - 
    
    [2023-01-06 14:39:48,950] {procutils.py:21} INFO - TASK [create persistent volume claims (optional, based on host requirements)] ***
    
    [2023-01-06 14:39:49,010] {procutils.py:21} INFO - 
    
    [2023-01-06 14:39:49,010] {procutils.py:21} INFO - TASK [common/k8s/persistent_volume : create persistent volume claims] **********
    
    [2023-01-06 14:39:49,066] {procutils.py:21} INFO - 
    
    [2023-01-06 14:39:49,067] {procutils.py:21} INFO - TASK [common/k8s/persistent_volume : construct volume definitions for use in the calling playbook] ***
    
    [2023-01-06 14:39:49,169] {procutils.py:21} INFO - 
    
    [2023-01-06 14:39:49,169] {procutils.py:21} INFO - TASK [create volume to cache plugins] ******************************************
    
    [2023-01-06 14:40:07,324] {procutils.py:21} INFO - [WARNING]: class KubernetesRawModule is deprecated and will be removed in
    
    [2023-01-06 14:40:07,324] {procutils.py:21} INFO - 2.0.0. Please use K8sAnsibleMixin instead.
    
    [2023-01-06 14:40:07,325] {procutils.py:21} INFO - ok: [127.0.0.1]
    
    [2023-01-06 14:40:07,368] {procutils.py:21} INFO - 
    
    [2023-01-06 14:40:07,369] {procutils.py:21} INFO - TASK [create a ClusterIP service definition (optional, based on host requirements)] ***
    
    [2023-01-06 14:40:07,451] {procutils.py:21} INFO - 
    
    [2023-01-06 14:40:07,451] {procutils.py:21} INFO - TASK [common/k8s/service : construct a set of ports to add to the new service] ***
    
    [2023-01-06 14:40:07,511] {procutils.py:21} INFO - 
    
    [2023-01-06 14:40:07,511] {procutils.py:21} INFO - TASK [common/k8s/service : create external service] ****************************
    
    [2023-01-06 14:40:07,544] {procutils.py:21} INFO - skipping: [127.0.0.1]
    
    [2023-01-06 14:40:07,607] {procutils.py:21} INFO - 
    
    [2023-01-06 14:40:07,607] {procutils.py:21} INFO - TASK [create the deployment for the service] ***********************************
    
    [2023-01-06 14:40:24,335] {procutils.py:21} INFO - changed: [127.0.0.1]
    
    [2023-01-06 14:40:24,380] {procutils.py:21} INFO - 
    
    [2023-01-06 14:40:24,381] {procutils.py:21} INFO - PLAY RECAP *********************************************************************
    
    [2023-01-06 14:40:24,381] {procutils.py:21} INFO - 127.0.0.1                  : ok=3    changed=1    unreachable=0    failed=0    skipped=4    rescued=0    ignored=0   
    
    [2023-01-06 14:40:24,381] {procutils.py:21} INFO - 
    

When the Model starts up, it launches a web server and creates a guidepad Operation and ServiceExposure to allow your Model to accept inference requests.

import requests

res = requests.post(
    'http://gpd-exposure-770f630c53de4381822e0dd8bf167d70.guidepad-dev:5000',
    json={
        'user_id': ['-4640272621319568052']
    }
)

print(res.json())
/home/tommy/Projects/guidepad/venv/lib/python3.10/site-packages/airflow/configuration.py:549 DeprecationWarning: The sql_alchemy_pool_enabled option in [core] has been moved to the sql_alchemy_pool_enabled option in [database]
    [{'rating': 3.4295135851814758, 'user_id': '-4640272621319568052'}]
  

About the Author

Tommy O'Keefe - Machine Learning Engineer - Guidepad

If you'd like to learn more about the Guidepad platform, please feel free to send us a note at hello@guidepad.io and we can schedule a call/demo.

Recent Publications

Blog

Guidepad's ML Plugin

The guidepad-ML plugin is an extension of the guidepad platform that helps users with their end-to- end ML lifecycle.

Tommy O'Keefe

Jul 28, 2023 · 10 min read read

Blog

Guidepad's Managed Embeddings Service (Part 1)

This demo showcases the capabilities of our embeddings service. This notebook will interact with a set of APIs we offer, showing that the embeddings service can be utilized by any downstream application with internet access, or any user with their preferred programming language.

Tommy O'Keefe

Aug 8, 2023 · 10 min read read

Blog

Guidepad's Managed Embeddings Service (Part 2)

Let's explore how we can leverage our REST API to save documents, compute document embeddings using a combination of pretrained open-source language models, and generate custom embeddings for your documents.

Tommy O'Keefe

Aug 8, 2023 · 10 min read read