Saving Machine Learning Models
After you have found, trained, and fine tuned your model, it's time to save it into a file so you can load it at any time to make predictions later.
In this post we will use pickle to serialize and save our models.
Pickle allows you to serialize your ML models and save them into a file. Then you can load and deserialize them so you could use them to make predictions.
Let's quickly build a linear regression model for housing prices so that we can try saving and loading it.
import pandas as pd import numpy as np from sklearn.linear_model import LinearRegression # Load dataset data = pd.read_csv('../data/housing.csv') # Split data into features and labels features = data.drop(['median_house_value'], axis=1) # Get rid of incomplete and non-numerical features so we don't have to deal # with data preparation in this post features = features.drop(['ocean_proximity', 'total_bedrooms'], axis=1) # Only the column we want to predict labels = data['median_house_value'] # Create an object - a specific model we can actually train model = LinearRegression() # Train the model model.fit(features,labels)
Now that we have our model, saving it is very simple:
from pickle import dump dump(model, open('housing_model', 'wb' ))
When you need the model later, loading it is just as easy:
from pickle import load saved_model = load(open('housing_model', 'rb' )) predictions = saved_model.predict(features) print('A house with these parameters:\n', features.loc) print('Will cost this much:\n', predictions) print('(actual cost):\n', labels)
A house with these parameters: longitude -122.2300 latitude 37.8800 housing_median_age 41.0000 total_rooms 880.0000 population 322.0000 households 126.0000 median_income 8.3252 Name: 0, dtype: float64 Will cost this much: 403422.26733257994 (actual cost): 452600.0
That's it! Don't forget to document the version of python and pickle to make sure you will try loading it with a compatible one. You may also want to output and save the parameters your model have learned, in case you'll want to making predictions using your own implementation.