Apple has created a new file format for machine learning models. These files can be used easily to predict, regardless of the creation process, which means that “Apple Introduces Core ML” draws an analogy between these files and PDFs. It’s possible to generate predictions with only this file, and none of the creation libraries.
Generating predictions is a pain point faced by data scientists today and often involves the underlying math. At best, this involves using training the model in Python and then calling the underlying C library in the production app.
This file format will only become widely used if easy conversion from popular machine learning libraries is possible and predictions are simple to generate. Apple made these claims during their WWDC 2017 keynote. I want to investigate their claim.
Specifically, Apple claimed easy integration their
.mlmodel file format and
various Python libraries. It’s easy to integrate these into an app (literally
via drag-and-drop) or another Python program.
Apple’s coremltools Python package make generation of this
- Train a model via scikit-learn, Keras, Caffe or XGBoost (see docs for conversion support for different library versions)
- Generate a
- (optional) Add metadata (e.g., feature names, author, short description)
- Save the model with
coremltools prints helpful error messages in my (brief) experience. When using
converters.sklearn.convert it gave a helpful error message indicating that
class labels should either be of type
float like I was
Here’s the complete script for the
.mlmodel file generation:
import coremltools from sklearn.svm import LinearSVC def train_model(): model = LinearSVC() # ... return model model = train_model() coreml_model = coremltools.converters.sklearn.convert(model) coreml_model.author = 'Scott Sievert' # other attributes can be added coreml_model.save('sklearn.mlmodel')
Yup, creation of these
.mlmodel files is as easy as Apple claims. Even
better, it appears this file format has integration with named features and
The generation of this file is easy. Now, where can these files be used?
.mlmodel files can be included on any device that supports CoreML. It
will not be tied to iOS/macOS apps, though these files will certainly be used
there. It will allow general and easy use in Python for both saving and
prediction. Given Apple’s expansion of Swift to other operating systems, I
don’t believe it will be tied to a particular operating system.
Prediction is easy as saving:
coremlmodel = coremltools.models.MLModel('sklearn.mlmodel') coremlmodel.predict(example) # `example` format should mirror training examples
However, I can’t test it as macOS 10.13 (currently in beta) is needed.
This difficulties were resolved quickly. Here’s what I ran while generating this post:
- CoreML depends on Python 2.7
- Version support in converting (e.g., Keras 2 not supported but 1.2 is).
The largest potential difficulty I see is with the limited (or not unlimited) scope of coremltools. There could be issues with version of different libraries, and not all classifiers in sklearn are supported (supported sklearn models).