FastAPI is a modern, fast, web-based framework for building APIs with Python 3.6+ based on standard Python type hints. It’s highly efficient and easy to use, and it comes with automatic interactive API documentation. In this blog post, we will walk through the process of deploying a Python web service that hosts a Transformers model inference using FastAPI. We will also discuss how to run Uvicorn workers as a systemd service on Debian environments.
What is FastAPI?
FastAPI is a high-performance web framework for building APIs with Python 3.6+ type hints. Key features of FastAPI include:
- Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette and Pydantic).
- Fast to code: Increase the speed to develop features by about 200% to 300%.
- Fewer bugs: Reduce about 40% of human (developer) induced errors.
- Intuitive: Great editor support. Completion everywhere. Less time debugging.
- Easy: Designed to be easy to use and learn. Less time reading docs.
- Short: Minimize code duplication. Multiple features from each parameter declaration.
- Robust: Get production-ready code. With automatic interactive documentation.
- Standards-based: Based on (and fully compatible with) the open standards for APIs: OpenAPI and JSON Schema.
- Django-friendly: Easy to integrate with Django ORM, Django Filters, and more.
Deploying a Python Web Service with FastAPI
Let’s start by creating a new FastAPI project. First, install FastAPI and Uvicorn, an ASGI server, with pip:
pip install fastapi uvicorn
Next, create a new Python file (e.g., main.py
) and import FastAPI:
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "World"}
You can now start the Uvicorn server with:
uvicorn main:app --reload
This will start a local server at http://127.0.0.1:8000
.
Hosting a Transformers Model Inference
To host a Transformers model inference, we first need to install the Transformers library:
pip install transformers
Next, we can load a pre-trained model and create an endpoint for model inference. For example, let’s use the distilbert-base-uncased-finetuned-sst-2-english
model for sentiment analysis:
from fastapi import FastAPI
from transformers import pipeline
app = FastAPI()
nlp_model = pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english')
@app.post("/predict")
def predict_sentiment(text: str):
result = nlp_model(text)[0]
return {"label": result['label'], "score": round(result['score'], 4)}
This will create a POST endpoint at /predict
that takes a text string as input and returns the predicted sentiment and score.
Running Uvicorn Workers as a systemd Service
To run Uvicorn workers as a systemd service on Debian environments, we first need to create a new systemd service file (e.g., /etc/systemd/system/uvicorn.service
):
[Unit]
Description=Uvicorn server instance
After=network.target
[Service]
ExecStart=/usr/local/bin/uvicorn main:app --host 0.0.0.0 --port 8000
WorkingDirectory=/path/to/your/project
User=yourusername
Group=www-data
Restart=always
[Install]
WantedBy=multi-user.target
Next, start the service with:
sudo systemctl start uvicorn
And enable it to start on boot with:
sudo systemctl enable uvicorn
You can check the status of the service with:
sudo systemctl status uvicorn
And that’s it! You now have a Python web service running a Transformers model inference with FastAPI, and Uvicorn workers running as a systemd service on a Debian environment.
Leave a Reply