Seven Minute Server

Sep 9, 2019 - 5 minute read - python machine learning

Grabbing data from the Simple OpenNMT-py REST Server

Alright, so you’ve trained a model or two and are ready to translate, but when you start using OpenNMT-py’s translation script, you run into some unforeseen issues — for example, you’ll find it’s not a huge fan of whitespace, and it’s not really meant to translate an entire document.

And for my use case, I want to actually print bilingual content to a single file in the format:

language 1 string

language 2 string

language 1 string

language 2 string

What to do? The Simple OpenNMT-py REST server, created by pltrdy and described at to the rescue!

You can use the REST server to dispatch translation tasks sentence-by-sentence, and then handle that output however you like. We’ll first set up and start the server, then use our own Python script to parse and print its output.

Setting up the Simple OpenNMT-py REST Server

The server itself is pretty easy to get up and running. It uses flask, and accepts and returns JSON output. Let’s get it set up (we are assuming here that you’re running a Linux-based OS and have already cloned and installed OpenNMT-py):

  1. pip install flask

  2. Make a directory for your server and copy your model files into it:

    mkdir OpenNMT-REST && cd OpenNMT-REST && cd OpenNMT-REST && mkdir available-models && cp available-models/ 
  3. Create a configuration file named conf.json and place it into the available-models directory:

        "models_root": "./available_models",
        "models": [
                "id": 100,
                "model": "",
                "timeout": 600,
                "opt": {
                "batch_size": 1,
                "beam_size": 10
                "id": 101,
                "model": "",
                "timeout": 600,
                "opt": {
                "batch_size": 1,
                "beam_size": 10

    You can add additional options, if you’re running on a gpu-enabled instance (I’m using CPU in this example, but your translations should fly on GPU), add "gpu": 0,1,2,3 where 0-3 are the GPUs you want to use. You can set it up to fallback to CPU if the query times out. See for more details. I also use "replace_unk": true so that I can see the untranslated bits more clearly.

  4. Next, export a bunch of variables for the system to use (you could hardcode, but the initial writeup uses variables and my Python script uses them, too, at the moment):

    export HOST=
    export CONFIG="./available_models/conf.json" 
    export URL_ROOT="/translator"
    export PORT="5000"
    export IP=""
  5. Start the server (run with & to background it so that you can run queries in the same window; if you use a different terminal, remember to export those variables again or add to your ~/.bash_profile and source it):

    python /path/to/your/OpenNMT/ --ip $IP --port $PORT --url_root $URL_ROOT --config $CONFIG &

Now, you’re ready to start sending it queries.

Querying the OpenNMT REST Server

If you check the forum post, the example sends a post to the server using curl and gets a JSON response back. In my case, I want to only grab the source sentence that I sent to the translator and the target, so I’ve written a basic script called that prepares and sends a request and processes the JSON output.

To run it, you simply do:

    python "This is the text I want to translate" 100

where 100 corresponds to the model id that you want to use, pulled from conf.json. So, for example, you can set your fr-en as 1, your en-fr as 2, and then dispatch English to 2 and French to 1.

The script then prints input and output line by line.

The current script looks like:

#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-

import json
import requests
import sys
import os

arglength = len(sys.argv)

if arglength <= 2:
    ruleid = str(100)
    ruleid = sys.argv[2]

req = sys.argv[1]

ip = os.environ['HOST']
port = os.environ['PORT']
url_root = os.environ['URL_ROOT']

url='http://' + ip + ':' + port + url_root + '/translate'
data='[{"src": "' + req +'", "id": ' + ruleid +'}]'

output =, data=data).text

myjson = json.loads(output)
input = myjson[0][0]['src']
output = myjson[0][0]['tgt']

print input
print output

You can modify the script to add newlines or whatever you like, or wrap it. For example, I wanted to run a super-quick translation from Russian to English and only write out the English. So I modified the bottom of the script thusly:

myjson = json.loads(output)
#input = myjson[0][0]['src']
output = myjson[0][0]['tgt'].encode('utf-8')

output_file = open('english.txt','a',0)


#print input
print output

This does not write the source language out, but instead writes the output. Because I use replace_unk, I have utf-8 characters in my output, so I had to add encode('utf-8') to the end. I then wrote to stdout (because I’m nosy) and to a file named english.txt (because I want to actually read the output later).

I then cheated a little and wrapped this in a shell script to iterate line by line through the file I wanted to translate:

while read line; do python "$line" 100; done < doc_i_want_to_translate.txt 

And that’s it for simple processing. You can also wrap the server in a GUI and make it available online, I currently don’t need to do this, but if you’re interested, Yasmin Moslem has a great example on Github that you can see in action at

You can find this tool at and it’s part of a suite of other publishing tools in the EOAT Project.