Server Side Language Detection with Google Language API

Yesterday Google announced a new AJAX API for translation and language detection. It’s a Javascript API to translate and detect the language of blocks of text within a webpage. But I need server side language detection of a text using PHP and the Google AJAX Language API.

Step 1 : Find Out Internal Working

You can test the language detection in the announcement post on The Official Google Blog. So I did, and I took my friend FireBug with me. This revealed the URL from which the Javascript code requests the magic :

1
http://www.google.com/uds/GlangDetect?callback=google.language.callbacks.id102&context=22&q=this%20is%20a%20test%20of%20a%20language&key=internal&v=1.0

A quick surf to that URL shows the response :

1
google.language.callbacks.id102('22',{"language":"en","isReliable":true,"confidence":0.33399844}, 200, null, 200)

Which is without any doubt JSONP. It invokes a Javascript method “google.language.callbacks.id102″, passing 5 parameters to it.

Step 2 : Analyze

Request

Lets analyze all parameters in the request URL :

  • callback : This is the name of the function that is called when the response is ready. It can be left-off.
  • context : No idea. It can be left-off.
  • q : The text you want to detect the language for. Is required (DUH!).
  • key : An API key. It can be left-off as it seems.
  • v : The API version. Is required.

Response

Like I said, the reponse is a call to the Javascript method “google.language.callbacks.id102″. It takes 5 parameters :

  • 22 : Same value as the context parameter in the request URL.
  • A Javascript object with 3 properties :
    • language : The ISO2 language code of the language detected by Google.
    • isReliable : Is the quess by Google reliable?
    • confidence : How confident is Google about the guess? I think it takes values from 0 to 1, 0 being least reliable and 1 being most reliable.
  • 200 : No idea.
  • null : No idea.
  • 200 : Again, no idea.

Step 3 : Bend and Break

Now lets see what we need to do to make this work in PHP.

Dump Unknown Request Parameters

Don’t use magic you don’t understand, so I dump the request parameters “context” and “key”. The result doesn’t seem to change when leaving these off.

No JSONP

Since I’m going to use PHP to request the language detection, I won’t be needing the callback parameter. So I dump that one too. The result is nice :

1
{"responseData": {"language":"en","isReliable":true,"confidence":0.33399844}, "responseDetails": null, "responseStatus": 200}

The 5 response parameters passed to the callback function are replaced by one Javascript object. At once, another useful response parameter becomes available : responseStatus. I guess it indicates if the request was successful. Just like the HTTP Code 200 OK.

Step 4 : Mix it with PHP

The final step is to mix what I have learned into a simple PHP class :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<?php
 
class Google_Language_Detection {
    const URL = 'http://www.google.com/uds/GlangDetect?v=1.0&q=';
 
    public static function detect($text) {
        $url = self::URL . urlencode($text);
        $response = Zend_Json::decode(file_get_contents($url));
 
        if ($response['responseStatus'] == 200)
            return $response['responseData'];
        else
            return false;
    }
}
 
?>

Notice that I use Zend Framework to decode the JSON from the response into a native PHP associative array.

About the Author

Lode Blomme works as a Software Engineer at RouteYou, a Belgian startup company active in digital outdoor navigation.