PicoGWで顔認識や音声認識を使う

PicoGWで顔認識や音声認識を使う

2019年5月3日 Documentation 0

PicoGWには、音声認識や顔認識を行うためのプラグインがあります。
このプラグインでは内部的にGoogle Cloud Platform (GCP)を使っています。今回はこれを使ってcurlで顔認識を行うところまで行ってみたいと思います。

GCPからサービスアカウントキー(クレデンシャル)の取得

まず、GCPで顔認識(Cloud Vision API)と音声認識(Cloud Speech-to-Text API)が有効化されたAPIのサービスアカウントキーを作成します。作成方法はいろいろなところに出ているので、ここで繰り返すことはしません。ちょっと検索した限りでは、以下のサイトがわかりやすいように思えました。

Google公式の解説ページはこちらです

取得するサービスアカウントキー(本稿では「クレデンシャル」と呼んでいます)は、おおよそ以下のような内容を持っています。(一部*****で伏せているところがあります)

{
  "type": "service_account",
  "project_id": "*****",
  "private_key_id": "*****",
  "private_key": "-----BEGIN PRIVATE KEY-----\n**********\n-----END PRIVATE KEY-----\n",
  "client_email": "*****@appspot.gserviceaccount.com",
  "client_id": "",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/*****%40appspot.gserviceaccount.com"
}

PicoGW GCPプラグインのインストールと設定

PicoGWはインストール済みとします。 GCPのプラグインをインストールしてPicoGWを起動します。

npm i -g picogw-plugin-gcp
picogw

ブラウザでWebフロントエンドを開き、GCPプラグインが含まれていることを確認します。

GCPプラグインの設定画面を開きます(gcpの文字の上で右クリック⇒Settingsを左クリック)。

表示されているテキスト領域を全部削除して、自分で取得したクレデンシャルを貼り付け、右下のApplyボタンを押してください。

これで準備完了です。

API Reference

本APIの呼び出しには、HTTP RESTのPOSTメソッドで画像または音声のファイルを送る必要があります。

POST /v1/gcp/image-label-detection

画像に何が写っているかのラベルを返します。

POST /v1/gcp/image-face-detection

画像から顔領域を探します。

POST /v1/gcp/transcript?lang=[Language code (default is ja)]

音声ファイルから文字起こしをします。周波数変換のために、パスが通った場所にffmpegがインストールされている必要がありますので用意をお願いします。参考:centosの場合

langオプションをつけると、認識言語を変更できます。ここに指定できるコード一覧はこちらにあります

curlによる動作テスト

画像ラベル判定

コマンドライン:

 curl -X POST -F file=@./child1.jpg http://localhost:8080/v1/gcp/image-label-detection
結果:

{
    "success": true,
    "result": [
        {
            "faceAnnotations": [],
            "landmarkAnnotations": [],
            "logoAnnotations": [],
            "labelAnnotations": [
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/0ytgt",
                    "locale": "",
                    "description": "Child",
                    "score": 0.9935216903686523,
                    "confidence": 0,
                    "topicality": 0.9935216903686523,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/0dzct",
                    "locale": "",
                    "description": "Face",
                    "score": 0.9538267254829407,
                    "confidence": 0,
                    "topicality": 0.9538267254829407,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/01k74n",
                    "locale": "",
                    "description": "Facial expression",
                    "score": 0.944551408290863,
                    "confidence": 0,
                    "topicality": 0.944551408290863,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/06z04",
                    "locale": "",
                    "description": "Skin",
                    "score": 0.9381709694862366,
                    "confidence": 0,
                    "topicality": 0.9381709694862366,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/0jnvp",
                    "locale": "",
                    "description": "Baby",
                    "score": 0.9308986067771912,
                    "confidence": 0,
                    "topicality": 0.9308986067771912,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/01bgsw",
                    "locale": "",
                    "description": "Toddler",
                    "score": 0.9096627235412598,
                    "confidence": 0,
                    "topicality": 0.9096627235412598,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/037p5b",
                    "locale": "",
                    "description": "Cheek",
                    "score": 0.8978834748268127,
                    "confidence": 0,
                    "topicality": 0.8978834748268127,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/01f43",
                    "locale": "",
                    "description": "Beauty",
                    "score": 0.8783525824546814,
                    "confidence": 0,
                    "topicality": 0.8783525824546814,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/0f9swq",
                    "locale": "",
                    "description": "Chin",
                    "score": 0.8504008650779724,
                    "confidence": 0,
                    "topicality": 0.8504008650779724,
                    "boundingPoly": null
                },
                {
                    "locations": [],
                    "properties": [],
                    "mid": "/m/0k0pj",
                    "locale": "",
                    "description": "Nose",
                    "score": 0.8419631719589233,
                    "confidence": 0,
                    "topicality": 0.8419631719589233,
                    "boundingPoly": null
                }
            ],
            "textAnnotations": [],
            "localizedObjectAnnotations": [],
            "safeSearchAnnotation": null,
            "imagePropertiesAnnotation": null,
            "error": null,
            "cropHintsAnnotation": null,
            "fullTextAnnotation": null,
            "webDetection": null,
            "productSearchResults": null,
            "context": null
        }
    ]
}

顔認識

コマンドライン:

curl -X POST -F file=@./child1.jpg http://localhost:8080/v1/gcp/image-face-detection
結果:

{
    "success": true,
    "result": [
        {
            "faceAnnotations": [
                {
                    "landmarks": [
                        {
                            "type": "LEFT_EYE",
                            "position": {
                                "x": 183.83306884765625,
                                "y": 158.3272705078125,
                                "z": -0.0009126272052526474
                            }
                        },
                        {
                            "type": "RIGHT_EYE",
                            "position": {
                                "x": 237.47848510742188,
                                "y": 152.12179565429688,
                                "z": -19.267776489257812
                            }
                        },
                        {
                            "type": "LEFT_OF_LEFT_EYEBROW",
                            "position": {
                                "x": 168.75555419921875,
                                "y": 149.98782348632812,
                                "z": 14.353793144226074
                            }
                        },
                        {
                            "type": "RIGHT_OF_LEFT_EYEBROW",
                            "position": {
                                "x": 195.18470764160156,
                                "y": 142.16566467285156,
                                "z": -12.412994384765625
                            }
                        },
                        {
                            "type": "LEFT_OF_RIGHT_EYEBROW",
                            "position": {
                                "x": 221.54637145996094,
                                "y": 138.88931274414062,
                                "z": -21.856792449951172
                            }
                        },
                        {
                            "type": "RIGHT_OF_RIGHT_EYEBROW",
                            "position": {
                                "x": 263.2908020019531,
                                "y": 141.64749145507812,
                                "z": -17.94286346435547
                            }
                        },
                        {
                            "type": "MIDPOINT_BETWEEN_EYES",
                            "position": {
                                "x": 208.23385620117188,
                                "y": 152.1124725341797,
                                "z": -21.061262130737305
                            }
                        },
                        {
                            "type": "NOSE_TIP",
                            "position": {
                                "x": 203.107421875,
                                "y": 180.77548217773438,
                                "z": -45.14686965942383
                            }
                        },
                        {
                            "type": "UPPER_LIP",
                            "position": {
                                "x": 208.4833221435547,
                                "y": 203.589599609375,
                                "z": -38.105159759521484
                            }
                        },
                        {
                            "type": "LOWER_LIP",
                            "position": {
                                "x": 210.4447784423828,
                                "y": 224.2166748046875,
                                "z": -39.48147201538086
                            }
                        },
                        {
                            "type": "MOUTH_LEFT",
                            "position": {
                                "x": 193.34466552734375,
                                "y": 215.6253662109375,
                                "z": -18.22780418395996
                            }
                        },
                        {
                            "type": "MOUTH_RIGHT",
                            "position": {
                                "x": 239.2014923095703,
                                "y": 211.96807861328125,
                                "z": -32.95136642456055
                            }
                        },
                        {
                            "type": "MOUTH_CENTER",
                            "position": {
                                "x": 210.82553100585938,
                                "y": 214.27163696289062,
                                "z": -36.6275634765625
                            }
                        },
                        {
                            "type": "NOSE_BOTTOM_RIGHT",
                            "position": {
                                "x": 224.8909912109375,
                                "y": 187.87037658691406,
                                "z": -31.15186309814453
                            }
                        },
                        {
                            "type": "NOSE_BOTTOM_LEFT",
                            "position": {
                                "x": 194.1004638671875,
                                "y": 191.8350830078125,
                                "z": -19.869840621948242
                            }
                        },
                        {
                            "type": "NOSE_BOTTOM_CENTER",
                            "position": {
                                "x": 207.41456604003906,
                                "y": 192.60531616210938,
                                "z": -35.35366439819336
                            }
                        },
                        {
                            "type": "LEFT_EYE_TOP_BOUNDARY",
                            "position": {
                                "x": 183.9964141845703,
                                "y": 154.99205017089844,
                                "z": -3.289634943008423
                            }
                        },
                        {
                            "type": "LEFT_EYE_RIGHT_CORNER",
                            "position": {
                                "x": 194.33596801757812,
                                "y": 159.67575073242188,
                                "z": -4.268338680267334
                            }
                        },
                        {
                            "type": "LEFT_EYE_BOTTOM_BOUNDARY",
                            "position": {
                                "x": 183.50257873535156,
                                "y": 163.54299926757812,
                                "z": -2.01901912689209
                            }
                        },
                        {
                            "type": "LEFT_EYE_LEFT_CORNER",
                            "position": {
                                "x": 176.3940887451172,
                                "y": 162.51254272460938,
                                "z": 7.949547290802002
                            }
                        },
                        {
                            "type": "LEFT_EYE_PUPIL",
                            "position": {
                                "x": 184.0810546875,
                                "y": 159.62258911132812,
                                "z": -1.7658048868179321
                            }
                        },
                        {
                            "type": "RIGHT_EYE_TOP_BOUNDARY",
                            "position": {
                                "x": 238.2580108642578,
                                "y": 148.54916381835938,
                                "z": -22.66112518310547
                            }
                        },
                        {
                            "type": "RIGHT_EYE_RIGHT_CORNER",
                            "position": {
                                "x": 252.5885009765625,
                                "y": 153.1426239013672,
                                "z": -19.092214584350586
                            }
                        },
                        {
                            "type": "RIGHT_EYE_BOTTOM_BOUNDARY",
                            "position": {
                                "x": 238.85826110839844,
                                "y": 156.89532470703125,
                                "z": -21.54038429260254
                            }
                        },
                        {
                            "type": "RIGHT_EYE_LEFT_CORNER",
                            "position": {
                                "x": 228.52256774902344,
                                "y": 155.40780639648438,
                                "z": -16.135316848754883
                            }
                        },
                        {
                            "type": "RIGHT_EYE_PUPIL",
                            "position": {
                                "x": 239.66233825683594,
                                "y": 152.96682739257812,
                                "z": -21.758270263671875
                            }
                        },
                        {
                            "type": "LEFT_EYEBROW_UPPER_MIDPOINT",
                            "position": {
                                "x": 180.4531707763672,
                                "y": 138.4478759765625,
                                "z": -0.8508222699165344
                            }
                        },
                        {
                            "type": "RIGHT_EYEBROW_UPPER_MIDPOINT",
                            "position": {
                                "x": 239.37245178222656,
                                "y": 131.46533203125,
                                "z": -21.863447189331055
                            }
                        },
                        {
                            "type": "LEFT_EAR_TRAGION",
                            "position": {
                                "x": 176.57542419433594,
                                "y": 208.13528442382812,
                                "z": 70.0703353881836
                            }
                        },
                        {
                            "type": "RIGHT_EAR_TRAGION",
                            "position": {
                                "x": 306.7492980957031,
                                "y": 195.07302856445312,
                                "z": 27.771272659301758
                            }
                        },
                        {
                            "type": "FOREHEAD_GLABELLA",
                            "position": {
                                "x": 207.62928771972656,
                                "y": 139.41769409179688,
                                "z": -18.9888858795166
                            }
                        },
                        {
                            "type": "CHIN_GNATHION",
                            "position": {
                                "x": 213.51324462890625,
                                "y": 251.314208984375,
                                "z": -38.98262405395508
                            }
                        },
                        {
                            "type": "CHIN_LEFT_GONION",
                            "position": {
                                "x": 175.279296875,
                                "y": 237.49163818359375,
                                "z": 37.2729377746582
                            }
                        },
                        {
                            "type": "CHIN_RIGHT_GONION",
                            "position": {
                                "x": 287.4442138671875,
                                "y": 224.81277465820312,
                                "z": -3.0539612770080566
                            }
                        }
                    ],
                    "boundingPoly": {
                        "vertices": [
                            {
                                "x": 136,
                                "y": 65
                            },
                            {
                                "x": 332,
                                "y": 65
                            },
                            {
                                "x": 332,
                                "y": 292
                            },
                            {
                                "x": 136,
                                "y": 292
                            }
                        ],
                        "normalizedVertices": []
                    },
                    "fdBoundingPoly": {
                        "vertices": [
                            {
                                "x": 148,
                                "y": 103
                            },
                            {
                                "x": 308,
                                "y": 103
                            },
                            {
                                "x": 308,
                                "y": 262
                            },
                            {
                                "x": 148,
                                "y": 262
                            }
                        ],
                        "normalizedVertices": []
                    },
                    "rollAngle": -1.3059591054916382,
                    "panAngle": -20.222259521484375,
                    "tiltAngle": 14.978519439697266,
                    "detectionConfidence": 0.9348809123039246,
                    "landmarkingConfidence": 0.7418720722198486,
                    "joyLikelihood": "LIKELY",
                    "sorrowLikelihood": "VERY_UNLIKELY",
                    "angerLikelihood": "VERY_UNLIKELY",
                    "surpriseLikelihood": "VERY_UNLIKELY",
                    "underExposedLikelihood": "VERY_UNLIKELY",
                    "blurredLikelihood": "VERY_UNLIKELY",
                    "headwearLikelihood": "VERY_UNLIKELY"
                }
            ],
            "landmarkAnnotations": [],
            "logoAnnotations": [],
            "labelAnnotations": [],
            "textAnnotations": [],
            "localizedObjectAnnotations": [],
            "safeSearchAnnotation": null,
            "imagePropertiesAnnotation": null,
            "error": null,
            "cropHintsAnnotation": null,
            "fullTextAnnotation": null,
            "webDetection": null,
            "productSearchResults": null,
            "context": null
        }
    ]
}

音声書き起こし

コマンドライン:

curl -X POST -F file=@./Sample.wav http://localhost:8080/v1/gcp/transcript
結果:

{"success":true,"result":[{"text":"こんにちは"}]}

日本語ドキュメントのトップに戻る