Home » Code » Laravel 中使用 elasticsearch

Laravel 中使用 elasticsearch

Laravel 官方的全文搜索扩展包 Scout 没有提供 elasticsearch 的支持,由于其良好的接口设计,第三方的也有很多,我选择的是 babenkoivan/scout-elasticsearch-driver

按照文档安装起来后,执行以下命令完成基本文件的创建:

//创建索引配置文件
php artisan make:index-configurator IndexConfigurators\UserIndexConfigurator

//创建搜索规则 
php artisan make:search-rule SearchRules\UserSearchRule

按文档在已有 App\User 模型上  use ScoutElastic\Searchable 这个 trait,补充 $indexConfigurator 及 $searchRules 这两个属性就可以了。接着执行:

//创建索引
php artisan elastic:create-index "App\IndexConfigurators\UserIndexConfigurator"

mapping 设置可以在模型添加 $mapping 属性,也可以定义在索引配置文件上。这个一般比较冗长,我是建议定义在 IndexConfigurator 的 $defaultMapping 属性上。

protected $defaultMapping = [
    'properties' =>
        array (
            'avatar' =>
                array (
                    'type' => 'text',
                    'fields' =>
                        array (
                            'keyword' =>
                                array (
                                    'type' => 'keyword',
                                    'ignore_above' => 256,
                                ),
                        ),
                ),
            'comment_counts' =>
                array (
                    'type' => 'long',
                ),
            'created_at' =>
                array (
                    'type' => 'text',
                    'fields' =>
                        array (
                            'keyword' =>
                                array (
                                    'type' => 'keyword',
                                    'ignore_above' => 256,
                                ),
                        ),
                ),
            'email' =>
                array (
                    'type' => 'text',
                    'fields' =>
                        array (
                            'keyword' =>
                                array (
                                    'type' => 'keyword',
                                    'ignore_above' => 256,
                                ),
                        ),
                ),
            'fans_counts' =>
                array (
                    'type' => 'long',
                ),
            'following_counts' =>
                array (
                    'type' => 'long',
                ),
            'id' =>
                array (
                    'type' => 'long',
                ),
            'key' =>
                array (
                    'type' => 'text',
                    'fields' =>
                        array (
                            'keyword' =>
                                array (
                                    'type' => 'keyword',
                                    'ignore_above' => 256,
                                ),
                        ),
                ),
            'name' =>
                array (
                    'type' => 'text',
                    'fields' =>
                        array (
                            'raw' =>
                                array (
                                    'type' => 'keyword',
                                ),
                        ),
                ),
            'register_source' =>
                array (
                    'type' => 'text',
                    'fields' =>
                        array (
                            'keyword' =>
                                array (
                                    'type' => 'keyword',
                                    'ignore_above' => 256,
                                ),
                        ),
                ),
            'score_counts' =>
                array (
                    'type' => 'long',
                ),
            'topic_counts' =>
                array (
                    'type' => 'long',
                ),
            'updated_at' =>
                array (
                    'type' => 'text',
                    'fields' =>
                        array (
                            'keyword' =>
                                array (
                                    'type' => 'keyword',
                                    'ignore_above' => 256,
                                ),
                        ),
                ),
        ),
];

以上是一个用户表的 mapping,是直接插入数据时自动生成的,其实有的不太对,比如时间的字段 type 也定义为了 text。建议先自己写好,然后执行

//更新 mapping
php artisan elastic:update-mapping "App\User"

接下来就是定义搜索规则了,在 UserSearchRule 中:

public function buildHighlightPayload()
{
    return [
        "pre_tags" => ['<font color="red">'],
        "post_tags" => ['</font>'],
        'fields' => [
            "name" => ['type' => 'plain'],
            "email" => ['type' => 'plain'],
        ]
    ];
}

public function buildQueryPayload()
{
    $query = "*{$this->builder->query}*";
    return [
        "should" => [
            ["wildcard" => ["name" => $query]],
            ["wildcard" => ["email" => $query]],
        ],
    ];
}

这里,搜索用户直接使用的是通配符匹配,针对 name 和 email 两个字段进行匹配,是或的关系(故用 should)。

如果长文本,要分词,定义 mapping 和  queryPayload 则有些许变化。看下边这个例子,对几个长文本字段进行分词搜索。

//指定中文分词器 analyzer 和 search_analyzer 
protected $defaultMapping = [
    'properties' => [
        'top_description' => [
            'type' => 'text',
            'analyzer' => 'ik_max_word',
            'search_analyzer' => 'ik_smart',
        ],
        'middle_description' => [
            'type' => 'text',
            'analyzer' => 'ik_max_word',
            'search_analyzer' => 'ik_smart',
        ],
        'bottom_description' => [
            'type' => 'text',
            'analyzer' => 'ik_max_word',
            'search_analyzer' => 'ik_smart',
        ],
    ],
];

//进行 match 搜索,会分词
public function buildQueryPayload()
{
    $query = $this->builder->query;
    return [
        "should" => [
            ["match" => ["top_description" => $query]],
            ["match" => ["middle_description" =>  $query]],
            ["match" => ["bottom_description" =>  $query]],
        ],
    ];
}

假如表中 content 字段存 json 格式的如下数据:

{
    "0":{
        "type":"text",
        "data":{
            "text":"值 得体验"
        }
    },
    "1":{
        "type":"image",
        "data":{
            "key":"oqjxx06tzfgzpwuayjrh",
            "format":"jpg",
            "width":1600,
            "height":2400
        }
    }
}

定义 mapping 及 queryPayload 是这样的:

protected $defaultMapping = [
    'properties' => [
        'id' => [
            'type' => 'long',
        ],
        'cid' => [
            'type' => 'long',
        ],
        'type' => [
            'type' => 'long',
        ],
        'content' => [
            'type' => 'nested',
            'properties' => [
                'type' => [
                    'type' => 'text',
                ],
                'data' => [
                    'properties' => [
                        'text' => [
                            'type' => 'text',
                            'analyzer' => 'ik_max_word',
                            'search_analyzer' => 'ik_smart',
                        ],
                        'key' => [
                            'type' => 'text',
                        ],
                        'format' => [
                            'type' => 'text',
                        ],
                        'width' => [
                            'type' => 'long',
                        ],
                        'height' => [
                            'type' => 'long',
                        ],
                        'url' => [
                            'type' => 'text',
                        ]
                    ],
                ]
            ]
        ],
        'created_at' => [
            'type' => 'date',
            'format' => 'yyyy-MM-dd HH:mm:ss',
        ],
        'updated_at' => [
            'type' => 'date',
            'format' => 'yyyy-MM-dd HH:mm:ss',
        ],
        'deleted_at' => [
            'type' => 'date',
            'format' => 'yyyy-MM-dd HH:mm:ss',
        ],
    ]
];

public function buildQueryPayload()
{
    $query = $this->builder->query;
    return [
        "must" => [
            "nested" => [
                "path" => "content",
                "query" => [
                    "match" => ["content.data.text" => $query]
                ],
            ],
        ],
    ];
}

其他依据文档进行即可。

参考:
https://www.elastic.co/guide/en/elasticsearch/reference/current/properties.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.