Laravel 官方的全文搜索扩展包 Scout 没有提供 elasticsearch 的支持,由于其良好的接口设计,第三方的也有很多,我选择的是 babenkoivan/scout-elasticsearch-driver
按照文档安装起来后,执行以下命令完成基本文件的创建:
//创建索引配置文件 php artisan make:index-configurator IndexConfigurators\UserIndexConfigurator //创建搜索规则 php artisan make:search-rule SearchRules\UserSearchRule
按文档在已有 App\User 模型上 use ScoutElastic\Searchable 这个 trait,补充 $indexConfigurator 及 $searchRules 这两个属性就可以了。接着执行:
//创建索引 php artisan elastic:create-index "App\IndexConfigurators\UserIndexConfigurator"
mapping 设置可以在模型添加 $mapping 属性,也可以定义在索引配置文件上。这个一般比较冗长,我是建议定义在 IndexConfigurator 的 $defaultMapping 属性上。
protected $defaultMapping = [ 'properties' => array ( 'avatar' => array ( 'type' => 'text', 'fields' => array ( 'keyword' => array ( 'type' => 'keyword', 'ignore_above' => 256, ), ), ), 'comment_counts' => array ( 'type' => 'long', ), 'created_at' => array ( 'type' => 'text', 'fields' => array ( 'keyword' => array ( 'type' => 'keyword', 'ignore_above' => 256, ), ), ), 'email' => array ( 'type' => 'text', 'fields' => array ( 'keyword' => array ( 'type' => 'keyword', 'ignore_above' => 256, ), ), ), 'fans_counts' => array ( 'type' => 'long', ), 'following_counts' => array ( 'type' => 'long', ), 'id' => array ( 'type' => 'long', ), 'key' => array ( 'type' => 'text', 'fields' => array ( 'keyword' => array ( 'type' => 'keyword', 'ignore_above' => 256, ), ), ), 'name' => array ( 'type' => 'text', 'fields' => array ( 'raw' => array ( 'type' => 'keyword', ), ), ), 'register_source' => array ( 'type' => 'text', 'fields' => array ( 'keyword' => array ( 'type' => 'keyword', 'ignore_above' => 256, ), ), ), 'score_counts' => array ( 'type' => 'long', ), 'topic_counts' => array ( 'type' => 'long', ), 'updated_at' => array ( 'type' => 'text', 'fields' => array ( 'keyword' => array ( 'type' => 'keyword', 'ignore_above' => 256, ), ), ), ), ];
以上是一个用户表的 mapping,是直接插入数据时自动生成的,其实有的不太对,比如时间的字段 type 也定义为了 text。建议先自己写好,然后执行
//更新 mapping php artisan elastic:update-mapping "App\User"
接下来就是定义搜索规则了,在 UserSearchRule 中:
public function buildHighlightPayload() { return [ "pre_tags" => ['<font color="red">'], "post_tags" => ['</font>'], 'fields' => [ "name" => ['type' => 'plain'], "email" => ['type' => 'plain'], ] ]; } public function buildQueryPayload() { $query = "*{$this->builder->query}*"; return [ "should" => [ ["wildcard" => ["name" => $query]], ["wildcard" => ["email" => $query]], ], ]; }
这里,搜索用户直接使用的是通配符匹配,针对 name 和 email 两个字段进行匹配,是或的关系(故用 should)。
如果长文本,要分词,定义 mapping 和 queryPayload 则有些许变化。看下边这个例子,对几个长文本字段进行分词搜索。
//指定中文分词器 analyzer 和 search_analyzer protected $defaultMapping = [ 'properties' => [ 'top_description' => [ 'type' => 'text', 'analyzer' => 'ik_max_word', 'search_analyzer' => 'ik_smart', ], 'middle_description' => [ 'type' => 'text', 'analyzer' => 'ik_max_word', 'search_analyzer' => 'ik_smart', ], 'bottom_description' => [ 'type' => 'text', 'analyzer' => 'ik_max_word', 'search_analyzer' => 'ik_smart', ], ], ]; //进行 match 搜索,会分词 public function buildQueryPayload() { $query = $this->builder->query; return [ "should" => [ ["match" => ["top_description" => $query]], ["match" => ["middle_description" => $query]], ["match" => ["bottom_description" => $query]], ], ]; }
假如表中 content 字段存 json 格式的如下数据:
{ "0":{ "type":"text", "data":{ "text":"值 得体验" } }, "1":{ "type":"image", "data":{ "key":"oqjxx06tzfgzpwuayjrh", "format":"jpg", "width":1600, "height":2400 } } }
定义 mapping 及 queryPayload 是这样的:
protected $defaultMapping = [ 'properties' => [ 'id' => [ 'type' => 'long', ], 'cid' => [ 'type' => 'long', ], 'type' => [ 'type' => 'long', ], 'content' => [ 'type' => 'nested', 'properties' => [ 'type' => [ 'type' => 'text', ], 'data' => [ 'properties' => [ 'text' => [ 'type' => 'text', 'analyzer' => 'ik_max_word', 'search_analyzer' => 'ik_smart', ], 'key' => [ 'type' => 'text', ], 'format' => [ 'type' => 'text', ], 'width' => [ 'type' => 'long', ], 'height' => [ 'type' => 'long', ], 'url' => [ 'type' => 'text', ] ], ] ] ], 'created_at' => [ 'type' => 'date', 'format' => 'yyyy-MM-dd HH:mm:ss', ], 'updated_at' => [ 'type' => 'date', 'format' => 'yyyy-MM-dd HH:mm:ss', ], 'deleted_at' => [ 'type' => 'date', 'format' => 'yyyy-MM-dd HH:mm:ss', ], ] ]; public function buildQueryPayload() { $query = $this->builder->query; return [ "must" => [ "nested" => [ "path" => "content", "query" => [ "match" => ["content.data.text" => $query] ], ], ], ]; }
其他依据文档进行即可。
参考:
https://www.elastic.co/guide/en/elasticsearch/reference/current/properties.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html