Laravel scout+elasticsearch 实现全文搜索

如题所述,最近给博客加上了搜索功能,利用了 laravel scout 包外加 elasticsearch ,同时由于公网服务器配置过低,我把 elasticsearch 安装在了家里的小服务器上,利用 frp 进行内网穿透,提供搜索服务。俩边机器均运行的 ubuntu 系统。

elasticsearch 安装

jdk

elasticsearch 运行需要 java 环境,直接安装 oracle-java8:

sudo add-apt-repository ppa:webupd8team/java
sudo apt update
sudo apt install oracle-java8-installer

oracle java 安装方式变了,请自行安装或安装 openjdk:sudo apt install openjdk-11-jdk

elasticsearch

elasticsearch 直接软件包安装:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
sudo apt update && sudo apt install elasticsearch
sudo systemctl restart elasticsearch

更多说明参考:Install Elasticsearch with Debian Package
如果出现 Falling back to java on path. This behavior is deprecated. Specify JAVA_HOME,可配置下 elasticsearch 的 JAVA_HOME 变量:

sudo vi /etc/default/elasticsearch
# openjdk 的 JAVA_HOME 路径
JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

ik 分词

中文搜索使用了分词插件 analysis-ik,因为与elasticsearch 最新版本(6.2.4)不兼容,我将 elasticsearch 降了版本重新安装了下:

sudo apt purge elasticsearch
sudo apt install elasticsearch=6.2.3
cd /usr/share/elasticsearch
sudo ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.3/elasticsearch-analysis-ik-6.2.3.zip
sudo systemctl restart elasticsearch
sudo systemctl enable elasticsearch
sudo apt-mark hold elasticsearch #阻止 elasticsearch 自动更新导致与插件版本不一致

此时 elasticsearch 安装并启动成功,命令 wget http://localhost:9200 获取到类似如下 html 内容:

{
  "name" : "YTdGKc4",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "NZyCrA5aQWOT8Thc1GYXwg",
  "version" : {
    "number" : "6.2.3",
    "build_hash" : "c59ff00",
    "build_date" : "2018-03-13T10:06:29.741383Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Laravel 中的搜索配置

依赖安装与配置

安装 Laravel Scout 和 elastic 驱动包,scout 使用 ^4.0 的版本:

composer require laravel/scout
composer require tamayo/laravel-scout-elastic
php artisan vendor:publish --provider="Laravel\Scout\ScoutServiceProvider"

ServiceProvider 已经由 laravel 包发现自动处理了,接下来在 app\scout.php 内修改下配置:

<?php
...
'driver' => env('SCOUT_DRIVER', 'elasticsearch'),
...
    'elasticsearch' => [
        'index' => env('ELASTICSEARCH_INDEX', 'lowlog'),
        'hosts' => [
            env('ELASTICSEARCH_HOST', 'http://localhost'),
        ],
    ],

model 修改与索引建立

修改下需要搜索的模型(App\Post)以支持 scout:

<?php

namespace App;

use Illuminate\Database\Eloquent\Model;
use Laravel\Scout\Searchable;

class Post extends Model
{
    use Searchable;
    public function shouldBeSearchable()
    {
        return $this->is_draft === 0;
    }

    public function toSearchableArray()
    {
        return array_only($this->toArray(), ['id', 'title', 'html']);
    }
}

创建一条 artisan 命令来建立索引(index):

php artisan make:command EsInit

命令内 handle()方法的内容:

$client = new \GuzzleHttp\Client();
$url = config('scout.elasticsearch.hosts')[0] . ':9200/' . config('scout.elasticsearch.index');
$client->put($url, [
    \GuzzleHttp\RequestOptions::JSON => [
        'settings' => [
            'number_of_shards' => 1, //一个索引中含有的主分片的数量
            'number_of_replicas' => 0 //每一个主分片关联的副本分片的数量
        ],
        'mappings' => [
            'posts' => [  //类型名(相当于mysql的表)
                '_all' => [   //  是否开启所有字段的检索
                    'enabled' => 'false'
                ],
                '_source' => [ //  存储原始文档
                    'enabled' => true
                ],
                'properties' => [   //文档类型设置(相当于mysql的数据类型)
                    'id' => [
                        'type' => 'integer', // //类型 string、integer、float、double、boolean、date,text,keyword
                        //'index'=> 'not_analyzed',//索引是否精确值  analyzed not_analyzed

                    ],
                    'title' => [
                        'type' => 'text', // 字段类型为全文检索,如果需要关键字,则修改为keyword,注意keyword字段为整体查询,不能作为模糊搜索
                        "analyzer" => "ik_max_word",
                        "search_analyzer" => "ik_max_word",
                    ],
                    'html' => [
                        'type' => 'text',
                        "analyzer" => "ik_max_word",
                        "search_analyzer" => "ik_max_word",
                    ]
                ]
            ]
        ]
    ]
]);

使用 frp 实现内网穿透

frp 好东西,内网穿透比 ngrok 简单好用,这里用它将 9200 端口的 elastic 搜索请求转到我内网运行 elasticsearch 的机器上。frp 使用方法不赘述,直接贴上服务器端(公网运行,frps)和客户端(内网运行,frpc)的配置:
/etc/frps/frps.ini

[common]
bind_addr = 0.0.0.0
bind_port = 7000
bind_udp_port = 7001
kcp_bind_port = 7000

log_file = /var/log/frps.log
log_level = info
log_max_days = 3

privilege_token = xxxxxxxx
privilege_allow_ports = 2000-3000,3001,3003,4000-50000

max_pool_count = 5

/etc/frpc/frpc.ini

[common]
server_addr = xx.xx.xx.xx
server_port = 7000
token = xxxxxxxx

log_file = /var/log/frpc.log
log_level = info
log_max_days = 3

[elastic]
type = tcp
local_ip = 127.0.0.1
local_port = 9200
remote_port = 9200

附上 frps/frpc 开机启动的服务脚本(frps 为例,将 frps 事先移动到了 /usr/bin 目录):
/etc/systemd/system/frps.service:

[Unit]
Description=frps daemon
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/frps -c /etc/frps/frps.ini
[Install]
WantedBy=multi-user.target

之后通过 systemctl 启动:

sudo systemctl start frps
sudo systemctl enable frps

初始化搜索

俩端都正常执行时,在公网服务器上测试下访问 elasticsearch:curl http://localhost:9200,成功后进入 laravel 项目,初始化下搜索:

php artisan es:init
php artisan scout:import "App\Post"

搜索路由、控制器、视图页面建立

routes\web.php 里加一条路由:Route::match(['get', 'post'], '/s', 'PostController@search')->name('post.search');
PostController@search:

public function search(Request $request)
{
    $q = $request->get('q');
    $paginatedPosts = [];
    if ($q) {
        $paginatedPosts = Post::search($q)->paginate();
    }
    return view('post.search', compact('paginatedPosts', 'q'));
}

注意 search 方法返回的\Laravel\Scout\Builder 比 laravel 的 query builder 少了许多方法,类似where('id', 1)的条件查询仍可使用,paginante分页也可以,一些关联查询是不行的了。

参考资料