opensearch基础知识

opensearch基础

  1. Cluster

    • Contains one or more nodes
    • Managed by a master node
  2. Node

    • Single server part of a cluster
    • Types: Master-eligible, data, ingest, etc.
  3. Index

    • Collection of documents with similar characteristics
    • Managed by shards
  4. Shard(分片):

    OpenSearch 索引中的数据可以增长到巨大的比例。为了保持其可管理性,它被拆分为多个分片。每个 OpenSearch 分片都是一个 Apache Lucene 索引,每个单独的 Lucene 索引都包含 OpenSearch 索引中文档的子集。以这种方式拆分索引可以控制资源使用。Apache Lucene 索引的文档数量限制为 2,147,483,519 个。

    • Single Lucene instance
    • Holds part of an index's data
    • Types: Primary and replica
  5. Document

    • Basic unit of information
    • Expressed in JSON format
  6. Field

    • Smallest individual unit of data in a document
    • Has a defined datatype
  7. Mapping

    • Defines how a document and its fields are stored and indexed (定义文档及其字段的存储和索引方式)
  8. Segment (段)

    • An inverted index (倒排索引)
    • Created when a new document is indexed (在为新文档编制索引时创建)
    • Merged into larger segments over time

如果安装的opensearch是集群,那么每个node节点都需要安装插件

opensearch安装分词插件

#进入docker容器
docker exec -it opensearch-node1 bash
  1. 安装opensearch提供的分词插件(analysis-smartcn)

    #进入容器后,执行以下命令
    bin/opensearch-plugin install analysis-smartcn
    
  2. 通过压缩包安装插件

    #file:///opensearch-analysis-ik.zip 为压缩文件在容器中的位置
    bin/opensearch-plugin install file:///opensearch-analysis-ik.zip
    

安装IK分词器启动opensearch服务的报错:

  1. NoClassDefFoundError: org/apache/commons/logging/LogFactory

    #缺少jar包,下载commons-logging-1.2.jar后复制到对应的目录(opensearch-node1为容器的名称)
    docker cp /data/openSearch/commons-logging-1.2.jar opensearch-node1:/usr/share/opensearch/plugins/opensearch-analysis-ik
    
  2. 分词插入数据时报错空指针 org.wltea.analyzer.dic.Dictionary.singleton._StopWords;

    #进入docker容器
    docker exec -it opensearch-node1 bash
    #查看/usr/share/opensearch/plugins/opensearch-analysis-ik/config是否存在,如果不存在的话
    cd /usr/share/opensearch/plugins/opensearch-analysis-ik
    mkdir config
    chmod 755
    #将opensearch-analysis-ik目录下的所有文件复制到config目录
    cp -r /usr/share/opensearch/config/opensearch-analysis-ik/* /usr/share/opensearch/plugins/opensearch-analysis-ik/config
    
  3. #本地通过 find 命令查找文件
    find / -name "sonar-pmd-plugin-2.6.jar"
    

其他操作:

#本地通过 find 命令查找文件
find / -name "sonar-pmd-plugin-2.6.jar"
#修改IKAnalyzer.cfg.xml来设置自定义分词
docker cp /data/openSearch/IKAnalyzer.cfg.xml opensearch-node1:/usr/share/opensearch/plugins/opensearch-analysis-ik/config

opensarch集群的docker-compose文件:

version: '3'
services:
  opensearch-node1:
    image: opensearchproject/opensearch:latest
    container_name: opensearch-node1
    environment:
      - cluster.name=opensearch-cluster # Name the cluster
      - node.name=opensearch-node1 # Name the node that will run in this container
      - discovery.seed_hosts=opensearch-node1,opensearch-node2 # Nodes to look for when discovering the cluster
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2 # Nodes eligibile to serve as cluster manager
      - bootstrap.memory_lock=true # Disable JVM heap memory swapping
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM
      - "DISABLE_INSTALL_DEMO_CONFIG=true" # Prevents execution of bundled demo script which installs demo certificates and security configurations to OpenSearch
      - "DISABLE_SECURITY_PLUGIN=true" # Disables Security plugin
    ulimits:
      memlock:
        soft: -1 # Set memlock to unlimited (no soft or hard limit)
        hard: -1
      nofile:
        soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
        hard: 65536
    volumes:
      - opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container
    ports:
      - 9200:9200 # REST API
      - 9300:9300 # TCP API
      - 9600:9600 # Performance Analyzer
    networks:
      - opensearch-net # All of the containers will join the same Docker bridge network
  opensearch-node2:
    image: opensearchproject/opensearch:latest
    container_name: opensearch-node2
    environment:
      - cluster.name=opensearch-cluster # Name the cluster
      - node.name=opensearch-node2 # Name the node that will run in this container
      - discovery.seed_hosts=opensearch-node1,opensearch-node2 # Nodes to look for when discovering the cluster
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2 # Nodes eligibile to serve as cluster manager
      - bootstrap.memory_lock=true # Disable JVM heap memory swapping
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM
      - "DISABLE_INSTALL_DEMO_CONFIG=true" # Prevents execution of bundled demo script which installs demo certificates and security configurations to OpenSearch
      - "DISABLE_SECURITY_PLUGIN=true" # Disables Security plugin
    ulimits:
      memlock:
        soft: -1 # Set memlock to unlimited (no soft or hard limit)
        hard: -1
      nofile:
        soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
        hard: 65536
    volumes:
      - opensearch-data2:/usr/share/opensearch/data # Creates volume called opensearch-data2 and mounts it to the container
    networks:
      - opensearch-net # All of the containers will join the same Docker bridge network
  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:latest
    container_name: opensearch-dashboards
    ports:
      - 5601:5601 # Map host port 5601 to container port 5601
    expose:
      - "5601" # Expose port 5601 for web access to OpenSearch Dashboards
    environment:
      - 'OPENSEARCH_HOSTS=["http://opensearch-node1:9200","http://opensearch-node2:9200"]'
      - "DISABLE_SECURITY_DASHBOARDS_PLUGIN=true" # disables security dashboards plugin in OpenSearch Dashboards
    networks:
      - opensearch-net

volumes:
  opensearch-data1:
  opensearch-data2:

networks:
  opensearch-net:
posted @   MapleDream  阅读(80)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
点击右上角即可分享
微信分享提示