druid.io本地集群搭建 / 扩展集群搭建

druid.io 是一个比较重型的数据库查询系统,分为5种节点 。

在此就不对数据库进行介绍了,如果有疑问请参考白皮书:

http://pan.baidu.com/s/1eSFlIJS  

 

 

单台机器的集群搭建

 

首先说一下通用的集群搭建,基于 0.9.1.1

下载地址  http://pan.baidu.com/s/1hrJBjlq:

 

修改 conf/druid/_common 内的 common.runtime.properties,参考如下配置:

 

  1.  
    #
  2.  
    # Licensed to Metamarkets Group Inc. (Metamarkets) under one
  3.  
    # or more contributor license agreements. See the NOTICE file
  4.  
    # distributed with this work for additional information
  5.  
    # regarding copyright ownership. Metamarkets licenses this file
  6.  
    # to you under the Apache License, Version 2.0 (the
  7.  
    # "License"); you may not use this file except in compliance
  8.  
    # with the License. You may obtain a copy of the License at
  9.  
    #
  10.  
    # http://www.apache.org/licenses/LICENSE-2.0
  11.  
    #
  12.  
    # Unless required by applicable law or agreed to in writing,
  13.  
    # software distributed under the License is distributed on an
  14.  
    # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  15.  
    # KIND, either express or implied. See the License for the
  16.  
    # specific language governing permissions and limitations
  17.  
    # under the License.
  18.  
    #
  19.  
     
  20.  
    #
  21.  
    # Extensions
  22.  
    #
  23.  
     
  24.  
    # This is not the full list of Druid extensions, but common ones that people often use. You may need to change this list
  25.  
    # based on your particular setup.
  26.  
    #druid.extensions.loadList=["druid-kafka-eight", "druid-s3-extensions", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "mysql-metadata-storage"]
  27.  
    druid.extensions.loadList=["mysql-metadata-storage"]
  28.  
     
  29.  
    # If you have a different version of Hadoop, place your Hadoop client jar files in your hadoop-dependencies directory
  30.  
    # and uncomment the line below to point to your directory.
  31.  
    #druid.extensions.hadoopDependenciesDir=/my/dir/hadoop-dependencies
  32.  
     
  33.  
    #
  34.  
    # Logging
  35.  
    #
  36.  
     
  37.  
    # Log all runtime properties on startup. Disable to avoid logging properties on startup:
  38.  
    druid.startup.logging.logProperties=true
  39.  
     
  40.  
    #
  41.  
    # Zookeeper
  42.  
    #
  43.  
     
  44.  
    druid.zk.service.host=10.202.4.22:2181
  45.  
    druid.zk.paths.base=/druid
  46.  
     
  47.  
    #
  48.  
    # Metadata storage
  49.  
    #
  50.  
     
  51.  
    # For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over):
  52.  
    #druid.metadata.storage.type=derby
  53.  
    #druid.metadata.storage.connector.connectURI=jdbc:derby://metadata.store.ip:1527/var/druid/metadata.db;create=true
  54.  
    #druid.metadata.storage.connector.host=metadata.store.ip
  55.  
    #druid.metadata.storage.connector.port=1527
  56.  
     
  57.  
    # For MySQL:
  58.  
    druid.metadata.storage.type=mysql
  59.  
    druid.metadata.storage.connector.connectURI=jdbc:mysql://10.202.4.22:3306/druid?characterEncoding=UTF-8
  60.  
    druid.metadata.storage.connector.user=szh
  61.  
    druid.metadata.storage.connector.password=123456
  62.  
     
  63.  
    # For PostgreSQL (make sure to additionally include the Postgres extension):
  64.  
    #druid.metadata.storage.type=postgresql
  65.  
    #druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
  66.  
    #druid.metadata.storage.connector.user=...
  67.  
    #druid.metadata.storage.connector.password=...
  68.  
     
  69.  
    #
  70.  
    # Deep storage
  71.  
    #
  72.  
     
  73.  
    # For local disk (only viable in a cluster if this is a network mount):
  74.  
    druid.storage.type=local
  75.  
    druid.storage.storageDirectory=var/druid/segments
  76.  
     
  77.  
    # For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
  78.  
    #druid.storage.type=hdfs
  79.  
    #druid.storage.storageDirectory=/druid/segments
  80.  
     
  81.  
    # For S3:
  82.  
    #druid.storage.type=s3
  83.  
    #druid.storage.bucket=your-bucket
  84.  
    #druid.storage.baseKey=druid/segments
  85.  
    #druid.s3.accessKey=...
  86.  
    #druid.s3.secretKey=...
  87.  
     
  88.  
    #
  89.  
    # Indexing service logs
  90.  
    #
  91.  
     
  92.  
    # For local disk (only viable in a cluster if this is a network mount):
  93.  
    druid.indexer.logs.type=file
  94.  
    druid.indexer.logs.directory=var/druid/indexing-logs
  95.  
     
  96.  
    # For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
  97.  
    #druid.indexer.logs.type=hdfs
  98.  
    #druid.indexer.logs.directory=/druid/indexing-logs
  99.  
     
  100.  
    # For S3:
  101.  
    #druid.indexer.logs.type=s3
  102.  
    #druid.indexer.logs.s3Bucket=your-bucket
  103.  
    #druid.indexer.logs.s3Prefix=druid/indexing-logs
  104.  
     
  105.  
    #
  106.  
    # Service discovery
  107.  
    #
  108.  
     
  109.  
    druid.selectors.indexing.serviceName=druid/overlord
  110.  
    druid.selectors.coordinator.serviceName=druid/coordinator
  111.  
     
  112.  
    #
  113.  
    # Monitoring
  114.  
    #
  115.  
     
  116.  
    druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
  117.  
    druid.emitter=logging
  118.  
    druid.emitter.logging.logLevel=info



 

0.9.1.1 默认是不带mysql 扩展的需要自己下载,解压后放置于extensions 下:

参考文章:

http://druid.io/docs/0.9.1.1/operations/including-extensions.html

 

 

扩展包的下载地址

http://druid.io/downloads.html

分别修改5个节点的启动配置

/conf/druid/${serviceName}/runtime.properties

 

broker节点:

/conf/druid/broker/runtime.properties

 

  1.  
    druid.host=10.202.4.22:9102
  2.  
    druid.service=druid/broker
  3.  
    druid.port=9102
  4.  
     
  5.  
    # HTTP server threads
  6.  
    druid.broker.http.numConnections=5
  7.  
    druid.server.http.numThreads=25
  8.  
     
  9.  
    # Processing threads and buffers
  10.  
    druid.processing.buffer.sizeBytes=32768
  11.  
    druid.processing.numThreads=2
  12.  
     
  13.  
    # Query cache
  14.  
    druid.broker.cache.useCache=true
  15.  
    druid.broker.cache.populateCache=true
  16.  
    druid.cache.type=local
  17.  
    druid.cache.sizeInBytes=2000000000



 

 

coordinator 节点:

/conf/druid/coordinator/runtime.properties

druid.host=10.202.4.22:8082
druid.service=druid/coordinator
druid.port=8082



historical 节点:

/conf/druid/historical/runtime.properties

 

  1.  
    druid.service=druid/historical
  2.  
    druid.host=10.202.4.22:9002
  3.  
    druid.port=9002
  4.  
     
  5.  
    # HTTP server threads
  6.  
    druid.server.http.numThreads=25
  7.  
     
  8.  
    # Processing threads and buffers
  9.  
    druid.processing.buffer.sizeBytes=6870912
  10.  
    druid.processing.numThreads=7
  11.  
     
  12.  
    druid.historical.cache.useCache=false
  13.  
    druid.historical.cache.populateCache=false
  14.  
     
  15.  
    # Segment storage
  16.  
    druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize"\:13000000}]
  17.  
    druid.server.maxSize=13000000



 

 

middleManager 节点:

/conf/druid/middleManager/runtime.properties

 

  1.  
    druid.host=10.202.4.22:8091
  2.  
    druid.service=druid/middleManager
  3.  
    druid.port=8091
  4.  
     
  5.  
    # Number of tasks per middleManager
  6.  
    druid.worker.capacity=3
  7.  
     
  8.  
    # Task launch parameters
  9.  
    druid.indexer.runner.javaOpts=-server -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
  10.  
    druid.indexer.task.baseTaskDir=var/druid/task
  11.  
     
  12.  
    # HTTP server threads
  13.  
    druid.server.http.numThreads=25
  14.  
     
  15.  
    # Processing threads and buffers
  16.  
    druid.processing.buffer.sizeBytes=65536
  17.  
    druid.processing.numThreads=2
  18.  
     
  19.  
    # Hadoop indexing
  20.  
    druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp
  21.  
    druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.3.0"]



 

 

 

overlord 节点:

/conf/druid/overlord/runtime.properties

 

  1.  
    druid.service=druid/overlord
  2.  
    druid.host=10.202.4.22:9100
  3.  
    druid.port=9100
  4.  
     
  5.  
    #druid.indexer.queue.startDelay=PT30S
  6.  
     
  7.  
    druid.indexer.runner.type=remote
  8.  
    druid.indexer.storage.type=metadata

 

 

 

启动集群启动  可以利用我写的脚本:

 

  1.  
    #java `cat conf/druid/broker/jvm.config | xargs` -cp conf/druid/_common:conf/druid/broker:lib/* io.druid.cli.Main server broker
  2.  
     
  3.  
    function help(){
  4.  
    echo "参数列表"
  5.  
    echo " 参数1 参数2"
  6.  
    echo " serviceName [-f]"
  7.  
    echo "参数1:serviceName: 启动服务的名字"
  8.  
    echo "serviceName可选项:"
  9.  
    echo "1: broker"
  10.  
    echo "2: coordinator"
  11.  
    echo "3: historical"
  12.  
    echo "4: middleManager"
  13.  
    echo "5: overlord"
  14.  
    echo "参数2:[-f]: 是否前台启动"
  15.  
    echo "-f:前台启动,(不加)默认后台启动"
  16.  
    }
  17.  
     
  18.  
    function startService(){
  19.  
    # echo $0
  20.  
    # echo $1
  21.  
    # echo $2
  22.  
    echo $service
  23.  
    if [[ $2 == "-f" ]]; then
  24.  
    echo "前台启动"
  25.  
    java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/$service:lib/* io.druid.cli.Main server $service
  26.  
    else
  27.  
    echo "后台启动"
  28.  
    nohup java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/$service:lib/* io.druid.cli.Main server $service &
  29.  
    fi;
  30.  
    }
  31.  
     
  32.  
    function tips(){
  33.  
    red=`tput setaf 1`
  34.  
    reset=`tput sgr0`
  35.  
    echo "${red}Not correct arguments${reset}"
  36.  
    echo "please use --help or -h for help"
  37.  
    }
  38.  
     
  39.  
    if [[ $1 == "--help" || $1 == "-h" ]]; then
  40.  
    help
  41.  
    exit
  42.  
    fi
  43.  
     
  44.  
    service=$1
  45.  
     
  46.  
    case $service in
  47.  
    "broker")
  48.  
    ;;
  49.  
    "coordinator")
  50.  
    ;;
  51.  
    "historical")
  52.  
    ;;
  53.  
    "middleManager")
  54.  
    ;;
  55.  
    "overlord")
  56.  
    ;;
  57.  
    *)
  58.  
    tips
  59.  
    exit
  60.  
    esac
  61.  
     
  62.  
    if [[ $2 == "-f" || $2 == "" ]]; then
  63.  
    startService $1 $2;
  64.  
    else
  65.  
    tips
  66.  
    exit
  67.  
    fi


将上述脚本放到 druid的根目录即可:

 

 

 

启动效果如图:

 

 

 

多台机器的集群搭建:

 

上面是单台机器的集群搭建, 扩展到多台

 

只需要修改 conf/druid/(broker | coordinator | historical | middleManager | overlord) 的 runtime.properties 中的 

druid.host=10.202.4.22:9102 改成其他机器的IP地址即可

 

 

 

 

=========================================================

 

最后共享下我成功搭建的本地集群 (内含个人写的集群启动脚本):

http://pan.baidu.com/s/1bJjFzg

基于版本 0.9.1.1,  元数据存储用的mysql (扩展包已经下载好了)

下载并解压 我搭建的druid.tgz 的druid本地集群后:

1.修改各个配置文件

conf/druid/broker/runtime.properties 

conf/druid/coordinator/runtime.properties 

conf/druid/historical/runtime.properties 

conf/druid/middleManager/runtime.properties 

conf/druid/overlord/runtime.properties 

druid.host=10.202.4.22:9102 改成自己的IP地址

2.修改 conf/druid/_common/common.runtime.properties  中的mysql端口地址, 用户名,密码, zookeeper地址等 替换为自己的地址。

 

Tips:mysql 没有创建库druid 要自己先创建(安装mysql数据库,并创建druid数据库和druid用户)

3.用根目录下的 cluster_start_service.sh 启动5个节点的服务

   这时进入mysql数据库 会发现druid自动创建了几张表

4.进行测试

(1)数据导入

(2)数据查询

切换到 test 目录,下面有两个文件夹:

(1)数据导入:

切换到 test_load 的目录下

下面的 submit_csv_task.sh , submit_json_task.sh 分别是提交 csv 数据 与 json 数据的测试脚本

这里需要修改 env.sh

设置 overlord_ip 为自己的 overlord_ip 的 地址端口

这里试下执行 submit_json_task.sh 脚本,去页面上查看 导入任务执行的状态 ,如图所示:

输入的地址为 overlord 节点的地址+端口:

 

等待任务成功执行完成:

切换到查询测试目录,执行查询脚本: 

posted @ 2018-09-09 19:08  漠漠颜  阅读(1515)  评论(0编辑  收藏  举报