Cassandra 分布式集群
1 实施Cassandra集群,并验证集群功能正常,抓图实验过程
2 为什么说对于布隆过滤器有"确定某个元素是否在某个集合中的代价和总的元素数目无关"?误判率和元素数目有关吗?为什么?
First, make sure that the nodes in the cluster all have the same name and the same keyspace definitions so that the new node can accept data.
Edit the config file on the second node to indicate that the first one will act as the seed.
Then, set autobootstrap to true.
1.
IP |
是否为seeds |
|
192.168.1.106 |
yes |
|
192.168.1.111 |
no |
|
[root@datanode01 conf]# mkdir /var/log/cassandra
[root@datanode01 conf]# chown student /var/log/cassandra/
[root@datanode01 bin]# mkdir /var/lib/cassandra
[root@datanode01 bin]# chown student /var/lib/cassandra
192.168.1.106
- seeds: "192.168.1.106"
listen_address: 192.168.1.106
Rpc_address:192.168.106
192.168.1.111
- seeds: "192.168.1.106"
listen_address: 192.168.1.111
Rpc_address:192.168.111
查看集群的状态:
[student@datanode01 bin]$ ./nodetool status
xss = -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms503M -Xmx503M -Xmn100M -XX:+HeapDumpOnOutOfMemoryError -Xss256k
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.1.111 55.32 KB 256 100.0% fddbf3a2-a221-4e88-bd2b-19e3db13894b rack1
UN 192.168.1.106 40.82 KB 256 100.0% ff335767-f93c-48d4-92d9-ae11aa3b0f40 rack1
[student@datanode01 bin]$
2.
确定某个元素是否在某个集合中的时间代价为
各个哈希函数运算时间和各个哈希结果在二进制向量中的比较时间,由于哈希函数是确定的,哈希结果在向量中的比较时间也是确定的,所以确定某个元素是否在某个集合中的时间代价也是确定的,不会随着元素数目多少而变化,也就是无关的。
确定某个元素是否在某个集合中的空间代价主要为各次哈希结果的空间代价和二进制向量的空间代价,由于哈希的次数和算法是确定的,从而其各次哈希结果也是确 定的,二进制向量的长度也是确定的,不会随着元素数目多少而变化,所以确定某个元素是否在某个集合中的空间代价也与总的元素数目无关。
误判率和元素数目有关,因为元素数目越大,哈希结果在二进制向量中存放"1"值的数量就越大,导致发生哈希碰撞的概率就越大,即误判率就越高
以下是对cql的初体验:
1 [student@datanode01 bin]$ ./cqlsh 2 Connected to Test Cluster at localhost:9160. 3 [cqlsh 4.1.0 | Cassandra 2.0.3 | CQL spec 3.1.1 | Thrift protocol 19.38.0] 4 Use HELP for help. 5 cqlsh> create keysapce yao with replication = {'class':'SimpleStrategy','replication_factor':1}; 6 Bad Request: line 1:7 no viable alternative at input 'keysapce' 7 cqlsh> create keyspace yao with replication = {'class':'SimpleStrategy','replication_factor':1}; 8 cqlsh> use yao 9 ... ; 10 cqlsh:yao> create table users(userid int primary key,fname text,lname text); 11 cqlsh:yao> drop table users; 12 cqlsh:yao> create table users(user_id int primary key,fname text,lname text); 13 cqlsh:yao> INSERT INTO users (user_id, fname, lname) 14 ... VALUES (1745, 'john', 'smith'); 15 INSERT INTO users (user_id, fname, lname) 16 VALUES (1744, 'john', 'doe'); 17 INSERT INTO users (user_id, fname, lname) 18 VALUES (1746, 'john', 'smith');cqlsh:yao> INSERT INTO users (user_id, fname, lname) 19 ... VALUES (1744, 'john', 'doe'); 20 cqlsh:yao> INSERT INTO users (user_id, fname, lname) 21 ... VALUES (1746, 'john', 'smith'); 22 cqlsh:yao> select * from users; 23 24 user_id | fname | lname 25 ---------+-------+------- 26 1745 | john | smith 27 1744 | john | doe 28 1746 | john | smith 29 30 (3 rows) 31 32 cqlsh:yao> create index on users(lname); 33 cqlsh:yao> select * from users where lname='smith'; 34 35 user_id | fname | lname 36 ---------+-------+------- 37 1745 | john | smith 38 1746 | john | smith 39 40 (2 rows)
|
|