Archiv

Mesos on Raspberry Pi

At this Friday, I saw a email mentioned compile Mesos for Raspberry Pi.
It is a bit interesting and attracting me, it reminds me the happy time about playing Raspberry Pi when I still was a student two years ago.
Now Raspberry Pi is more cheaper over that time (Raspberry Pi Zero ID: 2885 - $5.00).

To make Mesos compiled for Raspberry Pi successfully, we have three possible approaches:

  1. Compile Mesos on Raspberry Pi directly.
  2. Use cross compiling tool chains to compile Mesos for Raspberry Pi in ARM.
  3. Compile Mesos on Raspberry Pi virtual machine.

For first approach, there are a lot of header files in Mesos stout depenency. This requires huge memory when compiling Mesos and make it impossible to compile Mesos on current Raspberry Pi because of lack of memory.

For second approach, it should work theoretically. But Mesos would check the dependent libraries by compiling and running it during configure stage. This mean we need remove those checks in configure file manually. After I removed those checks, I found I fall into another trap: dependencies loops. The minimize dependent libraries for Mesos are zlib, apr, apr-util, subversion. I need to cross compiling and perpare them for Mesos firstly. But I blocked by cross compiling subversion eventually. I found need to solve its dependencies, its dependencies dependencies, its dependencies dependencies dependencies and so on. Seems the only remain possible way is to compile Mesos on Raspberry Pi virtual machine.

qemu-arm could simulate the Raspberry Pi architecture in x86 machine. I found a related vagrant file in github which make it easiler to set up the Raspberry Pi develop environment. However, it has already been out of maintain over 3 years. The debian version it used (Wheezy) is not new enough to compile Mesos as well. So I create a docker image haosdent/raspberry based on the newest Raspberry Pi operate system(Jessie) according to its puppet files.

By this docker image, we could start to compile Mesos for Raspberry Pi.

1. Launch the Raspberry Pi Development Environment.

1
docker run -i -t --net=host --volume=`pwd`/mesos:/root/mesos haosdent/raspberry /bin/bash

At above command, I mount my local Mesos code into docker container. Keep in mind here, we have to use the master branch of Mesos because of the bundle zookeeper package has updated in the master branch recently. Otherwise we would fail when compiling zookeeper on Raspberry Pi if use the old versions of Mesos.

Then we use sb2 -eR to enter the virtual machine of Raspberry Pi. The shell prompt looks like

1
[SB2 emulate raspberry] root@raspberry ~ #

if you enter the virtual machine of Raspberry Pi successfully.

2. Patch pivot_root in Mesos Code

Currenlty Mesos still would fail when compiling on Raspberry Pi because of the undefined of __NR_pivot_root. I take a look at @lyda’s patch for this before. However, it looks incorrect for me. So I modify it to

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
diff --git a/src/linux/fs.cpp b/src/linux/fs.cpp
index 2087b4a..f29ce8a 100644
--- a/src/linux/fs.cpp
+++ b/src/linux/fs.cpp
@@ -444,6 +444,16 @@ Try<Nothing> pivot_root(
// number for 'pivot_root' on the powerpc architecture, see
// https://w3challs.com/syscalls/?arch=powerpc_64
int ret = ::syscall(203, newRoot.c_str(), putOld.c_str());
+#elif __thumb__
+ // A workaround for arm thumb mode. The magic number '218' is the syscall
+ // number for 'pivot_root' on the arm thumb mode, see
+ // https://w3challs.com/syscalls/?arch=arm_thumb
+ int ret = ::syscall(218, newRoot.c_str(), putOld.c_str());
+#elif __arm__
+ // A workaround for arm. The magic number '9437402' is the syscall
+ // number for 'pivot_root' on the arm architecture, see
+ // https://w3challs.com/syscalls/?arch=arm_strong
+ int ret = ::syscall(9437402, newRoot.c_str(), putOld.c_str());
#else
#error "pivot_root is not available"
#endif

3. Following Mesos Getting Started Guide

After finish above prepare works, we could use following commands to start to compile Mesos.

1
2
3
4
5
6
7
8
9
apt-get install -y tar wget git
apt-get install -y autoconf libtool
apt-get -y install build-essential python-dev python-boto libcurl4-nss-dev libsasl2-dev libsasl2-modules maven libapr1-dev libsvn-dev
cd ~/mesos
./bootstrap
mkdir build
cd build
../configure --disable-python --disable-java
make

Noted that we disable java and python when compiling. I encounter some tricky problems when compiled with java in Raspberry Pi and not yet to
investigate.

The stage would take quite a long time. It taked more than 8 hours in my slow machine. You could use make -j <number of cores> if your machine have more cpu cores.

4. Launch Your First Mesos Task in the Raspberry Pi

You should build it successfully if your finished above stages. And we could copy the whole packages to our real Raspberry Pi machine. However, some Mesos feautres don’t work in the Raspberry Pi correctly. For example, replicated_log and cgroups. So we need use following commands to start Mesos Master and Mesos Agent.

4.1 Start Mesos Master

1
./bin/mesos-master.sh --work_dir=/tmp/mesos --ip=127.0.0.1 --hostname=127.0.0.1 --registry=in_memory

4.2 Start Mesos Agent

1
./bin/mesos-slave.sh --work_dir=/tmp/mesos --ip=127.0.0.1 --hostname=127.0.0.1 --master=127.0.0.1:5050 --launcher=posix

4.3 Submit a Mesos Task

After starting these Mesos components successfully, we can use mesos-execute to launch our Mesos task.

1
2
3
export LIBPROCESS_IP=127.0.0.1
export LIBPROCESS_HOSTNAME=127.0.0.1
./src/mesos-execute --master=127.0.0.1:5050 --name=test --command="ls /"
1
./bin/mesos-slave.sh --work_dir=/tmp/mesos --ip=127.0.0.1 --hostname=127.0.0.1 --master=127.0.0.1:5050 --launcher=posix

Now you should found something like

1
2
3
4
5
6
7
8
I0424 07:16:12.291718  2722 scheduler.cpp:177] Version: 0.29.0
Subscribed with ID 'b383e094-b7f2-4841-a737-c4899ef5c81b-0000'
Submitted task 'test' to agent 'b383e094-b7f2-4841-a737-c4899ef5c81b-S0'
Received status update TASK_RUNNING for task 'test'
source: SOURCE_EXECUTOR
Received status update TASK_FINISHED for task 'test'
message: 'Command exited with status 0'
source: SOURCE_EXECUTOR

and the stout logs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
[SB2 emulate raspberry] root@precise64 build # cat /tmp/mesos/slaves/b383e094-b7f2-4841-a737-c4899ef5c81b-S0/frameworks/b383e094-b7f2-4841-a737-c4899ef5c81b-0000/executors/test/runs/latest/stdout
Registered executor on 127.0.0.1
Starting task test
sh -c 'ls /'
Forked command at 2794
bin
boot
dev
etc
home
lib
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
Command exited with status 0 (pid: 2794)
Shutting down
Sending SIGTERM to process tree at pid 2794
Sent SIGTERM to the following process trees:
[

]

Yes, they indicate that our Mesos task succeed.

In additional, if you encounter below error when you want to lauch Mesos in the Raspberry Pi virtual machine. Please make sure you run the docker container with the option --cpuset-cpus=0 because qemu had a bug in multi cores environment.

1
/qemu-1.3.0/tcg/tcg.c:1440: tcg fatal error

In above content, we show how to compile Mesos for the Raspberry Pi and launch Mesos in the Raspberry Pi. But we still have a lot of works to make Mesos could work on the Raspberry Pi perfectly. Hope this could help you if you are seeking to run Mesos on ARM.

libhbase编译小记

从MapR的Github上下下来的libhbase后不论在OS X上还是Linux上都编译失败,对照着错误修改了多处地方,使得在OS X上和Linux上都能正常编译和运行。修改过的代码地址:

1
git@github.com:haosdent/libhbase.git

编译前先保证ant已安装,使用1.6的JDK作为JAVA_HOME,同时加下203.208.46.222 googletest.googlecode.comhosts文件(感谢伟大的GFW)。
clone代码后直接执行:

1
mvn install -DskipTests

正常结束后可以看到target/libhbase-1.0-SNAPSHOT/这个目录,说明已经编译成功。

libhbase的样例代码可参考example_async这个文件。

在Linux下,编译代码时,需要将libhbase.solibjvm.so加入动态链接库的搜索路径,同时加入先前编译得到的头文件。头文件和libhbase.so分别在target/libhbase-1.0-SNAPSHOT/include目录和target/libhbase-1.0-SNAPSHOT/native目录下,libjvm.so${JAVA_HOME}/jre/lib/amd64/server目录下,
编译命令如下所示:

1
gcc example_async.c -I ${LIB_HBASE_HOME}/target/libhbase-1.0-SNAPSHOT/include -L ${LIB_HBASE_HOME}/target/libhbase-1.0-SNAPSHOT/lib/native -L ${JAVA_HOME}/jre/lib/amd64/server -l hbase -l jvm -std=c99

Linux上运行编译好后的文件,也需要将${JAVA_HOME}/jre/lib/amd64/servertarget/libhbase-1.0-SNAPSHOT/native加入到LD_LIBRARY_PATH环境变量。

在OS X下,编译代码与Linux上类似,但需要加入动态链接库搜索路径的是libhbase.dyliblibjvm.dyliblibhbase.dylibtarget/libhbase-1.0-SNAPSHOT/native目录下,libhbase.dylib则在${JAVA_HOME}/../Libraries。编译命令如下所示:

1
gcc example_async.c -I ${LIB_HBASE_HOME}/target/libhbase-1.0-SNAPSHOT/include -L ${LIB_HBASE_HOME}/target/libhbase-1.0-SNAPSHOT/lib/native -L ${JAVA_HOME}/../Libraries -l hbase -l jvm -std=c99

值得注意的是,OS X上所用gcc需为GNU GCC,而非Apple LLVM Clang,可通过HomeBrew直接安装GCC。

OS X上运行编译好后的文件,也需要将${JAVA_HOME}/../Librariestarget/libhbase-1.0-SNAPSHOT/native加入到LD_LIBRARY_PATH环境变量。

最后简单测了下libhbase的写入性能,由于libhbase提供的接口都是异步接口,所以只用了一个线程,往HBase里面插入100w条数据。在写HLog的情况下,写入的QPS在5W每秒左右,基本跑满网卡。

生成IO频谱图

sysdig是一个强大的调试分析工具,可以很方便地排查网络、磁盘IO、CPU等(例子)。

另外sysdig有个比较炫酷的功能是可以画IO的频谱图。
IO频谱图

是通过分析open、close、read、write、socket等系统调用的延迟来实时渲染。X坐标轴并非等分,渲染频率默认为一秒两次,从深绿到深红分别表示IO调用的频繁程度。

或者整个系统的IO频谱图命令为:

1
sysdig -c spectrogram 500

若需指定特定进程,可加上sysdig的filter条件来过滤

1
sysdig -c spectrogram proc.pid=20

还可以指定只采样某种类型IO的延迟

1
sysdig -c spectrogram proc.pid=20 and fd.type=file

阿里云ECS上Hadoop HDFS的简单性能测试

在阿里云的ECS上部署了Hadoop,做了下HDFS的简单性能测试,记录如下,性能差距比较大。

使用的阿里云ECS配置如下:

参数名 参数值
Region 青岛
CPU 1核
内存 512MB
实例规格 ecs.t1.xsmall
系统盘 20G
操作系统 CentOS 6.3 64位

Hadoop HDFS的配置如下:

参数名 参数值
HDFS版本 社区2.4.0
集群ECS台数 6台
JVM堆大小 -Xmx400m
NameNode 2台ECS
JournalNode 1台ECS
DataNode 3台ECS

使用了TestDFSIO在同一个网段的另外一台独立的ECS上做多线程的简单写入和读取速率测试,结果如下(3次测试的平均结果):

测试类型 并发数 每个线程的写入大小 速率
写入 10 1GB 24.01MB/s
读取 10 1GB 40.55MB/s

虽然是最低配置,但由于机器上没有跑任何额外程序,Hadoop也只启动了HDFS,且测试程序是在另外一台ECS跑,所以正常应该是能跑慢网卡。

由于DataNode数据为3,所以这个速率可以近似认为是单台DataNode写云磁盘的速率,瓶颈应该还是在云磁盘。

用perf给Hadoop画CPU火焰图和IO热力图

perf自带的分析结果查看方式主要是tui,这种查看方式是在终端下进行查看。Brendan Gregg大神写了几个Perl脚本,可以将perf的结果转换成更直观的火焰图和热力图,本文以Hadoop的HDFS为例介绍如何以更直观的方式查看perf的分析结果。

1. 安装aliperf和taobao-jdk

1
2
sudo yum install aliperf -btest -y
sudo yum install taobao-jdk -y

因为要分析的是Java程序,只有aliperf和taobao-jdk配合才能解析出JIT符号,不然查看perf的分析结果时,看不到Java代码中对应的方法和类。

2. 修改JVM启动参数

本文用到的Hadoop是社区的trunk版本,要在${HADOOP_HOME}/etc/hadoop-env.sh修改${HADOOP_NAMENODE_OPTS}${HADOOP_DATANODE_OPTS}这两个变量,让JVM启动时带上libjvmti_perf.so这个插件,如下所示,修改完后重启运行perf的机器上的NameNode和DataNode。

1
2
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} -agentpath:/usr/libexec/perf-core/libs/libjvmti_perf.so -XX:+UseOprofile $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS -agentpath:/usr/libexec/perf-core/libs/libjvmti_perf.so -XX:+UseOprofile $HADOOP_DATANODE_OPTS"

3. 画CPU火焰图

先把Brendan Gregg大神的FlameGraph脚本下载下来

1
2
git clone git@github.com:brendangregg/FlameGraph.git
cd FlameGraph

执行HDFS的TestDFSIO压测下,让DataNode有点动静

1
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.4.0-tests.jar TestDFSIO -read -nrFiles 100 -size 1000MB -resFile ./logs/write.result

然后开启perf分析下DataNode,这里11691是DataNode的进程ID。

1
perf record -a -g -p 11691

等到TestDFSIO运行完成后,可以看到当前目录下面已经有perf.data这个文件,这时候用perf report --stdio已经可以直接查看结果。但是如果需要查看火焰图的话,需要再对perf.data做下转换。

1
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl >perf.svg

执行上面的命令后,把perf.svg搞到本地后,用浏览器打开就可以看到类似下面的结果。实际上生成的是svg,所以鼠标移上去左下角会显示完整的Java方法名。但由于下图只是截屏,所以没法体验到这效果
TestDFSIO火焰图

4. 画IO热力图

步骤很类似,把Brendan Gregg大神的HeatMap脚本下载下来

1
2
git clone git@github.com:brendangregg/HeatMap.git
cd HeatMap

执行HDFS的TestDFSIO压测下,让DataNode有点动静

1
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.4.0-tests.jar TestDFSIO -read -nrFiles 100 -size 1000MB -resFile ./logs/write.result

然后开启perf分析下DataNode磁盘IO事件,这里block_rq_issu是发出IO请求时的事件,block_rq_complete是IO请求处理结束时的事件,11691是DataNode的进程ID,会统计DataNode发出IO请求到操作系统处理完IO请求的时间。

1
perf record -e block:block_rq_issue -e block:block_rq_complete -a

等到TestDFSIO运行完成后,可以看到当前目录下面已经有perf.data这个文件,执行如下命令生成IO热力图。

1
2
perf script | awk '{ gsub(/:/, "") } $5 ~ /issue/ { ts[$6, $10] = $4 } $5 ~ /complete/ { if (l = ts[$6, $9]) { printf "%.f %.f\n", $4 * 1000000, ($4 - l) * 1000000; ts[$6, $10] = 0 } }' > out.lat_us
./trace2heatmap.pl --unitstime=us --unitslat=us --stepsec=40 --maxlat=100000 out.lat_us > out.svg

执行上面的命令后,把out.svg搞到本地后,用浏览器打开就可以看到类似下面的结果。横坐标是时间轴,单位由上面的--stepsec参数指定,纵坐标是IO操作的延迟时间,单位由上面的--unitstime参数指定,上限由--maxlat指定,由于TestDFSIO指定运行时间较短,所以perf搜集到的IO操作并不是那么多,如果运行的时间更长的话,生成的热力图会更加壮观。生成的是svg,所以鼠标移上去左下角会显示每个点的具体含义。但由于下图只是截屏,所以没法体验到这效果。
TestDFSIO热力图

5. 结束

perf是个很强大的工具,Brendan Gregg大神的这几个脚本可以让我们更直观地查看perf的分析结果。不过怎么结合这些分析结果改进程序的性能,还是需要对程序的代码足够了解才能有的放矢地进行改进。

Hadoop lzo找不到Native库解决方法

Hadoop lzo相关的错误有两个,分别为:

  1. Could not load native gpl library
  2. native-lzo library not available
    下面会分别说明

Could not load native gpl library

很多HBase用户在用BulkLoad从Hadoop往HBase导入数据的时候,会遇到如下情况。报hadoop lzo找不到gplcompression的错误。

1
2
3
4
5
6
ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1738)
at java.lang.Runtime.loadLibrary0(Runtime.java:823)
at java.lang.System.loadLibrary(System.java:1028)
at com.Hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>

这个错误是因为生成HFile的时候开启了LZO压缩,开启LZO压缩可以有效的减少HFile大小(压缩比平均20%),有效减少distcp传输时间。但由于云梯1的java.library.path路径下并不包含gplcompression这个Native库,所以若生成HFile时开启LZO,则会报如上错误。解决方法很简单,将hadoop-lzo-0.4.20-mr1(MapReduce 1编译版本)或hadoop-lzo-0.4.20(MapReduce 2编译版本)下载的jar包加入到-libjars参数重新执行即可。

原因是hadoop-lzo的作者考虑到了上述情况,所以直接将gplcompression打包进了jar中。查看hadoop-lzo-0.4.20-mr1.jar可发现,gplcompression的Native库,都已经加入到jar包中的 native/Linux-amd64-64/lib下面

1
2
3
4
5
6
7
8
9
$: jar -tf hadoop-lzo-0.4.20-mr1.jar |grep native
native/
native/Linux-amd64-64/
native/Linux-amd64-64/lib/
native/Linux-amd64-64/lib/libgplcompression.a
native/Linux-amd64-64/lib/libgplcompression.la
native/Linux-amd64-64/lib/libgplcompression.so.0.0.0
native/Linux-amd64-64/lib/libgplcompression.so.0
native/Linux-amd64-64/lib/libgplcompression.so

hadoop-lzo的实现中会先将gplcompression的Native库从jar包中解压到临时地址,并load进该库。详细代码可参见作者托管在Github上的代码GPLNativeCodeLoader#unpackBinaries

1
2
3
4
5
// locate the binaries inside the jar
String fileName = System.mapLibraryName(LIBRARY_NAME);
String directory = getDirectoryLocation();
// use the current defining classloader to load the resource
InputStream is = GPLNativeCodeLoader.class.getResourceAsStream(directory + "/" + fileName);

native-lzo library not available

另一个与Hadoop lzo常见的错误是:

1
java.lang.RuntimeException: native-lzo library not available

这个错误是执行你的写HDFS程序的机器没有安装lzo-devel,程序在LD_LIBRARY_PATH下找不到liblzo2.so.2导致的,在该机器上执行如下命令安装即可。

1
yum install lzo lzo-devel

或者直接到已安装lzo的机器上将/usr/lib64/liblzo2.so.2下到本地,然后代码中手动load即可。

1
System.load(liblzo2.so.2的存放地址);

jcgroup

#Cgroup on JVM

Build Status Coverage Status

jcgroup is a cgroup wrapper on JVM. You could use this library to limit the CPU shares, Disk I/O speed, Network bandwidth and etc of a thread.

Subsystems

☑ blkio

☑ common

☑ cpu

☑ cpuacct

☑ cpuset

☑ devices

☑ freezer

☑ memory

☑ net_cls

☑ net_prio

Example

This code snippet create two threads and set different cpu shares of them. One is 512 while another is 2048.

jcgroup_example_cpu

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
public class ExampleTest {

private static final Logger LOG = LoggerFactory.getLogger(ExampleTest.class);
private static Admin admin;
private static Group root;
private static Group one;
private static Group two;

@BeforeClass
public static void setUpClass() {
try {
admin = new Admin(Constants.SUBSYS_CPUSET | Constants.SUBSYS_CPU);
root = admin.getRootGroup();
one = admin.createGroup("one", Constants.SUBSYS_CPUSET | Constants.SUBSYS_CPU);
two = admin.createGroup("two", Constants.SUBSYS_CPUSET | Constants.SUBSYS_CPU);
} catch (IOException e) {
LOG.error("Create cgroup Failed.", e);
assertTrue(false);
}
}

@AfterClass
public static void tearDownClass() {
try {
admin.umount();
} catch (IOException e) {
LOG.error("Umount cgroup failed.", e);
assertTrue(false);
}
}

@Test
public void testCpu() {
try {
one.getCpuset().setCpus(new int[]{0});
two.getCpuset().setCpus(new int[]{0});
one.getCpuset().setMems(new int[]{0});
two.getCpuset().setMems(new int[]{0});
one.getCpu().setShares(512);
two.getCpu().setShares(2048);
final Group oneTmp = one;
final Group twoTmp = two;
new Thread(){
@Override
public void run() {
int id = Threads.getThreadId();
LOG.info("Thread id:" + id);
try {
oneTmp.getCpu().addTask(id);
while (true);
} catch (IOException e) {
LOG.error("Test cpu failed.", e);
assertTrue(false);
}
}
}.start();
new Thread(){
@Override
public void run() {
int id = Threads.getThreadId();
LOG.info("Thread id:" + id);
try {
twoTmp.getCpu().addTask(id);
while (true);
} catch (IOException e) {
LOG.error("Test cpu failed.", e);
assertTrue(false);
}
}
}.start();
Thread.sleep(60000l);
} catch (Exception e) {
LOG.error("Test cpu failed.", e);
assertTrue(false);
}
}
}

Requirements

  • Linux version (>= 2.6.18)

惊鸿一瞥之RocksDB

RocksDB

FB最近新开源了一个新的KV数据库。之前还没公布时挺传言说相比HBase如何如何,不过现在公开后实际看下来,其实是个单机的KV数据库,FB又重复造了一个看起来比LevelDB更快的轮子。仅从视频来看,相比LevelDB可以更充分的利用多核CPU和SSD。比方说像Compaction等操作相较LevelDB是以多线程的方式完成,也不乏像延时完成Value++这些操作的小trick。而像在Scan中加入Bloom过滤器这些我不确定是不是RocksDB的开发者从HBase中借鉴来的。虽然看起来感觉应该是比LevelDB有较大的性能提升,但我充分感觉到了这明显就是一个重复的轮子,无法感受到做这个东西有啥巨大的意义。估计FB的童鞋也是有KPI压力的吧,哈哈。

Wasp上手实例

Wasp是阿里开源的类似MegaStore和F1的分布式关系型数据库,本文将简要介绍如何快速部署Wasp及使用JDBC的方式连接和操作Wasp

准备

1.一个分布式部署的hbase集群,hbase已经启动

2.编译代码的机器上已经安装好Maven和JDK6

编译代码

1.使用如下命令从github克隆最新代码

1
git clone https://github.com/alibaba/wasp.git

2.进入wasp目录,并确保当前目录为JDK6

1
2
3
4
5
 ~/workspace/java$: cd wasp/
~/workspace/java/wasp$: java -version
java version "1.6.0_51"
Java(TM) SE Runtime Environment (build 1.6.0_51-b11-457-11M4509)
Java HotSpot(TM) 64-Bit Server VM (build 20.51-b01-457, mixed mode)

3.执行如下命令进行编译

1
mvn -DskipTests assembly:assembly

4.编译完成后,target目录下的wasp-0.10-bin.tar.gz就是我们稍后将会用的压缩包

配置

1.将wasp-0.10-bin.tar.gz上传到服务器指定的目录

1
~/workspace/java/wasp$: scp target/wasp-0.10-bin.tar.gz haosong.hhs@10.232.98.96:develop/soft/

2.登录服务器后,解压wasp-0.10-bin.tar.gz并进入解压后的目录

1
~/develop/soft$: tar -zxvf wasp-0.10-bin.tar.gz

3.编辑conf目录下的wasp-site.xml,加上如下配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<!-- 设置wasp在zk中的父目录 -->
<name>zookeeper.wasp.znode.parent</name>
<value>/wasp</value>
</property>
<property>
<!-- 设置wasp使用的zk地址,该地址必须是依赖的存储引擎hbase使用的zk地址 -->
<name>wasp.zookeeper.quorum</name>
<value>10.232.98.94,10.232.98.72,10.232.98.40</value>
</property>
<property>
<!-- 设置wasp使用的zk端口号 -->
<name>wasp.zookeeper.property.clientPort</name>
<value>40060</value>
</property>
<property>
<!-- 设置hbase在zk中的父目录 -->
<name>zookeeper.znode.parent</name>
<value>/hbase-cdh4</value>
</property>
<property>
<!-- 设置hbase使用的zk地址 -->
<name>hbase.zookeeper.quorum</name>
<value>10.232.98.94,10.232.98.72,10.232.98.40</value>
</property>
<property>
<!-- 设置hbase使用的zk端口号 -->
<name>hbase.zookeeper.property.clientPort</name>
<value>40060</value>
</property>
<property>
<!-- 设置系统为分布式模式 -->
<name>wasp.cluster.distributed</name>
<value>true</value>
</property>
<property>
<!-- master节点的服务端口 -->
<name>wasp.master.port</name>
<value>45050</value>
</property>
<property>
<!-- master的web页面的服务端口 -->
<name>wasp.master.info.port</name>
<value>45051</value>
</property>
<property>
<!-- 数据服务节点的服务端口 -->
<name>wasp.fserver.port</name>
<value>45052</value>
</property>
<property>
<!-- 数据服务节点的web页面的服务端口 -->
<name>wasp.fserver.info.port</name>
<value>45053</value>
</property>
</configuration>

4.编辑conf目录下的wasp-env.sh,禁用Wasp自动启动zookeeper集群:

1
export WASP_MANAGES_ZK=false

5.编辑conf目录下的fservers,加上fservers的地址,需确保已经配置fservers的免认证登录

1
2
3
10.232.98.60
10.232.98.61
10.232.98.62

部署及启动

1.完成配置后,将wasp-0.10同步到所有fservers服务器上,注意保持一样访问结构

2.使用如下命令启动Wasp

1
~/develop/soft$: ./wasp-0.10/bin/start-wasp.sh

3.启动Wasp Shell,使用status命令检查是否成功启动,若出现如下的提示信息,则表明已成功启动Wasp:

1
2
3
~/develop/soft$: ./wasp-0.10/bin/wasp shell
wasp(main):061:0> status
3 servers, 0 dead, 0.3333 average load

使用JDBC操作数据库

经过前面那么多的准备步骤后,我们现在终于可以使用Wasp来存储我们的数据了。而Wasp提供了我们非常熟悉的JDBC连接方式,下面将介绍如何以JDBC连接和操作Wasp。

1.在代码中需配置Wasp的zookeeper相关信息

1
2
3
4
5
6
7
Properties props = new Properties();
/*
* 配置wasp对应的zookeeper属性
*/

props.setProperty("wasp.zookeeper.quorum",
"10.232.98.94,10.232.98.72,10.232.98.40");
props.setProperty("wasp.zookeeper.property.clientPort", "40060");

2.载入Wasp的JDBC驱动

1
2
3
4
5
6
/*
* 载入wasp的jdbc和初始化相关对象
*/

com.alibaba.wasp.jdbc.Driver.load();
Connection conn = DriverManager.getConnection("jdbc:wasp:", props);
Statement stat = conn.createStatement();

OK,下面我们就可以通过直接执行SQL语句来操作Wasp,比如

1.创建表

1
2
3
4
5
6
7
8
9
/*
* 创建user表,主键为user_id
*/

String sql = "CREATE TABLE user {REQUIRED INT64 user_id;"
+ " REQUIRED STRING name; }"
+ " PRIMARY KEY(user_id),"
+ " ENTITY GROUP ROOT,"
+ " ENTITY GROUP KEY(user_id);";
stat.execute(sql);

2.插入记录

1
2
3
4
5
/*
* 插入id为1,name为'test'的记录
*/

sql = "INSERT INTO user(user_id,name) values(1,'test');";
stat.execute(sql);

3.查询记录

1
2
3
4
5
6
7
8
9
/*
* 查询user_id为1的记录信息
* 最终控制台结果为:1,test
*/

sql = "SELECT * FROM user WHERE user_id=1;";
ResultSet rs = stat.executeQuery(sql);
for (; rs.next(); ) {
System.out.println(rs.getString("user_id") + "," + rs.getString("name"));
}

整个完整的代码为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
package me.haosdent.test;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Properties;

public class WaspExample {
public static void main(String[] args) {
Properties props = new Properties();
/*
* 配置wasp对应的zookeeper属性
*/

props.setProperty("wasp.zookeeper.quorum", "10.232.98.94,10.232.98.72,10.232.98.40");
props.setProperty("wasp.zookeeper.property.clientPort", "40060");

/*
* 载入wasp的jdbc
*/

com.alibaba.wasp.jdbc.Driver.load();
Connection conn = null;
Statement stat = null;
try {
conn = DriverManager.getConnection("jdbc:wasp:", props);
stat = conn.createStatement();
/*
* 创建user表,主键为user_id
*/

String sql = "CREATE TABLE user {REQUIRED INT64 user_id;"
+ " REQUIRED STRING name; }"
+ " PRIMARY KEY(user_id),"
+ " ENTITY GROUP ROOT,"
+ " ENTITY GROUP KEY(user_id);";
stat.execute(sql);
Thread.sleep(2000);

/*
* 插入id为1,name为'test'的记录
*/

sql = "INSERT INTO user(user_id,name) values(1,'test');";
stat.execute(sql);

/*
* 查询user_id为1的记录信息
* 控制台结果:1,test
*/

sql = "SELECT * FROM user WHERE user_id=1;";
ResultSet rs = stat.executeQuery(sql);
for (; rs.next(); ) {
System.out.println(rs.getString("user_id") + "," + rs.getString("name"));
}
} catch (SQLException e) {
e.printStackTrace();
} catch (InterruptedException e) {
e.printStackTrace();
} finally {
try {
stat.close();
conn.close();
} catch (SQLException e) {
e.printStackTrace();
}
}
}
}

写好代码导出为jar包后,使用如下命令运行程序,运行时需将之前解压wasp-0.10-bin.tar.gzlib目录加到CLASSPATH中:

1
java -cp /tmp/wasp-0.10/lib/*:/tmp/WaspExample.jar me.haosdent.test.WaspExample

控制台结果如下,说明已经成功插入记录到Wasp并查询到相关数据:

1
1,test

而登录上服务器后,进入Wasp Shell,也可以查询到之前创建的user表信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
## 显示所有表
wasp(main):062:0> show_tables
TABLE
user
1 row(s) in 0.0130 seconds

## 查看user表的结构
wasp(main):065:0> describe_table 'user'
+---------------------------+----------+----------+-----+-----+
| Field | Type | REQUIRED | Key | EGK |
+---------------------------+----------+----------+-----+-----+
| user_id | INT64 | REQUIRED | PRI | EGK |
| name | STRING | REQUIRED | | |
+---------------------------+----------+----------+-----+-----+
1 row(s) in 0.0050 seconds

上手示例到这里已经结束,关于Wasp更多有趣的功能你可以从此wiki的其他文章中更进一步了解。若操作过程中有任何疑问或需求,欢迎到此处提issue