DaoCloud私有容器云实践

DaoCloud私有容器云实践

环境说明

  • Centos7.2
  • Docker-engine-1.12.0
  • Linux Kernel 3.10.0

前期准备

  • 安装docker运行环境
1
2
curl -sSL https://get.daocloud.io/docker | sh
systemctl start docker.service
  • 关闭防火墙、关闭selinux、docker默认开机启动
1
2
3
chkconfig docker on
setenforce 0 && sed -i '/^SELINUX=/c\SELINUX=disabled' /etc/selinux/config systemctl stop firewalld
systemctl disable firewalld.service

开始安装DCE主控节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
bash -c "$(docker run --rm daocloud.io/daocloud/dce install --help)"
Install the DCE Controller.

Usage: do-install [options]

Description:
  The command will install the DCE controller on this machine.

Options:
  -q, --quiet             Quiet. Do not ask for anything.
  --force-pull            Always Pull Image, default is pull when missing.
  --swarm-port PORT       Specify the swarm manager port(default: 2376).
  --controller-port PORT  Specify the dce controller port(default: 80).
  --replica               Install as a replica for HA.
  --replica-controller IP Specify the primary controller IP installed.
  --no-overlay            Do not config Overlay network.
  --secure-registry       Config insecure-registry.
  --experimental          Enable experimental Swarm Experimental Features.
  --host-address IP       Specify the node IP address.
  --dry-run               Start as a test process without install DCE.
  • 开始安装
1
bash -c "$(docker run --rm daocloud.io/daocloud/dce install)"

部署私有registry仓库

~~docker run -d -p 5000:5000 –name registry registry:2 Unable to find image ‘registry:2’ locally 2: Pulling from library/registry

e110a4a17941: Pull complete 2ee5ed28ffa7: Pull complete d1562c23a8aa: Pull complete 06ba8e23299f: Pull complete 802d2a9c64e8: Pull complete Digest: sha256:1b68f0d54837c356e353efb04472bc0c9a60ae1c8178c9ce076b01d2930bcc5d Status: Downloaded newer image for registry:2 ba607b4267e6a32e123c5cb0bde2c77fd0a4970b9f9b714f5149a71858502a42bash

~~~

建立私有docker仓库registry

建立私有docker仓库registry

环境说明

  • centos6.6
  • docker1.7

建立私有registry

  • docker-registry 是官方提供的工具,可以用于构建私有的镜像仓库。

  • 容器运行

    1
    
    docker run -d -p 5000:5000 -v /opt/docker-registry:/tmp/registry registry
    

仓库建立在容器的/tmp路径下,然后指向本地的特定路径

1
/opt/docker-registry

  • centos上安装
1
2
yum install -y python-devel libevent-devel python-pip gcc xz-devel
pip install docker-registry
centos配置docker运行环境

centos配置docker运行环境

环境说明

  • centos6.6
  • docker1.7

配置过程

添加yum源

1
2
3
4
5
6
7
8
tee /etc/yum.repos.d/docker.repo <<-'EOF'
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/$releasever/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF

安装docker1.7.1

1
yum install -y docker-engine

启动docker服务

1
2
chkconfig docker on
service docker start

获取镜像ubuntu

1
docker pull ubuntu:12.04

列出镜像

1
2
3
4
5
6
7
docker images
REPOSITORY                      TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ubuntu                          12.04               93932704ad15        10 days ago         139.3 MB
centos                          latest              7322fbe74aa5        13 months ago       172.2 MB
dl.dockerpool.com:5000/mysql    latest              3c6d7e5c8c1b        21 months ago       235.6 MB
dl.dockerpool.com:5000/java     7u65-jdk            bd8bd16075a0        21 months ago       562.7 MB
dl.dockerpool.com:5000/centos   latest              87e5b6b3ccc1        22 months ago       224 MB

制作私有镜像

1
2
yum -y install febootstrap
febootstrap -i bash -i wget -i yum -i iputils -i iproute -i man -i vim -i openssh-server -i openssh-clients -i tar -i gzip  centos6 centos6.8-image http://mirrors.aliyun.com/centos/6.8/os/x86_64/

centos6:OS版本。

centos6.8-image:镜像文件保存到当前路径下的centos6.8-image文件夹下。

http://mirrors.aliyun.com/centos/6.8/os/x86_64/:centos6.8系统镜像路径。

导入镜像文件

1
2
3
4
5
[root@hj-t centos6.8-image]# ls
bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  sbin  selinux  srv  sys  tmp  usr  var
[root@hj-t centos6.8-image]# cd ..
[root@hj-t ~]# cd centos6.8-image/ && tar -c .|docker import - centos6.8-base
fe95b256d2793f16fe41bd7a85b86ebef0fb5bb8a1b58d72cda2ac18e5b83735

查看制作的镜像

1
2
3
4
5
6
7
8
9
10
[root@hj-t centos6.8-image]# docker images
REPOSITORY                      TAG                 IMAGE ID            CREATED              VIRTUAL SIZE
centos6.8-base                  latest              fe95b256d279        About a minute ago   399.2 MB
trusty                          latest              8b40146fae0a        4 days ago           0 B
<none>                          <none>              38b969ffff81        5 days ago           172.2 MB
ubuntu                          12.04               93932704ad15        2 weeks ago          139.3 MB
centos                          latest              7322fbe74aa5        13 months ago        172.2 MB
dl.dockerpool.com:5000/mysql    latest              3c6d7e5c8c1b        21 months ago        235.6 MB
dl.dockerpool.com:5000/java     7u65-jdk            bd8bd16075a0        21 months ago        562.7 MB
dl.dockerpool.com:5000/centos   latest              87e5b6b3ccc1        22 months ago        224 MB

使用刚刚制作的镜像 并修改镜像

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
[root@hj-t centos6.8-image]# docker run -t -i centos6.8-base /bin/bash
bash-4.1#
bash-4.1#
bash-4.1#
bash-4.1# cat /etc/redhat-release
CentOS release 6.8 (Final)
bash-4.1# tee /etc/yum.repos.d/xxb.repo <<-'EOF'
> [xxb]
> name=XXB Customized packages
> proxy=_none_
> enabled=1
> baseurl=http://yum.xxb.cn
> gpgcheck=0
> EOF
[xxb]
name=XXB Customized packages
proxy=_none_
enabled=1
baseurl=http://yum.xxb.cn
gpgcheck=0
bash-4.1#
bash-4.1# cat /etc/yum.repos.d/xxb.repo
[xxb]
name=XXB Customized packages
proxy=_none_
enabled=1
baseurl=http://yum.xxb.cn
gpgcheck=0
bash-4.1# rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
Retrieving http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
warning: /var/tmp/rpm-tmp.Q2TNYz: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY
Preparing...                ########################################### [100%]
   1:epel-release           ########################################### [100%]
bash-4.1# rpm -ivh http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-7.noarch.rpm
Retrieving http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-7.noarch.rpm
warning: /var/tmp/rpm-tmp.PDozFT: Header V4 RSA/SHA1 Signature, key ID 4bd6ec30: NOKEY
Preparing...                ########################################### [100%]
   1:puppetlabs-release     ########################################### [100%]
   bash-4.1# ls -l /etc/yum.repos.d/
total 40
-rw-r--r-- 1 root root 1991 May 18 15:47 CentOS-Base.repo
-rw-r--r-- 1 root root  647 May 18 15:47 CentOS-Debuginfo.repo
-rw-r--r-- 1 root root  630 May 18 15:47 CentOS-Media.repo
-rw-r--r-- 1 root root 6259 May 18 15:47 CentOS-Vault.repo
-rw-r--r-- 1 root root  289 May 18 15:47 CentOS-fasttrack.repo
-rw-r--r-- 1 root root 1056 Nov  4  2012 epel-testing.repo
-rw-r--r-- 1 root root  957 Nov  4  2012 epel.repo
-rw-r--r-- 1 root root 1250 Apr 12  2013 puppetlabs.repo
-rw-r--r-- 1 root root   95 Aug  8 03:23 xxb.repo
bash-4.1# exit
exit
You have mail in /var/spool/mail/root

提交修改后的容器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@hj-t centos6.8-image]# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS              PORTS               NAMES
ef6ed34185e4        centos6.8-base      "/bin/bash"         About a minute ago   Up About a minute                       furious_brown
[root@hj-t centos6.8-image]# docker commit -m="add epel/xxb/puppet yum repos" -a="HuangJie" ef6ed34185e4 xxb/centos6.8-image
f9e5479d080972711bbb049a21f268eaa48ff8797851824b726bb62d6d90e599
You have mail in /var/spool/mail/root
[root@hj-t centos6.8-image]# docker images
REPOSITORY                      TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
xxb/centos6.8-image             latest              f9e5479d0809        20 seconds ago      399.2 MB
centos6.8-base                  latest              fe95b256d279        22 minutes ago      399.2 MB
trusty                          latest              8b40146fae0a        5 days ago          0 B
<none>                          <none>              38b969ffff81        5 days ago          172.2 MB
ubuntu                          12.04               93932704ad15        2 weeks ago         139.3 MB
centos                          latest              7322fbe74aa5        13 months ago       172.2 MB
dl.dockerpool.com:5000/mysql    latest              3c6d7e5c8c1b        21 months ago       235.6 MB
dl.dockerpool.com:5000/java     7u65-jdk            bd8bd16075a0        21 months ago       562.7 MB
dl.dockerpool.com:5000/centos   latest              87e5b6b3ccc1        22 months ago       224 MB

编写dockerfile

1
2
3
4
5
6
7
8
# centos tomcat7-java7
FROM xxb/centos6.8-image
MAINTAINER Docker HuangJie <huangjie@sufe.edu.cn>
RUN echo -ne "[xxb]\nname=XXB Customized packages\nproxy=_none_\nenabled=1\nbaseurl=http://yum.xxb.cn\ngpgcheck=0\n" > /etc/yum.repos.d/xxb.repo
RUN rpm -ivh http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-7.noarch.rpm
RUN rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
RUN yum install java-1.7.0-sun -y
RUN yum install tomcat7-7.0.52 -y

使用dockerbuild构建一个新的镜像

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
[root@hj-t docker]# docker build -t xxb/centos6.8-tomcat7 .
Sending build context to Docker daemon 14.85 kB
Sending build context to Docker daemon
Step 0 : FROM xxb/centos6.8-image
 ---> 04a72f961cee
Step 1 : MAINTAINER Docker HuangJie <huangjie@sufe.edu.cn>
 ---> Running in 87b006753029
 ---> ed75666269b8
Removing intermediate container 87b006753029
Step 2 : RUN echo -ne "[xxb]\nname=XXB Customized packages\nproxy=_none_\nenabled=1\nbaseurl=http://yum.xxb.cn\ngpgcheck=0\n" > /etc/yum.repos.d/xxb.repo
 ---> Running in b04ea1c52941
 ---> a1d12004c223
Removing intermediate container b04ea1c52941
Step 3 : RUN rpm -ivh http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-7.noarch.rpm
 ---> Running in ef5725f352a9
warning: /var/tmp/rpm-tmp.umEdxA: Header V4 RSA/SHA1 Signature, key ID 4bd6ec30: NOKEY
Retrieving http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-7.noarch.rpm
Preparing...                ##################################################
puppetlabs-release          ##################################################
 ---> afdc3f858610
Removing intermediate container ef5725f352a9
Step 4 : RUN rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
 ---> Running in b20c53211e2d
warning: /var/tmp/rpm-tmp.DOY63e: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY
Retrieving http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
Preparing...                ##################################################
epel-release                ##################################################
 ---> 8ec75e6205cd
Removing intermediate container b20c53211e2d
Step 5 : RUN yum install java-1.7.0-sun -y
 ---> Running in bf089eca1658
Loaded plugins: fastestmirror
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package java-1.7.0-sun.x86_64 1:1.7.0.25-xxb.el6 will be installed
--> Processing Dependency: unixODBC for package: 1:java-1.7.0-sun-1.7.0.25-xxb.el6.x86_64
--> Processing Dependency: jpackage-utils for package: 1:java-1.7.0-sun-1.7.0.25-xxb.el6.x86_64
--> Running transaction check
---> Package jpackage-utils.noarch 0:1.7.5-3.16.el6 will be installed
---> Package unixODBC.x86_64 0:2.2.14-14.el6 will be installed
--> Processing Dependency: libltdl.so.7()(64bit) for package: unixODBC-2.2.14-14.el6.x86_64
--> Running transaction check
---> Package libtool-ltdl.x86_64 0:2.2.6-15.5.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package              Arch         Version                     Repository  Size
================================================================================
Installing:
 java-1.7.0-sun       x86_64       1:1.7.0.25-xxb.el6          xxb        100 M
Installing for dependencies:
 jpackage-utils       noarch       1.7.5-3.16.el6              base        60 k
 libtool-ltdl         x86_64       2.2.6-15.5.el6              base        44 k
 unixODBC             x86_64       2.2.14-14.el6               base       378 k

Transaction Summary
================================================================================
Install       4 Package(s)

Total download size: 100 M
Installed size: 163 M
Downloading Packages:
--------------------------------------------------------------------------------
Total                                            18 MB/s | 100 MB     00:05
warning: rpmts_HdrFromFdno: Header V3 RSA/SHA1 Signature, key ID c105b9de: NOKEY
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6
Importing GPG key 0xC105B9DE:
 Userid : CentOS-6 Key (CentOS 6 Official Signing Key) <centos-6-key@centos.org>
 Package: centos-release-6-8.el6.centos.12.3.x86_64 (@febootstrap/$releasever)
 From   : /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Warning: RPMDB altered outside of yum.
  Installing : libtool-ltdl-2.2.6-15.5.el6.x86_64                           1/4
  Installing : unixODBC-2.2.14-14.el6.x86_64                                2/4
  Installing : jpackage-utils-1.7.5-3.16.el6.noarch                         3/4
  Installing : 1:java-1.7.0-sun-1.7.0.25-xxb.el6.x86_64                     4/4
  Verifying  : 1:java-1.7.0-sun-1.7.0.25-xxb.el6.x86_64                     1/4
  Verifying  : unixODBC-2.2.14-14.el6.x86_64                                2/4
  Verifying  : jpackage-utils-1.7.5-3.16.el6.noarch                         3/4
  Verifying  : libtool-ltdl-2.2.6-15.5.el6.x86_64                           4/4

Installed:
  java-1.7.0-sun.x86_64 1:1.7.0.25-xxb.el6

Dependency Installed:
  jpackage-utils.noarch 0:1.7.5-3.16.el6  libtool-ltdl.x86_64 0:2.2.6-15.5.el6
  unixODBC.x86_64 0:2.2.14-14.el6

Complete!
 ---> 5cee7430f398
Removing intermediate container bf089eca1658
Step 6 : RUN yum install tomcat7-7.0.52 -y
 ---> Running in a847208ea2c5
Loaded plugins: fastestmirror
Setting up Install Process
Determining fastest mirrors
 * base: mirrors.zju.edu.cn
 * epel: mirrors.ustc.edu.cn
 * extras: mirrors.zju.edu.cn
 * updates: mirrors.zju.edu.cn
Resolving Dependencies
--> Running transaction check
---> Package tomcat7.x86_64 1:7.0.52-xxb.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package          Arch            Version                    Repository    Size
================================================================================
Installing:
 tomcat7          x86_64          1:7.0.52-xxb.el6           xxb          6.4 M

Transaction Summary
================================================================================
Install       1 Package(s)

Total download size: 6.4 M
Installed size: 7.2 M
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : 1:tomcat7-7.0.52-xxb.el6.x86_64                              1/1
  Verifying  : 1:tomcat7-7.0.52-xxb.el6.x86_64                              1/1

Installed:
  tomcat7.x86_64 1:7.0.52-xxb.el6

Complete!
 ---> 0852c52a7856
Removing intermediate container a847208ea2c5
Successfully built 0852c52a7856
You have mail in /var/spool/mail/root
1
2
3
4
5
6
7
8
9
10
11
12
13
[root@hj-t docker]# docker images
REPOSITORY                      TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
xxb/centos6.8-tomcat7           latest              0852c52a7856        33 minutes ago      687.7 MB
xxb/centos6.8-image             latest              04a72f961cee        59 minutes ago      399.2 MB
<none>                          <none>              c55929c79d98        About an hour ago   399.2 MB
centos6.8-base                  latest              fe95b256d279        About an hour ago   399.2 MB
trusty                          latest              8b40146fae0a        5 days ago          0 B
<none>                          <none>              38b969ffff81        5 days ago          172.2 MB
ubuntu                          12.04               93932704ad15        2 weeks ago         139.3 MB
centos                          latest              7322fbe74aa5        13 months ago       172.2 MB
dl.dockerpool.com:5000/mysql    latest              3c6d7e5c8c1b        21 months ago       235.6 MB
dl.dockerpool.com:5000/java     7u65-jdk            bd8bd16075a0        21 months ago       562.7 MB
dl.dockerpool.com:5000/centos   latest              87e5b6b3ccc1        22 months ago       224 MB

DockerUI

1
2
3
4
docker run -d -p 9000:9000 --privileged -v /var/run/docker.sock:/var/run/docker.sock uifd/ui-for-docker
[root@hj-t ~]# docker ps
CONTAINER ID        IMAGE                COMMAND             CREATED             STATUS              PORTS                    NAMES
ab8d0c4da4c1        uifd/ui-for-docker   "/dockerui"         27 seconds ago      Up 26 seconds       0.0.0.0:9000->9000/tcp   furious_kowalevski

使用dockerUI来管理镜像和容器

dockerUI的Dashboard

dockerUI的镜像

dockerUI的容器

dockerUI的info

dockerUI的容器信息

关联账户识别

关联账户识别

思路

  • 模拟一年股指期货报撤单数据
  • 计算任意两个账户之间是否是关联账户
  • 计算所有账户组合是否关联账户

模拟数据

  • 20万用户,3万活跃用户
  • 4千万报撤单
  • 1年的数据

关联账户计算

  • 输入参数:任意两个账户ID
  • 输出参数:是否关联账户
  • 数据:1年共计4千万条
  • 耗时:800-900毫秒

计算所有组合

  • 组合数量:4.5亿个组合
  • 共计耗时:3.6亿秒,即4千天

并行计算

  • 组合数量平均分成4千份
  • Greeplum中同时提交,并行计算
  • 估计需要1-2天

计算任意两个账户ID的关联关系

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
WITH client_t1_orders AS (
        SELECT clientid, count(*) AS total_orders
        FROM if
        WHERE clientid = '137849'
        GROUP BY clientid
     ), client_t2_orders AS (
        SELECT clientid, count(*) AS total_orders
        FROM if
        WHERE clientid = '174243'
        GROUP BY clientid
     ),order_compare AS(
     SELECT t1.tradingday,t1.clientid as clientid_t1,t2.clientid as clientid_t2,
     EXTRACT(EPOCH FROM (t1.inserttime -  t2.inserttime )),
     CASE WHEN EXTRACT(EPOCH FROM (t1.inserttime -  t2.inserttime )) < 60
     AND  EXTRACT(EPOCH FROM (t1.inserttime -  t2.inserttime ))  > -60
     THEN 1 ELSE 0 
     END  AS same_orders
     FROM if t1
     INNER JOIN if t2
     on t1.tradingday = t2.tradingday 
     and t1.instrumentid = t2.instrumentid
     and t1.direction = t2.direction 
     and t1.clientid  = 137849
     and t2.clientid = 174243
     )
SELECT t1.total_orders,round(t2.same_orders*1.0/t1.total_orders,4) as rate
FROM(
SELECT min(total_orders) AS total_orders
FROM
    (SELECT total_orders
     FROM client_t1_orders
     UNION ALL 
     SELECT total_orders
     FROM client_t1_orders
     )tt
     )t1
CROSS JOIN (SELECT clientid_t1 ,clientid_t2,sum(same_orders) AS same_orders
            FROM order_compare 
            GROUP BY clientid_t1 ,clientid_t2
  
)t2

模拟股指期货报撤单数据

模拟股指期货报撤单数据

开始

  • 因项目需要
  • 股指期货的报撤单数据属于绝对机密,中金所不肯提供
  • 那就只能模拟造出来了

逻辑及思路

  • 20万客户,每天3万交易客户
  • 每天20万笔报撤单记录
  • 一年大概五千万条数据
  • 需要模拟的数据
  • 交易日 tradingday : date
  • 客户id clientid : int
  • 合约id instrumentid : text
  • 买卖方向 direction : text
  • 报单日期 insertdate : date
  • 报单时间 inserttime : time

核心实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
import random
import pandas.io.sql as psql
import pandas as pd


#交易日,[2:6]
tradingday = ['20150101','20150102','20150103','20150104','20150105','20150106','20150107','20150108','20150109','20150110','20150111','20150112','20150113','20150114','20150115','20150116','20150117','20150118','20150119','20150120','20150121','20150122','20150123','20150124','20150125','20150126','20150127','20150128','20150129','20150130','20150131','20150201','20150202','20150203','20150204','20150205','20150206','20150207','20150208','20150209','20150210','20150211','20150212','20150213','20150214','20150215','20150216','20150217','20150218','20150219','20150220','20150221','20150222','20150223','20150224','20150225','20150226','20150227','20150228','20150301','20150302','20150303','20150304','20150305','20150306','20150307','20150308','20150309','20150310','20150311','20150312','20150313','20150314','20150315','20150316','20150317','20150318','20150319','20150320','20150321','20150322','20150323','20150324','20150325','20150326','20150327','20150328','20150329','20150330','20150331','20150401','20150402','20150403','20150404','20150405','20150406','20150407','20150408','20150409','20150410','20150411','20150412','20150413','20150414','20150415','20150416','20150417','20150418','20150419','20150420','20150421','20150422','20150423','20150424','20150425','20150426','20150427','20150428','20150429','20150430','20150501','20150502','20150503','20150504','20150505','20150506','20150507','20150508','20150509','20150510','20150511','20150512','20150513','20150514','20150515','20150516','20150517','20150518','20150519','20150520','20150521','20150522','20150523','20150524','20150525','20150526','20150527','20150528','20150529','20150530','20150531','20150601','20150602','20150603','20150604','20150605','20150606','20150607','20150608','20150609','20150610','20150611','20150612','20150613','20150614','20150615','20150616','20150617','20150618','20150619','20150620','20150621','20150622','20150623','20150624','20150625','20150626','20150627','20150628','20150629','20150630','20150701','20150702','20150703','20150704','20150705','20150706','20150707','20150708','20150709','20150710','20150711','20150712','20150713','20150714','20150715','20150716','20150717','20150718','20150719','20150720','20150721','20150722','20150723','20150724','20150725','20150726','20150727','20150728','20150729','20150730','20150731','20150801','20150802','20150803','20150804','20150805','20150806','20150807','20150808','20150809','20150810','20150811','20150812','20150813','20150814','20150815','20150816','20150817','20150818','20150819','20150820','20150821','20150822','20150823','20150824','20150825','20150826','20150827','20150828','20150829','20150830','20150831','20150901','20150902','20150903','20150904','20150905','20150906','20150907','20150908','20150909','20150910','20150911','20150912','20150913','20150914','20150915','20150916','20150917','20150918','20150919','20150920','20150921','20150922','20150923','20150924','20150925','20150926','20150927','20150928','20150929','20150930','20151001','20151002','20151003','20151004','20151005','20151006','20151007','20151008','20151009','20151010','20151011','20151012','20151013','20151014','20151015','20151016','20151017','20151018','20151019','20151020','20151021','20151022','20151023','20151024','20151025','20151026','20151027','20151028','20151029','20151030','20151031','20151101','20151102','20151103','20151104','20151105','20151106','20151107','20151108','20151109','20151110','20151111','20151112','20151113','20151114','20151115','20151116','20151117','20151118','20151119','20151120','20151121','20151122','20151123','20151124','20151125','20151126','20151127','20151128','20151129','20151130','20151201','20151202','20151203','20151204','20151205','20151206','20151207','20151208','20151209','20151210','20151211','20151212','20151213','20151214','20151215','20151216','20151217','20151218','20151219','20151220','20151221','20151222','20151223','20151224','20151225','20151226','20151227','20151228','20151229','20151230','20151231']
#客户id
clientid = []
#股指期货合约
instrumentid = 'IF'
#买卖方向
direction = ['0','1']
#交易日期
insertdate = []
#交易时间09:00:00--11:00:00 13:00:00--15:00:00
inserttime = []
hours = [9,10,11,13,14]

#随机生成不重复用户3万个
number = 30000
start = 100000
end = 200000
clientid = random.sample(range(start,end),number)

#生成一条报撤单数据
one = ()
#报撤单数据列表
datalist =[]

def getinstrumentid(tradingday):
	li = []
	li.append('IF' + tradingday[2:6])
	li.append('IF' + str(int(tradingday[2:6])+1))
	return random.choice(li)

#获取交易日期
def gettradingday():
	return random.choice(tradingday)

#获取用户id
def getclientid():
	return random.choice(clientid)

#从买房方向中选择一个
def getdirection():
	return random.choice(direction)

#获取随机交易时间
def gettime():
	hour = random.choice(hours)
	minute = random.randint(0,59)
	second = random.randint(0,59)
	return str(hour) + ':' + str(minute) + ':' + str(second)


if __name__=='__main__':

	for i in range(0,20):
		for d in range(0,2000000):
			day = gettradingday()
			one = (day,getclientid(),getinstrumentid(day),getdirection(),day,gettime())
			datalist.append(one)
		df = pd.DataFrame(datalist)
		csv = df.to_csv('/opt/IF' + str(i) + '.csv',index=False,header=False)
		print('%s is saved......' % i)
	print('IF data is all successful..........')

Greenplum创建表

1
2
3
4
5
6
7
8
9
10
create table if(
tradingday date,
clientid int,
instrumentid char(6),
direction char(1),
insertdate date,
inserttime time
)
WITH (appendonly=true,orientation=column,compresstype=QUICKLZ,COMPRESSLEVEL=1)  
distributed by (clientid)

使用GPLOAD载入数据库

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
VERSION: 1.0.0.1
DATABASE: fitl
USER: gpadmin
HOST: mdw
PORT: 5432
GPLOAD:
   INPUT:
    - SOURCE:
         LOCAL_HOSTNAME:
           - mdw
         PORT: 8092
         FILE:
           - /fitl/IF/*.csv
    - COLUMNS:
          -  tradingday : date
          -  clientid : int
          -  instrumentid : text
          -  direction : text
          -  insertdate : date
          -  inserttime : time
    - FORMAT: csv
    - DELIMITER: ','
    - QUOTE: '"'
    - HEADER: false
    - ERROR_LIMIT: 500
    - ERROR_TABLE: public.if_err
   OUTPUT:
    - TABLE: public.if
    - MODE: INSERT

生成4亿6千2百万报撤单数据

1
2
3
4
5
6
7
2016-05-23 18:10:38|INFO|gpload session started 2016-05-23 18:10:38
2016-05-23 18:10:39|INFO|started gpfdist -p 8092 -P 8093 -f "/fitl/IF/IF20.csv" -t 30
2016-05-23 18:10:52|INFO|running time: 14.26 seconds
2016-05-23 18:10:53|INFO|rows Inserted          = 42000000
2016-05-23 18:10:53|INFO|rows Updated           = 0
2016-05-23 18:10:53|INFO|data formatting errors = 0
2016-05-23 18:10:53|INFO|gpload succeeded

IF数据