smallfish weblog

web.py 数据库操作指南

On 2010/03/19, in Python, by admin

web.py是一个小巧灵活的框架，最新稳定版是0.33。这里不介绍web开发部分，介绍下关于数据库的相关操作。

很多Pyer一开始都喜欢自己封装数据库操作类，本人亦如此。不过后来通过观摩web.py的源码，发现其数据库操作部分相当紧凑实用。推荐懒人可以尝试一下。

废话不多，先来安装，有两种方式：

1. easy_install方式，如果木有此工具，可以参考：https://chenxiaoyu.org/blog/archives/23

easy_install web.py

2. 下载源码编译。地址： http://webpy.org/static/web.py-0.33.tar.gz ，解压后执行：

python setup.py install

web.py安装算到此结束，如果想使用其中的db功能，还得借助与相应数据库操作模块，比如MySQLdb、psycopg2。如果需要尝试连接池(database pool)功能，还得装下DBUtils。这几个模块都可以通过easy_install来安装。

下面开始使用吧！

1. 导入模块，定义数据库连接db。

import web
db = web.database(dbn='postgres', db='mydata', user='dbuser', pw='')

2. select 查询

# 查询表
entries = db.select('mytable')

# where 条件
myvar = dict(name="Bob")
results = db.select('mytable', myvar, where="name = $name")
results = db.select('mytable', where="id>100")

# 查询具体列
results = db.select('mytable', what="id,name")

# order by
results = db.select('mytable', order="post_date DESC")

# group
results = db.select('mytable', group="color")

# limit
results = db.select('mytable', limit=10)

# offset
results = db.select('mytable', offset=10)

3. 更新

db.update('mytable', where="id = 10", value1 = "foo")

4. 删除

db.delete('mytable', where="id=10")

5. 复杂查询

# count
results = db.query("SELECT COUNT(*) AS total_users FROM users")
print results[0].total_users

# join
results = db.query("SELECT * FROM entries JOIN users WHERE entries.author_id = users.id")

# 防止SQL注入可以这么干
results = db.query("SELECT * FROM users WHERE id=$id", vars={'id':10})

6 多数据库操作 (web.py大于0.3)

db1 = web.database(dbn='mysql', db='dbname1', user='foo')
db2 = web.database(dbn='mysql', db='dbname2', user='foo')

print db1.select('foo', where='id=1')
print db2.select('bar', where='id=5')

7. 事务

t = db.transaction()
try:
    db.insert('person', name='foo')
    db.insert('person', name='bar')
except:
    t.rollback()
    raise
else:
    t.commit()

# Python 2.5+ 可以用with
from __future__ import with_statement
with db.transaction():
    db.insert('person', name='foo')
    db.insert('person', name='bar')

Python(Stackless) + MongoDB Apache 日志(2G)分析

On 2010/03/04, in Apache, MongoDB, Python, by admin

为何选择Stackless？ http://www.stackless.com

Stackless可以简单的认为是Python一个增强版，最吸引眼球的非“微线程”莫属。微线程是轻量级的线程，与线程相比切换消耗的资源更小，线程内共享数据更加便捷。相比多线程代码更加简洁和可读。此项目是由EVE Online推出，在并发和性能上确实很强劲。安装和Python一样，可以考虑替换原系统Python。:)

为何选择MongoDB？ http://www.mongodb.org

可以在官网看到很多流行的应用采用MongoDB，比如sourceforge，github等。相比RDBMS有啥优势？首先在速度和性能上优势最为明显，不仅可以当作类似KeyValue数据库来使，还包含了一些数据库查询（Distinct、Group、随机、索引等特性）。再有一点特性就是：简单。不论是应用还是文档，还是第三方API，几乎略过一下就可以使用。不过有点遗憾的就是，存储的数据文件很大，超过正常数据的2-4倍之间。本文测试的Apache日志大小是2G，生产的数据文件有6G。寒…希望在新版里能有所缩身，当然这个也是明显的以空间换速度的后果。

本文除去上面提及到的两个软件，还需要安装pymongo模块。http://api.mongodb.org/python/

模块安装方式有源码编译和easy_install，这里就不再累赘。

1. 从Apache日志中分析出需要保存的资料，比如IP，时间，GET/POST，返回状态码等。

fmt_str  = '(?P<ip>[.\d]+) - - \[(?P<time>.*?)\] "(?P<method>.*?) (?P<uri>.*?) HTTP/1.\d" (?P<status>\d+) (?P<length>.*?) "(?P<referere>.*?)" "(?P<agent>.*?)"'
fmt_name = re.findall('\?P<(.*?)>', fmt_str)
fmt_re   = re.compile(fmt_str)

定义了一个正则用于提取每行日志的内容。fmt_name就是提取尖括号中间的变量名。

2. 定义MongoDB相关变量，包括需要存到collection名称。Connection采取的是默认Host和端口。

conn     = Connection()
apache   = conn.apache
logs     = apache.logs

3. 保存日志行

def make_line(line):
    m = fmt_re.search(line)
    if m:
        logs.insert(dict(zip(fmt_name, m.groups())))

4. 读取Apache日志文件

def make_log(log_path):
    with open(log_path) as fp:
        for line in fp:
            make_line(line.strip())

5. 运行把。

if __name__ == '__main__':
    make_log('d:/apachelog.txt')

脚本大致情况如此，这里没有放上stackless部分代码，可以参考下面代码：

import stackless
def print_x(x):
    print x
stackless.tasklet(print_x)('one')
stackless.tasklet(print_x)('two')
stackless.run()

tasklet操作只是把类似操作放入队列中，run才是真正的运行。这里主要用于替换原有多线程threading并行分析多个日志的行为。

补充：

Apache日志大小是2G，671万行左右。生成的数据库有6G。

硬件：Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz 台式机

系统：RHEL 5.2 文件系统ext3

其他：Stackless 2.6.4 MongoDB 1.2

在保存300万左右时候，一切正常。不管是CPU还是内存，以及插入速度都很不错，大概有8-9000条/秒。和以前笔记本上测试结果基本一致。再往以后，内存消耗有点飙升，插入速度也降低。500万左右记录时候CPU达到40%，内存消耗2.1G。在生成第二个2G数据文件时候似乎速度和效率又提升上去了。最终保存的结果不是太满意。

后加用笔记本重新测试了一下1000万数据，速度比上面的671万明显提升很多。初步怀疑有两个地方可能会影响性能和速度：

1. 文件系统的差异。笔记本是Ubuntu 9.10，ext4系统。搜了下ext3和ext4在大文件读写上会有所差距。

2. 正则匹配上。单行操作都是匹配提取。大文件上应该还有优化的空间。

PostgreSQL UUID 函数

On 2010/02/26, in PostgreSQL, by admin

测试环境：PostgreSQL 8.4

默认PostgreSQL是木有UUID函数可使用，而不像MySQL提供uuid()函数，不过在contrib里有，只需要导入一下uuid-ossp.sql即可。（PS：注意权限问题，要Pg可读改文件。）

导入很简单，下面是win下面测试，其他平台类似该操作：

D:\>psql -U postgres -h localhost -f D:\PostgreSQL\8.4\share\contrib\uuid-ossp.sql
Password for user postgres:
SET
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION

进入psql，执行：

postgres=# select uuid_generate_v1();
           uuid_generate_v1
--------------------------------------
 86811bd4-22a5-11df-b00e-ebd863f5f8a7
(1 row)

postgres=# select uuid_generate_v4();
           uuid_generate_v4
--------------------------------------
 5edbfcbb-1df8-48fa-853f-7917e4e346db
(1 row)

主要就是uuid_generate_v1和uuid_generate_v4，当然还有uuid_generate_v3和uuid_generate_v5。其他使用可以参见PostgreSQL官方文档 uuid-ossp。

PostgreSQL RPM 安装笔记

On 2010/02/06, in PostgreSQL, by admin

测试环境：REHL 5.3
PostgreSQL版本：8.4.2

1. 首先检查下是否已经有PostgreSQL安装程序(俺的机器有pg-libs 8.1，无视之)

shell> rpm -qa | grep postgres

2. 下载最新的8.4.2RPM安装包，这个FTP速度挺快的。:)

shell> wget http://ftp.easynet.be/postgresql/binary/v8.4.2/linux/rpms/redhat/rhel-5-x86_64/postgresql-server-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> wget http://ftp.easynet.be/postgresql/binary/v8.4.2/linux/rpms/redhat/rhel-5-x86_64/postgresql-contrib-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> wget http://ftp.easynet.be/postgresql/binary/v8.4.2/linux/rpms/redhat/rhel-5-x86_64/postgresql-libs-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> wget http://ftp.easynet.be/postgresql/binary/v8.4.2/linux/rpms/redhat/rhel-5-x86_64/postgresql-devel-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> wget http://ftp.easynet.be/postgresql/binary/v8.4.2/linux/rpms/redhat/rhel-5-x86_64/postgresql-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> wget http://ftp.easynet.be/postgresql/binary/v8.4.2/linux/rpms/redhat/rhel-5-x86_64/postgresql-plpython-8.4.2-1PGDG.rhel5.x86_64.rpm

3. 安装PostgreSQL(要注意下顺序)，首先需要更新pg-libs版本。
后面几个不需要的话可以不装。主要是一些扩展功能。

shell> rpm -ivh postgresql-libs-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> rpm -ivh postgresql-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> rpm -ivh postgresql-server-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> rpm -ivh postgresql-contrib-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> rpm -ivh postgresql-devel-8.4.2-1PGDG.rhel5.x86_64.rpm
shell> rpm -ivh postgresql-plpython-8.4.2-1PGDG.rhel5.x86_64.rpm

4. RPM安装完后，需要初始化PostgreSQL库。service初次启动会提示。
如果是源码安装这个过程就是对应的initdb -D，指定data目录。RPM默认对应目录是/var/lib/pgsql/data。

shell> service postgresql initdb

5. service启动PostgreSQL

shell> service postgresql start

到上面这一步基本是安装完成了。下面的是修改数据库用户密码和登陆相关。

6. 切换到postgres用户，修改数据库密码。(注意系统用户和数据库用户密码是两个概念，虽然名字都叫postgres)
修改完需要重启数据库，这里咱不重启，等修改完认证配置再一起重启。

shell> su - postgres
shell> psql
postgres=# ALTER USER postgres WITH PASSWORD '123456';
postgres=# \q

7. 修改认证文件/var/lib/pgsql/data/pg_hba.conf，登陆使用密码。md5格式。

shell> vi /var/lib/pgsql/data/pg_hba.conf
修改ident为md5 (local, host)

8. service重启PostgreSQL

shell> service postgresql restart

9. 再次进入测试，应该会提示输入密码鸟

shell>psql -U postgres

MySQL & PostgreSQL 小命令对比

On 2010/02/05, in MySQL, PostgreSQL, by admin

对比下一些两个数据库常用的操作。分别使用自带的client程序。

MySQL命令行：

mysql -u 用户名 -h 主机地址 -P 端口号 数据库名 -p

PostgreSQL命令行：

psql -U 用户名 -h 主机地址 -p 端口号 数据库名

操作对比：

mysql                      psql

SHOW DATABASES;           \l
USE db-name;              \c db-name
SHOW TABLES;              \d
SHOW USERS;               \du
SHOW COLUMNS;             \d table-name
SHOW PROCESSLIST;         SELECT * FROM pg_stat_activity;
SELECT now()\G            \x 可以打开和关闭类似\G功能
SOURCE /path.sql          \i /path.sql
LOAD DATA INFILE ...      \copy ...
\h                        \?

PostgreSQL Partitioning 表分区

On 2009/12/22, in PostgreSQL, by admin

测试版本：pg 8.3 (ubuntu)

在pg里表分区是通过表继承来实现的，一般都是建立一个主表，里面是空，然后每个分区都去继承它。

创建表分区步骤如下：

1. 创建主表

CREATE TABLE users ( uid int not null primary key, name varchar(20));

2. 创建分区表(必须继承上面的主表)

CREATE TABLE users_0 ( check (uid >= 0 and uid< 100) ) INHERITS (users);
CREATE TABLE users_1 ( check (uid >= 100)) INHERITS (users);

3. 在分区表上建立索引，其实这步可以省略的哦

CREATE INDEX users_0_uidindex on users_0(uid);
CREATE INDEX users_1_uidindex on users_1(uid);

4. 创建规则RULE

CREATE RULE users_insert_0 AS
    ON INSERT TO users WHERE
        (uid >= 0 and uid < 100)
    DO INSTEAD
        INSERT INTO users_0 VALUES (NEW.uid,NEW.name);

CREATE RULE users_insert_1 AS
    ON INSERT TO users WHERE
        (uid >= 100)
    DO INSTEAD
        INSERT INTO users_1 VALUES (NEW.uid,NEW.name);

下面就可以测试写入数据啦：

postgres=# INSERT INTO users VALUES (100,'smallfish');
INSERT 0 0
postgres=# INSERT INTO users VALUES (20,'aaaaa');
INSERT 0 0
postgres=# select * from users;
uid  |   name
-----+-----------
20   | aaaaa
100  | smallfish
(2 笔资料列)
postgres=# select * from users_0;
uid  | name
-----+-------
20   | aaaaa
(1 笔资料列)

postgres=# select * from users_1;
uid  |   name
-----+-----------
100  | smallfish
(1 笔资料列)

到这里表分区已经可以算完了，不过还有个地方需要修改下，先看count查询把。

postgres=# EXPLAIN SELECT count(*) FROM users where uid<100;
QUERY PLAN
------------------------------------------------
Aggregate  (cost=62.75..62.76 rows=1 width=0)
    ->  Append  (cost=6.52..60.55 rows=879 width=0)
    ->  Bitmap Heap Scan on users  (cost=6.52..20.18 rows=293 width=0)
Recheck Cond: (uid < 100)
    ->  Bitmap Index Scan on users_pkey  (cost=0.00..6.45 rows=293 width=0)
Index Cond: (uid < 100)
    ->  Bitmap Heap Scan on users_0 users  (cost=6.52..20.18 rows=293 width=0)
Recheck Cond: (uid < 100)
    ->  Bitmap Index Scan on users_0_uidindex  (cost=0.00..6.45 rows=293 width=0)
Index Cond: (uid < 100)
    ->  Bitmap Heap Scan on users_1 users  (cost=6.52..20.18 rows=293 width=0)
Recheck Cond: (uid < 100)
    ->  Bitmap Index Scan on users_1_uidindex  (cost=0.00..6.45 rows=293 width=0)
Index Cond: (uid < 100)
(14 笔资料列)

按照本来想法，uid小于100，理论上应该只是查询users_0表，通过EXPLAIN可以看到其他他扫描了所有分区的表。

postgres=# SET constraint_exclusion = on;
SET

postgres=# EXPLAIN SELECT count(*) FROM users where uid<100;
QUERY PLAN
------------------------------------------------
Aggregate  (cost=41.83..41.84 rows=1 width=0)
    ->  Append  (cost=6.52..40.37 rows=586 width=0)
    ->  Bitmap Heap Scan on users  (cost=6.52..20.18 rows=293 width=0)
Recheck Cond: (uid < 100)
    ->  Bitmap Index Scan on users_pkey  (cost=0.00..6.45 rows=293 width=0)
Index Cond: (uid < 100)
    ->  Bitmap Heap Scan on users_0 users  (cost=6.52..20.18 rows=293 width=0)
Recheck Cond: (uid < 100)
    ->  Bitmap Index Scan on users_0_uidindex  (cost=0.00..6.45 rows=293 width=0)
Index Cond: (uid < 100)
(10 笔资料列)

到这里整个过程都OK啦！

PostgreSQL tablespace 表空间

On 2009/12/22, in PostgreSQL, by admin

pgsql允许管理员在文件系统里定义表空间存储位置，这样创建数据库对象时候就可以引用这个表空间了。好处就不用多说了，可以把数据库对象存储到不同的分区上，比如更好的存储之类。默认initdb之后会有两个表空间pg_global和pg_default。

查看pgsql当前表空间有哪些可以试试下面：

postgres=> SELECT spcname FROM pg_tablespace;
  spcname
------------
 pg_default
 pg_global
(2 rows)

或：

postgres=> \db
    Name    |  Owner   | Location
------------+----------+----------
 pg_default | postgres |
 pg_global  | postgres |

建立表空间需要注意的主要的是权限问题，而且要在新的空目录上建立，权限属于数据库管理员比如默认postgres。

1. 建立目录

$ mkdir /home/smallfish/pgdata
$ sudo chown -R postgres:postgres /home/smallfish/pgdata

2. 进入psql

$ psql -U postgres -h 192.168.0.122

如果权限没设置好下面语句会报错

postgres=> CREATE TABLESPACE space1 LOCATION '/home/smallfish/pgdata';

建测试表

postgres=> CREATE TABLE foo(i int) TABLESPACE space1;

可以查看表空间目录下多了文件

postgres=> \! ls /home/smallfish/pgdata

删除表空间，需要注意的是先要删除所有该表空间里的对象

postgres=> DROP TABLESPACE space1;

ok，到这里已经建立好表空间了。当然每次建表都指定TABLESPACE也有点麻烦，来点默认的把。

postgres=> SET default_tablespace = space1;
postgres=> CREATE TABLE foo(i int);

C Apache Module 开发入门

On 2009/12/16, in Apache, by admin

前言：

扩展Apache模块开发网上大部分教程都是围绕Perl语言，老外的《Writing Apache Modules with Perl and C》可以算是经典之作了，可惜一直都是针对老版本开发，而且主力语言是Perl，C语言部分只是略有介绍。不过相比较而言用Perl来扩展模块功能确实比 C语言来的快速以及便捷多了，也简单容易。我自己也在工作里应用了一部分，主要是在防盗链上面写了两个简单都模块，可以参考我写的另外两篇文章：apache+mod_perl防盗链以及apache+mod_perl实现url rewrite。说了那么多题外话，回到正题，这里只是用C语言实现一个简单的hello模块，模块功能是查询MySQL自带mysql数据库里都user表。

系统环境：

ArchLinux Apache2.2 MySQL 5.0

具体开发步骤：

1. 利用Apache自带都apxs建立hello模块：

[root#localhost] apxs -g -n hello

这样就会在当前目录下新建一个hello模块的文件目录，可以看到里面有：Makefile mod_hello.c modules.mk这样的文件，具体apxs路径查询下本机apache/bin目录。

2. 预览下mod_hello.c，可以看到里面apxs自动帮你生成一堆代码了，我们需要的只是修改里面的代码部分，先简单都介绍下里面的函数说明。

include 部分就是引入了一些必要都头文件
hello_handler 这个就是hello模块都主体部分，所有的显示、处理请求什么的都在这里。
hello_register_hooks hello_module 这俩个是需要导出的函数所必须的，先可以不管他们，按照生成的不动即可。

3. 修改hello_handler函数，里面可以看到request_rec *r，r有很多函数和变量，具体要参见文档了。里面的ap_rputs是输出，可以简单的理解为把字符串输出到r。

static int hello_handler(request_rec *r)
{
if (strcmp(r->handler, "hello")) { // 判断apache配置文件里handler是否等于hello，不是就跳过
          return DECLINED;
     }
     r->content_type = "text/html"; // 设置content-type
     if (!r->header_only)
          ap_rputs("The sample page from mod_hello.c\n", r); // 输出一段文字
     return OK;// 返回 200 OK状态
}

增加#include “mysq.h”，查询需要用到这个头文件。
具体代码参见本文结尾部分。

4. 编译模块

[root#localhost] apxs -c -a -i -I/usr/include/mysql/ -lmysqlclient mod_hello.c

可以看到一堆编译指令，加上-I和-l是编译mysql必须的，编译完会自动在httpd.conf加上 LoadModule hello_module modules/mod_hello.so

5. 修改httpd.conf
<Location /hello>
SetHandler hello
</Location
6. 重启apache，访问http://localhost/hello，看是否成功。

=====================

完整代码：

#include "httpd.h"
#include "http_config.h"
#include "http_protocol.h"
#include "ap_config.h"
/* 头文件，本文用到了ap_rprintf函数 */
#include "apr.h"
#include "apr_lib.h"
#include "apr_strings.h"
#include "apr_want.h"
#include "mysql.h"

/* 定义mysql数据变量 */
const char *host = "localhost";
const char *user = "root";
const char *pass = "smallfish";
const char *db    = "mysql";

/* The sample content handler */
static int hello_handler(request_rec *r)
{
    if (strcmp(r->handler, "hello")) {
        return DECLINED;
    }
    r->content_type = "text/html";
    /* 定义mysql变量 */
    MYSQL mysql;
    MYSQL_RES *rs;
    MYSQL_ROW row;
    mysql_init(&mysql); /* 初始化 */
    if (!mysql_real_connect(&mysql, host, user, pass, db, 0, NULL, 0)) {/* 连接数据库 */
        ap_rprintf(r, "<li>Error:%d %s</li>\n", mysql_errno(&mysql), mysql_error(&mysql));
        return OK;
    }
    char *sql = "select host,user from user order by rand()";
    if (mysql_query(&mysql, sql)!=0) { /* 查询 */
        ap_rprintf(r, "<li>Error : %d %s</li>\n", mysql_errno(&mysql), mysql_error(&mysql));
        return OK;
    }
    rs = mysql_store_result(&mysql); /* 获取查询结果 */
    while ((row = mysql_fetch_row(rs))) { /* 获取每一行记录 */
        ap_rprintf(r, "<li>%s - %s</li>\n", row[0], row[1]);
    }
    mysql_free_result(rs); /* 释放结果集 */
    mysql_close(&mysql); /* 关闭连接 */
    return OK;
}

static void hello_register_hooks(apr_pool_t *p)
{
    ap_hook_handler(hello_handler, NULL, NULL, APR_HOOK_MIDDLE);
}

/* Dispatch list for API hooks */
module AP_MODULE_DECLARE_DATA hello_module = {
    STANDARD20_MODULE_STUFF,
    NULL,                            /* create per-dir              config structures */
    NULL,                            /* merge  per-dir              config structures */
    NULL,                            /* create per-server config structures */
    NULL,                            /* merge  per-server config structures */
    NULL,                            /* table of config file commands                 */
    hello_register_hooks  /* register hooks                                */
};

修改 ModPython 下 PYTHON_EGG_CACHE 报错

On 2009/12/16, in Apache, Python, by admin

环境：Linux Apache Python(mod_python)

换了一台新机器，没有配置Mod_Python了，在一些应用里import MySQLdb出现了下面错误：

ExtractionError: Can't extract file(s) to egg cache
The following error occurred while trying to extract file(s) to the Python egg
cache:
  [Errno 13] Permission denied: '/root/.python-eggs'
The Python egg cache directory is currently set to:
  /root/.python-eggs
Perhaps your account does not have write access to this directory?  You can
change the cache directory by setting the PYTHON_EGG_CACHE environment
variable to point to an accessible directory.

解决办法有两种：

1.设置PYTHON_EGG_CACHE环境变量

$ SetEnv PYTHON_EGG_CACHE /tmp/aaa/

目录权限注意要是apache用户，或者简单点就777

2.把egg格式转成目录

$ cd /python-path/site-packages/
$ mv MySQL_python-1.2.3c1-py2.5-linux-x86_64.egg foo.zip
$ mkdir MySQL_python-1.2.3c1-py2.5-linux-x86_64.egg
$ cd MySQL_python-1.2.3c1-py2.5-linux-x86_64.egg
$ unzip ../foo.zip
$ rm ../foo.zip

Java调用Linux SCP操作

On 2009/12/16, in Java, by admin

先来回顾下linux下scp命令的用法：

[shell $] scp -r /本地目录或文件 [email protected]:/远程目录

这条命令是把本地的目录或者文件拷贝到远程192.168.0.110一个目录下，如果是从远程拷到本地，则反一下ip和目录。-r则是递归目录。更多参见scp –help

最近在Java里调用scp，是通过一个JSP页面来触发。为了在调用系统命令时候不出现提示密码，两台机器配置好了信任关系，可以参考ssh, scp不输入密码，大致代码如下：

Runtime.getRuntime().exec("scp /aa.txt [email protected]:/bb");

try时候也没任何异常，但是文件没拷贝过去，最后根据Process的waitFor()获取命令返回值是1。

这下可以肯定的是调用系统命令失败，在System.out.println里打印出command，linux下运行是没错的。为何呢？

后来发现原来是用户权限的问题，默认apache运行用户是nobody，根本没权限调用scp命令，配置的信任关系也是本机的root用户。

那就重新加一个user把，adduser…到配置好信任关系，在scp -i 指定一个rsa文件，并把rsa文件复制到/tmp目录下，权限为0755，继续刷新，后台可以看到提示输入密码之类的output了。

貌似还比较棘手，最后还是搜了下，发现有关Java scp的库，Ganymed SSH-2 for Java。貌似比较老，先来测试一下把。

Connection conn = new Connection(“192.168.0.110”);
conn.connect();
boolean isAuthenticated = conn.authenticateWithPassword(“root”, "***********");
if (isAuthenticated == false)
    throw new IOException("Authentication failed.");
SCPClient client = new SCPClient(conn);
client.put("/aa.txt", "/bb");
conn.close();

OK！发现竟然可以一次运行了。算了就不调用系统命令了，直接使用这个库把。

client.put方法第一个参数可以是个数组，即文件名的数组。暂时没找到整个目录的方法，就自己手动获取下目录文件列表把。