ls: 读取目录 . : 输入/输出错误

gcs.jpg

昨日进入一目录(/data2),确是有错误发生:

15:42:39 root@bookfm [~] # cd /data2/
15:42:58 root@bookfm [/data2] # ls
ls: 读取目录 .: 输入/输出错误

先卸载该分区

15:43:01 root@bookfm [/data2] # umount /data2

如此,甚是奇怪,思考了一会,想到该分区做的是LVM,于是:

16:17:48 root@bookfm [~] # vgscan
Reading all physical volumes.  This may take a while...
/dev/data_backup/backup_data: read failed after 0 of 4096 at 0: 输入/输出错误
/dev/data_backup/backup_data: read failed after 0 of 4096 at 4096: 输入/输出错误
/dev/data_backup/backup_data: read failed after 0 of 4096 at 0: 输入/输出错误
Found volume group "vg_data" using metadata type lvm2
Found volume group "data_backup" using metadata type lvm2
Found volume group "VolGroup_ID_9201" using metadata type lvm2

这看起来 data_backup 这个卷组有问题,于是想尝试看能不能重新激活一下该卷组:

16:24:53 root@bookfm [~] # vgchange -an data_backup
/dev/data_backup/backup_data: read failed after 0 of 4096 at 0: 输入/输出错误
0 logical volume(s) in volume group "data_backup" now active

- 阅读剩余部分 -

MySQL5.1 升级到 Percona Server 5.5

1.下载

从Percona 的官网下载 http://www.percona.com

最新的版本:percona-server-5.5.36-34.2


[root@server3 ~]# wget http://www.percona.com/downloads/Percona-Server-5.5/LATEST/source/tarball/percona-server-5.5.36-34.2.tar.gz

2.编译安装

注意:事先要把cmake安装好


[root@server3 ~]# tar xf percona-server-5.5.36-34.2.tar.gz
[root@server3 ~]# cd percona-server-5.5.36-34.2
[root@server3 percona-server-5.5.36-34.2]# cmake . -DCMAKE_INSTALL_PREFIX=/usr/local/percona-server-5.5.36-34.2 \
> -DMYSQL_DATADIR=/usr/local/mysql/data \
> -DMYSQL_UNIX_ADDR=/tmp/mysql.sock \
> -DDEFAULT_CHARSET=utf8 \
> -DDEFAULT_COLLATION=utf8_general_ci \
> -DWITH_EXTRA_CHARSETS:STRING=all \
> -DWITH_INNOBASE_STORAGE_ENGINE=1 \
> -DWITH_READLINE=1 \
> -DENABLED_LOCAL_INFILE=1  \
> -DWITH_DEBUG=0 \
> -DMYSQL_TCP_PORT=3306 \
> -DWITH_SSL=system \
> -DWITH_ZLIB=system \
> -DWITH_LIBWRAP=0

-DMYSQL_DATADIR=/usr/local/mysql/data 在这里指定了原来Mysql5.1所在的数据目录,不过不指定也可以,安装完后也可以在my.cnf里面指定


[root@server3 percona-server-5.5.36-34.2]# make && make install
[  0%] Built target INFO_BIN
Scanning dependencies of target INFO_SRC
[  0%] Built target INFO_SRC
Scanning dependencies of target abi_check
[  0%] Built target abi_check
Scanning dependencies of target readline
[  0%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/readline.c.o
[  1%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/funmap.c.o
[  1%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/keymaps.c.o
[  1%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/vi_mode.c.o
[  1%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/parens.c.o
[  1%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/rltty.c.o
[  1%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/complete.c.o
[  2%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/bind.c.o
[  2%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/isearch.c.o
[  2%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/display.c.o
[  2%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/signals.c.o
[  2%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/util.c.o
[  2%] Building C object cmd-line-utils/readline/CMakeFiles/readline.dir/kill.c.o

- 阅读剩余部分 -

Python小爬虫

Python写的小爬虫,用来爬去Amazon上的书籍信息,分2个脚本(一个把页面下载下来,一个分析下载好的页面),下载的时候使用多进程(结合数据库,多开脚本)进行下载

创建表语句

CREATE TABLE `AMAZON_BOOK` (
  `BOOK_ID` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '书ID',
  `ASIN` varchar(100) NOT NULL COMMENT 'amazon书籍标识',
  `BOOK_NAME` varchar(200) DEFAULT NULL COMMENT '书名',
  `BOOK_SERIES` varchar(100) DEFAULT NULL COMMENT '丛书名',
  `ORIGINAL_BOOK_NAME` varchar(200) DEFAULT NULL COMMENT '外文书名',
  `AUTHOR` varchar(200) DEFAULT NULL COMMENT '作者',
  `EDITOR` varchar(200) DEFAULT NULL COMMENT '责任编辑',
  `EDITOR_CONTACT` varchar(400) DEFAULT NULL COMMENT '责任编辑联系方式',
  `PUBLISHER_NAME` varchar(200) DEFAULT NULL COMMENT '出版社名称',
  `PUBLISH_DATE` date DEFAULT NULL COMMENT '出版时间',
  `PUBLISH_VERSION` varchar(100) DEFAULT NULL COMMENT '版次',
  `PRINTED_COUNT` int(11) DEFAULT '0' COMMENT '印次',
  `PRINTED_DATE` date DEFAULT NULL COMMENT '印刷时间',
  `ISBN` varchar(100) DEFAULT NULL COMMENT 'ISBN',
  `BARCODE` varchar(100) DEFAULT NULL COMMENT '条形码',
  `WORD_COUNT` varchar(50) DEFAULT NULL COMMENT '字数',
  `FACT_PAGE_COUNT` int(11) DEFAULT NULL COMMENT '实际页数',
  `PAGE_COUNT` int(11) DEFAULT NULL COMMENT '电子书页数',
  `CHAPTER_COUNT` int(11) DEFAULT NULL COMMENT '章节数量',
  `PRINTED_QUANTITY` int(11) DEFAULT NULL COMMENT '印刷数量',
  `FOLIO` varchar(50) DEFAULT NULL COMMENT '开本',
  `PAPER_MATERIAL` varchar(50) DEFAULT NULL COMMENT '纸张',
  `PACK` varchar(50) DEFAULT NULL COMMENT '包装',
  `INTRODUCTION` mediumtext COMMENT '简介',
  `AUTHOR_INTRODUCTION` mediumtext COMMENT '作者简介',
  `EDITOR_COMMENT` mediumtext COMMENT '编辑评论',
  `CELEBRITY_COMMENT` mediumtext COMMENT '名人评论',
  `TABLE_OF_CONTENTS` mediumtext COMMENT '目录',
  `TAGS` varchar(200) DEFAULT NULL COMMENT '标签',
  `BOOK_CATEGORY_CODE` varchar(35) DEFAULT NULL COMMENT '书分类代码',
  `PAPER_PRICE` decimal(13,4) DEFAULT NULL COMMENT '纸质书价格',
  `LANGUAGE` varchar(200) DEFAULT NULL COMMENT '语种',
  `PACKAGE_SIZE` varchar(200) DEFAULT NULL COMMENT '商品尺寸',
  `PACKAGE_WEIGHT` varchar(200) DEFAULT NULL COMMENT '商品重量',
  `TRANSLATOR` varchar(200) DEFAULT NULL COMMENT '译者',
  `EDITOR_DEPARTMENT` varchar(200) DEFAULT NULL COMMENT '责编部门',
  `CREATE_DATETIME` datetime DEFAULT NULL,
  `CREATE_BY` varchar(100) DEFAULT NULL,
  `UPDATE_DATETIME` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `UPDATE_BY` varchar(100) DEFAULT NULL,
  `DELETE_FLAG` tinyint(4) DEFAULT '0' COMMENT '0未删除,1已删除',
  `PAPER_SALE_PRICE` decimal(13,4) DEFAULT NULL COMMENT '纸质书售价',
  `DOWNLOAD_SUCCESS` tinyint(1) DEFAULT '0' COMMENT '0下载未成功,1下载成功',
  `IS_DOWNLOADING` tinyint(1) DEFAULT '0' COMMENT '0没有在下载中,1已在下载中',
  `PARSE_OK` tinyint(1) DEFAULT NULL COMMENT '0未分析成功,1成功分析',
  PRIMARY KEY (`BOOK_ID`),
  UNIQUE KEY `ASIN` (`ASIN`)
) ENGINE=InnoDB AUTO_INCREMENT=110297 DEFAULT CHARSET=utf8 COMMENT='书';

请在数据库中自行添加 ASIN字段的值(可以分析搜索页面抓取需要的)

下载页面脚本

#!/usr/bin/env python
# coding=utf-8

import os
import time
import socket
import urllib
import MySQLdb


# 下载进度
def reportHook(blocks_read, block_size, total_size):
    if not blocks_read:
        print 'Connection opened'
        return
    if total_size < 0:
        print 'Read %d blocks (%d bytes)' % (blocks_read, blocks_read * block_size)
    else:
        amount_read = blocks_read * block_size
        print 'Read %d blocks, or %d/%d' % (blocks_read, amount_read, total_size)
    return


# 下载页面
def downloadPage(url, filename):
    urllib.urlretrieve(url, filename, reporthook=reportHook)


# 创建书ID目录
def createDir(dir_name):
    if not os.path.exists(dir_name):
        os.mkdir(dir_name)


# 判断下载的文件是否存在且大小不为0
def existsFile(file_name):
    if os.path.exists(file_name) and os.path.getsize(file_name) > 102400:
        return True
    return False


# 执行查询语句,并返回结果
def fetchAllResult(cursor, sql):
    cursor.execute(sql)
    return cursor.fetchall()


# 执行更新语句
def updateSQL(cursor, sql):
    cursor.execute(sql)


# 开始下载页面
def startDownLoad(conn, start_asni, error_file):

    # 开始下载商品页面和商品详细页面
    print
    print "-------start download: < %s > --------" % start_asni
    updateSQL(cur, set_downloading_sql % start_asni)
    conn.commit()
    product_file_name = start_asni + '.html'
    product_description_file_name = start_asni + '_description.html'
    try:
        downloadPage(product_url % start_asni, product_file_name)
        downloadPage(product_description % start_asni, product_description_file_name)
        if existsFile(product_file_name) and existsFile(product_description_file_name):
            updateSQL(cur, set_success_sql % start_asni)
            conn.commit()
            print "..... Book [%s] download OK..... " % start_asni
        else:
            print "..... Book [%s] download failure, will restart download ..... " % start_asni
            updateSQL(cur, set_failure_sql % start_asni)
            conn.commit()
    except:
        print "******* [ %s ] download exception *******" % start_asni
        print
        updateSQL(cur, set_failure_sql % start_asni)
        conn.commit()
        error_file.write(start_asni + '\n')
        time.sleep(2)
    conn.commit()


# 连接Mysql
def connMysql(host, user, passwd, db, port):
    conn = None
    try:
        conn = MySQLdb.connect(host=host, user=user, passwd=passwd, db=db, port=port, charset='utf8')
    except MySQLdb.Error,e:
        print "Mysql Error [ %d ]: %s" % (e.args[0], e.args[1])
    return conn



# 定义数据库IP、用户名、密码
host = 'localhost'
user = 'root'
passwd = '123456'
db = 'download_book'
port = 3306
book_root = "D:\\AMAZON_BOOK"

socket.setdefaulttimeout(20)

# 商品页面
product_url = 'http://www.amazon.cn/111fsfsdfd/dp/%s'

# 商品详细页面
product_description = 'http://www.amazon.cn/111fsfsdfd/dp/product-description/%s'


# 查询SQL语句
get_notstart_sql = """
    SELECT
        BOOK_ID,ASIN,DOWNLOAD_SUCCESS,IS_DOWNLOADING
    FROM
        AMAZON_BOOK
    WHERE
        DOWNLOAD_SUCCESS = 0 AND IS_DOWNLOADING = 0
    LIMIT 1
    """

# 更新SQL语句
set_downloading_sql = """
    UPDATE
        AMAZON_BOOK
    SET
        IS_DOWNLOADING = 1
    WHERE
        ASIN = '%s'
    """

# 下载成功时更新语句
set_success_sql = """
    UPDATE
        AMAZON_BOOK
    SET
        DOWNLOAD_SUCCESS = 1
    WHERE
        ASIN = '%s'
    """

# 失败时更新语句
set_failure_sql = """
    UPDATE
        AMAZON_BOOK
    SET
        DOWNLOAD_SUCCESS = 0,IS_DOWNLOADING = 0
    WHERE
        ASIN = '%s'
    """


conn = connMysql(host, user, passwd, db, port)
if conn:
    cur = conn.cursor()
    errorlog = open('d:\\error.log','a')

    # 获取书籍标识并下载
    while True:
        os.chdir(book_root)
        start_asni = fetchAllResult(cur, get_notstart_sql)
        conn.commit()

        # 查询结果是否为空
        if start_asni:
            asni = start_asni[0][1]
            createDir(asni)
            os.chdir(asni)
            startDownLoad(conn, asni, errorlog)
        else:
            print
            print "-------- There is no Book page downdload ! ---------"
            break
    errorlog.close()
    cur.close()
    conn.close()

- 阅读剩余部分 -

Tomcat 配置log4j

一、为什么要使用 log4j?

自带的日志系统默认把日志都输出到catalina.out,且在抛出某些异常没有时间戳,要找到问题抛出的时间不好找,所以才使用log4j日志系统,它可以把所有的定向到system.out/system.err的输出,定向到指定的文件,而不是定向到catalina.out,并且可以按照日期来轮询日志文件,当然同时也可以输出到catalina.out。

详细的介绍:

http://logging.apache.org/log4j/2.x/manual/filters.html

http://www.cnblogs.com/struggletofly/p/log4j.html

二、log4j的配置方式

配置参考:
http://tomcat.apache.org/tomcat-6.0-doc/logging.html

1.在程序中配置,是由程序员写程序的时候配置好,可以针对某个应用,这里不讨论。

http://logging.apache.org/log4j/2.x/manual/configuration.html

http://blog.csdn.net/azheng270/article/details/2173430/

- 阅读剩余部分 -

一次完整的HTTP事务是怎样一个过程?

声明:本文章中的说法仅是个人理解总结,不一定完全正确,但是可以有助于理解。

关于HTTP协议可以参考以下:

HTTP协议漫谈  http://kb.cnblogs.com/page/140611/
HTTP协议概览  http://www.cnblogs.com/vamei/archive/2013/05/11/3069788.html
了解HTTP Headers的方方面面  http://kb.cnblogs.com/page/55442/

当我们在浏览器的地址栏输入 www.linux178.com ,然后回车,回车这一瞬间到看到页面到底发生了什么呢?

域名解析 --> 发起TCP的3次握手 --> 建立TCP连接后发起http请求 --> 服务器响应http请求,浏览器得到html代码 --> 浏览器解析html代码,并请求html代码中的资源(如js、css、图片等) --> 浏览器对页面进行渲染呈现给用户

- 阅读剩余部分 -