SAE实时日志API Python使用小记

SAE新近开放的实时日志API允许开发者通过HTTP GET方式从SAE服务器获取应用日志，从而实现在线的应用调试与分析。

API参数介绍：

SAE日志API的URL请求格式为：GET /log/(string: service)/(string: date)/(string: ident).log?(string: fop)

参数列表中：

date表示日志的日期，格式为yyyy-MM-dd

service为SAE提供的各项服务，包括http，taskqueue（任务队列），cron（定时任务），mail（邮件），rdc（关系型数据库集群），storage（存储），push（推送）以及fetchurl（URL抓取），相信熟悉SAE的开发者不会对此感到陌生。

ident表示相应服务下的日志类型，包括access（访问），error（错误），alert（警报），debug（调试），warning（警告）与 notice（通知）。

service与ident的对应关系见下表（摘自SAE日志API文档）：

service    ident
http    access、error、alert、debug、warning、notice
taskqueue    error
cron    error
mail    access、error
rdc    error、warning
storage    access
push    access
fetchurl    access

fop（似乎是flow operation的简称），为流式操作参数，支持head，tail，grep等linux下常用的shell命令；并且支持管道，操作之间用竖线分隔

fop操作指令列表（摘自SAE日志API文档）：

head/OFFSET/LIMIT：获取日志开头行，OFFSET是起始行号，LIMIT是获取的最大行数。
tail/OFFSET/LIMIT：获取日志末尾行，OFFSET是起始行号（最后一行行号为1），LIMIT是获取的最大行数。
grep/PATTERN：关键字匹配，参数为关键字，支持部分正则，遵循lua正则语法，如 yq2[^6]+$ 。
fields/SEPERATOR/COL1/COL2/...：取部分列。SEPERATOR指定列与列的分隔符，COL1等是要取的列的序号，从1开始。
uniq/SEPERATOR/COL1/COL2/...：去除相邻重复的行，可以指定通过哪些列来排重，若无参数则比较整行，参数同fields指令。

日志API使用方法：

下面介绍使用Python通过SAE实时日志API在线获取日志的方法，并提供了一个简单的sae log工具类。

1). SAE Python ApibusHandler（用于应用的权限校验）
下载链接地址：https://raw.githubusercontent.com/sinacloud/sae-python-dev-guide/master/examples/apibus/apibus_handler.py

命名为apibus_handler.py

2). Python requests模块（用于HTTP GET）
下载链接地址：https://github.com/kennethreitz/requests/archive/master.zip

将压缩包中的requests目录解压至工作目录下

3). sae log工具类：

使用时记得设置应用的ACCESSKEY与SECRETKEY

#-*-coding: utf8 -*-
#sae log utility based on requests and apibus_handler

#ACCESSKEY and SECRETKEY should be set properly
ACCESSKEY = ''
SECRETKEY = ''

status_code_dict = {200 : 'OK', 206 : 'Partial Content', 400 : 'Bad Request', 500 : 'Internal Server Error' , 404 : 'Not Found'}

service_ident_dict = {'http': ['access', 'error', 'alert', 'debug', 'warning', 'notice'], \
    'taskqueue' : ['error'], \
    'cron' : ['error'], \
    'mail': ['access', 'error'], \
    'rdc' : ['error', 'warning'], \
    'storage' : ['access'], \
    'push' : ['access'], \
    'fetchurl' : ['access']
}

import requests
from apibus_handler import SaeApibusAuth

def fetch_log(service, date, ident, fop = '', version = 1):
    if service not in service_ident_dict:
        raise Exception('Invalid Service Parameter')
    if ident not in service_ident_dict[service]:
        raise Exception('Invalid Ident Parameter')
    url = 'http://g.sae.sina.com.cn/log/' + service + '/' + date + '/' + str(version) + '-' + ident + '.log' + ('?' + fop if fop else '')
    r = requests.get(url, auth=SaeApibusAuth(ACCESSKEY, SECRETKEY))
    status_code, status = r.status_code, status_code_dict.get(r.status_code, 'Unknown')
    content = r.content if status_code == 200 else ''
    return {'status_code' : status_code, 'content' : content, 'status' : status}

命名为sae_log_util.py

4). 测试代码（sae_log_demo.py）：

#-*-coding: utf8 -*-
import sae_log_util

service = 'http'
date = '2015-07-22'
ident = 'access'
fop = 'grep/Yahoo! Slurp;'
print sae_log_util.fetch_log(service, date, ident, fop)

上面的测试代码获取了2015-07-22的HTTP访问日志，并使用grep筛选出日志中所有包含Yahoo! Slurp;字段的行。

本文链接：http://bookshadow.com/weblog/2015/07/22/sae-log-api-python-note/
请尊重作者的劳动成果，转载请注明出处！书影博客保留对文章的所有权利。

周一	周二	周三	周四	周五	周六	周日
2015年6月				2015年8月
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31