这是一个简单的 Bigquery 数据集,其中包含 Google Cloud Service Health (CSH) 事件。
您可以使用它来查询事件并过滤您感兴趣的事件。
每分钟都会触发一次。如果现有中断没有更新或未检测到新中断,则不会插入新行。
您还可以将其与资产清单 API 结合使用,将给定位置/区域中的事件与可能受影响的资产关联起来。
不管怎样,现有的CSH
仪表板提供了各种格式的数据,例如RSS
和JSON
,同时您可以使用jq
发出简单的查询和过滤器
curl -s https://status.cloud.google.com/incidents.json | jq -r ' .[] | select(.service_name == "Google Compute Engine") '
然而,它只是不易于使用。
因此,您或任何人都可以在 bigquery 表中使用而不是原始 json...
这是数据集:
要使用此功能,请首先将以下项目添加到 UI gcp-status-log
中。完成后,您发出的任何查询都将使用此数据集,但会根据您自己的使用情况对您的项目进行计费。 (即,我只是提供数据......您运行的查询就是您将支付的费用)
注意:谷歌不支持此存储库、数据集和代码。买者自负
bq query --nouse_legacy_sql '
SELECT
DISTINCT(id), service_name,severity,external_desc, begin,`end` , modified
FROM
gcp-status-log.status_dataset.status
WHERE
service_name = "Google Compute Engine"
ORDER BY
modified
'
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+
| id | service_name | severity | external_desc | begin | end | modified |
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+
| pxD6QciVMd9gLQGcREbV | Google Compute Engine | medium | Requests to and from Google Compute Instances in us-west2 may see increase traffic loss when using the instance's public IP | 2021-05-05 02:11:10 | 2021-05-05 04:54:57 | 2021-05-05 04:54:57 |
| LGFBxyLwbh92E47fAzJ5 | Google Compute Engine | medium | Mutliregional Price for E2 Free Tier core is set incorrectly | 2021-08-01 07:00:00 | 2021-08-04 23:18:00 | 2021-08-05 17:35:12 |
| gwKjX9Lukav15SaFPbBF | Google Compute Engine | medium | us-central1, europe-west1, us-west1, asia-east1: Issue with Local SSDs on Google Compute Engine. | 2021-09-01 02:35:00 | 2021-09-03 03:55:00 | 2021-09-07 21:39:46 |
| rjF86FbooET3FDpMV9w1 | Google Compute Engine | medium | Increased VM failure rates in a subset of Google Cloud zones | 2021-09-17 15:00:00 | 2021-09-17 18:25:00 | 2021-09-20 23:33:53 |
| ZoUf49v2qbJ9xRK63kaM | Google Compute Engine | medium | Some Users might have received credit cards deemed invalid email erroneously. | 2021-11-13 07:14:48 | 2021-11-13 08:29:30 | 2021-11-13 08:29:30 |
| SjJ3FN51MAEJy7cZmoss | Google Compute Engine | medium | Global: pubsub.googleapis.com autoscaling not worked as expected | 2021-12-07 09:56:00 | 2021-12-14 00:59:00 | 2021-12-14 19:59:08 |
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+
...这就是我 BQ 技能的极限。如果您有任何建议的查询,请在 github issues 中给我留言
此处使用的架构与 Incidents.schema.json 中显示的仪表板的 JSON 输出提供的格式几乎相同。
唯一的区别是 BQ 中的每一行都是一个单独的事件,而不是根据上面提供的架构将所有事件封装到单个 JSON 中。
此外,该架构还有两个新列:
insert_timestamp
:这是插入行/事件时的TIMESTAMP
snapshot_hash
:这是下载的incident.json
文件的 Base64 编码哈希值。您可以在此存储库中查看示例架构。
bq show --format=prettyjson --schema gcp-status-log:status_dataset.status
对于其他人,您可以使用以下组合自行设置整个事情
Cloud Scheduler -> Cloud Run -> BigQUery
下面的设置是:
Cloud Scheduler
每分钟都会安全地调用Cloud Run
服务Cloud Run
从 GCS 存储桶下载文件,该存储桶保存上次插入的JSON CSH 文件的哈希值Cloud Run
下载并解析 JSON CSH 数据Cloud Run
将 CSH 事件插入BigQuery
当然,如果没有更新,此方案取决于 JSON CSH 文件保留相同的哈希值(例如,它不包含其自身更新的新鲜度时间戳)
export PROJECT_ID= ` gcloud config get-value core/project `
export PROJECT_NUMBER= ` gcloud projects describe $PROJECT_ID --format= ' value(projectNumber) ' `
gcloud services enable containerregistry.googleapis.com
run.googleapis.com
bigquery.googleapis.com
cloudscheduler.googleapis.com
storage.googleapis.com
# # create the datasets. We are using DAY partitioning
bq mk -d --data_location=US status_dataset
bq mk --table status_dataset.status schema.json
# # create service accounts for cloud run and scheduler
gcloud iam service-accounts create schedulerunner --project= $PROJECT_ID
gcloud iam service-accounts create cloudrunsvc --project= $PROJECT_ID
bq add-iam-policy-binding
--member=serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com
--role=roles/bigquery.admin status_dataset.status
gcloud projects add-iam-policy-binding $PROJECT_ID
--member= " serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com "
--role= " roles/bigquery.jobUser "
# create a gcs bucket to store hash of the incidents json file
# the first value of the hash will force a reload of the incidents.json file
gsutil mb -l us-central1 gs:// $PROJECT_ID -status-hash
echo -n " foo " > /tmp/hash.txt
gsutil cp /tmp/hash.txt gs:// $PROJECT_ID -status-hash/
gsutil iam ch serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com:roles/storage.admin gs:// $PROJECT_ID -status-hash/
# # you may also need to allow your users access to the dataset https://cloud.google.com/bigquery/docs/dataset-access-controls
# # build and deploy the cloud run image
docker build -t gcr.io/ $PROJECT_ID /gstatus .
docker push gcr.io/ $PROJECT_ID /gstatus
gcloud run deploy gcp-status --image gcr.io/ $PROJECT_ID /gstatus
--service-account cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com
--set-env-vars " BQ_PROJECTID= $PROJECT_ID " --no-allow-unauthenticated
export RUN_URL= ` gcloud run services describe gcp-status --region=us-central1 --format= " value(status.address.url) " `
# # allow cloud scheduler to call cloud run
gcloud run services add-iam-policy-binding gcp-status --region=us-central1
--member=serviceAccount:schedulerunner@ $PROJECT_ID .iam.gserviceaccount.com --role=roles/run.invoker
# # deploy cloud scheduler
gcloud scheduler jobs create http status-scheduler- $region --http-method=GET --schedule " */5 * * * * "
--attempt-deadline=420s --time-zone= " Pacific/Tahiti " --location=us-central1
--oidc-service-account-email=schedulerunner@ $PROJECT_ID .iam.gserviceaccount.com
--oidc-token-audience= $RUN_URL --uri= $RUN_URL
[等待5分钟]
您还可以将 bq 事件与资产库存数据结合起来,以帮助缩小事件是否影响您的服务的范围。
例如,如果您知道us-central1-a
中存在影响 GCE 实例的事件,您可以发出限制潜在资产列表的搜索查询:
$ gcloud organizations list
DISPLAY_NAME ID DIRECTORY_CUSTOMER_ID
esodemoapp2.com 673202286123 C023zwabc
$ gcloud asset search-all-resources --scope= ' organizations/673202286123 '
--query= " location:us-central1-a "
--asset-types= " compute.googleapis.com/Instance " --format= " value(name) "
//compute.googleapis.com/projects/in-perimeter-gcs/zones/us-central1-a/instances/in-perimeter
//compute.googleapis.com/projects/ingress-vpcsc/zones/us-central1-a/instances/ingress
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/instance-1
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/nginx-vm-1
//compute.googleapis.com/projects/clamav-241815/zones/us-central1-a/instances/instance-1
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/windows-1
您还可以使用以下方式查询世界各地的 IAM 角色和权限:
源事件是 JSON,因此您还可以使用 BQ 对 JSON 数据类型的本机支持将每个事件加载到 BQ 中。
这可能是一个 TODO 和一个示例工作流程可能是这样的:
export PROJECT_ID= ` gcloud config get-value core/project `
export PROJECT_NUMBER= ` gcloud projects describe $PROJECT_ID --format= ' value(projectNumber) ' `
bq mk --table status_dataset.json_dataset events:JSON
curl -o incidents.json -s https://status.cloud.google.com/incidents.json
cat incidents.json | jq -c ' .[] | . ' | sed ' s/"/""/g ' | awk ' { print """$0"""} ' - > items.json
bq load --source_format=CSV status_dataset.json_dataset items.json
bq show status_dataset.json_dataset
$ bq show status_dataset.json_dataset
Last modified Schema Total Rows Total Bytes Expiration Time Partitioning Clustered Fields Labels
----------------- ----------------- ------------ ------------- ------------ ------------------- ------------------ --------
08 Apr 09:39:48 | - events: json 125 822184
然后查询时可以直接引用各个filed:
$ bq query --nouse_legacy_sql '
SELECT events["id"] as id, events["number"] as number, events["begin"] as begin
FROM `status_dataset.json_dataset`
LIMIT 10
'
+------------------------+------------------------+-----------------------------+
| id | number | begin |
+------------------------+------------------------+-----------------------------+
| " ukkfXQc8CEeFZbSTYQi7 " | " 14166479295409213890 " | " 2022-03-31T19:15:00+00:00 " |
| " RmPhfQT9RDGwWLCXS2sC " | " 3617221773064871579 " | " 2022-03-31T18:07:00+00:00 " |
| " B1hD4KAtcxiyAWkcANfV " | " 17742360388109155603 " | " 2022-03-31T15:30:00+00:00 " |
| " 4rRjbE16mteQwUeXPZwi " | " 8134027662519725646 " | " 2022-03-29T21:00:00+00:00 " |
| " 2j8xsJMSyDhmgfJriGeR " | " 5259740469836333814 " | " 2022-03-28T22:30:00+00:00 " |
| " MtMwhU6SXrpBeg5peXqY " | " 17330021626924647123 " | " 2022-03-25T07:00:00+00:00 " |
| " R9vAbtGnhzo6n48SnqTj " | " 2948654908633925955 " | " 2022-03-22T22:30:00+00:00 " |
| " aA3kbJm5nwvVTKnYbrWM " | " 551739384385711524 " | " 2022-03-18T22:20:00+00:00 " |
| " LuGcJVjNTeC5Sb9pSJ9o " | " 5384612291846020564 " | " 2022-03-08T18:07:00+00:00 " |
| " Hko5cWSXxGSsxfiSpg4n " | " 6491961050454270833 " | " 2022-02-22T05:45:00+00:00 " |
+------------------------+------------------------+-----------------------------+
对 Cloud Run 的相应修改将涉及创建 CSV 格式的加载(自4/8/22
起,支持 CSV 旧版加载器)
var rlines [] string
for _ , event := range events {
event . InsertTimestamp = now
event . SnapshotHash = sha256Value
strEvent , err := json . Marshal ( event )
if err != nil {
fmt . Printf ( "Error Marshal Event %v" , err )
http . Error ( w , err . Error (), http . StatusInternalServerError )
return
}
// for JSON Datatype
// https://cloud.google.com/bigquery/docs/reference/standard-sql/json-data
line := strings . Replace ( string ( strEvent ), " " " , " " " " , - 1 )
line = fmt . Sprintf ( " " %s " " , line )
rlines = append ( rlines , line )
}
dataString := strings . Join ( rlines , " n " )
rolesSource := bigquery . NewReaderSource ( strings . NewReader ( dataString ))
rolesSource . SourceFormat = bigquery . CSV
无论如何,JSON 数据类型只是一个 TODO,我不确定目前是否有必要
为什么我再次选择大溪地时间作为调度程序?
为什么不呢,你自己看看: