gcp_cloud_status_dataset下载 - gcp_cloud_status

gcp_cloud_status_dataset

其他源码

1.0.0

下载

GCP Cloud 状态仪表板公共数据集

这是一个简单的 Bigquery 数据集，其中包含 Google Cloud Service Health (CSH) 事件。

您可以使用它来查询事件并过滤您感兴趣的事件。

每分钟都会触发一次。如果现有中断没有更新或未检测到新中断，则不会插入新行。

您还可以将其与资产清单 API 结合使用，将给定位置/区域中的事件与可能受影响的资产关联起来。

不管怎样，现有的CSH仪表板提供了各种格式的数据，例如RSS和JSON ，同时您可以使用jq发出简单的查询和过滤器

 curl -s https://status.cloud.google.com/incidents.json | jq -r ' .[]  | select(.service_name == "Google Compute Engine") '

然而，它只是不易于使用。

因此，您或任何人都可以在 bigquery 表中使用而不是原始 json...

这是数据集：

https://console.cloud.google.com/bigquery?project=gcp-status-log&p=gcp-status-log&d=status_dataset

要使用此功能，请首先将以下项目添加到 UI gcp-status-log中。完成后，您发出的任何查询都将使用此数据集，但会根据您自己的使用情况对您的项目进行计费。（即，我只是提供数据......您运行的查询就是您将支付的费用）

注意：谷歌不支持此存储库、数据集和代码。买者自负

用法

查询 GCE 中断

 bq query --nouse_legacy_sql  '
SELECT
  DISTINCT(id), service_name,severity,external_desc, begin,`end` , modified
FROM
     gcp-status-log.status_dataset.status
WHERE
  service_name = "Google Compute Engine"
ORDER BY
  modified
'
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+
|          id          |     service_name      | severity |                                                        external_desc                                                        |        begin        |         end         |      modified       |
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+
| pxD6QciVMd9gLQGcREbV | Google Compute Engine | medium   | Requests to and from Google Compute Instances in us-west2 may see increase traffic loss when using the instance's public IP | 2021-05-05 02:11:10 | 2021-05-05 04:54:57 | 2021-05-05 04:54:57 |
| LGFBxyLwbh92E47fAzJ5 | Google Compute Engine | medium   | Mutliregional Price for E2 Free Tier core is set incorrectly                                                                | 2021-08-01 07:00:00 | 2021-08-04 23:18:00 | 2021-08-05 17:35:12 |
| gwKjX9Lukav15SaFPbBF | Google Compute Engine | medium   | us-central1, europe-west1, us-west1, asia-east1: Issue with Local SSDs on Google Compute Engine.                            | 2021-09-01 02:35:00 | 2021-09-03 03:55:00 | 2021-09-07 21:39:46 |
| rjF86FbooET3FDpMV9w1 | Google Compute Engine | medium   | Increased VM failure rates in a subset of Google Cloud zones                                                                | 2021-09-17 15:00:00 | 2021-09-17 18:25:00 | 2021-09-20 23:33:53 |
| ZoUf49v2qbJ9xRK63kaM | Google Compute Engine | medium   | Some Users might have received credit cards deemed invalid email erroneously.                                               | 2021-11-13 07:14:48 | 2021-11-13 08:29:30 | 2021-11-13 08:29:30 |
| SjJ3FN51MAEJy7cZmoss | Google Compute Engine | medium   | Global: pubsub.googleapis.com autoscaling not worked as expected                                                            | 2021-12-07 09:56:00 | 2021-12-14 00:59:00 | 2021-12-14 19:59:08 |
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+

...这就是我 BQ 技能的极限。如果您有任何建议的查询，请在 github issues 中给我留言

图片/status.png

模式

此处使用的架构与 Incidents.schema.json 中显示的仪表板的 JSON 输出提供的格式几乎相同。

唯一的区别是 BQ 中的每一行都是一个单独的事件，而不是根据上面提供的架构将所有事件封装到单个 JSON 中。

此外，该架构还有两个新列：

insert_timestamp ：这是插入行/事件时的TIMESTAMP
snapshot_hash ：这是下载的incident.json文件的 Base64 编码哈希值。

您可以在此存储库中查看示例架构。

 bq show --format=prettyjson --schema gcp-status-log:status_dataset.status

设置

对于其他人，您可以使用以下组合自行设置整个事情

Cloud Scheduler -> Cloud Run -> BigQUery

下面的设置是：

Cloud Scheduler每分钟都会安全地调用Cloud Run服务
Cloud Run从 GCS 存储桶下载文件，该存储桶保存上次插入的JSON CSH 文件的哈希值
Cloud Run下载并解析 JSON CSH 数据
如果 GCS 中的文件和下载的文件的哈希值不同，则
- Cloud Run将 CSH 事件插入BigQuery
- 将文件上传到 GCS，其中包含刚刚插入的 CSH 文件的哈希值
否则，继续

当然，如果没有更新，此方案取决于 JSON CSH 文件保留相同的哈希值（例如，它不包含其自身更新的新鲜度时间戳）

 export PROJECT_ID= ` gcloud config get-value core/project `
export PROJECT_NUMBER= ` gcloud projects describe $PROJECT_ID --format= ' value(projectNumber) ' `
gcloud services enable containerregistry.googleapis.com 
   run.googleapis.com 
   bigquery.googleapis.com 
   cloudscheduler.googleapis.com 
   storage.googleapis.com

# # create the datasets.  We are using DAY partitioning
bq mk -d --data_location=US status_dataset 
bq mk  --table status_dataset.status   schema.json

# # create service accounts for cloud run and scheduler
gcloud iam service-accounts create schedulerunner --project= $PROJECT_ID
gcloud iam service-accounts create cloudrunsvc --project= $PROJECT_ID

bq add-iam-policy-binding 
  --member=serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com 
  --role=roles/bigquery.admin status_dataset.status

gcloud projects add-iam-policy-binding $PROJECT_ID 
  --member= " serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com " 
  --role= " roles/bigquery.jobUser "

# create a gcs bucket to store hash of the incidents json file
# the first value of the hash will force a reload of the incidents.json file
gsutil mb -l us-central1 gs:// $PROJECT_ID -status-hash
echo -n " foo " > /tmp/hash.txt 
gsutil cp  /tmp/hash.txt  gs:// $PROJECT_ID -status-hash/

gsutil iam ch  serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com:roles/storage.admin gs:// $PROJECT_ID -status-hash/

# # you may also need to allow your users access to the dataset https://cloud.google.com/bigquery/docs/dataset-access-controls

# # build and deploy the cloud run image
docker build -t gcr.io/ $PROJECT_ID /gstatus .
docker push gcr.io/ $PROJECT_ID /gstatus

gcloud run deploy gcp-status --image  gcr.io/ $PROJECT_ID /gstatus  
  --service-account cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com 
  --set-env-vars " BQ_PROJECTID= $PROJECT_ID "  --no-allow-unauthenticated

export RUN_URL= ` gcloud run services describe gcp-status --region=us-central1 --format= " value(status.address.url) " `

# # allow cloud scheduler to call cloud run
gcloud run services add-iam-policy-binding gcp-status --region=us-central1 
  --member=serviceAccount:schedulerunner@ $PROJECT_ID .iam.gserviceaccount.com --role=roles/run.invoker

# # deploy cloud scheduler
gcloud scheduler jobs create http status-scheduler- $region --http-method=GET --schedule " */5 * * * * " 
    --attempt-deadline=420s --time-zone= " Pacific/Tahiti " --location=us-central1 
    --oidc-service-account-email=schedulerunner@ $PROJECT_ID .iam.gserviceaccount.com  
    --oidc-token-audience= $RUN_URL --uri= $RUN_URL

[等待5分钟]

对受影响的服务使用资产清单

您还可以将 bq 事件与资产库存数据结合起来，以帮助缩小事件是否影响您的服务的范围。

例如，如果您知道us-central1-a中存在影响 GCE 实例的事件，您可以发出限制潜在资产列表的搜索查询：

$ gcloud organizations list
DISPLAY_NAME               ID  DIRECTORY_CUSTOMER_ID
esodemoapp2.com  673202286123              C023zwabc

$ gcloud asset search-all-resources --scope= ' organizations/673202286123 ' 
  --query= " location:us-central1-a " 
  --asset-types= " compute.googleapis.com/Instance " --format= " value(name) "

//compute.googleapis.com/projects/in-perimeter-gcs/zones/us-central1-a/instances/in-perimeter
//compute.googleapis.com/projects/ingress-vpcsc/zones/us-central1-a/instances/ingress
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/instance-1
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/nginx-vm-1
//compute.googleapis.com/projects/clamav-241815/zones/us-central1-a/instances/instance-1
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/windows-1

其他 BQ 数据集

您还可以使用以下方式查询世界各地的 IAM 角色和权限：

Google Cloud IAM 角色-权限公共数据集

替代方案：BQ JSON 数据类型

源事件是 JSON，因此您还可以使用 BQ 对 JSON 数据类型的本机支持将每个事件加载到 BQ 中。

这可能是一个 TODO 和一个示例工作流程可能是这样的：

 export PROJECT_ID= ` gcloud config get-value core/project `
export PROJECT_NUMBER= ` gcloud projects describe $PROJECT_ID --format= ' value(projectNumber) ' `

bq mk --table status_dataset.json_dataset events:JSON
curl -o incidents.json -s https://status.cloud.google.com/incidents.json

cat incidents.json  | jq -c ' .[] | . ' | sed ' s/"/""/g ' | awk ' { print """$0"""} '  - > items.json


bq load --source_format=CSV status_dataset.json_dataset items.json
bq show status_dataset.json_dataset

$ bq show status_dataset.json_dataset

   Last modified        Schema        Total Rows   Total Bytes   Expiration   Time Partitioning   Clustered Fields   Labels  
 ----------------- ----------------- ------------ ------------- ------------ ------------------- ------------------ -------- 
  08 Apr 09:39:48   | - events: json   125          822184

然后查询时可以直接引用各个filed：

$ bq query --nouse_legacy_sql  '
SELECT events["id"] as id, events["number"] as number,  events["begin"] as begin
  FROM `status_dataset.json_dataset` 
  LIMIT 10
'
+------------------------+------------------------+-----------------------------+
|           id           |         number         |            begin            |
+------------------------+------------------------+-----------------------------+
| " ukkfXQc8CEeFZbSTYQi7 " | " 14166479295409213890 " | " 2022-03-31T19:15:00+00:00 " |
| " RmPhfQT9RDGwWLCXS2sC " |  " 3617221773064871579 " | " 2022-03-31T18:07:00+00:00 " |
| " B1hD4KAtcxiyAWkcANfV " | " 17742360388109155603 " | " 2022-03-31T15:30:00+00:00 " |
| " 4rRjbE16mteQwUeXPZwi " |  " 8134027662519725646 " | " 2022-03-29T21:00:00+00:00 " |
| " 2j8xsJMSyDhmgfJriGeR " |  " 5259740469836333814 " | " 2022-03-28T22:30:00+00:00 " |
| " MtMwhU6SXrpBeg5peXqY " | " 17330021626924647123 " | " 2022-03-25T07:00:00+00:00 " |
| " R9vAbtGnhzo6n48SnqTj " |  " 2948654908633925955 " | " 2022-03-22T22:30:00+00:00 " |
| " aA3kbJm5nwvVTKnYbrWM " |   " 551739384385711524 " | " 2022-03-18T22:20:00+00:00 " |
| " LuGcJVjNTeC5Sb9pSJ9o " |  " 5384612291846020564 " | " 2022-03-08T18:07:00+00:00 " |
| " Hko5cWSXxGSsxfiSpg4n " |  " 6491961050454270833 " | " 2022-02-22T05:45:00+00:00 " |
+------------------------+------------------------+-----------------------------+

对 Cloud Run 的相应修改将涉及创建 CSV 格式的加载（自4/8/22起，支持 CSV 旧版加载器）

		var rlines [] string
		for _ , event := range events {
			event . InsertTimestamp = now
			event . SnapshotHash = sha256Value
			strEvent , err := json . Marshal ( event )
			if err != nil {
				fmt . Printf ( "Error Marshal Event %v" , err )
				http . Error ( w , err . Error (), http . StatusInternalServerError )
				return
			}
			// for JSON Datatype 
			// https://cloud.google.com/bigquery/docs/reference/standard-sql/json-data
			line := strings . Replace ( string ( strEvent ), " " " , " " " " , - 1 )
			line = fmt . Sprintf ( " " %s " " , line )

			rlines = append ( rlines , line )
		}

		dataString := strings . Join ( rlines , " n " )
		rolesSource := bigquery . NewReaderSource ( strings . NewReader ( dataString ))
		rolesSource . SourceFormat = bigquery . CSV