gcp_cloud_status_dataset 다운로드 - gcp_cloud_status

gcp_cloud_status_dataset

기타 소스코드

1.0.0

다운로드

GCP Cloud 상태 대시보드 공개 데이터세트

이는 Google Cloud Service Health(CSH) 이벤트가 포함된 간단한 BigQuery 데이터 세트입니다.

이를 사용하여 이벤트를 쿼리하고 관심 있는 사건을 필터링할 수 있습니다.

매분마다 트리거됩니다. 기존 중단에 대한 업데이트가 없거나 새로운 중단이 감지되지 않으면 새 행이 삽입되지 않습니다.

또한 이를 자산 인벤토리 API와 함께 사용하여 특정 위치/지역의 이벤트를 영향을 받을 수 있는 자산과 연관시킬 수 있습니다.

아무튼 기존 CSH 대시보드는 RSS , JSON 등 다양한 포맷으로 데이터를 제공하고, jq 이용해 간단한 쿼리와 필터를 발행할 수도 있습니다.

 curl -s https://status.cloud.google.com/incidents.json | jq -r ' .[]  | select(.service_name == "Google Compute Engine") '

그러나 사용하기 쉬운 형태는 아닙니다.

따라서 원시 json 대신 귀하 또는 누구나 사용할 수 있는 BigQuery 테이블은 어떻습니까?

데이터 세트는 다음과 같습니다.

https://console.cloud.google.com/bigquery?project=gcp-status-log&p=gcp-status-log&d=status_dataset

이를 사용하려면 먼저 UI gcp-status-log 에 다음 프로젝트를 추가하세요. 완료되면 실행하는 모든 쿼리는 이 데이터 세트를 사용하지만 사용자의 사용량에 대해 프로젝트에 요금을 청구합니다. (즉, 저는 단지 데이터를 제공할 뿐입니다. 실행한 쿼리에 대한 비용을 지불하게 됩니다.)

참고: 이 저장소, 데이터 세트 및 코드는 Google에서 지원되지 않습니다 . 주의 사항

용법

GCE 중단 쿼리

 bq query --nouse_legacy_sql  '
SELECT
  DISTINCT(id), service_name,severity,external_desc, begin,`end` , modified
FROM
     gcp-status-log.status_dataset.status
WHERE
  service_name = "Google Compute Engine"
ORDER BY
  modified
'
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+
|          id          |     service_name      | severity |                                                        external_desc                                                        |        begin        |         end         |      modified       |
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+
| pxD6QciVMd9gLQGcREbV | Google Compute Engine | medium   | Requests to and from Google Compute Instances in us-west2 may see increase traffic loss when using the instance's public IP | 2021-05-05 02:11:10 | 2021-05-05 04:54:57 | 2021-05-05 04:54:57 |
| LGFBxyLwbh92E47fAzJ5 | Google Compute Engine | medium   | Mutliregional Price for E2 Free Tier core is set incorrectly                                                                | 2021-08-01 07:00:00 | 2021-08-04 23:18:00 | 2021-08-05 17:35:12 |
| gwKjX9Lukav15SaFPbBF | Google Compute Engine | medium   | us-central1, europe-west1, us-west1, asia-east1: Issue with Local SSDs on Google Compute Engine.                            | 2021-09-01 02:35:00 | 2021-09-03 03:55:00 | 2021-09-07 21:39:46 |
| rjF86FbooET3FDpMV9w1 | Google Compute Engine | medium   | Increased VM failure rates in a subset of Google Cloud zones                                                                | 2021-09-17 15:00:00 | 2021-09-17 18:25:00 | 2021-09-20 23:33:53 |
| ZoUf49v2qbJ9xRK63kaM | Google Compute Engine | medium   | Some Users might have received credit cards deemed invalid email erroneously.                                               | 2021-11-13 07:14:48 | 2021-11-13 08:29:30 | 2021-11-13 08:29:30 |
| SjJ3FN51MAEJy7cZmoss | Google Compute Engine | medium   | Global: pubsub.googleapis.com autoscaling not worked as expected                                                            | 2021-12-07 09:56:00 | 2021-12-14 00:59:00 | 2021-12-14 19:59:08 |
+----------------------+-----------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+---------------------+

...그것이 내 BQ 기술의 한계입니다. 당신이 작성한 추천 쿼리가 있다면 github 이슈에 메모를 보내주세요.

이미지/상태.png

개요

여기에 사용된 스키마는 Incidents.schema.json에 표시된 대시보드의 JSON 출력에서 제공하는 것과 거의 동일한 형식입니다.

유일한 차이점은 위에 제공된 스키마에 따라 단일 JSON으로 캡슐화된 모든 인시던트와 달리 BQ의 각 행은 개별 인시던트라는 점입니다.

또한 이 스키마에는 두 개의 새로운 열이 있습니다.

insert_timestamp : 행/이벤트가 삽입된 시점의 TIMESTAMP 입니다.
snapshot_hash : 다운로드된 incident.json 파일의 base64 인코딩 해시입니다.

이 저장소에서 예제 스키마를 볼 수 있습니다.

 bq show --format=prettyjson --schema gcp-status-log:status_dataset.status

설정

다른 모든 사람의 경우 다음 조합을 사용하여 모든 것을 직접 설정할 수 있습니다.

Cloud Scheduler -> Cloud Run -> BigQUery

다음 설정은 다음과 같습니다.

Cloud Scheduler 매분마다 Cloud Run 서비스를 안전하게 호출합니다.
Cloud Run 마지막으로 삽입된 JSON CSH 파일의 해시가 저장된 GCS 버킷에서 파일을 다운로드합니다.
Cloud Run JSON CSH 데이터를 다운로드하고 파싱합니다.
GCS의 파일 해시와 다운로드된 파일의 해시가 다른 경우
- Cloud Run CSH 이벤트를 BigQuery 에 삽입합니다.
- 방금 삽입한 CSH 파일의 해시 값을 사용하여 GCS에 파일을 업로드합니다.
아니면 그냥 계속하세요

물론 이 체계는 업데이트가 없는 경우 동일한 해시 값으로 남아 있는 JSON CSH 파일에 따라 달라집니다(예: 자체 업데이트에 대한 최신 타임스탬프가 포함되지 않음).

 export PROJECT_ID= ` gcloud config get-value core/project `
export PROJECT_NUMBER= ` gcloud projects describe $PROJECT_ID --format= ' value(projectNumber) ' `
gcloud services enable containerregistry.googleapis.com 
   run.googleapis.com 
   bigquery.googleapis.com 
   cloudscheduler.googleapis.com 
   storage.googleapis.com

# # create the datasets.  We are using DAY partitioning
bq mk -d --data_location=US status_dataset 
bq mk  --table status_dataset.status   schema.json

# # create service accounts for cloud run and scheduler
gcloud iam service-accounts create schedulerunner --project= $PROJECT_ID
gcloud iam service-accounts create cloudrunsvc --project= $PROJECT_ID

bq add-iam-policy-binding 
  --member=serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com 
  --role=roles/bigquery.admin status_dataset.status

gcloud projects add-iam-policy-binding $PROJECT_ID 
  --member= " serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com " 
  --role= " roles/bigquery.jobUser "

# create a gcs bucket to store hash of the incidents json file
# the first value of the hash will force a reload of the incidents.json file
gsutil mb -l us-central1 gs:// $PROJECT_ID -status-hash
echo -n " foo " > /tmp/hash.txt 
gsutil cp  /tmp/hash.txt  gs:// $PROJECT_ID -status-hash/

gsutil iam ch  serviceAccount:cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com:roles/storage.admin gs:// $PROJECT_ID -status-hash/

# # you may also need to allow your users access to the dataset https://cloud.google.com/bigquery/docs/dataset-access-controls

# # build and deploy the cloud run image
docker build -t gcr.io/ $PROJECT_ID /gstatus .
docker push gcr.io/ $PROJECT_ID /gstatus

gcloud run deploy gcp-status --image  gcr.io/ $PROJECT_ID /gstatus  
  --service-account cloudrunsvc@ $PROJECT_ID .iam.gserviceaccount.com 
  --set-env-vars " BQ_PROJECTID= $PROJECT_ID "  --no-allow-unauthenticated

export RUN_URL= ` gcloud run services describe gcp-status --region=us-central1 --format= " value(status.address.url) " `

# # allow cloud scheduler to call cloud run
gcloud run services add-iam-policy-binding gcp-status --region=us-central1 
  --member=serviceAccount:schedulerunner@ $PROJECT_ID .iam.gserviceaccount.com --role=roles/run.invoker

# # deploy cloud scheduler
gcloud scheduler jobs create http status-scheduler- $region --http-method=GET --schedule " */5 * * * * " 
    --attempt-deadline=420s --time-zone= " Pacific/Tahiti " --location=us-central1 
    --oidc-service-account-email=schedulerunner@ $PROJECT_ID .iam.gserviceaccount.com  
    --oidc-token-audience= $RUN_URL --uri= $RUN_URL

[5분만 기다려주세요]

영향을 받은 서비스에 자산 인벤토리 사용

또한 bq 이벤트를 자산 인벤토리 데이터와 결합하여 이벤트가 서비스에 영향을 미치는지 여부를 좁힐 수도 있습니다.

예를 들어 us-central1-a 에 GCE 인스턴스에 영향을 미치는 이벤트가 있다는 것을 알고 있는 경우 잠재적 자산 목록을 제한하는 검색어를 실행할 수 있습니다.

$ gcloud organizations list
DISPLAY_NAME               ID  DIRECTORY_CUSTOMER_ID
esodemoapp2.com  673202286123              C023zwabc

$ gcloud asset search-all-resources --scope= ' organizations/673202286123 ' 
  --query= " location:us-central1-a " 
  --asset-types= " compute.googleapis.com/Instance " --format= " value(name) "

//compute.googleapis.com/projects/in-perimeter-gcs/zones/us-central1-a/instances/in-perimeter
//compute.googleapis.com/projects/ingress-vpcsc/zones/us-central1-a/instances/ingress
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/instance-1
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/nginx-vm-1
//compute.googleapis.com/projects/clamav-241815/zones/us-central1-a/instances/instance-1
//compute.googleapis.com/projects/fabled-ray-104117/zones/us-central1-a/instances/windows-1

기타 BQ 데이터세트

또한 다음을 사용하여 전 세계 IAM 역할 및 권한을 쿼리할 수도 있습니다.

Google Cloud IAM 역할-권한 공개 데이터세트

대안: BQ JSON 데이터 유형

소스 이벤트는 JSON이므로 잠재적으로 JSON DataType에 대한 BQ 기본 지원을 사용하여 각 이벤트를 BQ에 로드할 수도 있습니다.

이는 TODO일 수 있으며 샘플 작업 흐름은 다음과 같을 수 있습니다.

 export PROJECT_ID= ` gcloud config get-value core/project `
export PROJECT_NUMBER= ` gcloud projects describe $PROJECT_ID --format= ' value(projectNumber) ' `

bq mk --table status_dataset.json_dataset events:JSON
curl -o incidents.json -s https://status.cloud.google.com/incidents.json

cat incidents.json  | jq -c ' .[] | . ' | sed ' s/"/""/g ' | awk ' { print """$0"""} '  - > items.json


bq load --source_format=CSV status_dataset.json_dataset items.json
bq show status_dataset.json_dataset

$ bq show status_dataset.json_dataset

   Last modified        Schema        Total Rows   Total Bytes   Expiration   Time Partitioning   Clustered Fields   Labels  
 ----------------- ----------------- ------------ ------------- ------------ ------------------- ------------------ -------- 
  08 Apr 09:39:48   | - events: json   125          822184

그런 다음 쿼리하려면 각 필드를 직접 참조하면 됩니다.

$ bq query --nouse_legacy_sql  '
SELECT events["id"] as id, events["number"] as number,  events["begin"] as begin
  FROM `status_dataset.json_dataset` 
  LIMIT 10
'
+------------------------+------------------------+-----------------------------+
|           id           |         number         |            begin            |
+------------------------+------------------------+-----------------------------+
| " ukkfXQc8CEeFZbSTYQi7 " | " 14166479295409213890 " | " 2022-03-31T19:15:00+00:00 " |
| " RmPhfQT9RDGwWLCXS2sC " |  " 3617221773064871579 " | " 2022-03-31T18:07:00+00:00 " |
| " B1hD4KAtcxiyAWkcANfV " | " 17742360388109155603 " | " 2022-03-31T15:30:00+00:00 " |
| " 4rRjbE16mteQwUeXPZwi " |  " 8134027662519725646 " | " 2022-03-29T21:00:00+00:00 " |
| " 2j8xsJMSyDhmgfJriGeR " |  " 5259740469836333814 " | " 2022-03-28T22:30:00+00:00 " |
| " MtMwhU6SXrpBeg5peXqY " | " 17330021626924647123 " | " 2022-03-25T07:00:00+00:00 " |
| " R9vAbtGnhzo6n48SnqTj " |  " 2948654908633925955 " | " 2022-03-22T22:30:00+00:00 " |
| " aA3kbJm5nwvVTKnYbrWM " |   " 551739384385711524 " | " 2022-03-18T22:20:00+00:00 " |
| " LuGcJVjNTeC5Sb9pSJ9o " |  " 5384612291846020564 " | " 2022-03-08T18:07:00+00:00 " |
| " Hko5cWSXxGSsxfiSpg4n " |  " 6491961050454270833 " | " 2022-02-22T05:45:00+00:00 " |
+------------------------+------------------------+-----------------------------+

Cloud Run에 대한 해당 수정에는 CSV 형식의 로드 생성이 포함됩니다( 4/8/22 일부터 CSV 레거시 로더가 지원됨).

		var rlines [] string
		for _ , event := range events {
			event . InsertTimestamp = now
			event . SnapshotHash = sha256Value
			strEvent , err := json . Marshal ( event )
			if err != nil {
				fmt . Printf ( "Error Marshal Event %v" , err )
				http . Error ( w , err . Error (), http . StatusInternalServerError )
				return
			}
			// for JSON Datatype 
			// https://cloud.google.com/bigquery/docs/reference/standard-sql/json-data
			line := strings . Replace ( string ( strEvent ), " " " , " " " " , - 1 )
			line = fmt . Sprintf ( " " %s " " , line )

			rlines = append ( rlines , line )
		}

		dataString := strings . Join ( rlines , " n " )
		rolesSource := bigquery . NewReaderSource ( strings . NewReader ( dataString ))
		rolesSource . SourceFormat = bigquery . CSV