ついにCloud Run に always-on CPUが登場したぞ！！

※まだプレビュー版です(2021/09/18)

今まではRequestがきてからResponseを返すまでCPUの割当がされていました。しかしそれでは下記のSlackのような3秒以内レスポンスを返す必要があるケースにて困ることが多々有りました。基本的にこのようなAPIはレスポンスを先に返してしまい、あとで別途非同期で処理を実行するという形で対応していましたが、それが簡単にCloud Runではできませんでした。

An instance will never stay idle for more than 15 minutes after processing a request

それがリクエスト後最大15分間CPUが割り当てられるようになります。

When you opt in to "CPU always allocated", you are billed for the entire lifetime of container instances

また課金形態もinstance単位に変わりますが、

CPU is priced 25% lower and memory 20% lower

CPUが25%、メモリーが20%通常のものよりも安くなります。すごい！

Enabling interactivity with Slash Commands | Slack

SlackSlack

This confirmation must be received by Slack within 3000 milliseconds of the original request being sent, otherwise a Timeout was reached will be displayed to the user. If you couldn't verify the request payload, your app should return an error instead and ignore the request.

お試し

今回試したコードはこちらにあります。

Pub/Subを使用して、いろいろ試してみたいと思います。

下準備

Cloud Build

とりあえず、Cloud Buildのyamlファイルを作っておいて楽をします。

  1 steps:
  2 - name: gcr.io/cloud-builders/docker
  3   args: ["build", "-t", "gcr.io/$PROJECT_ID/cloud-run-sample", "."]
  4 images:
  5 - gcr.io/$PROJECT_ID/cloud-run-sample

Cloud Shellから下記コマンドでデプロイ

Pub/Sub

2つTOPICを作っておきます。

Cloud Run

通常Ver

always-on CPU

Cloud Runを作成する際に、「CPUを常に割り当てる」にチェックを入れるだけなのでとても簡単。ただしこのモードを選んだ際には、メモリの512MiBよりも下には設定できなかったので今回は512を設定。

Cloud RunへRequest → Responseをすぐ返すGoroutinesでPub/Subにデータを投入と言った感じ。

最初は１秒ごとにMessageを作成し、検証していきます。

package main

import (
	"context"
	"fmt"
	"log"
	"os"
	"time"

	"cloud.google.com/go/pubsub"
	"net/http"
)

func main() {

	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		w.WriteHeader(http.StatusOK)
		fmt.Fprintf(w, "Hello world")
	})
	http.HandleFunc("/start", handler)

	port := os.Getenv("PORT")
	if port == "" {
		port = "8080"
		log.Printf("defaulting to port %s", port)
	}

	log.Printf("listening on port %s", port)
	if err := http.ListenAndServe(":"+port, nil); err != nil {
		log.Fatal(err)
	}
}

func handler(w http.ResponseWriter, r *http.Request) {
	ctx := context.Background()
	projectID := os.Getenv("PROJECT_ID")
	topicID := os.Getenv("TOPIC_ID")

	go func() {
		client, err := pubsub.NewClient(ctx, projectID)
		if err != nil {
			log.Fatalf("Faild to create client %v", err)
		}

		defer client.Close()
		for i := 0; i < 1000; i++ {

			topic := client.Topic(topicID)
			res := topic.Publish(ctx, &pubsub.Message{
				Data: []byte("hello world"),
			})
			fmt.Printf("%v\n", topic)
			fmt.Printf("%v\n", res)
			msgID, err := res.Get(ctx)
			if err != nil {
				log.Fatal(err)
			}
			fmt.Println(msgID)
			time.Sleep(time.Second * 1)
		}
	}()

	w.WriteHeader(http.StatusOK)
	fmt.Fprintf(w, "Start!")
}

通常の

always-on CPU

通常: 800
always-on CPU: 1000

となりました。

両方共15分程度起動しています。通常モードの方もレスポンスが返ったあとに全くCPUが割り当てられなって落ちるわけではなく、最低限のものが割り当てられているようですね。

1秒スリープを入れるのでは200程度の差で大きくはないので、、、

Sleepを外し、10万件のループへ

  func handler(w http.ResponseWriter, r *http.Request) {    
        ctx := context.Background()    
          projectID := os.Getenv("PROJECT_ID")    
          topicID := os.Getenv("TOPIC_ID")    
      
          go func() {    
                  client, err := pubsub.NewClient(ctx, projectID)    
                  if err != nil {    
                          log.Fatalf("Faild to create client %v", err)    
                  }    
      
                  defer client.Close()    
                  for i := 0; i < 100000; i++ {    
                          res := topic.Publish(ctx, &pubsub.Message{    
                                  Data: []byte("hello world"),    
                          })    
                          fmt.Printf("%v\n", topic)    
                          fmt.Printf("%v\n", res)    
                          msgID, err := res.Get(ctx)    
                          if err != nil {    
                                  log.Fatal(err)    
                          }    
                          fmt.Println(msgID)        
                  }    
          }()    
      
          w.WriteHeader(http.StatusOK)    
          fmt.Fprintf(w, "Start!")    
  }

【結果】Pub/Subのメッセージ数

通常

always-on CPU

通常: 2700
always-on CPU: 37000

となりました。

Goroutines での多重起動をさせたらもっと結果は変わるかもしれません。

これによってCloud Runでかなりのサービスを稼働させることができるじゃないでしょうか？

もし15分以上起動させたい場合はmin-instanceを設定すると良さそうです。

Combined with Cloud Run minimum instances, you can even keep a certain number of container instances up and running with full access to CPU resources.

Cloud Runが出たときの衝撃は今でも忘れられず、どんどん進化しているのでワクワクしますね。

その他

always-on CPUもRequestが飛んでいない時は使は0にスケールしてくれますね。