Home Data Center Series Cloudflare Monitoring Alarm Combination Practice: Health Check + Event Alarm Brings Lightweight Operation and Maintenance Experience

1 Introduction

Regarding the health check and instant notification of cluster station applications, I have previously written a solution: self-built Uptime with Bark to monitor the service status and push abnormalities to my Apple devices (iPhone, iPad, macOS) (For specific construction steps, see the article:Docker series builds a real-time health monitoring and alarm system for applications based on uptime and bark). Apart from the fact that it can only support the Apple ecosystem, this solution is actually quite complete in terms of functionality.

However, after using it for a while, I gave up. It was not because of any technical problems, but from the actual experience, the cost performance was not high. The reasons are roughly as follows:

  1. Graphical interface wastes resources: The Uptime chart page is almost useless to me, and I won’t stare at it for no reason, but the background rendering of these contents takes up a lot of resources of the lightweight server - and Tencent Cloud’s lightweight servers are not considered to have strong performance.
  2. WAF rules compromise security: In order to allow Uptime's detection request to successfully return to the source, I have to add the IP of Tencent Cloud server to Cloudflare's WAF whitelist. To be honest, I don't like doing this, and I always feel that it poses a security risk.
  3. Frequent false alarms make people upset: The detection package starts from China and arrives at the Cloudflare data center (mostly in the western United States) through a complex path. It is easily affected by network fluctuations, so I often receive Bark notifications for no apparent reason, sometimes several times a minute. Even though I later reduced the detection frequency, I still can't solve the problem - the key is that there is no problem with the service at all. I am annoyed by the noise, but I'm not a real operation and maintenance person who is paid for work~~~~
  4. The service is actually quite stable: More realistically, my service has been quite stable in the long run, with almost no major outages. Since the probability of a real problem is already very low, it doesn't make much sense to spend limited lightweight server resources to maintain a continuously running monitoring system.

It was for these reasons that I gave up on building my own monitoring + notification combination. However, when I was looking through the various features provided by Cloudflare to see if there was anything else I could do with Cloudflare, I found that others had already provided native health checks and notification mechanisms.

Although this is not a necessity for me, since othersFree gift functionI would definitely be happy to accept it, so I wrote this article to record how Cloudflare's "health check" and "notification" functions helped me rebuild a lightweight monitoring solution that is more suitable for webmaster operation and maintenance scenarios at a "low cost".

2 Health Check (Pro and above)

2.1 Overview

Cloudflare's "Health Check" feature allows users to regularly check the availability of websites or servers to ensure that the website is running normally and detect problems in a timely manner: by specifying a URL path (such as a homepage, API interface, or a specific check page, etc.), Cloudflare will periodically initiate HTTP requests to verify whether the target returns the expected status code (such as 200 OK). Users can customize the frequency of checks, timeout settings, and expected HTTP status codes. If a check fails, Cloudflare will send a notification via email or Webhook to help users respond quickly. This feature is suitable for single-site or multi-origin station configurations. Pro users can use up to 10 health checks for free to ensure that the site is always online and improve the availability of the site.

2.2 Setup steps

2.2.1 Adding health checks

Taking my blog adding a health check as an example, the graphic tutorial is as follows:

image.png

image.png

After successful addition:
image.png

Compared with traditional third-party availability monitoring tools (such as Uptime mentioned above), Cloudflare's health check provides a detection method that is closer to the actual back-to-origin path: the detection request sent by Uptime usually accesses Cloudflare's edge node from its set detection node (for example, I use Tencent Cloud's lightweight server in China), and then Cloudflare initiates a back-to-origin request to the origin server. The health check directly initiates HTTP detection from Cloudflare's edge network to my origin server. Its detection link is shorter, and the back-to-origin path is almost exactly the same as the back-to-origin path when Cloudflare actually serves visitors, so it can more accurately reflect whether Cloudflare can really "reach" the origin server.

But what is really powerful is the "event linkage judgment" capability brought about by combining the health check with Cloudflare's notification system. For example, I can set a "tunnel health alarm" for the tunnel and a "health check" for the blog in the notification system. If these two items trigger an exception at the same time in the same time period, it can be basically determined that the tunnel is disconnected and Cloudflare cannot return to the source. This cannot be done using only third-party monitoring tools - Uptime only knows that the source station is inaccessible, but cannot determine whether the source station itself is down, the tunnel is down, or Cloudflare has a problem. In contrast, Cloudflare's native monitoring system can collect internal signals from multiple dimensions, push alerts (such as through Email, Webhook) after cross-analysis, which is extremely valuable for users like me who deploy on the intranet and rely on the tunnel.

2.2.2 Configuring Health Check Alerts

The so-called alerts are actually email notifications by default, which can be configured directly in the health check interface:

image.png

image.png

image.png

image.png

When the blog is inaccessible in the future, the set mailbox will receive a similar email notification:
image.png

If the blog is restored to an accessible state, the email address you set will receive a similar email notification:
image.png

3 Cloudflare Notifications

Cloudflare's notification function is an important part of webmaster operation and maintenance. Whether your website is a static blog or a dynamic service, after setting up notifications, you will know the first time there is a problem with the website. It supports push messages through multiple methods such as email, Webhook, etc., and covers a wide range of event types, such as common DDoS attacks, back-to-source anomalies, SSL/TLS certificates about to expire, health check failures, WAF triggers, tunnel health alerts, Page Shield alarms, etc.

However, one thing to note is thatDifferent subscription levels have different types of alerts available to them.The types of alerts that Free users can subscribe to are relatively basic, while Pro, Business, and Enterprise users can unlock more advanced notifications, such as brand protection, new domain name monitoring, BGP hijack detection, independent health checks, and more.

The "Health Check" alert mentioned earlier in the article is actually the "Health Check" in the "Notification" function, so creating a "Health Check" alert in the notification is actually the same.

image.png

In addition, even for Free subscription users, there are many useful alerts that can be enabled, such as:

  1. HTTP DDoS attack alert (enabled by default): Cloudflare detects and mitigates HTTP DDoS attacks against your domain, and sends notifications when they occur.
  2. SSL/TLS Certificate Expiration Alert: When an SSL/TLS certificate is about to expire, Cloudflare will remind you to avoid access issues caused by certificate expiration.
  3. Page Shield new code change detection alert: If the JavaScript files loaded by the page are changed, you will receive relevant alerts, helping you to promptly detect code changes that may affect the security of your site.
  4. Page Shield New Malicious Domain Alert: When your users load resources from known malicious domains, Cloudflare sends notifications to alert you about the safety of the resources.
  5. Page Shield New Malicious Script Alert: If the JavaScript loaded by the user is marked as malicious, the system will also issue an alert.
  6. Brand Protection Alert: If a new domain is detected that matches your domain query or logo, you will receive real-time notifications, helping to identify potential brand abuse in a timely manner.

    For Pro users, in addition to the notification types available to Free users, there are more commonly used alert types, including the aforementioned "health check", "passive source server monitoring", "Tunnel tunnel health alert", "Web Analytics indicator update", "Cloudforce One port scan alert", etc. For example, all the notifications I created are as follows:

    image.png

Note 1: One of the more frustrating things is that the notification does not indicate the types of alerts available to users of different subscription levels, which is very inconvenient when using it. So when setting up, you may encounter a situation where some options are grayed out or cannot be added, which basically means that they are not supported by the current level.

Note 2:Free PlanUsers of Cloudflare are limited in the notification capabilities provided by Cloudflare. Many advanced alert types are not included, especially for Traffic Management,Health Check,Port Scanning In contrast, the notification functions provided by Pro and above plans are more comprehensive, which can help users better deal with potential traffic surges, DDoS attacks, maintenance impacts and other issues. So if you need more sophisticated control, upgrading to the Pro plan may be a good choice, especially when the website has high traffic and security requirements.

4 Enhanced real-time notifications

4.1 Limitations of Email Notifications

In fact, email notifications can be considered quasi-real-time notifications, but most people may not take email notifications seriously, especially when they often receive spam. It is impossible for me to check whether there is a problem with the site every time I hear the reminder sound of a new email (unless I create an email address specifically for receiving notifications or set the notification emails sent by Cloudflare to a special prompt tone), so an acceptable, regular real-time notification method is needed.

4.2 Low-end compromise version: 139 SMS notification

This method has a threshold: you need to have a mobile phone number, and it uses the free SMS notification of new emails provided by 139 mailbox:

image.png

This method is very simple and comes from:[email protected]You can study the notification emails in the mailbox with confidence. When there is a problem with the tunnel, the SMS effect is as follows:
image.png

The SMS effects when encountering HTTP DDoS attacks are as follows:
image.png

Some alert topics are not complete, but at least I know that it is from Cloudflare and there is a problem with the "tencentcloud" tunnel. Real-time notification is basically possible, but sometimes it is not so real-time. The latest time I received a notification SMS was more than 10 minutes later. In addition, after following the WeChat public account, it also supports WeChat public account reminders. In general, the practicality is still very good, the key is that you don't have to toss.

4.3 Webhook notification method

4.3.1 What is a webhook

Webhook is a way to communicate between applications in real time through HTTP protocol. When an event occurs in a system, it will automatically send a piece of data to the URL you configured in advance (i.e. the address of the receiving end), usually in the form of HTTP POST request. This data can be in JSON, XML and other formats, and contains information related to the event.

Let's take a practical example: suppose you have a repository on GitHub, and you want to automatically trigger a CI/CD tool to build and test when someone submits a Pull Request to this repository. You can set up a webhook so that GitHub automatically provides an HTTP POST request to the CI/CD system when a Pull Request event occurs, containing detailed information about the event (such as the submitter, modified content, etc.). After receiving the request, the CI/CD system will start the build and test process.

Compared with polling (constantly querying the server for new data), webhook is more efficient because it is event-driven. Once an event occurs, the system will actively push data to you instead of you frequently requesting data. In short, webhook is a mechanism that sends relevant information to a specified URL in real time by defining trigger conditions. It is often used for automated interactions between systems.

I've said so much because Cloudflare's notification function supports webhook in addition to email notifications (Free subscribers can also use it):

image.png

4.3.2 Creating a webhook receiver for notifications based on Bark

For individual users who use Apple ecosystem devices, Bark is a very convenient tool for building personal Webhook addresses (see the article for specific building steps:Docker series builds a message push server based on bark server), which is essentially a transit service that is not responsible for storage or identity verification. It is a "judgment URI → forwarding message" service: each Bark terminal (iPhone, iPad, Mac device) will obtain aUnique device key (token)Then, as long as the URI path request in the agreed format is sent to the Bark server with this device key as the URI prefix, the Bark server will extract the notification content from the request and send it to the device corresponding to this device key (usually these terminals maintain communication with the Bark server).

Below is my Bark server (address:https://barkapi.tangwudi.com)anddevice keyTaking the virtual device "123456789" as an example, the steps to create an HTTP DDoS alert type webhook notification based on Bark are demonstrated:

image.png

Then set the address of the Bark server:
image.png

If the test is successful, the Bark client on the corresponding terminal device will receive a notification:
image.png


There are many ways to display notifications on the terminal. You can choose from Bark:

image.png


Finally, just select the webhook notification method you created earlier in the specific event notification (HTTP DDos attack alert in this example), and then save it:

image.png

All the webhook notifications I ended up creating were as follows:
image.png

Note: Why do I create so many webhook notifications in the above picture? Because if you want to distinguish various alarm types in Bark notifications, then each terminal device should at least create a corresponding Bark notification for each required alarm type (you can't share the same notification content for all alarm types, which is equivalent to only knowing that something has happened, but not knowing what it is). In order to make it easier to distinguish, I created 2 (up and down) for some alarm types (such as tunnel status and blog monitoring). In addition, each terminal needs to create 2, and I have 2 terminals here (iPhone, macmini, and even iPad for HTTP DDoS attack notifications), so naturally there are so many~.

5 Conclusion

The combination of Cloudflare's "health check" and "notification" functions provides great convenience for webmasters' daily operation and maintenance. Although there are third-party availability monitoring tools such as Uptime on the market, they have many shortcomings compared with Cloudflare's native solution in terms of monitoring accuracy, problem location capabilities, and flexibility of alarm mechanisms.

First, the detection nodes of third-party tools are usually distributed in public clouds or overseas data centers. Although they can detect whether a site is accessible, the detection path is different from the user's actual access path, especially due to factors such as network fluctuations and geographical differences, the monitoring accuracy may not be able to perfectly simulate the global access experience. In contrast, Cloudflare's health check is distributed and initiated from multiple edge nodes. Especially when using Tunnel back to the source, this multi-node detection can almost completely reflect the actual access path from different regions, so it can more accurately reflect whether Cloudflare can successfully return to the source and serve visitors.

Secondly, although Uptime can report "access failure", for deployments that use tunnels or are hidden in the intranet, it is impossible to further determine whether the problem lies with the tunnel, the Cloudflare back-to-source process, or the source station itself. Cloudflare's solution can combine multiple event sources such as health checks, tunnel connection status, and WAF behavior through the "event alert" mechanism to easily locate the root cause of the problem: for example, when the health check fails and the tunnel is disconnected, it can be preliminarily determined that the tunnel is faulty rather than the source station itself being unavailable - this is difficult for most third-party solutions to do.

Qishan, in terms of notification methods, although third-party tools support email or webhook, the alarm logic is usually rigid and can only notify a single state change. In contrast, Cloudflare provides an integrated and combinable alarm system, with all configurations concentrated on one platform, simple operation and clear logic. Whether it is email notification or docking with external push tools such as Bark and 139 mailbox SMS, it can be easily completed with almost no threshold to access.

For users who deploy services on the intranet, do not open public network access, and only return to the source through Cloudflare Tunnel, this "close to the source" native detection mechanism and flexible alarm system are particularly valuable: it not only saves the energy of building an independent monitoring system, but also allows webmasters to truly achieve the lightweight and highly reliable operation and maintenance goal of "knowing about problems in seconds and not disturbing when there are no problems".

However, this solution is not flawless. The only small regret is that Free users cannot get it for free.

Share this article
The content of the blog is original. Please indicate the source when reprinting! For more blog articles, you can go toSitemapUnderstand. The RSS address of the blog is:https://blog.tangwudi.com/feed, welcome to subscribe; if necessary, you can joinTelegram GroupDiscuss the problem together.
No Comments

Send Comment Edit Comment


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠(ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ°Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
Emoticons
Emoji
Little Dinosaur
flower!
Previous
Next
       
error:
en_US