Contents
- Preface
- Background
- How CDN works
- Use cloudflare CDN to speed up website access
- Global CDN based on Anycast IP
- Cloudflare's current situation in China
- Impact on individual webmasters using the Free plan
- In-depth analysis of the reasons why domestic users access websites slowly
- Conventional optimization ideas
- Advanced optimization ideas - worker (suitable for dynamic sites)
- Additional knowledge: Argo Smart Routing
- Summary of cloudflare website optimization concept
- Folk remedies: "self-selected IP", "preferred IP", "preferred domain name"
- Summarize
Preface
I didn’t want to write this article originally, because fundamentally, the concepts of “self-selected IP”, “preferred IP”, and “preferred domain name” are not official terms. To put it nicely, they are “folk remedies”, and to put it bluntly, they are “heresy” (Old Kai said it truthfully). I have also corrected this statement in a special section in my previous article (see article:Home Data Center Series Cloudflare Tutorial (VII) Introduction to CF Worker Functions and Practical Operation, Verification and Research on Related Technical Principles of Implementing "Beggar Version APO for WordPress" Function to Accelerate Website Access Based on WorkerHowever, to this day, I still see that many friends do not understand the principle of cloudflare CDN accelerating website access, and do not optimize website access according to cloudflare's official optimization concept. Instead, they spend their energy on studying these "evil ways" (don't you see that Super Saiyan 3 can't even gather energy normally in the end?). After thinking about it, I decided to write this article, at least so that everyone can really use their energy on the blade.
Note 1: This article is purely a technical explanation and is not directed at anyone. Please forgive me if I accidentally offend anyone with my words. Of course, if you have different technical opinions, you are welcome to discuss them in the article comment section or message board.
Note 2: Although CDN accelerates website access is just a sentence, its actual implementation actually involves many technical points, not to mention Cloudflare's CDN. In order to explain it clearly, I can only start from the beginning.
Background
Once upon a time
Before CDN technology became popular, the traditional way of building a website was to build a server in your own computer room. At that time, for the self-built computer room project, users had to apply for a line from a broadband operator (at first, the operator only provided a 2-megabit E1 interface, and users had to buy a Cisco 2600 router and use an E1 module to connect to the operator's E1 interface. The Cisco 2600's built-in 10-megabit electrical port was used as the RJ45 interface ultimately provided by the operator's line. Now, this is equivalent to buying a Cisco router as an optical-to-electrical converter to connect to the optical fiber provided by the operator. It's really luxurious to think about it, but this configuration was standard at that time). Then, they bought a firewall, used one electrical port as a WAN port to connect to the 10-megabit electrical port of the CIAC 2600 router, and one electrical port to connect to the intranet layer 3 switch as a LAN port. Then, they selected one of the IPv4 public network addresses provided by the operator as the NAT address from the intranet to the external network, and then selected one on the firewall to point the TCP port 80 of the public network IP directly to the port 80 of the intranet server IP. At most, they would configure a TCP port 21 mapping, and a computer room construction project was basically completed.
At that time, there were not many visitors (home broadband was not yet popular, it seemed to be ISDN?), and it was more than enough to publish the website using an IPv4 public IP + 2 megabit broadband. At that time, there was no concept of access experience. It was enough for the website page to be opened, and it didn’t matter whether it took 5 seconds or 8 seconds.
Later, with the rapid development of domestic home broadband and the emergence of multiple broadband operators, new problems arose: Telecom broadband users experienced extremely slow speeds when accessing websites with Netcom (now China Unicom) addresses, and the same was true for Netcom broadband users when accessing websites with Telecom addresses. This was because the traffic between operators would be settled across networks, and the fees were very expensive, so all operators unanimously restricted outbound traffic.
Therefore, at that time, major ICPs (NetEase, Sina, etc.) could only build data centers within the networks of China Telecom and China Netcom respectively to ensure that users of different operators had a good access experience without generating outbound network traffic (users of small operators suffered, such as China Tietong).
NetEase and Sina have only one domain name for external use. How can users of China Telecom and China Netcom accurately access the corresponding data centers of NetEase and Sina within their own operator networks when accessing the same domain name? This leads to the first important technical point: global server load balancing.
How DNS works
Before talking about "Global Server Load Balancing", you need to review the basic knowledge - "How DNS Works".
When we surf the Internet normally, we just need to open the browser, enter the domain name we want to visit and press Enter, and then we can browse the web at will. However, this process that everyone is accustomed to is actually the result of the cooperation of a series of DNS servers, as shown in the following figure:

From the above figure, we can see that when the user enters the target domain name (starting withwww.tangwudi.com
For example) and press Enter, the user's operating system will first recursively ask the Local DNS server (that is, the DNS address that everyone usually configures directly on the computer or router, such as 223.5.5.5, 119.29.29.29, 8.8.8.8, etc.) for the IP address of the domain name you want to access. Then the Local DNS server will iteratively query the IP address of the target domain name from more upper-level DNS servers and return the result to the user. During this period, the upper-level DNS (root domain name server, top-level domain name server, authoritative domain name server) will see the Local DNS server asking questions from beginning to end, rather than the user's real IP (mark this point, it is very important).
Note 1: Actually, it is inaccurate to query Local DNS in the first step. If you really want to be picky, the first step should be that the browser queries its own DNS cache. If it exists, use it directly. If not, continue to move on. The second step is to call the operating system's function. At this time, you should first check whether there is a DNS cache in the local operating system. If it does, use it directly. If not, check the local host file. If not, it will initiate a request to Local DNS.
Note 2: Regarding DNS recursive query and iterative query, friends who are interested can study it by themselves. This is not the focus of this article, I just mentioned it casually.
Global Server Load Balance
Continuing with the previous topic, why can global server load balancing (hereinafter referred to as global load balancing, also called smart DNS) enable telecom users to access the corresponding data center within the telecom network and Netcom users to access the corresponding data center within the Netcom network?
The principle is actually very simple. According to the content of the DNS principle part in the previous section, the upper-level DNS server can see the request sent by the Local DNS, so as long as the authoritative DNS server responsible for the domain name resolution of NetEase and Sina can additionally determine the source IP address of the Local DNS: if the Local DNS is a telecom address, then the IP of the data center in the telecom network will be returned nearby; similarly, if the Local DNS is a Netcom address, then the IP of the Netcom intranet data center will be returned nearby.
However, the main responsibility of the authoritative DNS server is DNS resolution after all, and the burden is already heavy, so it is not appropriate to do more work, so this extra judgment work is handed over to the special device "global load balancing" to complete: Local DNS should have gotten the result when it queried the authoritative DNS server, but now the authoritative server said, "Although I am the boss, I have handed this matter over to my younger brother. This is its address, you go find it", and then Local DNS has another round of queries (as shown in the red box in the figure below):

Note 1: Global load balancing has a very grand name, but in essence it is just a normal DNS server. It is just that it has a built-in national (or global) IP address library, so it can determine its ownership based on the source IP of Local DNS. For example, if you build your own DNS server based on bind9, you only need to create a zone file containing different resolution results, and then use the view statement to associate the source IP address segments of different DNS queries with different zone files to achieve the simplest "intelligent" resolution.
Note 2: Professional global load balancing (such as F5's GTM) has more commercial functions. If it is deployed in multiple data centers and can cooperate with each other, and combined with the load balancing of multiple export links in the data center and the multi-active deployment of applications, these combinations can realize (same city/different location) disaster recovery data center, two-site three-center, and multi-active data center solutions.
How CDN works
“Multiple Shadow Clones”: Make the source site content ubiquitous
One of the biggest problems with the traditional way of building a website is that the source server location is unique. For example, if the public network IP is 4.4.4.4, then users across the country (or even the world) can only go to this address according to the public network route to obtain the website content. If the network is smooth, it is fine. If the network is not smooth (similar to the previous restrictions between China Telecom and China Netcom, or domestic users want to visit foreign countries), the difficulty is no less than the journey to the West: you need to go through 81 difficulties.
So, is it possible to make the website content available all over the world and make it instantly accessible to all visitors when the origin server is located in a single location? Of course, you can use the "multi-shadow clone technique" for the origin server: CDN.

Although there is only one source station, through CDN, in theory, the source station content can be mirrored to cache servers in various regions across the country (the cache server is represented by the nginx icon), making the source station "ubiquitous" and also unloading the performance pressure of the source station to each cache server (in theory, many traditional performance requirements for the source station, such as the number of concurrent connections, the number of new connections, etc., are not so important):

However, in actual implementation, the entire source site content is not mirrored to the cache server. This is because directly mirroring the source site content to the cache server will have the following problems:
1. Storage efficiency and cost issues
• Massive data is difficult to mirror:The content on the source site may be very large, including a large number of files, videos, pictures, etc. If all the source site content is completely mirrored to all cache servers, a very large storage space is required. This will greatly increase the hardware cost of CDN operators, especially when multiple data centers distributed around the world need to store this data.
• On-demand caching is more efficient: Through the on-demand caching strategy, the CDN server only stores the content that the user has actually requested, reducing unnecessary storage consumption, improving storage efficiency and reducing costs.
2. Content dynamics and timeliness
• Dynamic content changes:Some content is dynamically generated or frequently updated, such as user personalized pages or real-time news. This type of content cannot be cached for a long time and needs to be retrieved from the source in real time according to user requests. Therefore, it is impossible to cache all of it to the CDN node through simple mirroring.
• Content timeliness:Some content may have a time limit, such as short-term event pages, discount information, or current news. Fully mirroring these contents may cause expired content to still be cached, causing user experience problems and data synchronization difficulties.
3. Network bandwidth and transmission costs
• Full mirroring increases bandwidth requirements:If all the content of the source station is transferred to the cache server, especially for some large websites or streaming platforms, huge bandwidth requirements will be generated. This not only increases the load on the source station, but also significantly increases the transmission cost, especially the cross-regional data transmission fee.
• Avoid unnecessary traffic:On-demand back-to-source can reduce unnecessary traffic transmission and only pull content from the source site when the user actually needs it. This can optimize the use of network resources and avoid wasting bandwidth.
4. Complexity of data update and synchronization
• The source site content is updated frequently: If the source site content is frequently updated, CDN needs to continuously monitor and synchronize all cached content, which increases complexity and management costs. The on-demand back-to-source model can reduce this complexity and only requires ensuring the latest version of the content when it is requested.
• Consistency management issues: Full mirroring to the cache server means that every time the source site is updated, all cache servers need to be updated synchronously, otherwise it will lead to inconsistent content. By returning to the source on demand, the CDN node can automatically check whether the content needs to be updated according to the cache strategy, effectively reducing the pressure of consistency management.
5. Differences in user needs in different regions
• Unbalanced content request distribution:Users in different regions have different demands for origin server content. For example, some content may only be popular in a certain region. Full mirroring of all content to global CDN nodes will result in a lot of unnecessary data storage, especially for content that is rarely accessed by users. On-demand retrieval can more intelligently cache relevant content according to the needs of different regions and optimize the use of storage resources.
Therefore, the final method adopted is that the cache server "returns to the source" when needed (when the CDN node does not cache the content requested by the user, it needs to forward the request back to the source server to obtain the content), which is the "take food on demand" advocated by buffet restaurants.
Note: The "back-to-source speed" of the cache server is a very important consideration. However, usually the network between domestic CDN suppliers and domestic cloud host suppliers is optimized (some are just one company), so the back-to-source speed is not that slow, so I won't talk about it here.
Traditional DNS scheduling based on global load balancing
Although CDN can use distributed cache servers to make the source site content "ubiquitous", how to enable users to find the cache server closest to them is a question worth studying. The simplest implementation is global load balancing + cache servers distributed in various locations.
Visit my blog URL as a domestic userwww.tangwudi.com
For example, suppose I have deployed nginx servers in multiple regions across the country as cache servers. When a user requests accesswww.tangwudi.com
When the global load balancing device that receives the DNS query request last will return an address that it considers to be the nearest cache server based on the Local DNS address used by the user as a reference, as shown in the following figure:

Therefore, after using global load balancing for DNS scheduling, users in each area will feel that their access speed is greatly improved (local access is of course fast~).
Note 1: In theory, if you use the DNS server address provided by the operator by default, such as 61.128.128.68 for Chongqing Telecom users and 61.139.2.69 for Chengdu Telecom users, then the global load balancing can accurately determine which operator's user initiated the DNS request, and the nearest IP address returned will be the most accurate. However, you may modify the address of the Local DNS for various reasons (for example, if you want to use Alibaba's doh, you will change the DNS server address to Alibaba's DNS address: 223.5.5.5). In this case, the nearest address returned by the global load balancing based on the judgment of the Local DNS server IP may not be the real nearest address, but it is all in China, and it is not noticeable if it is slow.
Note 2: In fact, there is also a way to carry the user's real IP in the DNS request, so that the global load balancing device can obtain the user's real source IP address and return the most effective nearby access IP. This technology is EDNS. However, this function requires the support of the DNS client, Local DNS and the final authoritative DNS server (it is best if all levels of DNS in the middle support it), so it is difficult to implement.
Limitations of CDN based on DNS scheduling
In the previous section, I used the simplest implementation method based on global load balancing to explain the working principle of CDN. Strictly speaking, although it is not wrong, it has limitations: this method is usually only suitable for specific, localized demand occasions, such as enterprise self-use or only providing services for a specific region (such as China). If you want to expand globally, this method will cause many problems:
1. Unable to achieve global traffic distribution
• Geographic location affects access speed: An important function of CDN is to route requests to the server node closest to the user based on the user's geographic location. This is fine for a single region, but if the service scope is expanded to the world and still relies solely on DNS or other methods to determine traffic routing, it may lead to inaccurate routing and slower response speeds.
2. More complex traffic management
• Limitations of DNS Load Balancing:CDN usually needs to rely on DNS load balancing to achieve traffic distribution. Although DNS load balancing can perform routing selection based on Local DNS addresses (or even user IP addresses implemented by EDNS), its accuracy and flexibility are not good enough (not to mention that everyone likes to change Local DNS addresses), and DNS records are updated slowly and cannot respond to network changes in real time.
• Failover is more complicated: If a node has a problem, a CDN using DNS routing may take some time to update the DNS record and redirect traffic to a healthy node.
3. High latency and inconsistent network performance
• Increased delays in cross-border traffic:Users’ requests may be routed to CDN nodes far away from them, especially when the DNS resolution of certain areas is not accurate enough. This will increase the cross-border network transmission time and affect the user experience.
• Inconsistent performance: Different users may experience different access speeds due to differences in network paths.
4. Higher network complexity and operating costs
• Requires complex geographic DNS setup:In order to achieve global or regional traffic optimization, CDN providers need to invest more time and resources to configure and maintain geographic DNS settings to ensure that user requests can be routed to appropriate nodes as much as possible. This not only increases operating costs, but also makes the system more complex.
• Regional congestion problem: Some regional nodes may bear more traffic, causing regional congestion problems to become more obvious.
5. Poor DDoS protection
• Difficult to disperse attack traffic:DDoS attacks may be concentrated on a specific IP address or node, making the node more vulnerable to being brought down and the entire service affected.
• Global protection is more difficult to achieve:It is difficult for CDN to disperse and absorb attack traffic through multiple nodes, affecting the overall security and stability.
6. Limited scalability
• Difficult to flexibly expand global nodes: CDN may need to configure the IP address of each node separately, which makes expansion more difficult and operation more complicated.
In order to deploy CDN globally, the above problems will be encountered, and the key to solving the problem is Anycast IP technology.
Anycast IP Technology
What is Anycast IP technology?
Anycast IP is a network routing technology. Its core idea is to assign the same IP address to multiple server nodes in different locations. In other words, multiple servers share one IP address. When a user sends a request to this IP address, the network will automatically route the request to the server node that is closest to the user or the best.
Take the KFC takeout phone number "400-880-3823" as an example. When you call this number to order food, no matter where you are, the phone will automatically connect to the KFC fast food restaurant closest to you. This fast food restaurant will provide services based on your address. The whole process is seamless for you. You don’t need to manually select which store, but you can enjoy the service of the nearest store.
Going back to the previous picture, if Anycast IP technology is deployed nationwide andwww.tangwudi.com
The domain name resolution IP is bound to the anycast IP 10.10.10.10, so the global load balancing can be lazy and receive anywww.tangwudi.com
The domain name query will respond to 10.10.10.10 (of course, generally we are not so lazy, and we need to respond to different anycast IPs according to the customer area), and each user uses this IP as the target IP to access, and can still access the cache closest to him:

After CDN uses Anycast IP technology, the six problems mentioned in the previous section are all solved:
1. Global or regional traffic optimization
• Quick response:Through Anycast IP technology, user requests are automatically routed to the nearest CDN node. This significantly improves user access speed, especially in the case of cross-border access or global user distribution, and reduces latency.
• Intelligent traffic distribution: Anycast allows the same IP to be shared among multiple nodes around the world, so no matter where the user is, their requests will be intelligently assigned to the optimal server, thereby optimizing performance.
2. Simplify traffic management
• Real-time route optimization:Anycast IP automatically selects the best path, without the need for frequent updates or manual management like traditional DNS load balancing. This reduces the complexity of traffic management and improves network flexibility.
• Fast failover: Anycast can quickly switch traffic to other healthy nodes when a node fails, ensuring service continuity without having to rely on slower DNS updates.
3. Lower latency and consistent network performance
• Geographical advantage: Anycast IP ensures that users are connected to the closest node based on their physical location, which means lower network latency and more consistent performance for users around the world.
• Eliminate cross-border access delays: For cross-border or cross-regional traffic, Anycast can route requests through the optimal path, reducing network latency issues in data transmission.
4. Simplified network management and cost control
• Automatic traffic scheduling: Anycast simplifies geographic DNS settings, automatically routes user traffic to the optimal node, and reduces the complexity and management costs of manual configuration.
• Effectively distribute traffic:By balancing the traffic, it avoids overloading of a certain node, improves the efficiency and performance of the overall network, and reduces additional maintenance costs.
5. Enhanced DDoS protection
• Disperse attack traffic: When a DDoS attack occurs, Anycast technology can disperse the attack traffic to multiple nodes, reducing the pressure on a single node, thereby effectively alleviating the impact of the attack and improving network security.
• Global protection capabilities: Anycast allows attack traffic to be absorbed and dispersed globally, no longer limited to a certain area or node, which improves the overall network protection effect.
6. Flexible scalability
• Simplify global node expansion:With Anycast IP, CDN service providers can easily expand new nodes around the world without having to configure different IP addresses for each node. This expansion method is more flexible and faster, helping CDN service providers cope with the growing traffic demand.
• Balancing network load: When Anycast distributes traffic globally, it makes the traffic more balanced, reduces the risk of overloading a single node, and improves the stability and reliability of the service.
After using Anycast IP technology, the advantages of CDN are faster access speed (better than slow DNS queries), more flexible traffic management, more consistent network performance, stronger DDoS protection capabilities and more flexible scalability. These advantages enable CDN to better provide high-quality services to users around the world or in large areas, and ensure network stability and security. However, compared with the traditional DNS scheduling method, Anycast IP is much more difficult to deploy (involving backbone network optimization), so it is not suitable for small-scale CDN suppliers or enterprises to build their own CDNs.
Note: Anycast IP says it is an IP address, and it is indeed an IP address, but it is not the ordinary IP address (unicast IP) that we can see everywhere. In fact, it is more like a door with an uncertain exit covered with an IP address (just like Doraemon's Anywhere Door, which looks like an ordinary door, but can lead to anywhere after being pushed open. So even if you open the door and see place A outside at one time, it may become place B at the next moment).
Use cloudflare CDN to speed up website access
Global CDN based on Anycast IP
Cloudflare is a service provider that provides overall solutions to the world (including CDN, DDoS attack protection and other services). Most importantly, Cloudflare also provides a Free plan that benefits many individual webmasters (see article:Home Data Center Series CloudFlare Tutorial (I) CF related introduction and its benefits to personal webmasters), which includes unlimited traffic CDN. One of the key reasons why it is so powerful is its private backbone network and data centers that deploy Anycast IP all over the world:

Cloudflare's current situation in China
Unfortunately, even a powerful company like Cloudflare cannot build its own backbone network in China. The current compromise is to cooperate with JD Cloud and use JD Cloud's network to extend into China, which will lead to two problems:
1. Domestic network costs remain high
Will JD.com let Cloudflare get it for free? Cloudflare can save costs in other places because it built its own network. Now it has to cooperate with domestic cloud providers, and the high cost is imaginable. Therefore, for Cloudflare, the main purpose of using JD.com’s cloud network is to meet the needs of its own large enterprise customers (such aswww.qualcomm.cn
,www.visa.cn
etc.) for the use of domestic websites. For this purpose, a China-specific service "China Network Access" has emerged. Users of the general Free plan can only drool:

In the red box above, the “China Network Access” means directly accessing JD Cloud’s 30 domestic data centers. There are two meanings:
- 1. The IP of the domain name resolution is the domestic public IP of JD Cloud (the global load balancing is based on the resolution result returned by the region), so domestic users can directly use the nearest JD Cloud domestic IP to access the website of cloudflare enterprise users. You can ping it with a normal network.
www.qualcomm.cn
andwww.ivsa.cn
You will know. - 2. If the source site of the domestic website of a Cloudflare enterprise user is also set up in China, then when the domestic user cannot find the content required by the website in the cache server of the JD Cloud data center that they visit nearby, the cache server can directly return to the source from this JD Cloud data center.
2. The control over the network is far inferior to the backbone network built by itself
This is normal. After all, we are using JD Cloud's network, so we cannot configure it as we want like our own network. Moreover, the domestic network has many limitations, which makes many technologies that Cloudflare is good at not working well in China (such as Anycast):

This also means that Cloudflare can only build services with "Chinese characteristics" according to the network conditions of JD Cloud. As a result, the website resolution addresses of Cloudflare's enterprise-level users in China are just ordinary domestic public network addresses (non-Anycast IP).
Impact on individual webmasters using the Free plan
The direct consequence of the above two problems is that any network of JD Cloud in China will only serve Cloudflare enterprise users who have purchased the "China Network Access" service, and has nothing to do with users who subscribe to other plans. In other words, except for those enterprise users, for users who subscribe to the Free plan (including other paid plans), Cloudflare is equivalent to having no network in China (it didn't have any network in the first place~).
This actually has no impact on the target user group that does not include domestic Free plan users (websites built). After all, there are cloudflare data centers all over the world. When visitors from other regions visit websites built by Free plan users, they only need to use the Anycast IP resolved by DNS to be directly assigned to the nearest data center. If they need to go back to the source, they can go back to the source directly from this nearest data center. Although many advanced features (such as "Argo Smart Routing") cannot be used, the foundation is there, and it is not too slow (because the actual physical location is close).
However, it is a disaster for individual webmasters (including me) whose target user base includes domestic Free plans, because there is no Internet in China. According to Cloudflare's management and optimization strategy for network resources, access requests from domestic visitors are usually directed by Cloudflare's Anycast IP to the data center in the western United States (the probability is higher in the San Joses data center). The direct result is slow access for domestic users.

Why are domestic Free users assigned to the San Jose data center? There are probably the following reasons:
• 1. Cost management: Cloudflare's Free plan is free, and companies usually allocate more high-quality resources (such as high-quality data centers in Asia) to paying users first. Data centers in the western United States are relatively cheap (the main reason) and are closer to mainland China (close geographical location, shorter submarine optical cables, and lower latency), so they have become a common allocation location for Free users.
• 2. Regional network restrictions: Mainland China has certain restrictions and regulatory requirements on foreign network services. Cloudflare’s traffic in mainland China needs to cooperate with local operators and data centers. This may also cause the traffic of non-paying users to be forwarded preferentially to US data centers, especially western data centers, because of their relatively close distance.
• 3. Network load balancing: Cloudflare's global network distributed traffic scheduling system will select the appropriate data center to handle traffic based on the network load. During peak traffic hours or when a specific area is overloaded, users of the Free plan may be assigned to a more distant server (such as the western United States) to reduce the burden on Asian data centers.
• 4. Partnership: Cloudflare currently cooperates with JD Cloud (previously Baidu Cloud) in China. Although this cooperation provides better access speeds for users in mainland China, this partnership prioritizes serving enterprise-level users who have purchased Chinese network access services, causing Free users' traffic to detour through foreign nodes.
In-depth analysis of the reasons why domestic users access websites slowly
In the last section, I mentioned that the reason why domestic users are slow when accessing websites built by cloudflare Free plan (for individual webmasters) is that they are assigned to the data center in the western United States. You may think that this is natural: because it is far away. Although this is indeed the case, it is also necessary to further divide the specific reasons for the slowness into multiple links for research, so as to facilitate targeted optimization later:
Let's take a look at the complete 4-step process including back-to-origin when a domestic user's access request does not hit the cache in the San Jose data center:

If we simply estimate the time based on the above scenario, we can see that if a return to the source is required, with the one-way time as "1", the total time is "4".
Note 1: The data center to which the access user is assigned will determine the data center from which the back-to-origin request will be "initiated" if there is a back-to-origin demand. This is an iron rule that applies to subscribers of all Cloudflare plans (including enterprise edition users). The only difference is that enterprise-level users have more options from the beginning (such as more and closer data centers). Everyone should pay attention to this.
Note 2: This is just the simplest simulation. In reality, the total time it takes to open a page is determined by how many round trips there are in steps 1 and 2 and how many round trips there are in steps 3 and 4.
Conventional optimization ideas
The so-called conventional optimization idea is to optimize the 4-step access process in the previous section: use cache rules or page rules to minimize the number of times steps 3 and 4 occur, preferably none. In short, "Store up food and slowly claim the throneMore cache, less return to the source".
Static Site
The optimization goal of this type of site is to cache the entire site without returning to the source (the so-called non-return to the source means that except for the first return to the source, there is no need for subsequent return to the source). You don't have to worry too much, just configure the cache rules of cloudflare and cache all the required content such as js, css, html, png, etc. The subsequent access speed will not be much slower (although the test speed is definitely not as good as the domestic 100ms, but the actual experience is not much different), and there is no need to worry about DDoS attacks.
You can use the hexo test site I deployed on cloudflare pages to experience it (only the default homepage, so make do with it):https://hexo.tangwudi.com.
Dynamic Site
The optimization goal of this type of site is to cache as much as possible and return to the source as little as possible. However, dynamic sites are different from static sites. Some content cannot be cached. For example, in WordPress, specific URIs (content starting with "/wp-admin", "/wp-login", "/wp-comment", etc.) and content containing specific cookie fields ("wp-", "logged_in", "wordpress", "comment_
"、"woocommerce_
") requests cannot be cached, so this part of the content needs to be excluded, and all other content except this part needs to be cached. The cache rule strategy needs to be configured in a hierarchical manner from top to bottom according to the priority (for detailed configuration methods, see the article:Home Data Center Series CloudFlare Tutorial (VI) CF Cache Rules Function Introduction and Detailed Configuration Tutorial).
Here is a comparison of the results. When my test WordPress site is accessed directly without using any cache rules (that is, completing steps 1, 2, 3, and 4), the time from sending a request to starting to receive data from the server (cloudflare data center) (TTFB) is 2.74 seconds:

After using the cache rule to cache all cacheable content, the TTFB time becomes 1.14 seconds:

It's good enough to be reasonable.
Note 1: TTFB (Time To First Byte) is the sum of the time from sending a page request to receiving the first byte of the response data. It includes DNS resolution time, TCP connection time, HTTP request time, and the time to get the first byte of the response message. In simple terms, it is the time from when we enter the domain name in the browser and press Enter until we see the page content begin to appear. TTFB is very important because when we open a web page, the time it takes for the entire web page content to be fully presented is not actually sensitive to users. What is really sensitive is how long it takes for the page content to start loading after entering the domain name and pressing Enter.
Note 2: The Tiered Caching feature is very important and must be enabled:

The layered cache function has a good optimization effect on back-to-origin. Although domestic users must initiate a back-to-origin request from the San Jose data center when they return to the source, after turning on the layered cache function, they do not need to directly return to the source from San Jose, but instead ask the top-level data center closer to the source (such as a data center in Asia). If the top-level data center does not have the source in the cache, the top-level data center will initiate a back-to-origin request to the source. After receiving a reply from the source, the content will be notified to the San Jose data center through the cloudflare internal network. This is faster than directly returning to the source from the San Jose data center through the public network in a routing manner. Layered cache is the only mechanism that can indirectly achieve the "nearest source" effect when back-to-origin is needed in Cloudflare's Free plan.
Advanced optimization ideas - worker (suitable for dynamic sites)
The so-called advanced optimization idea is not limited to the traditional cache function of Cloudflare (based on cache rules and page rules), but adopts the worker function, combined with a JS script "Edge Cache HTML" previously released by Cloudflare on GitHub to control the cache function more flexibly and intelligently (for detailed configuration, see the article:Home Data Center Series Cloudflare Tutorial (VII) Introduction to CF Worker Functions and Practical Operation, Verification and Research on Related Technical Principles of Implementing "Beggar Version APO for WordPress" Function to Accelerate Website Access Based on Worker).
Compared with CloudFlare's traditional caching function, the worker-based optimization method is more suitable for dynamic websites:
- Fine-grained cache control
• Worker Optimization Allows you to customize caching strategies for specific request types, URLs, user status, etc. For example, you can cache more content for logged-in users, and dynamically generate personalized parts for logged-in users. This flexibility is not possible with traditional caching rules.
• Can be Custom Cache Key To decide which requests share the cache and which requests are cached independently, thereby improving the cache hit rate. - Partial caching of dynamic content
• Workers allow you to cache static parts of dynamic pages. For example, the body of a blog post can be cached, while the personalized parts for logged-in users are fetched from the origin server. Shard Cache(Partial Caching) can significantly reduce server load and improve user response speed. - Higher cache hit rate
• Using Workers can improve cache hit rates, especially in multi-user, complex content sites. By carefully controlling the cache strategy, you can maximize cache utilization and reduce unnecessary back-to-origin requests. - Avoid excessive back-porting
• Workers can set different cache TTLs for certain resources, or implement Cache prefetching, when the user requests it, the old cache is returned, and the new content is asynchronously retrieved from the source to update the cache without affecting the user experience.
If your website has a lot of dynamic content and user interaction, the Worker optimization method will indeed bring significant performance improvements. For example, after using the worker optimization method on my test WordPress site, the TTFB was further reduced from 1.14 seconds with the traditional cache method to 332.52 milliseconds:

Note 1: Don’t forget to enable the layered cache function.
Note 2: For websites that mainly provide static content, traditional caching methods can achieve good results if configured properly, so there is no need to bother with it.
Additional knowledge: Argo Smart Routing
What is Argo Smart Routing?

This feature actually has little to do with personal webmasters like us who mainly use it for free (it is paid and so expensive), but because it is a legitimate network-level optimization, I think it is worth mentioning.
There is no public abbreviation for the "Argo" in "Argo Smart Routing", but it can be understood as a symbolic name that represents a solution for optimizing network paths. The name "Argo" may be borrowed from the "Argonauts" in Greek mythology, who were heroes who sailed on a ship called "Argo". This corresponds to the core concept of Argo Smart Routing - quickly and intelligently transmitting data in the global network through the optimal path.
So, what is “Argo Smart Routing”?
Argo Smart Routing is an advanced network optimization service provided by Cloudflare that optimizes Internet traffic by:
• Intelligent routing: It dynamically selects the fastest and most reliable path based on real-time network conditions through Cloudflare's private network, rather than using traditional BGP protocol-based routing selection.
• Reduce latency and packet loss: By avoiding congested or poorly performing network paths, Argo is able to significantly reduce latency and improve the reliability of content delivery.
• Reduce back-to-source time: Especially for back-to-origin requests, Argo can find a faster path, thereby reducing the time for request back-to-origin and improving the overall response speed of the website.
The optimization principle of "Argo Smart Routing"
Let's take the access process in this picture as an example:

1. Intelligent routing selection (taking domestic users -> San Jose data center as an example)
Without Argo Smart Routing:
When domestic users visit, their requests will be blocked by Cloudflare. Anycast IP Assign to San Jose Data CenterWithout Argo, user requests will be transmitted according to the default routing selection method, which may go through the following path: "China → Southeast Asia → Pacific Cable → San Jose". In some cases, this path may be congested or have high latency, affecting the user's access experience.
Using Argo Smart Routing:
Argo Smart Routing It can dynamically detect the network conditions of different paths and automatically select the path with lower latency and less congestion. When Argo detects that the path using the default routing method is delayed or congested, it can choose to use a faster path, such as "China → Hong Kong → Pacific Cable → San Jose". In this way, although domestic users are still assigned to San Jose Data Center, but the data transmission path has been optimized to reduce the round-trip time.
2. Reduce latency and packet loss (taking San Jose data center -> domestic users as an example)
Without Argo Smart Routing:
When transmitting data from the San Jose data center back to China, traditional routing may cause congestion on certain paths or instability of network nodes, which in turn causes packet loss. For example, when a data packet passes through a congested link (such as a submarine cable line), users may experience data loss, slow page loading, or even need to retransmit the data packet multiple times, resulting in higher latency.
Using Argo Smart Routing:
"Argo Smart Routing" can monitor the health of different network nodes to avoid links with high packet loss rates or unstable networks. When Argo detects packet loss or high latency on a path from "San Jose → China", it will automatically select a more reliable path, such as "San Jose → Singapore → China" or "San Jose → Hong Kong → China". Through this intelligent adjustment, the data packets received by users are more stable, packet loss is reduced, and the access experience is ultimately improved.
3. Reduce the time to return to the source (taking the San Jose data center to the source server in China as an example)
Without Argo Smart Routing:
If the San Jose data center does not cache the content requested by domestic users, it will initiate a back-to-origin request to the source station located in China. Without Argo, this back-to-origin request will use the default Internet routing selection method, which may go through the following paths: "San Jose → Southeast Asia → China" or "San Jose → Pacific Cable → China". These paths may be long or have congestion problems, resulting in increased back-to-origin time.
Using Argo Smart Routing:
"Argo Smart Routing" can select the best back-to-source path from the San Jose data center to the domestic source station based on real-time network conditions. When Argo detects that a node in the back-to-source path is congested, it will bypass the congested node and choose a faster path, such as: "San Jose → Hong Kong → Beijing", or "San Jose → Singapore → Beijing". Through this intelligent routing selection, the back-to-source path is optimized and the time to obtain data from the source server is greatly reduced.
It can be seen that although steps 1, 2, 3, and 4 still exist, they are much more flexible and efficient from the perspective of network routing.
Note 1: "Argo Smart Routing" can be regarded as a panacea. It is effective for any type of application except HTTP type applications. This is the greatness of "network layer optimization" (not application layer optimization).
Note 2: It also needs to be used with the hierarchical cache option.
Summary of cloudflare website optimization concept
From the various website optimization methods provided by Cloudflare, plus the default nearest access strategy, we can summarize Cloudflare's three optimization concepts in optimizing website access:
- 1. Proximity principle
This proximity principle includes allocating access users to the nearest Cloudflare data center (so Cloudflare needs to build a large number of data centers around the world, except for a few countries), and also includes the data center returning to the source server nearby.
Note: Regarding the nearest back-to-source method, in addition to enabling the "tiered cache" option, back-to-source using a tunnel may sometimes be more advantageous than back-to-source using a public network address. This is because the tunnel establishes a private path from the cloudflare data center to the source station, which may bypass potential public line congestion and latency when back-to-source using a public network address. However, the specific performance improvement will depend on network conditions, the physical location of the source station, and the distribution of Cloudflare data centers. If the public network connection between the source station and the Cloudflare data center is good, the performance difference may not be significant.
In theory, Free plan users using Cloudflare Tunnel "may" be assigned to any Cloudflare data center, including data centers in Asia. Cloudflare will automatically connect the tunnel to the data center it considers optimal based on network conditions, traffic load, and other factors. Therefore, if you are in Asia or most of your users are in Asia, Cloudflare may connect the tunnel to the Asian data center closest to you.
However, for Free plan users' tunnels, Cloudflare's global network resource allocation may be subject to certain restrictions. Compared with paid plans, Free users' tunnels are more likely to be allocated to distant data centers during peak traffic hours. For example, even if you are geographically closer to a data center in Asia, Cloudflare may still allocate you to a distant location, such as a data center in the United States, due to load or other reasons.
Therefore, under the Free plan, it is "not completely guaranteed" that the Free plan user's tunnel will be connected to the Asian data center, but it is possible (it is impossible to use the public network address to go back to the source~), depending on the dynamic allocation of Cloudflare's network.
Still taking the previous example: domestic access users are assigned to the San Jose data center. If the cache of the San Jose data center does not have the corresponding content, it will request the content from the top-level data center (Asia data center) according to the requirements of the layered cache. When the top-level data center finds that it does not have the content in its own cache, if it is a public network address return to the source method, it will directly initiate a return to the source through the public network; if it is a tunnel return to the source method, the tunnel will take over this process to ensure that it is directly returned to the source through a secure dedicated line, bypassing the public Internet, thereby enhancing security and reducing network instability. Therefore, using the "tunnel method to return to the source" is equivalent to using "public network address return to the source" + "Argo Smart Routing" to a certain extent, which is more or less free.
However, the tunnel method is not a panacea. At present, the safest way to build a website in China is to use a domestic cloud host. For home broadband website building, it depends on the access strategy of the local broadband operator for the cloudflare IP address segment. Some places have been degraded, which may cause the tunnel itself to be unstable. Of course, most broadbands should be fine.
- 2. Route Optimization
After getting close, there is still a way to go. For the remaining road that must be taken, we will do our best to optimize it to make it easier to travel (Argo Smart Routing).
- 3. Use cache as much as possible
No matter how easy the road is to walk on, your feet will still hurt if you walk too much, so it is best to walk less. For example, in the past you had to get up early in the morning to go to the market in town to buy vegetables and meat, but now you only need to buy them at the farmers' market at the entrance of the village.
According to Cloudflare, points 1 and 2 are its business, and point 3 is what they want us to do ourselves.
Folk remedies: "self-selected IP", "preferred IP", "preferred domain name"
The origin of "self-selected IP", "preferred IP" and "preferred domain name": when domestic Chinese users use the resolved Anycast IP to access websites established by free plan users, they will be assigned to the San Jose data center in the western United States, resulting in slow access. Therefore, these free plan users want to use their own domestic IP to resolve the website domain names of some cloudflare enterprise customers (such as www.visa.com), and after obtaining their Anycast IP, use specific software to test these Anycast IPs (or simply use the IP address segment announced by cloudflare), find the IP with no packet loss and the smallest delay, and finally manually change the resolved IP of their own website to this IP.
In fact, this idea itself is good. It is intended to solve the problems of steps 1 and 2 in the above steps 1, 2, 3, and 4. It is also intended to manually implement the first step of Cloudflare's website optimization concept: the principle of proximity. However, although this method may seem to be able to "temporarily" improve the access speed of domestic users in certain specific cases, it is not a reliable or recommended approach in the long run (after I manually changed the IP address, I had to visit the blog every now and then to check whether the speed was normal. In the end, I was annoyed and changed back to the default resolution). There are several reasons for this:
1. The design of Cloudflare Anycast mechanism itself
• Cloudflare's Anycast IP allocation mechanism is global and relies on Dynamic Routing, which is automatically optimized and allocated by Cloudflare based on network conditions and geographic location. Although different Anycast IP segments correspond to different data centers, it does not simply determine the route based on the IP address, but dynamically determines it based on network topology and real-time factors such as latency and congestion.
• Even if you manually specify an IP address, Cloudflare may still dynamically adjust traffic paths based on global network conditions, so your manual modifications may not always maintain effective optimization results.
2. Violation of Cloudflare’s best practices and service design
• Manually changing the Anycast IP is not in line with Cloudflare's best practices. Cloudflare is designed to let users use its DNS service and intelligently assign the best path and IP through its network.
• Forcibly binding to a specific IP may cause the traffic to be identified as abnormal traffic by Cloudflare, affecting the stability of access or causing other unexpected problems, such as SSL certificate verification issues, performance fluctuations, etc.
3. Changes and uncontrollability of IP addresses
• Cloudflare regularly adjusts IP segments in different regions, and some specific Anycast IP segments may be reallocated or rerouted to other regions. If you manually specify an IP, it may be routed to a different region after a period of time, resulting in unstable access speed (taking the Doraemon's Anywhere Door mentioned in the previous Anycast IP section as an example, you push open the door and see that the outside is Chengdu. After closing the door, you think that the outside is always Chengdu, but in fact, the next moment the outside is Chongqing. Well, it's a bit like Schrödinger's cat. You don't know the result unless you actually open the door).
• In addition, Cloudflare provides different levels of services for different customers (such as Free and Enterprise plans). Manual resolution to enterprise customers' IP segments may lead to potential access restrictions or instability, because the allocation and resource scheduling of these IP segments are optimized for enterprise-level customers. The Free plan may not enjoy the same priority:

Besides, do you really think that Free plan users manually pointed the resolution IP to Cloudflare without knowing it? And have you not taken any precautions against this situation? You only need to enable SNI hierarchical monitoring on these IP segments: access requests from website domains of enterprise plan users will be processed correctly, and all others will be treated according to the default policy. You can do whatever you want.
4. CDN’s intelligent optimization function is bypassed
• Cloudflare’s free plan doesn’t have all the optimizations of the enterprise level, but it still takes advantage of Cloudflare’s global network and caching system. Manually binding an IP will bypass Cloudflare’s intelligent optimization and routing, and may cause you to lose some optimization effects, especially when network conditions change.
5. Potential Legal and Terms of Service Issues
• This practice may violate Cloudflare's terms of service, especially if you use unofficial means to obtain the IP addresses of specific corporate customers for manual resolution. Cloudflare may take action against such behavior and even terminate the service.
In fact, the above reasons are just official statements. The real reason why I don’t recommend it is that it’s too troublesome. If you have the energy to repeatedly mess with the IP, ask yourself first: Is the cache strategy configured properly? Is all the cached content cached? Have you tried to use the worker optimization method for dynamic sites of the WordPress type? After the configuration is completed, has the return request been reduced to the maximum extent? You should know that the most influential factors for website experience are too long TTFB time and frequent return to the source. If these are configured, for dynamic personal blog sites of the WordPress type, using the worker optimization method to open the page can achieve a TTFB experience of several hundred milliseconds; or using the traditional cache method to open the page to achieve a TTFB experience of more than 1 second, is this not enough to support normal visits to personal blogs?
In a word: Even if we are just freeloaders, we must do so with quality and dignity.
Summarize
Finally finished writing, so tired, I just wanted to express the meaning of the last paragraph, but found that it was unfounded, and to be convincing I need to first explain the optimization concept of cloudflare; then I thought that in this case, I should just explain the optimization concept of cloudflare first, but found that to explain the working principle of cloudflare's Anycast IP, I need to compare it with the working principle of traditional CDN; then I thought I should just explain the working principle of traditional CDN, but it involved the intelligent DNS scheduling of global load balancing; finally, when I wrote about the intelligent DNS scheduling of global load balancing, I found that to explain it clearly, I need to explain the basic working principle of DNS first. . . . Then I wrote so much without realizing it, I am really convinced. . .
But it was still very rewarding. I had only a vague understanding of several concepts before (including such an important layered caching function, which I had mentioned before and recommended to be enabled, but I didn’t pay too much attention to), and there were even some fallacies (I previously believed that Argo Smart Routing would change the data center that initiates the back-to-origin request). However, in order to write this article, I figured it out completely (it is really important to concretize simple ideas and turn them into text). Therefore, this article can also serve as a higher-level summary and conclusion of all the cloudflare-related content I wrote before.
Note: If you have any questions about the concepts in this article, you cancloudflare learning mapSearch in and you may find the answer.
ヾ(≧∇≦*)ゝ
What does this expression mean?