Contents
- 1 Introduction
- 2 Background knowledge: Access difficulties in real network environment
- 3 Direct connection mode: a scientific solution for direct exposure of the source station
- 4 Non-direct connection mode: hiding the source station IP with the help of transit entrance
- 4.1 Why do we need “indirect connection”?
- 4.2 The key to achieving “non-direct connection”: transit entrance
- 4.3 Prerequisite knowledge: How to implement Cloudflare's reverse proxy function
- 4.4 The Cost and Limitations of “Indirect Connection”
- 4.5 Are “non-directly connected” science nodes necessarily slower than “directly connected” science nodes?
- 4.6 Cloudflare Tunnel and Compliance Issues for Scientific Use
- 5. Construction of scientific nodes based on sing-box
- 6 Conclusion
1 Introduction
Since I purchased Racknerd's VPS and completed the big project of moving the disaster recovery node from Tencent Cloud's lightweight server to Racknerd's Chicago VPS (see article:The second reconstruction of the home data center series blog architecture: service migration and active-active disaster recovery practice caused by VPS relocation, data center relocation is a big project), I am still thinking about how to make full use of this expensive ($39.88/year) Chicago node (Racknerd_High Configuration 1_882), after all, the resources of the Chicago node are still largely idle:

On second thought, in addition to the hardware resources, the Chicago node has a monthly traffic of up to 8,500G, and I now mainly use it as the main back-to-origin station for my core applications (blog) to Cloudflare. This usage does not use much traffic at all, which means that the traffic for so many months is completely wasted.
Since we want to make full use of the traffic, building our own scientific node is the first choice. But because I didn't have an overseas VPS before and I don't know much about this, I need to sort out the relevant knowledge points first.
2 Background knowledge: Access difficulties in real network environment
In today's network environment, connections are not always "connected": some websites are online, but the pages cannot be opened; some software is configured correctly by default, but cannot be used normally; some domain names cannot be correctly resolved through public DNS; the browser's built-in encrypted DNS (such as DoH or DoQ) cannot be enabled normally... Such problems occur from time to time, and the reason is often not a server failure, but a problem with the server.There is a transparent "wall" in the communication path.
In some areas of the network, for the purpose of content compliance, information supervision, data audit, etc., operators willTargeting overseasThis type of behavior relies on a set of technologies calledA mechanism for Deep Packet Inspection (DPI), which does not require decrypting the content itself, but rather analyzes the handshake metadata of the encryption protocol(such as TLS's SNI, JA3 fingerprint, QUIC's initial data packet structure, etc.) to characterize the traffic and then dynamically decide whether to release it based on the matching results. These intervention behaviors all come from a transparent "wall".
It is worth noting that this type of intervention is not targeted at a specific protocol, service or application;Based on the identification and blocking of "communication behavior patterns"This strategy is inherently uncertain, and often results in "intermittent connectivity" and "partial service anomalies" due to different egress links, updated path node strategies, and changes in service deployment methods. As a result, what we are facing is no longer a stable network transmission model, but a real environment full of variables and dynamic games.
Let's take a simple example: you have enabled the "Secure DNS" feature in your browser (such as Cloudflare or Google's DoH), which is intended to improve the privacy and integrity of resolution. However, the connection may be reset by an intermediate node in the network during the TLS handshake phase, resulting in "domain name cannot be resolved" or even "webpage cannot be opened"; for example, some users may encounter occasional connection anomalies when visiting developer community websites (such as GitHub, Docker Hub, Stack Overflow, etc.). This phenomenon is not easy to troubleshoot and there is no pattern to follow.
For ordinary users, this may just be "occasionally unable to open a website" - they can choose to change the content and avoid the problem; but for technical users, cross-border office workers, and developers of self-built services, the challenges behind this are deeper:Unpredictability of connections and structural uncertainty of communicationIn this context, if the traditional "direct access" mode is still used for communication, then friends who need to communicate with overseas servers will find it difficult.
How to solve this problem? The most common way is to use scientific Internet access to build a "communication tunnel" that bypasses the interference path, thereby avoiding interference from the transparent "wall" when accessing overseas content.
At present, scientific ways of surfing the Internet can be roughly divided into two categories:
The first category is to use paid services provided by third parties.
The characteristic of this type of service is that it is “out-of-the-box”: users only need to purchase an account, subscribe to the configuration, and connect to overseas nodes with one click, without having to worry about technical details such as the underlying protocol, back-to-source mechanism, or traffic diversion. It is almost the most hassle-free solution, with a low threshold and intuitive experience, and is very popular among novice users. But at the same time, this solution also has certain risks:
- It is not uncommon for some cheap service providers to "run away", and the security of user assets and data cannot be guaranteed;
- It’s okay if they just “run away”, but if you encounter an official-led “phishing” provider, hehe;
- Some high-end services are stable but expensive, and the long-term cost of use is high;
The second type is to build scientific Internet nodes by yourself.
The core of this approach is: you have aControllable offshore VPS, deploy server-side programs (such as sing-box, xray, trojan-go, etc.) by yourself, and manage connection protocols and policies by yourself. The advantages of this path are obvious:
- You can freely choose the protocol type, encryption method and transmission path.Avoid traffic characteristics of mainstream services;
- Multiple nodes can be deployed on demand, integrated with transit, access to tunnel/CDN, etc.Higher anti-blocking and flexibility;
- More controllable costs: The monthly traffic provided by a general VPS is sufficient for multiple users
But its disadvantages are equally obvious:
- You need to have basic network knowledge and understand VPS management, domain name resolution, certificate application and other related operations;
- Need to deal with transparent "wall" policy updates by yourself, and dynamically adjust deployment methods, protocol configurations, and node architectures;
- It is also necessary to deal with occasional IP blocking, certificate revocation, domain name pollution and other issues, and have certain troubleshooting capabilities.
therefore,The choice between building your own scientific service node and using a third party to provide scientific services often depends on the user's balance between stability, security, cost and control.The focus of this article is to help users who have certain hands-on skills and hope to build a more reliable communication environment to understand and master the core principles and evolution methods of "self-built scientific nodes".
This article does not involve the use of third parties to provide scientific services, but only talks about how to use existing overseas VPS to build scientific service nodes for study and work needs. In general, there are two major options for building scientific Internet nodes by yourself: direct connection and indirect connection, and each of these two options has its own advantages and disadvantages.
3 Direct connection mode: a scientific solution for direct exposure of the source station
3.1 Passing through the transparent “wall”: the natural problem of direct connection mode
In the context of scientific access, the so-called "direct connection mode" does not mean that users directly access the target website, but that users access the target website through aA communication link established by yourself with a controllable and trustworthy overseas VPS, and forward subsequent access requests to the overseas VPS, which will forward them on your behalf. This can actually be broken down into two key stages:
- Connection Phase: Connect directly from local devices to overseas VPSs to establish a communication channel that passes through a transparent "wall";
- Forwarding Phase: Send access requests to target websites and services to overseas VPS, which will make requests on your behalf and return the response content.
The second stage is easy to achieve, but the first stage, which is the key "connection stage", is difficult:How to successfully "pass through" the transparent "wall"?
3.2 Connection phase: Can I connect to the VPS?
In the domestic network environment, if the client wants to directly connect to an overseas VPS (usually a public IP on a cloud service), it must go through an "unpredictable" outbound path. On this path, there is an invisible mechanism that screens, judges, and processes cross-border traffic in real time: What service do you want to connect to? Does the protocol look like HTTPS? Is this link being "abused" recently? Has the established rules been triggered?
This mechanism is invisible on the surface, but it actually runs on key nodes such as backbone egress routing and operator edge gateways. It relies mainly on the following means:
- Deep Packet Inspection (DPI): Analyze the SNI field, JA3 fingerprint, ALPN protocol flag, etc. during the TLS handshake phase to identify the server type and communication intent;
- Behavioral pattern matching: Establish a feature library for the behaviors of non-standard protocols (such as proxies disguised as HTTPS, QUIC or DoH) to perform projective blocking;
- Path pressure monitoring: If non-browser connection requests frequently appear at a certain exit, it may also be dynamically limited, the connection reset, or the packet dropped directly.
This means:Even if you set the VPS service to listen on port 443 and use the TLS protocol to wrap the communication content, it does not mean that you can connect successfully; even if you can connect, it does not mean that it will be stable.
Protocols and ports: "morphological characteristics" that affect the direct connection rate
When faced with policy identification of middleware,The type of communication protocol and the port combination usedIt will directly affect whether the data packet can pass smoothly. The transparent "wall" will not clearly say "who is blocked", but it will be more sensitive to certain characteristics in behavior and easier to interfere with. The following are the more common "release priorities" in the current network environment:
- TCP port 443 (HTTPS)
This is the mainstream port for modern encrypted communications. A large amount of traffic such as browser access, system updates, and mobile application background traffic all go through this path. In order to avoid accidentally damaging compliant communications,Policy systems tend to be more restrictive towards traffic on 443, with the highest "release priority". Therefore, many scientific protocols will actively "borrow" and use port 443 to improve disguise. - TCP port 80 (HTTP)
Although it is a plain text protocol, it is still widely used in various services with low security requirements. Although the encryption level is low, it has a certain "basis for release" due to its widespread use. However, because it is plain text, it is easy to be completely parsed.Suitable for obfuscation but not suitable for carrying sensitive traffic itself. - The UDP protocol as a whole is in the "highly sensitive area"
Unlike TCP, UDP is connectionless and stateless, and is naturally more suitable for high-speed transmission and instant communication, but it is also more prone to abuse. The default policy of the policy system for UDP is usually more aggressive, and the common manifestations are:- QUIC (HTTP/3): Clearly based on UDP, it is often used to improve performance when accessing services such as Google and YouTube. However, due to its obvious behavioral characteristics and obvious initial handshake, it often encounters connection failures or downgrades back to TCP.
- WireGuard / OpenVPN (UDP mode):In the early days, it was the main force of self-built scientific nodes, but these protocolsThe handshake is fixed, the features are obvious, and the fingerprint is unique, which has been widely mastered by DPI systems, resulting in easy disconnection, high latency and poor stability.
- Emerging high-performance tunneling protocols such as Hysteria/TUIC:Although it is transmitted on UDP, its design emphasizes anti-identification and anti-blocking capabilities.To a certain extent, it is more "anti-disturbance"However, the actual deployment method and path conditions still need to be considered.
- "Mixed protocols" or "TCP protocol emulating UDP"
Some tools (such as Trojan-Go) are mainly based on TCP, but can optionally enable UDP forwarding. This mode has different performance in policy judgment:Relatively safe when not enabled, but may trigger more interference when enabled. - DoH (DNS over HTTPS)
It is actually a variant of DNS query encapsulated through HTTPS, using TCP port 443. However, due to its "non-typical HTTPS" behavior (such as extremely short handshakes and small packet exchanges), it is not possible to obtain a DNS query on some paths.It may still be judged as abnormal communication and blocked.
Therefore, you should not only look at the protocol name or tool name, but also pay attention to itsOperation mode, port used, whether to use UDP, whether there are typical handshake characteristicsWhat really affects the success rate of crossing is not "what protocol to use", butIs your protocol behavior "normal enough"?,Is it considered as “traffic that should be released” by the transparent “wall”?.
3.3 Conventional traffic camouflage solutions may not be reliable
In order to pass the first stage, many "protocol camouflage" solutions have appeared on the market. Their core idea is to make your traffic look like an ordinary person opening a web page (http + tls). The following is a comparison of camouflage solutions for different protocols:
protocol | Disguise ability | Is it easy to identify? | Update activity | Practical security |
---|---|---|---|---|
Trojan | Relying on TLS+HTTP masquerading | High (clear fingerprint) | Very low | Medium-low |
VLESS | HTTP/WS masquerading | Medium to high | High (sing-box active) | middle |
Shadowsocks | No disguise | Very high | Stop maintenance | Low |
VMess | XTLS/WS | Medium to high | Stop maintenance | Medium-low |
Hysteria | QUIC Obfuscation | middle | Medium to high | middle |
These solutions did bring some breakthroughs when they first appeared, but the subsequent shortcomings are also very real:
- The flow characteristics are stable.The fingerprint of the protocol is gradually being figured out, especially protocols such as Trojan,It has not been updated for many years and has distinct features;
- DNS pollution, TLS reset, active detection (such as detecting packet return behavior) and other interference methods are frequently used;
- Even if TLS is enabled, SNI (Server Name Indication) is still in plain text and can be exploited;
- Most camouflage strategies rely on HTTP Host Header or traffic padding, which are increasingly vulnerable to complex DPI.
- The source station IP and service port are exposed for a long time and are easy to be detected and blocked;
3.4 Don’t get me wrong, connecting to a VPS ≠ successfully surfing the Internet
It should be emphasized that connecting to an overseas VPS is only the starting point of a scientific visit.The core of scientific Internet access lies in whether you can successfully access restricted resources abroad through VPS and bring the responses back to the local area safely.
This process is also subject to various interferences:
- Is the path for the VPS to access external resources reliable? For example, are access to Google, YouTube, and GitHub also restricted?
- Did the data packet encounter a reset midway due to "abnormal traffic characteristics"? For example, you are just opening an ordinary web page, but during the page loading process, the connection suddenly transmits a large amount of data and occupies bandwidth for a long time. This "behavior does not match expectations" may be regarded as "suspicious communication" by the recognition system, and then forcibly terminate the connection by sending RST.
- Is the overall link stable and concealed enough? For example, some friends have configured nodes based on Hysteria or WireGuard, which perform very well when testing speed, but frequently disconnect when actually using them, or even reset the connection directly. This is often not a problem with the VPS itself, but rather that the link is detected to be abnormal when crossing certain key routing segments and is strategically interfered with - for example, abnormal handshake characteristics, sudden changes in data packets, "unnatural" traffic behavior, etc., all of which can become trigger points.
This is the natural limitation of the direct connection mode: it relies too much on "direct communication" and is too easy to expose the target.
Note 1: Usually, "direct connection" is used to build scientific services. One potential requirement is the latency when accessing the VPS directly. Therefore,The physical location of the node becomes particularly important——You need to consider not only bandwidth, price and region, but alsoThe quality of the link from your locationOtherwise, even if the node is set up, the client will not get a good user experience due to high latency or severe packet loss. In addition, the "direct connection" solution also means that the real IP of the server will be directly exposed to the public network. If the network to which the IP belongs has poor anti-interference ability, the operator is relatively small, or the behavior pattern is too "obvious", it is also easy to be accurately identified and blocked, so it is usually required that the server VPS adopts a "premium line". This is one of the reasons why many people start to consider the "non-direct connection" solution:It’s not that direct connection is bad, but that “direct exposure” is not stable or durable enough in today’s environment..
Note 2: Although "direct connection + TLS" itself has a certain degree of encryption, this is often not enough in the current detection environment.Therefore, we can use Nginx, Caddy, HAProxy and other middle layers that support reverse proxy to "wrap a layer of normal Web traffic" on the VPS., such as running a blog, CDN cache, or returning a static web page response, in order to confuse traffic characteristics and improve the degree of "looking like a normal HTTPS website". Although this type of "multi-layer camouflage" cannot completely avoid identification, it can significantly increase the fault tolerance and stability during passive interference.
Note 3: Of course, this does not mean that the "direct connection" mode cannot be used. After all, a considerable proportion of users currently use the "direct connection" mode to build their own scientific nodes. Many people rely on its simple configuration and low latency, especially in the early days when nodes are not restricted or still have good effects in specific areas. It is still the preferred solution for many people. However, from the trend point of view, the difficulties faced by this model will become greater and greater: on the one hand, the identification and interference methods of operators and platforms for encrypted traffic are continuously evolving, and simple direct connection methods are more easily detected, speed-limited, or even banned; on the other hand, more and more services need to seek a more balanced solution between anti-censorship, stability, and concealment, which means that the use of more complex architectures, such as transit, tunnel encapsulation, and remote diversion, will gradually become the mainstream.
4 Non-direct connection mode: hiding the source station IP with the help of transit entrance
4.1 Why do we need “indirect connection”?
In the early stages of "scientific Internet access", direct connection mode was a seemingly simple and direct way: just connect the client to an overseas VPS and forward the request through a controlled channel to access the external Internet.This seemingly open path is actually extremely fragile:Once the IP address of an overseas VPS is exposed in the domestic network, it may become the focus of DPI (deep packet inspection) systems, behavior recognition models, and keyword monitoring systems. Once marked, the connection will have the following problems:
- Traffic is limited, interference is injected, or even reset directly;
- The VPS IP was added to the block list and the connection was completely disconnected;
- Cross-border communications are "phased dead", but there is no clear error message, and it is difficult to locate the source of the problem.
The core of the problem:The transparent "wall" does not target a specific service or site, but attempts to identify "abnormal connection behavior"When an overseas VPS is accessed by dozens of clients in the same way, with fixed traffic characteristics and abnormal access frequency, the "identity" of this VPS is no longer an ordinary web server, but a suspicious communication transit point.
In this context, continuing to adhere to the "direct connection" strategy means:
- Keep changing VPS;
- Be prepared for unstable connections at any time;
- Every user bears the high risk of connection failure and traffic being blocked.
The more fundamental problem is:If the user's request always points directly to the real address of the target service, then the server cannot be "hidden" and the identification system will always have room to play.
4.2 The key to achieving “non-direct connection”: transit entrance
In fact, the real significance of the non-direct connection mode is to prevent the "source station" IP from being exposed to the public network., which requires the use of a transit entrance. The real "non-direct connection" solution should be a link like this: domestic client ── connection ──> transit entrance ── forwarding ──> real VPS / source server service. Under this design, for the transparent "wall", what it sees is always the domestic client accessing the transit entrance, and as for the source server behind the transit entrance, it knows nothing.
How to find the “transit entrance”?
To be able to serve as a "transit entrance" in today's network environment, it is not enough to just find a random machine. It must meet the following key conditions:
- IP will not be easily blocked: It is best to use an IP from a large company with a "credible background" and will not be blocked immediately due to a little "atypical traffic"
- Have a certain degree of neutrality:For example, CDN and distributed network platforms, the firewall is naturally more "gentle" towards them and will not easily take action
- Support custom traffic forwarding: For example, the ability to establish reverse tunnels, WebSocket forwarding, or direct TCP/QUIC forwarding
These conditions combined have actually ruled out most personal servers or cheap commercial VPS: you don't want a springboard that "anyone can deploy", but aA "camouflage point" that won't easily go wrong.
In reality, Cloudflare is the preferred choice for transit entrance, because it happens to have all the above characteristics, which makes it aA transit point with great potential:
- IP is stable and highly reputable:Its network nodes widely serve millions of websites around the world and are the "infrastructure" of the modern Web world;
- Natural CDN Platform:The Great Firewall is more restrained in its handling of CDN policies, often giving priority to ensuring “availability”;
- Support flexible reverse proxy logic: Relay public network traffic back to your real service through your own portal network.
4.3 Prerequisite knowledge: How to implement Cloudflare's reverse proxy function
When using Cloudflare to build a "non-direct connection" transit entrance, what really works is not its traditional function as a CDN, but itsGlobal edge network coverageas well asHighly flexible reverse proxy capabilities, so you need to first understand the "reverse generation" method it provides.
From the overall mechanism point of view, Cloudflare provides two typical "anti-generation" methods:
- Public IP retrieval
In this mode, Cloudflare acts as an edge proxy, simply forwarding requests from clients and pointing to your configured DNS. Public server IP, go back to the source to retrieve the content. Although the deployment is simple, it also exposes a fatal problem:Your real server IP must be visible to the public network, which is exactly what "non-direct connection" wants to avoid.
- Cloudflare Tunnel (Intranet penetration anti-generation)
This is the method we focus on: Tunnel is started by you on the server, actively initiates a connection to Cloudflare, and creates aBack Channel, allowing Cloudflare to securely forward user requests to services on the intranet/non-public IP.No need to expose the source IP, and there is no need to open any public network ports.
From the perspective of concealment, security, and survivability, the Tunnel mode is obviously more in line with the design goals of "non-direct connection":
- The origin server IP is never exposed to DNS, scanners, or direct connection tests;
- The entire link is built from the inside out, avoiding the interference and blockade faced by traditional connection methods;
- Even if the service is deployed in home broadband, NAT intranet or even IPv6-only environment, it can be easily connected.
Therefore, in the current complex network environment, Cloudflare Tunnel has become one of the mainstream implementation methods of the "non-direct connection" mode, and is particularly suitable for use with self-built proxy services.
4.4 The Cost and Limitations of “Indirect Connection”
In the previous section we mentioned:Cloudflare Tunnel has become one of the mainstream implementations of the "non-direct connection" mode, which hides the IP of the real source station and only exposes Cloudflare's edge nodes, allowing many self-built solutions to continue to survive in today's environment.
Unfortunately, if you want to seamlessly connect your existing proxy service to Cloudflare Tunnel, you will find that not all protocols can be used.Tunnel is not a universal anti-generation service,It supports only a very small number of protocol stacks that can be "successfully disguised". This is because Cloudflare Tunnel is designed toServing Web Applications, it expects you to provide a standard HTTP or HTTPS service by default, and also prevents you from abusing it. This means:Only those proxy protocols that "look like HTTP services" can be successfully accessed and transmitted by Tunnel. Typical representatives are Trojan and VLESS ——They themselves emphasize "disguising as browser access" through TLS, WebSocket, etc., which just fits in with Cloudflare's traffic strategy.
Therefore, if you want to pass the proxy protocol through Tunnel, the recommended combination is:
protocol | Is it compatible with Cloudflare Tunnel? | Recommended pairing |
---|---|---|
Trojan | yes | TLS + WebSocket |
VLESS | yes | TLS + WebSocket (or gRPC) |
VMess | no | Need to cooperate with WebSocket and poor stability |
Shadowsocks | no | Not an HTTP request per se, unless encapsulated by additional |
WireGuard / OpenVPN | no | Totally unavailable, additional transfer required |
Hysteria / TUIC and other QUIC protocols | no | Non-web transmission is not supported and is not recommended. |
What is WebSocket and why do we need it?
WebSocket is a "persistent connection" mechanism that runs on the HTTP protocol. It was originally designed to establish aA pipeline that can communicate in two directions in real time, its magic lies in:
- The initial handshake looks like a standard HTTPS request (this is crucial for "disguise");
-
After successfully establishing the connection, any type of data can be transmitted in this seemingly ordinary "web channel" - including the proxy traffic we want to use.
For Cloudflare Tunnel, it defaults to expecting the proxy to be website requests, so if you want to "sneak" in other non-standard traffic, you must first set these proxy protocolsEncapsulation into WebSocket, in order to disguise as compliant traffic to bypass detection. This is why protocols such as Trojan and VLESS are often recommended to be used with WebSocket:Not only can it disguise itself as a website visit, but it can also pass through Cloudflare Tunnel's "Web censorship".
4.5 Are “non-directly connected” science nodes necessarily slower than “directly connected” science nodes?
This is a question in the minds of many self-built users, and even a misunderstanding.The "non-direct connection" solution takes a detour on the link, which will inevitably lead to performance degradationBut the fact is that this judgment does not hold true in the current Internet environment, and in many cases it is just the opposite.“Non-direct connection” is actually faster and more stable. The main reasons are as follows:
1,Direct connection does not mean smooth connection. In many cases, it is "blocked after being identified"
The biggest problem with "direct connection" scientific nodes is not bandwidth, butWhether the traffic is delivered to the target VPS completely and stablyIn reality, many so-called "direct connections" are not really "smooth":
- The data packet is dropped (Drop) or actively reset (RST);
- After the connection is established, communication is possible, but the speed is extremely slow and unstable;
- It suddenly becomes unavailable for a certain period of time, and then magically recovers.
This kind of phenomenon is not uncommon. What it reflects isThe link is not credibleSo the fundamental reason why many "direct connections" are slow is not because they take the direct route, but because of this route.You're not welcome at all.
2,"Non-direct connection" avoids traffic identification through transit and improves the overall link quality
In contrast, "non-direct connection" hides the real source station through the transit entrance, making the traffic in the first half of the link look like accessing a regular service (such as Cloudflare node, well-known cloud vendor IP, CDN distribution center, etc.), thereby gaining Higher link fault tolerance and stability.
Moreover, the transit entrance and the source station generally follow standard international routes without any middleman interference.The quality of this link is often much higher than that of "direct connection through the wall".
3.Latency and throughput are not only affected by “short paths”
Many people mistakenly believe that "the shorter the path, the lower the latency", but in the current network environment:
- Whether the path is speed-limited, interfered with, or suppressed
- The export quality of the data center where the source station is located
- Throughput bottleneck of transit server
- Connection recovery/retransmission efficiency of the protocol
These factors are often more critical than the path length. For example, a "direct connection" VPS is located in a small data center in Asia, but the egress is poor and TCP handshake fails frequently; while a "non-direct connection" VPS uses Cloudflare Tunnel as a detour, but the egress is a high-quality large-scale line such as GCP and Azure, which is much more stable and fast.
4."Non-direct connection" can also easily build multi-point transfer, load balancing, failover and other mechanisms
This is difficult to achieve with the "direct connection" model: once the node is identified and blocked, the entire solution becomes ineffective.Flexible entry configuration and protocol camouflage capabilities, which is very suitable for building architectures with more engineering flexibility, such as multi-link parallelism and dynamic switching.
in conclusion:"Non-direct connection" is not a "second best option", but a realistic choice that is more concealed, anti-interference and scalable in the modern network environment. Whether it is slower than "direct connection" does not depend on "whether it goes in a straight line" but onThe overall security, concealment and scheduling capabilities of the link.
4.6 Cloudflare Tunnel and Compliance Issues for Scientific Use
Although Cloudflare Tunnel is almost an ideal tool for "non-direct connection" mode - it is stable, widely distributed globally, does not need to expose public IP, and can seamlessly cooperate with VLESS/Trojan and other protocols to complete the complete TLS + WebSocket disguised link - but here we must solemnly remind you:Cloudflare explicitly prohibits the use of its services for proxy, VPN, scientific Internet access, etc.
Cloudflare's Terms of Service (https://www.cloudflare.com/terms/) mentioned many times that it is forbidden to use its infrastructure to "bypass access control", "unauthorized access", "violate local laws or service provider policies", etc. And in its community, work order replies and actual account suspension cases, it also clearly pointed out many times:
- Proxies, VPNs, or traffic forwarding services are not permitted through Cloudflare Tunnel;
- If a large amount of atypical traffic or proxy behavior is found, flow control may be automatically triggered, subdomains may be blocked, or even accounts may be deactivated;
- Special emphasis is placed on the high sensitivity to traffic "not related to the public network" (such as intranet proxies and wall-circumvention purposes).
This means thatAlthough technically feasible, it is not compliant in practice:Being able to use it does not mean it should be used. In the current environment, Cloudflare Tunnel is indeed one of the few channels that can be stably used for "non-direct connection" scientific links, but from the perspective of service agreement and actual risks,It is more suitable as a tunnel service for "reasonable purposes" such as development, demonstration, remote management, etc..
If you insist on using it in the service of science:
- Recommended onlyPersonal low frequency use;
- As much as possibleSetting up access control,Hidden Path, and avoid high-traffic behavior;
- Most importantly,Do not disclose or disseminate usage methods or service addresses to others, to avoid affecting the entire Tunnel ecosystem.
After all, Cloudflare is not here to help you with "science", its goal is to provide secure, compliant acceleration and protection services for web applications.
5. Construction of scientific nodes based on sing-box
5.1 Introduction to sing-box
In fact, apart from the theoretical content mentioned above, I did not intend to explain in detail the use of specific tools.sensitiveOn the other hand, there is no absolute "optimal solution" for the choice of tools and protocols - what is more important is to understandThe principle and structure behind itHowever, in order to make the structure of this article more complete and to help some beginners better implement the content mentioned above, I will briefly talk about the configuration demonstration of the tool.
The current mainstream choices include Xray, Hysteria, NaïveProxy, tuic and other tools, each with its own characteristics and usage scenarios. But considering the stability, flexibility and update activity, I finally chosesing-boxAs an example.
Different from the traditional idea of "one protocol to the end",sing-box is more like a highly modular proxy frameworkIts positioning is very clear: it integrates multiple proxy protocols, provides strong configurability, and provides advanced users with the ability to build complex scientific scenarios.
In terms of protocol support, sing-box is one of the few proxy tools that currently supports a wide range of incoming and outgoing protocols. It can run as a client, as a remote server, and can even be used to build a hierarchical multi-level forwarding structure (such as "DNS diversion + TLS forwarding + multi-exit outbound"). Common proxy protocols such as VLESS, Trojan, Shadowsocks, Hysteria2, Tuic, SOCKS, WireGuard, etc. are fully supported, and outbound protocols even include VMess (compatible mode), HTTP, DNS diversion, etc., covering almost all mainstream needs.
In addition to the protocol itself, sing-box also has comprehensive support for the transport layer. It natively supports TLS, Reality, WebSocket, gRPC, QUIC, XTLS and other transport mechanisms. Users can flexibly combine them according to actual needs to achieve both concealed and stable traffic disguise.
In terms of configuration, sing-box's advantage is "clear and powerful". Although it does not have "one-click configuration" like some GUI clients, the JSON configuration file has a clear structure, complete official documentation, supports comments, and has highly self-consistent logic. Unlike Xray, it does not hide many behaviors in the default logic of "you can't guess it without telling", but encourages you to explicitly define every behavioral detail. Once you understand the core architecture, you can write a fully controllable and transparent configuration file. Not only that, it also comes with a command line debugging tool that can be used to check route matching, test connections, and analyze links, which is very critical for troubleshooting.
It is worth mentioning that sing-box is also significantly ahead of other tools in terms of support for new protocols. For example, it is one of the earliest proxy projects to support Reality, a new mechanism that does not rely on certificates and simulates real TLS traffic, greatly improving concealment; for example, Hysteria 2 and Tuic 2, these QUIC-based anti-interference protocols, have also been supported in sing-box at the first time. In addition, it also has a built-in obfuscation plug-in system that allows users to customize obfuscation rules to further improve anti-interference capabilities.
In general, if you want to be able toFull control over node behavior, flexibly define each inbound and outbound traffic, and finely control the camouflage method and protocol details, then sing-box is one of the core tools that is currently most worthy of in-depth study and use: it was not born to "simplify scientific Internet access", but to remain free in a complex environment.
5.2 Installing sing-box
5.2.1 APT installation
mkdir -p /etc/apt/keyrings curl -fsSL https://sing-box.app/gpg.key -o /etc/apt/keyrings/sagernet.asc chmod a+r /etc/apt/keyrings/sagernet.asc echo ' Types: deb URIs: https://deb.sagernet.org/ Suites: * Components: * Enabled: yes Signed-By: /etc/apt/keyrings/sagernet.asc ' | tee /etc/apt/sources.list.d/sagernet.sources apt-get update apt-get install sing-box
Check the version and features:
sing-box version

Note: Only the latest official release is installed using the default apt method.
5.2.2 Download the official build version directly
In addition to the apt method, you can also use the official version directly. Let's take the v1.11.10 version as an example:
wget https://github.com/SagerNet/sing-box/releases/download/v1.11.10/sing-box-1.11.10-linux-amd64.tar.gz tar -xzf sing-box-1.11.10-linux-amd64.tar.gz cp sing-box-1.11.10-linux-amd64/sing-box /usr/local/bin chmod +x /usr/local/bin/sing-box
5.2.3 Docker installation
5.2.3.1 docker run method:
docker run -d \ -v /etc/sing-box:/etc/sing-box/ \ --name=sing-box \ --restart=always \ ghcr.io/sagernet/sing-box \ -D /var/lib/sing-box \ -C /etc/sing-box/ run
5.2.3.2 Docker Compose Configuration File
version: "3.8" services: sing-box: image: ghcr.io/sagernet/sing-box container_name: sing-box restart: always volumes: - /etc/sing-box:/etc/sing-box/ command: -D /var/lib/sing-box -C /etc/sing-box/ run
5.3 Create configuration directory and configuration file
5.3.1 Nagging
Sing-box is a bit tricky, and the configuration files between different versions may be quite differentBecause its development pace is very fast, it is updated frequently, and many functions are constantly refactored and renamed. For example, some fields in the old version are merged or split in the new version, and the behavior logic of some modules will also change with the version iteration. This leads to many tutorials online and even official documents. The sample configurations written at different times are incompatible with each other, which is easy for beginners to fall into the trap.
Especially when you refer to other people's configuration, if the difference between the two versions is large (in fact, it is not necessarily a problem when the difference is large, it is just as difficult to say if there is a small version difference~), it is very likely that "the configuration looks correct, but it does not run at all". Therefore, you must pay attention when using sing-boxEnsure consistency among tutorial examples, your own program version, and official documentation, otherwise the debugging process may become very painful.
Friendly reminder: If you are looking for stability, it is recommendedAfter verifying the available configuration, lock the version to use; If you want to use the latest features, such as Reality, Tuic2 and other advanced features, you must be prepared to continuously update the configuration according to the version log.
5.3.2 Server configuration demonstration (v1.11.10)
Create a new directory and create a config.json configuration file:
mkdir -p /etc/sing-box vim /etc/sing-box/config.json
Use the "vless+tls+websocket" method and fill in the following configuration (server-side configuration):
{ "log": { "level": "info" // Set the log level, optional: trace/debug/info/warn/error. Here it is set to info, which means outputting general operation information. }, "inbounds": [ { "type": "vless", // The inbound protocol type is VLESS "listen": "0.0.0.0", // Listen to the IP addresses of all network cards"listen_port": 8443, // The listening port is 8443, and the client needs to connect to this port"users": [ { "uuid": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" // UUID used for user authentication, which must be consistent in the client configuration} ], "tls": { "enabled": true, // Enable TLS encryption"server_name": "xxx.exapmle.com", // SNI (Server Name Indication), should be consistent with the domain name in the certificate"certificate_path": "/etc/certs/xxx.exapmle.com/fullchain.pem", // TLS certificate path"key_path": "/etc/certs/xxx.exapmle.com/key.pem" // TLS private key path}, "transport": { "type": "ws", // Use WebSocket as the transport layer "path": "/websocket" // WebSocket access path, the client needs to match } } ], "outbounds": [ { "type": "direct" // The outbound type is direct connection (not proxy), indicating that this is an end node (not a forwarding proxy) } ] }
The uuid in the above configuration can be generated using an online tool, or you can use mine:uuid online generation tool.
Run sing-box server:
sing-box run -c /etc/sing-box/config.json
Note 1: In this example, because this is deployed on an overseas VPS node, "outbounds" can be used directly outbound; if you plan to let this server continue to proxy the received traffic, then "outbounds" is not direct, but should point to other proxy nodes (such as an outbound configuration of shadowsocks, vless or trojan).
Note 2: In addition to vless, you can also use trojan. It depends on your preference. With the support of Cloudflare Tunnel, there is actually no difference between using vless or trojan (TLS is completed by Cloudflare). Vless is just a little lighter.
5.3.3 Client configuration demonstration (v1.11.10)
Create a new directory and create a config.json configuration file:
mkdir -p /etc/sing-box vim /etc/sing-box/config.json
Use the "vless+tls+websocket" method and fill in the following configuration (client-side configuration):
{ "log": { "level": "info" // Set the log level, here is info, which means output general information}, "inbounds": [ { "type": "http", // The inbound type is HTTP proxy"tag": "http-in", // Tag name, which can be used for reference in routing and other functions"listen": "0.0.0.0", // Listen on all network card addresses"listen_port": 8080 // The HTTP proxy listens on port 8080 }, { "type": "socks", // The inbound type is SOCKS5 proxy"tag": "socks-in", // Tag name"listen": "0.0.0.0", // Listen on all network card addresses"listen_port": 1080 // The SOCKS5 proxy listens on port 1080 } ], "outbounds": [ { "type": "vless", // The outbound protocol type is VLESS, used to connect to the server"tag": "vless-out", // Outbound tag, can be used for route matching "server": "xxx.exapmle.com", // VLESS server domain name or IP address "server_port": 443, // The port that the server listens on, used with TLS "uuid": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", // UUID for authentication, must be consistent with the server configuration "tls": { "enabled": true, // Enable TLS encryption "server_name": "xxx.exapmle.com" // The SNI used during TLS handshake should be consistent with the server certificate }, "transport": { "type": "ws", // Use WebSocket transmission method "path": "/websocket" // The path of WebSocket should be consistent with the server configuration } } ], "route": { "auto_detect_interface": true, // Automatically detect the default network interface (such as the egress network card) "final": "vless-out" // All traffic not matched by other rules finally goes through the "vless-out" outbound } }
This configuration is suitable for forwarding local HTTP/SOCKS requests to the remote VLESS server, and is suitable as a scientific proxy entrance for browsers or local applications in the home intranet.
Run sing-box client:
sing-box run -c /etc/sing-box/config.json
Note: The UUID on the client side must be consistent with the UUID on the server side.
5.4 Configure automatic startup
Create a service file:
vim /etc/systemd/system/sing-box.service
Paste and save the following:
[Unit] Description=Sing-box Service After=network.target [Service] ExecStart=/usr/bin/sing-box run -c /etc/sing-box/config.json Restart=on-failure RestartSec=5s LimitNOFILE=1048576 [Install] WantedBy=multi-user.target
Enable and start the service:
systemctl daemon-reexec systemctl daemon-reload systemctl enable --now sing-box
Check the service status:
systemctl status sing-box
6 Conclusion
In fact, no matter you use the "direct connection" mode or the "non-direct connection" mode,There is almost no need to make any differentiated adjustments at the sing-box server/client configuration levelThe key to determining the access method is often only the xxx.example.com in the configuration file.Domain name resolution points.
Take a domain hosted on Cloudflare as an example:
- if youDisable Cloudflare proxy function(That is, turn off the "little orange cloud" icon) and let the domain name pass the A recordDirectly resolve to the public IP address of the sing-box deployment node, and the node is open and listening to the corresponding port, then when the client accesses the domain name, it is actually directly connecting to the node, which belongs toTypical "direct connection" mode.
- On the contrary, if youEnable Cloudflare proxy function(Open "Little Orange Cloud"), then the domain name will not expose the real IP, but will be back-sourced through Cloudflare's edge network. There are two forms:
- Traditional agency method:Cloudflare returns to your node through the public network (public IP + listening port);
- Tunnel Mode: You run the Cloudflare Tunnel client on the node, and Cloudflare uses the tunnel's dedicated tunnel to return to the source.
Both of these methods are typical "non-direct connection" modes, because there is always a "non-real IP" transit layer in the access link. Therefore, it can be said that:The difference between "direct connection" and "indirect connection" does not lie in the configuration of the sing-box itself, but in how you control the entry path of the access traffic through DNS and Cloudflare.
Note: If you use the "non-direct connection" mode based on cloudflare tunnel, there will be several configuration points to note, see my other article:dnscrypt-proxy (v2.1.8) Multi-scenario configuration guide: from upstream deployment to downstream integration.
说到自建科学节点, 我的经验是 cdn路子(我是用 websocket+tls) + tcp路子(reality) + udp路子 (hysteria2) 同时搭着, 端口互相不冲突就能同时共存. 对于使用者来说, 哪个好就用就哪个.
用到的github项目
https://github.com/crazypeace/v2ray_wss
https://github.com/crazypeace/xray-vless-reality
https://github.com/crazypeace/hy2
没毛病,我只是不喜欢开放端口而已,所以直连方式的tcp和udp都不想弄,只是着重介绍一下你说的cdn路子(严格意义上说和cdn无关,只是利用了其中的反向代理功能而已)。