Contents
- 1. The Reasons for the Third Optimization of the Blog Active-Active Architecture
- 2. Core adjustments in the third optimization
- 3. A MariaDB database version issue exposed by an accident
- 4. Homepage reinforcement: From structural stability to access stability
- 5. Postscript: What constitutes the complete form of a multi-active architecture?
1. The Reasons for the Third Optimization of the Blog Active-Active Architecture
A few days ago, I was preparing to work on the active-active architecture of my blog. Mac mini nodePerforming a routine maintenance task: upgrading Portainer's version. However, a seemingly insignificant problem ultimately exposed the entire node's applicability—on a Mac mini:
docker pull portainer/portainer-ce:latest
However, an unexpected error occurred:
error getting credentials - err: exit status 1, out: keychain cannot be accessed because the current session does not allow user interaction. The keychain may be locked; unlock it by running "security -v unlock-keychain ~/Library/Keychains/login.keychain-db" and try again
At first glance, it seems like a Portainer or Docker issue, but a little research reveals that...This matter actually has almost nothing to do with Portainer itself..
The real root cause lies in: Docker on macOS uses... Keychain Store login credentials, while Non-interactive sessionIn situations such as SSH, background tasks, or when the device is powered on and no one is logged in, Keychain cannot display a GUI for unlocking. This is because... Mac mini in long-term unattended environments It is especially common in the area.
In short, the reason is that Docker wants to retrieve credentials from the Keychain, but the current session does not allow any user interaction, so it fails directly.
The corresponding solution is actually not complicated, as long as...Disable Docker's credential dependency on Keychain.That's it. The specific steps are: Edit:
~/.docker/config.json
Delete the following:
""credsStore": "desktop""
or:
""credsStore": "osxkeychain""
After the operation was completed, executing `docker pull` again successfully restored the image retrieval to normal.
But the real problem wasn't there. Because I then discovered that several containers that were running in Docker Desktop started acting up, services failed to start, and even restarting the Mac mini didn't fix it. After struggling for a while, I finally had to choose... Reinstall Docker Desktop and redeploy all critical containers..
Fortunately, the process wasn't too time-consuming; it was fully restored in about twenty minutes. But it was this experience that made me realize something very clearly for the first time:The combination of Docker Desktop and Mac mini is not suitable as a "node" in a dual-active architecture for a blog.
The reason is not complicated, nor was it caused by this operational error, but rather...structural:
- In a dual-active architecture, the nodes should essentially be "server-like" entities.
- need Unattended
- It should be able to restart Deterministic recovery of operational status
- Cannot rely on desktop UI
- Furthermore, mechanisms like Keychain, which strongly bind user sessions, should not be relied upon.
而 Docker Desktop + macOS The reality is quite the opposite:
- Strongly dependent on GUI
- Strongly dependent on user login state
- Its internal state is complex and unpredictable.
- If the environment encounters problems, the recovery path is highly uncertain.
This isn't because Docker Desktop is bad; rather, it's because it wasn't originally designed as a "server-level runtime environment."
The issues caused by this Portainer upgrade were merely a trigger. What truly made me decide to upgrade was that it reaffirmed a conclusion:The Mac mini is no longer suitable to continue serving the operational responsibilities in the blog's active-active architecture.
Based on this judgment, I started using dual-active nodes for my blog.Third optimization.
2. Core adjustments in the third optimization
From a hardware and infrastructure perspective, this third optimization doesn't really constitute a major overhaul: no new cloud service providers were added, no new middleware components were introduced, and the overall resource scale remained largely unchanged. However, at the architectural level, the changes are quite clear—The status and responsibilities of each node in the active-active system have been redefined.
One of the most significant changes is reflected in the intermini node. While intermini has appeared in previous articles, it primarily played a supporting role, such as serving as the export node for Simply Static, and wasn't truly integrated into the core of the blog's active-active architecture. When macmini exports and synchronizes database files, it merely incidentally synchronizes data to intermini via syncthing; it's more like a "tool node" than a service node.
In this adjustment, intermini was officially incorporated into the external service system.From a dispensable supporting role, it has been transformed into one of the read-only nodes on par with the Chicago VPS.They truly participate in the external release of blog content and dual-active redundancy.
In the previous active-active architecture, the Macmini handled both write operations and provided read services. This design itself wasn't flawed, but its shortcomings became increasingly apparent over time. The Macmini is essentially a desktop device, relying on Docker Desktop, user login state, and system components. In the event of system upgrades, desktop service malfunctions, or unattended restarts, its overall behavior can become unpredictable. This poses a potential risk for blog systems that require long-term stable operation.
Therefore, the core idea behind this optimization is very straightforward:The Mac mini will be relegated to a "control" role and will no longer directly handle service functions.
In the new structure, the WordPress instance on macmini is explicitly positioned asWrite only nodesThe macmini node is only responsible for active write operations, such as publishing articles, modifying pages, or adjusting site configurations. All responsibility for "externally displayed content" is handled by the intermini node and the Chicago VPS node. In other words, all operations that change the system state remain centralized in the macmini node, while external access is entirely handled by the read nodes.

As for transferring content from the write node to the two read-only nodes, I still used the same simple but reliable method as before: after making the modifications, I exported the WordPress database as a complete wordpress.sql file and synchronized it to both the intermini and Chicago VPS nodes using rsync. While this method isn't as sophisticated as real-time database replication, it...Clear state of mind, controllable behaviorThe investigation and recovery costs are low, making it ideal for small-scale personal blogs.
Both the intermini and Chicago nodes run Debian 12, and Docker containers do not rely on a desktop UI or user sessions. These two nodes are positioned as...Fully peer-to-peer read-only active-active nodesIt is responsible for providing blog access services to external users. If any node malfunctions, traffic can automatically switch to another node without human intervention.
I deployed the same monitoring script on two read-only nodes, continuously monitoring the same directory path./docker/wordpress/db/Once the new wordpress.sql file is detected and synchronized, the database rebuild and import process is automatically triggered, and the import results are sent to my iPhone and Mac via Bark. The entire process does not require direct communication between nodes; as long as the file is successfully synchronized and passes MD5 verification, the content will be automatically updated.
A key advantage of this design is:There is almost no strong coupling between nodes.Macmini does not need to be aware of whether the read nodes are online, nor do the read nodes need to know each other's status. Each node only reacts to whether the file has appeared, keeping the overall complexity within a controllable range.
Of course, there are still passive write operations in the blog, such as user comments. To avoid single points of failure, I adjusted the comment processing logic of Cloudflare Worker so that comments can be written to macmini, intermini, and the Chicago VPS node simultaneously. This multi-point write is not for strict strong consistency, but rather to ensure...Comment data is not lost when switching nodes.In the context of personal blogs, this trade-off is acceptable.
Overall, the third optimization did not increase the system complexity. Instead, by redefining responsibilities, it allowed each type of node to do only what it does best and is most stable: macmini is responsible for writing and control, while intermini and the Chicago node are responsible for reading and external services. The data flow method is made as intuitive and easy to verify as possible.
An added benefit is that read nodes can naturally scale horizontally—as long as a new node can receive and import wordpress.sql, it can join the external service, while the system still retains a clear and controllable "center" and will not lose order due to expansion.
In terms of actual user experience, this adjustment also brings significant changes:The writing process and external presentation were completely decoupled. Previously, when MacMini handled both writing and reading, visitors might see the intermediate state of editing articles or adjusting the structure in the background. Now, MacMini doesn't provide read services externally, allowing me to repeatedly modify, scrap, and redo content locally without affecting the currently published content. Only after confirming the modifications are complete and exporting the database to the read node will the blog switch to the new content all at once. From a user experience perspective, this feels more like a one-time update.Clear and controllable release actionsInstead of making changes as they take effect, this allows for more relaxed writing and adjustments.
For detailed design and node responsibility division of the active-active architecture, please refer to my previous article:The Second Restructuring of the Blog Architecture: Service Migration and Active-Active Disaster Recovery Practices Triggered by VPS Relocation and A solution for implementing a WordPress multi-active architecture (simplified version) in a personal blog..
3. A MariaDB database version issue exposed by an accident
In the overall process of this third optimization, most steps went smoothly. The only thing that really got me stuck was this one place:After using rsync from Macmini to Intermini, the import into the database failed, resulting in the WordPress database on Intermini being completely deleted.
When I first encountered this problem, I was actually a bit confused. Because whether it's directory monitoring, automatic triggering, or database import scripts, Intermini uses the exact same logic as the Chicago node, and the Chicago node has always run very stably without ever encountering similar issues.
This means that the problem is most likely not in the script logic itself, but in some "lower, but less noticeable" place.
To quickly locate the problem, I bypassed the monitoring script and manually executed a database import command similar to the following on Intermini to investigate:
docker exec -i mariadb mysql -uroot -ppassword wordpress < /docker/wordpress/db/wordpress.sql
The result immediately showed an error:
ERROR at line 1: Unknown command '\-'.
The error message itself isn't very user-friendly, but it at least makes one thing clear:The problem lies in the first line of the SQL file. So I used the `head` command to view the beginning of wordpress.sql:
head -n 20 /docker/wordpress/db/wordpress.sql
The results are as follows:

As you can see, the first line of the file is a specially formatted comment:
/*M!999999\- enable the sandbox mode */
In other words, MariaDB on Intermini treated this comment as an illegal command instead of skipping it normally, causing the entire import process to be interrupted at the first line.
But here another crucial question arises:Why does the same wordpress.sql file work perfectly fine when imported on the Chicago node?
Continuing along this trail, I compared the MariaDB versions on the three nodes, using commands similar to the following:
The command `docker exec -it mariadb mysql --version #` executes the command where `mariadb` is the container name.
- Mac Mini:

The version is 10.11.15. - Chicago VPS:

The version is also 10.11.15. - intermini:

The version is 10.11.6.
At this point, the problem is basically clear—although 10.11.15 and 10.11.6 both belong to the major version 10.11, MariaDB has some differences in special comments (especially between different versions) between certain minor versions. /*M! ... */ This is a MariaDB-specific extended annotation parsing behavior.There are indeed differences..
More specifically, the earlier version of MariaDB (10.11.6) on Intermini cannot correctly recognize this type of comment format, so it directly throws an error when importing SQL; while the Chicago node uses version 10.11.15, which does not have this problem.
To verify this judgment, I didn't modify the SQL file itself, but instead chose the most direct method:Upgrade the MariaDB version on Intermini to 10.11.15.After the upgrade was completed, the import command was executed again using the exact same wordpress.sql file, and it succeeded on the first try without any further errors.
At this point, the root cause of the problem has been completely identified.
Looking back now, this is actually a very typical but easily overlooked pitfall:When creating the Docker container, I used an image label like mariadb:10.11 and did not explicitly lock the minor version number. In a multi-node architecture, even a slight difference in the minor version of a single node can trigger this "not obvious but fatal" compatibility issue in extreme cases.
Fortunately, the cost of this mistake wasn't too high, but it served as a reminder:In a multi-node, replicable architecture, version consistency is an integral part of system stability, rather than a dispensable detail.
If you don't want to upgrade the MariaDB version on Intermini, you can also solve this import problem by preprocessing the wordpress.sql file before importing the SQL file—by removing the comment from the first line. You can do this using the following command:
sed -i '/^\/\*M!/d' /docker/wordpress/db/wordpress.sql
However, it is recommended to unify the MariaDB version across all nodes to avoid encountering similar inexplicable problems as this one.
4. Homepage reinforcement: From structural stability to access stability
After the third optimization, the blog's structure has become very clear: centralized writes, well-defined data distribution, and single-responsibility read nodes. From the system's internal perspective, it has achieved the stable form expected of a multi-active architecture.
However, in actual operation, I did something extra that was not "core" but very practical - I reinforced the homepage.
The reason for singled out the homepage is that it holds a very special position in the entire blog visitor journey: it's almost the first page for all visitors, and at the same time, it's the page most susceptible to structural adjustments. Even if the backend is very stable, any uncertainty regarding the homepage will significantly amplify the overall user experience.
In this optimization, I adopted a more conservative strategy for the homepage than for regular pages: by using static page generation and forced caching, the homepage will most of the time return results directly from the edge node, instead of going back to any WordPress instance. In this way, the availability of the homepage is changed from "depending on the backend state" to "depending on the existence of the edge cache".
This point is particularly worth elaborating on when using Cloudflare APOs. Many people subconsciously assume that enabling an APO guarantees a stable and cacheable homepage. However, in practice...The APO homepage cache status is still related to the working status of WordPress.If an exception is triggered during the origin request process—such as a database connection error, a PHP fatal error, or even a plugin-level exception response—Cloudflare may not be able to obtain a "cacheable valid response" in this round of requests, thus causing the homepage cache to become invalid.
in other words,APO is not a "black box cache" completely detached from the health of the origin server.“—When WordPress is in an abnormal or unstable state, the homepage is often the first page to show problems.
For this reason, in this third optimization, I chose to move the stability goal of the homepage forward: even if a certain node in the backend experiences an anomaly in a short period of time, the homepage will still respond directly with the existing static results, instead of being forced to return to the origin to verify the current system status.
It's important to note that this step is not a "necessary component" of a multi-active architecture, nor does it participate in the data flow logic between nodes. It acts more like an extra buffer: when the internal structure is adjusted, nodes switch, or even experience temporary instability, the homepage can still maintain a relatively stable and predictable external performance.
The specific implementation of homepage staticization and caching strategies requires setting relevant rules with your caching provider. Taking Cloudflare as an example, you can simply use a single caching rule:

For details on setting up Cloudflare caching rules, I wrote about it in a previous article (see:Home Data Center Series CloudFlare Tutorial (VI) CF Cache Rules Function Introduction and Detailed Configuration TutorialAs already discussed in detail, I won't go into it again here. What I want to emphasize in this chapter is its position in the overall structure—it doesn't change the division of responsibilities within the system, but effectively reduces the uncertainty of the system's boundary layer.
In retrospect, this sequence of "stabilizing the structure first, then reinforcing the boundaries" proved worthwhile. The internal structure could continue to evolve, while the experience presented to visitors remained firmly fixed within a safer range.
5. Postscript: What constitutes the complete form of a multi-active architecture?
Looking back at this third optimization, it didn't focus on increasing the number of nodes, nor did it pursue the superficial excitement of "simultaneous online and simultaneous writes," but rather...Control and Data FlowA clear cut was made on top.
If we only focus on "multiple WordPress nodes providing services simultaneously," then that's at best...Multiple copiesIf writing, database, and control logic are still entangled on the same node, it can hardly be called true multi-active.
The key changes in this structural adjustment are:Writes are centrally controlled, data is actively separated, and redundancy is maintained for external services. Macmini no longer handles daily read services, but it remains the de facto control center—writes occur there, database exports and distributions are completed there, and structured operations are initiated from there; while other nodes only do one thing:Provide stable and predictable access to content..
From this perspective, the "complete form" of a multi-active architecture is not that each node can do everything, but rather that: the control plane is clear; the data flow is unidirectional and controllable; and service nodes can be added or removed at any time without affecting overall consistency.
This structure does not pursue the strongest theoretical consistency, nor does it deliberately display complexity, but it is stable in engineering. Precisely because of this, the third optimization did not introduce many "new technologies," but rather focused on a refinement of the existing structure.Role reassignmentDetermining which capabilities should be concentrated and which must be decentralized may be more important than the specific implementation methods.
As for whether this structure is the "end point," the answer is clearly no. It is merely a temporary choice that prioritizes stability, maintainability, and controllability under current constraints. The next time it is modified, it will most likely not be because of "how else can it be shown off," but because some new real problem forces the structure to take another step forward.
At that time, it will be fine to record it again.
I felt I should make a note of this because you mentioned the PostgreSQL issue, and I also encountered a problem with MariaDB where I didn't specify the minor version precisely, so I thought I should learn from this experience in the future.
Running a database in Docker can be a real pain. Like our discussion at the beginning of the month, I got screwed over by PostgreSQL – images from the same repository and with the same version number had different underlying operating systems. Running a database in Docker requires specifying the image down to the minimum version and system version.