How to migrate services from a classic network to VPC?

BackgroundWe plan to migrate the services gradually to Alibaba Cloud VPC network environment to enhance the isolation between different environments, prevent services from getting stuck because of the shared network, and avoid potential security issues with the business scale development. Before the real migration, we first trialed the process in our testing environment. Below are the detailed records of the migration and problems we encountered.

RequirementsWe hope to achieve non-perceivable migration when migrating the official environment and end users won't be able to sense the service interruption. Despite being a testing environment, we tried our best to make the migration smooth and minimize the service downtime to provide more reference for the migration of the official environment.

MigrationNetwork structures before and after migrationOur services are provided to the external environment in docker containers. Every container is stateless. Data is stored in RDS. In the testing environment, all containers run on a 16G dual-core ECS instance. There is an nginx container before the business containers. We have considered the flexible replacement of ECS instances so the nginx container is connected after a public SLB. The specific structure is shown in the figure:
After the migration to VPC, the foremost public SLB (slb-p1) won't be replaced, because Alibaba Cloud's public SLB can be connected to both classic network ECS instances and VPC ECS instances (but not the two types of ECS instances at the same time). The ECS and RDS after the SLB will be replaced by the ECS and RDS in VPC. The final state after the migration is shown below:
ECS ecs-v1 serves as the NAT gateway. Because Alibaba Cloud VPC NAT gateway service charges fees, we selected self-built NAT gateway.
The advantage of self-built NAT lies in the lower cost. But the shortcoming is the unavailability of multi-point redundancy. When the gateway fails, there is no backup node available. The entire network cannot visit the internet until the gateway is recovered.

ProceduresA major bottleneck of smooth migration is the migration of RDS. Good news is that Alibaba Cloud provides the data transmission service DTS which supports RDS data synchronization between classic network and VPC. With DTS, we can control the service downtime within seconds. For migration of containers, we first run the containers on ecs-v2 and these containers will continue to read or write data in the RDS in the classic network before the RDS migration is complete. After the RDS is migrated, these containers will read and write data in the VPC RDS.
The specific process as follows:
1. Run the containers on ecs-v2 to read and write data in RDS in the classic network.
2. Enable DTS and synchronize data in rds-c1 to rds-v1.
3. Remove ecs-c1 from the SLB and stop the service. Ensure no data is written to rds-c1.
4. RDS synchronization is complete. Switch the database configurations of the containers on ecs-v2 to read and write data in rds-v1.
5. Add ecs-v2 to the SLB and resume the service. The migration is complete.

ImplementationCreate a VPCAlibaba Cloud NAT service charges high and only offers one billing mode, namely by bandwidth. So we choose self-built NAT service. Thus we need two instances: NAT gateway node (ecs-v1) and business node (ecs-v2).
Self-built NAT gateway
Environment:
• Create a VPC instance
• Create a VSwitch
• Create an EIP instance
• Create an ECS instance
Create two ECS instances: ecs-v1 and ecs-v2.
The ecs-v1 acts as the NAT gateway and the intranet address is: 192.168.1.1.
The ecs-v2 serves as the business container and the intranet address is: 192.168.1.2.
Create the EIP address.
Operation steps:
• Bind an EIP to the ECS ecs-v1
• Enable the IP forwarding function
By default, Linux will discard packets not of its own. You need to enable this option since you use ecs-v1 as the gateway.
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf sysctl -p /etc/sysctl.conf • Configure iptables
Replace the source address of the packet sent from 192.168.1.0/24 on ecs-v1 with 192.168.1.1.
iptables -t nat -I POSTROUTING -s 192.168.1.0/24 -j SNAT --to-source 192.168.1.1 • Add a route
Add a route in the Console and send the packet for the 0.0.0.0/0 segment to ecs-v1 (ECS ecs-v1 instance ID).
This command tells the router to hand over all data packets to ecs-v1 for processing. If the packet is for the intranet ECS, the packet will be sent to the corresponding ECS instance according to the local route table.

Establish VPNThere are two reasons for establishing the VPN:
1. You need to access the RDS in the classic network from the VPC during the migration to achieve smooth migration.
2. In the classic network for Docker Registry, the VPC needs to pull images from the registry.
We used GRE tunnel in the scheme to connect the two networks. Because the two networks are isolated on the data link layer, the tunnel can only go through the internet.
One end of the tunnel is ecs-v1 and the other end is ecs-c2 in the classic network.
The internet address of ecs-v1 is represented as 1.2.3.4 and the intranet address is represented as 192.168.1.1.
The internet address of ecs-c2 is represented as 5.6.7.8 and the intranet address is represented as 10.1.1.1.
Execute the following commands on ecs-v1:
# Load the GRE module modprobe ip_gre # Create a tunnel and name it tun1 ip tunnel add tun1 mode gre remote 5.6.7.8 # up interface tun1 ip link set tun1 up # Configure VIP for the newly-created interface ip addr add 192.168.2.1 peer 192.168.2.2 dev tun1 # Add a route and route the traffic to the remote VIP in the classic network through this tunnel. The private network addresses in the classic network are comparatively scattered so we set it to 10.0.0.0/8. route add -net 10.0.0.0/8 dev tun1Similarly, execute the following commands on ecs-c2:
modprobe ip_gre ip tunnel add tun1 mode gre remote 1.2.3.4 ip link set tun1 up ip addr add 192.168.2.2 peer 192.168.2.1 dev tun1 route add -net 192.168.1.0/24 dev tun1VPN is established successfully. Test the access to rds-c1 (representing its IP address as 10.1.1.2) on ecs-v2.
nc -nv 10.1.1.2 3306Run containers on ecs-c2Run all containers in the testing environment on ecs-c2. At this time, the RDS is still on the classic network. There are almost no additional operations required to make containers on ecs-c2 access rds-c1. Because of VPN, ecs-c2 can access rds-c1 transparently. You should know that RDS is provided in the form of a domain name and consider whether the domain name of the RDS in the classic network can be resolved in the VPC. Good news is that all RDS domain names can be resolved successfully in any Alibaba Cloud network.
The only change required is the nginx container running in the VPC. Its backend node is changed from ecs-c1 to ecs-v2 so you need to change its upstream.

Test VPC containersThis step ensures that all containers running on the VPC are available and that they can correctly read and write data in rds-c1.

Enable DTS synchronizationCreate VPC RDS (rds-v1) and configure the DTS to enable RDS synchronization. The DTS will start to synchronize the table structure and full data and then synchronize incremental data continuously.

Stop a service Wait until the RDS data on both sides are consistent. Then delete the backend node (ecs-c1) after slb-p1 to ensure that no new data is written to rds-c1.
The criteria for judging consistent data for synchronization is the “latency” time. According to Alibaba Cloud DTS official documentation, the latency refers to the time difference between the time stamp of the latest data synchronized to the target RDS instance and the current time stamp of the source RDS instance. That is to say, this is the time when the target RDS and the source RDS reach the full consistency after the source RDS stops data writing. There was no much data in our testing environment, so the latency of DTS was 700ms in our actual migration.
The reason for deleting the backend node is not to stop/replace the public SLB. Because it requires you to modify the DNS to replace the public SLB and the DNS effective time is not easy to control. This step has two goals:
1. Stop the service and ensure there is no data written.
2. Remove the backend node after the SLB and a new node ecs-v2 will be added later.

Redirect data source connection During the several seconds after the service is stopped, DTS can finish synchronizing the RDS data. At this time you can resume the service. Before the resumption, make sure the service does not read or write data in rds-c1, because at this time the RDS has been migrated to rds-v1.
There are a total of more than 50 containers in our testing environment. If we switched the RDS reading/writing by modifying the configurations of one container after another, the deployment would be too time-consuming. We gave up this scheme to shorten the service downtime.
The scheme we finally adopted is iptables. We redirected all the connections to the rds-c1 on the ECS layer to rds-v1 through DNAT. This not only sped up the switching, but also facilitated quick rollback in case of issues.
Match the connections to rds-c1 using iptables and then change the destination IP to the rds-v1 address (192.168.1.19) using DNAT.
iptables -t nat -I PREROUTING --dest 10.1.1.2 -p tcp --dport 3306 -j DNAT --to-dest 192.168.1.19There is a risk here. The IP address is used in the rule, but Alibaba Cloud's RDS is provided in domain name, and the IP address may change. This will depend on the internal discovery in Alibaba Cloud. However, the chance of RDS IP changes is very low and an additional script can be used for the official environment migration to keep polling for and quickly identify any IP changes in the migration process.

Stop data source in the classic networkWhen the databases of VPC containers all point to rds-v1, you can stop rds-c1, for two reasons:
1. Ensure the containers in VPC won’t read or write data in rds-c1.
2. The above iptable rules only work for new TCP connections. Stopping rds-c1 can force to disconnect the established connections for re-connection.
But you cannot shut down the RDS instance yet. We should clear the rds-c1 whitelist in the Console to ensure no node can read or write data in rds-c1.

Resume a serviceAdd ecs-c2 to the backend of slb-p1 and the service will be resumed.

Modify RDS configurations of containersAt this moment, the RDS address in every container is still rds-c1, and requests are forwarded to rds-v1 through the iptables rules. In this step, we modify the RDS address in every container into rds-v1 and re-deploy the containers to make them connect to rds-v1 directly.

Remove redirection rulesAfter containers are connected to rds-v1 directly, iptables rules are not needed. We can remove the rules:
iptables -t nat -L --line-numbers iptables -t nat -D PREROUTING 1Shut down related resources in the classic networkThe migration is complete and you can remove the obsolete resources.
• Release the DTS.
• Release the RDS.
• Release the ECS.

Estimate the service downtimeThe estimated service downtime before migration:
• Stop service: delete the backend nodes after slb-p1 (service downtime starts)
• Redirect data source connections (estimated five seconds)
• Stop rds-c1: clear the RDS whitelist (estimated 30 seconds)
• Restore service: add ecs-v2 to the backend of slb-p1 and wait for the state to return to healthy (120 seconds)
The estimated service downtime is 155 seconds in total.

Reality We used the bash script for looped monitoring of data service availability of the site throughout the migration process:
while true do r=`curl -sS 'http://example.com/data' | grep 'ok'` d=`date` echo "${d}: ${r}" sleep 0.5 doneThe actual service downtime for migration met the expectation, namely 2.5 minutes.
The downtime mainly occurred to two procedures:
1. Manual operations. Because this is the first migration, we didn't have any scripts to automate the process. We made some mouse click operations in the Console, which took around half a minute.
2. After ecs-v2 is added after slb-p1, it took around two minutes to wait for the SLB state to return to healthy. Because the SLB does not support backend instances in both networks at the same time, we have no other way to avoid the time consumption for switching the SLB backend instances. If we choose to modify DNS, that is, to replace the SLB, this problem can be avoided. But it takes longer for the DNS to take effect. The two SLBs (two networks) need to provide services for a long time. This is not a big problem for the testing environment, but it is costly and risky for the official environment that has high traffic.

Problems encounteredThe docker port of the other end cannot be accessed in VPNWhen we set up the GRE VPN, we connect the ecs-c1 and ecs-v1 with the GRE tunnel.
After the setup, we tested the connection to ecs-c1 to be successful on ecs-v2 using the command:
nc -zv 10.1.1.1 22 Connection to 10.1.1.1 22 port [tcp/ssh] succeeded!But the connection from ecs-v2 to the port exposed by docker on ecs-c1 failed:
nc -zv 10.1.1.1 53 TIMEOUTThis problem nagged us for long. We suspected it is because of the incompatibility of the GRE and NAT protocols. NAT will modify the IP address of the data packet, causing failure to receive the returned packet by the GRE. Therefore, we modified the scheme, and connected another ECS ecs-c2 using a classic network with ecs-v1. In this way, ecs-v2 can access the docker port on ecs-c1. The data packet actually arrives at ecs-c2 first, and then to ecs-c1. The access from ecs-c2 to ecs-c1 is transparent for the GRE tunnel. So the ecs-v2 can access the docker port on the ecs-c1.
The problem caused by this solution is that ecs-c1 won't be able to visit the VPC in reverse. But in actual migration process, there is no such requirement for accessing back the VPC from the ecs-c1.
We later discussed this issue with Alibaba Cloud developers. They proposed a solution: adding a route in the docker container:
docker exec -ti --privileged=true <container_id> /bin/bash ip route add 192.168.2.1/32 via 192.168.0.1 dev eth0This solution is complicated as it requires us to modify every container. In addition, the external network environment should be transparent to docker, an independent container, and there is no reason to modify the route of every container. So we didn't go with this scheme.

DTS synchronization errorAfter the DTS synchronization task is created, an error is prompted in the first step of “structure synchronization”:
class java.lang.Exception: Failed to request creating the missing database, failed to create DB[{"message":"The specified DB instance name does not exist.","apiCode":"InvalidDBInstanceName.NotFound","requestId":"9446bbec-8b1f-4a72-9eab-aeceeb910b2e","data":null,"code":"404","success":false}]We submitted a ticket to Alibaba Cloud and the customer service said the cause is that the instance contains special databases with underscores in the name and the database should be created manually in the target instance, with corresponding read-write permissions granted. After repeated tries, we manually created all the databases (whether the database name contains underscores or not) to solve the problem.

Failure to redirect data from the container using iptablesWe used iptables to forward all the data through the OUTPUT CHAIN to the VPC:
iptables -t nat -I OUTPUT --dest 10.1.1.2 -p tcp --dport 3306 -j DNAT --to-dest 192.168.1.19In actual tests, this rule only works for connection requests from the host layer. The connections from the container still point to the classic network.
We learned about the flow of data packets in every table and every chain of iptables, as shown in the figure below. The source address of the data from the container is not the host, so it enters the FORWARD chain. Only data packets sent from the host will pass the OUTPUT CHAIN.

So you have to set the PREROUTING CHAIN to apply the rule to connections from the container:
iptables -t nat -I PREROUTING --dest 10.1.1.2 -p tcp --dport 3306 -j DNAT --to-dest 192.168.1.19At the same time, the loopback interface will ignore PREROUTING CHAIN. So you must also set both PREROUTING and OUTPUT to make the rule effective to the host and container at the same time.

After iptables redirection, the rule is not effective for established connectionsThe above iptables rule only works for newly established connections and won't apply to established connections. The improvement for this is:
Shut down the RDS (after removing the RDS whitelist on the classic network) after the iptables rule is introduced and before the service is resumed.
The advantages of doing so include:
• Force the client to re-establish connections to the data sources and make the iptables redirection rule effective.
• Ensure the accessed is data sources in VPC.

Error in accessing VPC RDS after iptables redirectionWe used a script to test the availability of iptables redirected connections before official migration to ensure the smooth migration. The script read data from the RDS in the classic network continuously, but an error was reported by the script after iptables redirection to VPC RDS, and the script couldn't read any data from the VPC RDS.
The reason is that the password strength of the RDS in the classical network was relatively low. At the VPC RDS creation, Alibaba Cloud required high password strength, so the password is not the same with the previous one. The iptables redirection applies to the IP address, but the account used the password for the classic network account, hence the reading error.
Make sure the two passwords are consistent when using this method.
Before the migration, we changed the password of the RDS in the classic network to a strong password consistent with that in the VPC. The RDS account has more than 50 containers. If you want to achieve smooth changes without interrupting the service, you can first create a temporary account. Change all containers to temporary accounts first, and then change the password, and then change all containers to the original account and the new password. In order to reduce the workload during implementation, we stopped the service for a period of time and directly modified the password.

The backend of the private SLB cannot use the TCP modeWe tried to add ecs-v2 to the backend of the private SLB. If the listening mode is HTTP, it worked normally. However, if the listening mode is TCP, the SLB wasn't accessible, with timeout prompted, even if its health state was “normal”.
Alibaba Cloud customer service said this was a network bug. If the client and SLB backend ECS are in the same network segment, and the SLB is in the TCP mode, there may be an access problem. A temporary solution is to remove the direct route from the host, and add it to /etc/rc.local to avoid restart failure.
route del -net 192.168.1.0 netmask 255.255.255.0Our actual tests showed that this command couldn't solve the problem, but led to frequent dropping of ecs-v2, unavailable service, and SSH login failure. At last, we had to restart the instance in the Console.
We submitted the ticket on November 22, 2016 for this issue. Tests showed that the issue had been resolved when I wrote this post (January 3, 2017).

Failure to access RDS in containers When the migration was complete, we changed the RDS configuration in the container to VPC RDS. At this time, there was a strange problem: the container could not access the RDS, or any instance in the 192.168.1.x segment except the instance itself.
For this problem, we first suspected it was because of the GRE and NAT networks at the beginning. But the problem persisted after GRE VPN was disconnected.
We finally located the problem cause in the route table settings on the host. Below are two in the routes:
192.168.0.0 * 255.255.240.0 U 0 0 0 docker0 192.168.1.0 * 255.255.255.0 U 0 0 0 eth0The subnet mask of the first route was set to 255.255.240.0. When a RDS address similar to 192.168.1.19 appeared, it would be calculated as 192.168.0.0. As a result, the packet that should have gone to eth0 went to docker0. This results in the access failure to the RDS in the container.
To solve the problem, just change it to the subnet mask:
192.168.0.0 * 255.255.255.0 U 0 0 0 docker0 192.168.1.0 * 255.255.255.0 U 0 0 0 eth0The reason for this problem is that the docker will create docker0 by default in the 172 segment, but the new ECS will add several routes by default, one of which is the 172 segment. This causes the docker to create docker0 in the 192.168 segment, conflicting with the VPC segment.

ConclusionWe encountered a lot of problems in the test environment migration and managed to solve or avoid them one by one. The service downtime also complied with the expectation. Compared with the test environment, the official environment has a larger traffic flow. To migrate the test environment, we simply ran a script on the container running on the VPC to test whether it could read or write data in RDS successfully. However, in the official environment, we hope to introduce real traffic for the grayscale test, which can prevent some unknown issues after all the traffic is switched to the VPC. But SLB does not support connection of ECS instances in both networks at the same time.
The migration was not completely imperceivable as we expected. But we have acquired a deep understanding of the issues that need to be addressed to achieve an imperceivable migration. The main obstacles are:
• RDS switching
DTS provides a good data synchronization experience, and the delay can be as low as several hundred milliseconds. However, to ensure that the data is 100% synchronized to the target RDS, the business container must switch off the service before RDS switching to ensure there is no new data written to the source RDS, especially for high-traffic services. Service downtime for RDS switching appears to be unavoidable, but the second-level service downtime is also acceptable.
• Classic network/VPC switching
o The public SLB of Alibaba Cloud does not support connecting ECS instances in both the classic network with the VPC network as the backend at the same time. As a result, before you add the VPC ECS to the SLB, you must remove the classic network ECS first. It takes several minutes for Alibaba Cloud SLB to complete adding an instance as the backend and the health check. So if you adopt the SLB backend switching approach, the minute-level service downtime is unavoidable.
o If you adopt the SLB replacement and DNS modification scheme, ECS instances in both networks have to provide services simultaneously for a long time. Both the old and the new SLBs will introduce traffic to ECS instances in the classic network and VPC, where the VPC ECS will read and write data in RDS in the classical network through a public VPN. The risk is that we cannot control the time when the DNS takes effect, and cannot control the traffic to the VPC network. In addition, the VPC ECS reads and writes data in the RDS in the classic network through the internet, which will incur quite some bandwidth costs.The above-mentioned problems are difficult to bypass. We hope to Alibaba Cloud can solve them at the infrastructure layer. It will help a lot if the public SLB can connect ECS instances in the classic network and VPC at the same time:
1. It eliminates the minute-level service downtime for backend switching.
2. When the two networks coexist, you can assign the traffic to the VPC through weight to achieve the purpose of a gray-scale test. Risks and costs are within the controllable range.