Solved with Aviatrix: Securely bring Google-managed Apigee-X to another Cloud or Datacenter

 

How To:

Securely bring Google-managed Apigee-X to another Cloud or Datacenter


Product Focus

Google Apigee is one of the most popular tools to expose APIs to the wider world.

From the Apigee documentation: Apigee is a platform for developing and managing APIs. By fronting services with a proxy layer, Apigee provides an abstraction or facade for your backend service APIs and provides security, rate limiting, quotas, analytics, and more.

This blog is focused on Apigee-X, which is hosted in Google Cloud.

Aviatrix is the leading vendor of Secure Multicloud Networking.

From the website: The Aviatrix cloud network platform delivers a single, common platform for multi-cloud networking, regardless of public cloud providers used. Aviatrix delivers the simplicity and automation enterprises expect in the cloud with the operational visibility and control they require.

Aviatrix can easily connect Google Apigee-X to AWS, Azure, and on-prem backends with full encryption and visibility.

Background

Our customer has standardized on AWS and Google Clouds. The customer had decided upon Aviatrix to provide Transit, Visibility, and Security for AWS. Using Aviatrix to provide the a consistent Google Cloud architecture was an easy decision.

Our customer has a large number of systems that exist not only in Google Cloud but also in AWS and in legacy datacenters. Typical API usage is from 3rd party vendor connections, mobile apps, and web pages. Access can be from anywhere.

These APIs are mission critical and revenue generating. Outages are unacceptable. To protect the APIs, Apigee-X was chosen to expose these systems. Apigee-X is deployed in multiple regions to protect availability.

Apigee-X Design

Google Apigee-X is a hybrid service; resource ownership is split between Google and the customer. Private Services Access is used to facilitate the resource relationship. The Service Producer VPC is allocated a CIDR by the customer when deploying Agigee-X. This VPC is peered to a Customer-owned VPC.


The below image represents the Apigee-X Network flows. The original image is from the Architecture overview.
(Click to Enlarge)

In customer discussions, it was helpful to label the flow types.
  • Flow 1: Ingress flows to the Apigee-X Frontend in the peered Service VPC. These are generally from the Internet but could be Internal.
    • Google's reference design uses their Global External HTTPS Load Balancer.
    • The Load Balancer's backends are regional Managed Instance Groups running Linux/IPTables.
    • The Backends must be in the same Global VPC.
    • The MIGs NAT traffic to the real Apigee-X endpoint in the peered Service VPC.
  • Flow 2: Flows south to backend API resources via Private Connectivity. Google's diagram places the backend in the VPC with the Ingress but it is (obviously?) oversimplified.
  • Flow 3: Flows south to backend API resources over Public Internet. Our customer has a strict policy against this kind of traffic, so it will be not be used.

Apigee-X Design Challenges

The above reference is fine if the network is simple. However, most companies will want multi-region support, firewall insertion, and visibility. Their endpoints will be spread across multiple networks in Cloud or on-prem. Backend Public IP connectivity needs to be reduced or eliminated.

What if we could simply merge the above simplicity with a larger network?

Challenge: Global VPC regional routing

Regional routing affinity in the VPC routing table can be difficult, especially when coupled with VPC peering.
  • Dynamic (BGP) Routes are required to route regionally with effective failover.
  • There is no way to specify regionality in a Static Route. You can't specify a route as preferred for US Central and another as preferred for EU West.
  • Routes and VMs can be Network Tagged to simulate regional routing, but failover to other regions is a manual process. Tagged Routes do not cross the VPC peering and cannot be used by the Apigee-X Runtime.
  • Google's Firewall Insertion reference runs into the same issue, as it specifies Static Routes to Network Load Balancers to direct traffic.
There are native ways to workaround this issue, such as using VPN or Cloud Interconnect, but those introduce their own caveats.

Aviatrix Solution

The customer can use their existing Aviatrix deployment in AWS and simply add on the Google reference design. The solution enables regional routing, as well as the other Aviatrix advantages such as visibility, security, and easy Firewall Insertion.

(Click to Enlarge)

Design Components

  • Global Load Balancer: Sends client traffic to MIGs from Clients.
  • Global VPCs:
    • apigee-service-vpc: This is Google's VPC in their diagram. This VPC is peered to the apigee-ingress-global VPC/Customer Global VPC.
    • apigee-ingress-global: This is the Customer Global VPC equivalent in Google's architecture diagram above. It contains:
      • MIG for IPTables (NAT on Flow #1)
      • NCC/Cloud Router and Aviatrix Transit Gateway NIC as Router Appliance Spoke
  • For each Google Cloud Region:
    • transit: Regional Aviatrix Transit VPC with an HA Pair of Gateways. Each Gateway has a NIC in the apigee-ingress-global VPC, as mentioned above.
    • transit-fw-exch: VPC for Aviatrix Transit to forward to the firewall.
    • cp-external: Ingress/Egress Public VPC for firewall
  • AWS us-east 1:
    • Transit Firenet example with Spokes
    • Connected to Google Transit using High Performance Encryption

Aviatrix Transit and NCC/Cloud Router (Dynamic BGP Routing)

This solution solves the Global VPC routing challenge. NCC is used to install Aviatrix routes into the apigee-ingress-global VPC routing table with regional affinity. These dynamic (BGP) routes are carried into the apigee-service-vpc using VPC peering and export custom routes.

Specific features:

Flows

  • Flow 1: This is the standard GCP architecture, aside from Akamai.
    • The Customer reaches to Akamai as front end by Public Internet.
    • Akamai proxies to the Global Load Balancer, also by Public Internet.
    • The XLB directs traffic to the closest MIG using the Google Network.
    • The MIG instance forwards traffic across the peering to the Apigee-X regional Load Balancer.
  • Flow 2: An Apigee-X regional instance reaches out to a backend API.
    • The Dynamic (BGP) routes received across the VPC peering direct the SYN to the Aviatrix Transit Gateway NIC in the appropriate region.
    • As required, the Transit Gateway hairpins the SYN to the regional Firewall for flow inspection.
    • If allowed, the Transit Gateway sends the SYN to the next hop. This could be:
      • A Spoke Gateway attached to this Transit Gateway (not shown).
      • Another Transit Gateway in GCP.
      • Another Transit Gateway in AWS/Azure.
      • A on-prem datacenter via IPSec or Cloud Interconnect (not shown). 
    • The SYN is eventually forwarded to the destination VPC Spoke or to the on-prem datacenter and on to the API backend.
    • The backend will respond with a SYN..ACK and eventually the flow will be established.


Implementation

Note: Deployment of Apigee-X and the MIGs is beyond the scope of this document.

Prerequisites

  • Apigee-X deployment CIDR: 10.10.0.0/21
  • Existing Aviatrix Controller/deployment in any Cloud
  • GCP Credentials for the Project onboarded.
  • IP ranges for the subnets. I prefer to do aggregate blocks per Cloud and per region. Ex:
    • us-west1: 10.3.0.0/16
    • us-central1: 10.1.0.0/16
    • us-east1: 10.2.0.0/16
  • 4 private, unique ASNs: One each per Aviatrix Transit Gateway and the Google Cloud Routers.
    • us-west1: 64851
    • us-central1: 64853
    • us-east1: 64855
    • All Cloud Routers: 64852
  • Export Custom Routes is enabled on the apigee-ingress-global <-> apigee-service-vpc peering.
    This can be changed after Apigee-X deployment, if required.

Subnets Deployed

    (Click to Enlarge)

Aviatrix Transit Gateways

Log into the Aviatrix Controller and click Multicloud Transit >> Setup.
(Click to Enlarge)


Fill out the form for Step 1, similar to the below.

(Click to Enlarge)
  • High Performance Encryption/Insane Mode is optional in this example.
  • Be sure to pick a Gateway size that will handle 3+ NICs.
  • The BGP over LAN subnet must be the regional subnet deployed in the apigee-ingress VPC.
  • Important: Repeat this task 3 times for the 3 regions. Double check subnet locations.

Aviatrix Transit Gateways, HA

Once the Transits are built, we can then deploy the HA Gateway.

Fill out the form for Step 2, similar to the below.


(Click to Enlarge)

  • For the Backup Gateway zone, pick a different zone.
  • The BGP over LAN subnet must be the regional subnet deployed in the apigee-ingress VPC.
  • Important: Repeat this task 3 times for the 3 regions. Double check subnet locations.

Gateway Configuration

Go to Multi-Cloud Transit >> Advanced.
(Click to Enlarge)
Set the Local AS Number and click change.

Scroll down until you see Multi-Tier Transit and click Enable.

Repeat this step for all 3 gateways, keeping in mind the unique Local AS Number.

Transit Peerings

These peering enable failover routing. Ideally, we should maintain regional affinity but different routing patterns should emerge if there is a zonal or regional failure.

Go to the Multi-cloud Transit Peering workflow. The link is two below Setup, from the Transit Gateway deploys.


Create 3 sets of Transit Peerings.
  • transit-uswest to transit-uscentral
  • transit-uscentral to transit-useast
  • transit-useast to transit-uswest 




(Click to Enlarge)

When done, we should see something like this.

Aviatrix/NCC Integration

Up to this point, we haven't done any serious integration with the Apigee-X deployment's networking. We have NICs deployed in the apigee-ingress-global VPC that now need to be configured. 

Aviatrix BGP over LAN

Go to the Multi-Cloud Transit >> Setup, then choose External Connection.
(Click to Enlarge)


Choose External Device, BGP, and LAN to populate the full step.

(Click to Enlarge)

Fill out the step.
(Click to Enlarge)

  • The above information will be used with NCC/Cloud Router configuration.
    • The Purple box is the Aviatrix Transit Gateway's unique ASN.
    • The Orange boxes represent the NCC/Cloud Routers' ASN.
    • The Red boxes represent the NCC/Cloud Router's two peer IPs. 
      • These IPs are arbitrary but must be unique in the Transit Gateway BGP subnet.
      • For the uswest gateway, the subnet CIDR is 10.3.1.0/28.
  • Repeat this task 3 times for the 3 regions, updating the ASNs/IPs as appropriate.

Google NCC/Cloud Router: Build Spokes

Under the Hybrid Connectivity section in the Google web console, go to Network Connectivity Center. Create a new Hub.

(Click to Enlarge)

Click Create to move to Add Spokes.
(Click to Enlarge)

The red boxes show the configuration needed.
  • Make sure to pick the correct region and VPC. Otherwise the association is created with wrong NICs.
  • Make sure Site-to-Site data transfer is on.
Click Add Spoke to add the additional 2 regions. Then click Create. Once deployed, we can see the Spokes displayed in the GUI.

(Click to Enlarge)

The alert is because we haven't configured BGP sessions for the Spokes.

Google NCC/Cloud Router: Configure BGP

Click one of the Spoke names in the view above then expand the sections for each Aviatrix Gateway.
(Click to Enlarge)

Four BGP sessions will be configured. These sessions align with the BGP ActiveMesh created in the Aviatrix GUI. Click the first Configure BGP Session link.

(Click to Enlarge)

In the top half, the Google ASN is always the same, in this demo, 64852.
Chose Create Custom routes and scroll down to configure.
(Click to Enlarge)


We want the Cloud Router to also advertise the Service VPC CIDR for Apigee-X. It doesn't matter if that VPC exists yet.
  • Keep advertising existing visible routes.
  • Add a custom route for the Service VPC CIDR. In this example, it is 10.10.0.0/21.
Click Create and Continue.
(Click to Enlarge)

After a minute, the Cloud Router is created. Click Edit BGP Session on line 1.

(Click to Enlarge)


Note that the Peer ASN is the same as the Aviatrix Transit in that region.
The Cloud Router BGP IP is the same as specified for the Remote LAN IP.

Click Save and Continue, then click on Edit BGP Session for line 2.
(Click to Enlarge)

Everything is the same, except the Cloud Router BGP IP matches the Remote LAN IP (Backup) field.

Click Save and Continue.
(Click to Enlarge)
Click Create.
(Click to Enlarge)

Now choose Configure BGP Session for the HA Gateway.

(Click to Enlarge)

Select the existing Cloud Router from the Primary gateway then Select & Continue. Edit the BGP session for line 1.
(Click to Enlarge)

Since the Cloud Router is already configured, all we have to configure is a name and ASN. This is the same ASN for Aviatrix Transit that we configured before.

Click Save and Continue and then finish the other session with as done previously.

(Click to Enlarge)

This Spoke is now configured. Configure the BGP Peerings for the other spokes in the same way, keeping in mind the ASNs and IPs will change.

Once we confirm BGP is up and routes are exchanged, we can call the implementation phase as good.


Testing

Verification - does the Service VPC CIDR make it where it should?

(Click to Enlarge)

All green. That means BGP is up.

In the Aviatrix Controller, go to the Multi-Cloud Network >> List view.

(Click to Enlarge)


Select one of the Primary Transits.
(Click to Enlarge)

Then click Details/Diag. Scroll down and look for Best Routes via BGP.
(Click to Enlarge)

All subnet routes are accounted for, including the custom route that represents the Service VPC. Please check the other Transit Gateways for the Service VPC route.

(Click to Enlarge)

The Service VPC makes it all the way to my on-prem Palo Alto, via another Aviatrix Transit and VPN tunnel.

(Click to Enlarge)


The ASN Path for good measure; notice the 64852 at the end?


To prove out the routing, I deployed the above VM in a mock Service VPC.

(Click to Enlarge)


I ran traceroute from my laptop in my office. The packet is forwarded into the avx-uscentral-hagw, then to the Google VPC Routing table. The packet is routed globally and across the peering in the Google Network. Peering does not show up in a traceroute.

Verification - how about the peered VPC?

Does the peered VPC see all of routes from the Transit? What about regional affinity?

(Click to Enlarge)

The above prefixes are advertised to GCP via VPN.

The only way I have found (so far) to prove this out from the Google API is to use Network Intelligence/Connectivity Center tests. This command appears to run a traceroute of sorts and is useful to see the computed next hops with dynamic routes. See the Into the weeds section at the bottom for some detail on the process.

I ran these connectivity tests against the VM from the previous section.

(Click to Enlarge)

The above tests are in the normal state with all Gateways up. The test returns 4 paths, corresponding to the 4 BGP sessions established. Priority is 0, for same region routes.

Since there are really only 2 real paths, the Aviatrix Transit Primary and HA, there are duplicates.

(Click to Enlarge)


Verification - what if a pair of Aviatrix Transits is down?

Since my test VMs are in us-west1, I'll shut down the same region Transit.

(Click to Enlarge)



I still see the 4 identical paths, but now traffic is routed via us-central1 with priority 236. My destination IP (192.168.2.3) is connected by a VPN in us-central1, so this routing makes sense.

(Click to Enlarge)


Into the weeds


Initially, I wanted to prove the on-prem prefixes are learned into the Service VPC with route affinity by showing route tables.

(Click to Enlarge)

These are the effective routes for a test VM (us-west1) in the Service VPC. We see 3 pairs of routes, each pair with the same priority but no real next hop.

(Click to Enlarge)

Strangely, the same test on a VM instance (us-west1) in the apigee-ingress-global VPC doesn't show different priorities.

The Cloud Router documentation states that dynamic routes have a priority assigned based on routing distance. The gcloud compute routes list command states that it won't return dynamic routes, referring us instead to the gcloud compute routers get-status command. However, that command only runs agains a regional Cloud Router, not a VM Instance.

I moved on to the Network Intelligence/Connectivity Center tests to see if I could coax out some more information. I ran a test to a firewall mgmt interface (192.168.2.3) my on-prem lab for both the apigee-ingress-global VPC and the Service VPC test VMs.

In both cases, the tests succeeded and found 4 identical paths, save for the Peering hop on the Service VPC. The REST output returned more detail with the following route names:
  • "displayName": "dynamic-route_17734467156087659059"
  • "displayName": "dynamic-route_6076300278419112525"
  • "displayName": "dynamic-route_11702578725044332292"
  • "displayName": "dynamic-route_3878110521043239369"
The next hop is one of the two us-west1 Aviatrix Transit Gateways.

Conclusion: Google's general route and Cloud Router commands aren't appropriate for path testing. The Connectivity Center tests are an effective way to determine the paths in the present state.

Comments

Popular posts from this blog

Solved with Aviatrix: Infoblox Anycast BGP on AWS/Azure

Solved with Aviatrix - Google Compute instances with DNS Forwarding Zones