Enabling Replication Between Clusters with Kerberos Authentication
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
To enable replication between clusters, additional setup steps are required to ensure that the source and destination clusters can communicate.
Continue reading:
Ports
When using BDR with Kerberos authentication enabled, BDR requires all the ports listed on the following page: Port Requirements for Backup and Disaster Recovery.
Additionally, the port used for the Kerberos KDC Server and KRB5 services must be open to all hosts on the destination cluster. By default, this is port 88.
Considerations for Realm Names
- If the clusters do not use the same KDC (Kerberos Key Distribution Center), Cloudera recommends that you use different realm names for each cluster. Additionally, if you are replicating across clusters in two different realms, see the steps for HDFS and Hive replication later in this topic to setup trust between those clusters.
- You can use the same realm name if the clusters use the same KDC or different KDCs that are part of a unified realm, for example where one KDC is the master and the other is a slave KDC.
-
Note: If you have multiple clusters that are used to segregate production and non-production environments, this configuration could result in principals that have equal permissions in both environments. Make sure that permissions are set appropriately for each type of environment.
HDFS Replication
- On the hosts in the destination cluster, ensure that the krb5.conf file (typically located at /etc/kbr5.conf) on each host has the following information:
- The kdc information for the source cluster's Kerberos realm. For example:
[realms] SOURCE.MYCO.COM = { kdc = src-kdc-1.src.myco.com:88 admin_server = src-kdc-1.src.myco.com:749 default_domain = src.myco.com } DEST.MYCO.COM = { kdc = dest-kdc-1.dest.myco.com:88 admin_server = dest-kdc-1.dest.myco.com:749 default_domain = dest.myco.com }
- Domain/host-to-realm mapping for the source cluster NameNode hosts. You configure these mappings in the [domain_realm] section. For
example, to map two realms named SRC.MYCO.COM and DEST.MYCO.COM, to the domains of hosts named hostname.src.myco.com and hostname.dest.myco.com, make the following mappings in the krb5.conf file:
[domain_realm] .src.myco.com = SRC.MYCO.COM src.myco.com = SRC.MYCO.COM .dest.myco.com = DEST.MYCO.COM dest.myco.com = DEST.MYCO.COM
- The kdc information for the source cluster's Kerberos realm. For example:
- On the destination cluster, use Cloudera Manager to add the realm of the source cluster to the Trusted Kerberos Realms configuration
property:
- Go to the HDFS service.
- Click the Configuration tab.
- In the search field type "Trusted Kerberos" to find the Trusted Kerberos Realms property.
- Enter the source cluster realm.
- Enter a Reason for change, and then click Save Changes to commit the changes.
- In the search field, type "domain name".
- Enter the domain names for Kerberos.
- If domain_realm is configured in the Advanced Configuration Snippet (Safety Valve) for remaining krb5.conf, remove the entries for it.
- If your Cloudera Manager release is 5.0.1 or lower, restart the JobTracker to enable it to pick up the new Trusted Kerberos Realm settings. Failure to restart the JobTracker prior to the first replication attempt may cause the JobTracker to fail.
Hive/Impala Replication
- Perform the procedure described in the previous section, including restarting the JobTracker.
Note: If the source and destination clusters both run Cloudera Manager 5.12 or higher, you can skip steps 2 and 3 in this section. These additional steps are no longer required for Hive/Impala replication. When you complete the configuration steps in HDFS Replication, you also configure Hive/Impala replication..
- On the hosts in the source cluster, ensure that the krb5.conf file on each host has the following information:
- The kdc information for the destination cluster's Kerberos realm.
- Domain/host-to-realm mapping for the destination cluster NameNode hosts.
- On the source cluster, use Cloudera Manager to add the realm of the destination cluster to the Trusted Kerberos Realms configuration
property.
- Go to the HDFS service.
- Click the Configuration tab.
- In the search field type "Trusted Kerberos" to find the Trusted Kerberos Realms property.
- Enter the destination cluster realm.
- Enter a Reason for change, and then click Save Changes to commit the changes.
It is not necessary to restart any services on the source cluster.
Kerberos Connectivity Test
As part of Test Connectivity, Cloudera Manager tests for properly configured Kerberos authentication on the source and destination clusters that run the replication. Test Connectivity runs automatically when you add a peer for replication, or you can manually initiate Test Connectivity from the Actions menu.
This feature is available when the source and destination clusters run Cloudera Manager 5.12 or later. You can disable the Kerberos connectivity test by setting feature_flag_test_kerberos_connectivity to false with the Cloudera Manager API: api/<version>/cm/config.
If the test detects any issues with the Kerberos configuration, Cloudera Manager provides resolution steps based on whether Cloudera Manager manages the Kerberos configuration file.
- Whether both clusters have Kerberos enabled. If one cluster uses Kerberos but the other does not, replication is not supported.
- Whether both clusters are in the same Kerberos realm. Clusters in the same realm must share the same KDC or the KDCs must be in a unified realm.
- Whether clusters are in different Kerberos realms. If the clusters are in different realms, cross-realm trust must be configured on the destination cluster according to the following
criteria:
- Destination HDFS services must have the correct trusted realm configuration. Cloudera Manager can check if the trusted realm configuration is correct.
- The krb5.conf file has the correct domain_realm mapping on all the hosts.
- The krb5.conf file has the correct realms information on all the hosts.
- Whether the local and peer KDC are running on an available port. This port must be open for all hosts in the cluster. The default port is 88.
Kerberos Recommendations
If Cloudera Manager manages the Kerberos configuration file, Cloudera Manager configures Kerberos correctly for you and then provides the set of commands that you must manually run to finish configuring the clusters. The following screen shots show the prompts that Cloudera Manager provides in cases of improper configuration:
Configuration changes:
Steps to complete configuration:
If Cloudera Manager does not manage the Kerberos configuration file, Cloudera manager provides the manual steps required to correct the issue. For example, the following screen shot
shows the steps required to properly configure Kerberos:
<< Using Snapshots with Replication | ©2016 Cloudera, Inc. All rights reserved | Replication of Encrypted Data >> |
Terms and Conditions Privacy Policy |