Skip to main content

Setting Up MIT Kerberos ↔ Active Directory Cross-Realm Trust for Secure Hadoop Clusters

Struggling with delivery, architecture alignment, or platform stability?

I help teams fix systemic engineering issues: processes, architecture, and clarity.
→ See how I work with teams.


This post explains how to configure a secure cross-realm Kerberos trust between a MIT KDC and Active Directory for Hadoop environments. It covers modern Kerberos settings, realm definitions, encryption choices, KDC configuration, AD trust creation, and Hadoop’s auth_to_local mapping rules. A final section preserves legacy compatibility for older Windows Server versions, ensuring the article can be used across mixed enterprise environments.

Integrating Hadoop with enterprise identity systems often requires establishing a cross-realm Kerberos trust between a local MIT KDC and an Active Directory (AD) domain. This setup allows Hadoop services to authenticate users from AD while maintaining a separate Hadoop-managed realm.

We walk through a full MIT Kerberos ↔ AD trust configuration using a modern setup, while preserving legacy notes for older Windows environments still found in long-lived clusters.

Example Realms

Replace these with your actual realms and hosts:

ALO.LOCAL      → Local MIT Kerberos realm (Hadoop KDC)
HADOOP1.INTERNAL → Host running the MIT KDC
AD.REMOTE     → Active Directory realm (external domain)

The KDC should be located within the Hadoop network. AD may be remote as long as the two domains can network-route to each other on Kerberos ports (88/UDP+TCP and 749/TCP).

1. Install Required Packages

On the MIT KDC (RHEL, CentOS, AlmaLinux, Rocky, etc.)

yum install krb5-server krb5-libs krb5-workstation -y

On Hadoop nodes (clients)

yum install krb5-libs krb5-workstation -y

Install Java JCE unlimited strength policy if required by your JDK distribution.

2. Configure the MIT KDC

/etc/krb5.conf

[libdefaults]
 default_realm = ALO.LOCAL
 dns_lookup_realm = false
 dns_lookup_kdc = false
 forwardable = true
 proxiable = true
 default_tgs_enctypes = aes256-cts aes128-cts rc4-hmac
 default_tkt_enctypes = aes256-cts aes128-cts rc4-hmac

[realms]
 ALO.LOCAL = {
   kdc = hadoop1.internal:88
   admin_server = hadoop1.internal:749
 }
 AD.REMOTE = {
   kdc = ad.remote.internal:88
   admin_server = ad.remote.internal:749
 }

[domain_realm]
 alo.local = ALO.LOCAL
 .alo.local = ALO.LOCAL
 ad.internal = AD.REMOTE
 .ad.internal = AD.REMOTE

[logging]
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmin.log
 default = FILE:/var/log/krb5lib.log

/var/kerberos/krb5kdc/kdc.conf

[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88

[realms]
 ALO.LOCAL = {
   acl_file = /var/kerberos/krb5kdc/kadm5.acl
   admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
   supported_enctypes = aes256-cts:normal aes128-cts:normal rc4-hmac:normal
 }

/var/kerberos/krb5kdc/kadm5.acl

*/admin@ALO.LOCAL *

Create the realm and start services

kdb5_util create -s -r ALO.LOCAL
service kadmin restart
service krb5kdc restart
chkconfig kadmin on
chkconfig krb5kdc on

Create admin principal

kadmin.local -q "addprinc root/admin"

3. Create the Trust on Active Directory (Modern Workflow)

Run the following from an elevated Windows PowerShell terminal:

# Register the MIT KDC
ksetup /addkdc ALO.LOCAL HADOOP1.INTERNAL

# Create the cross-realm trust
netdom trust ALO.LOCAL /domain:AD.REMOTE /add /realm /passwordt:passw0rd

# Set allowed encryption types
ksetup /SetEncTypeAttr ALO.LOCAL AES256-CTS-HMAC-SHA1-96 AES128-CTS-HMAC-SHA1-96 RC4-HMAC-MD5

After this, AD recognizes the MIT KDC as a trusted external Kerberos realm.

4. Create the AD Trust Principal in MIT Kerberos

kadmin.local: addprinc krbtgt/ALO.LOCAL@AD.REMOTE
password: passw0rd

5. Configure Hadoop’s auth_to_local Rules

core-site.xml

<property>
  <name>hadoop.security.auth_to_local</name>
  <value>
    RULE:[1:$1@$0](.*@\QAD.REMOTE\E$)s/@\QAD.AD.REMOTE\E$//
    RULE:[2:$1@$0](.*@\QAD.REMOTE\E$)s/@\QAD.REMOTE\E$//
    DEFAULT
  </value>
</property>

This strips the AD domain suffix, allowing Hadoop and HDFS to map users from AD to local Linux accounts or group mappings.

6. Test the Trust

kinit username@AD.REMOTE
klist

You should see a ticket-granting ticket (TGT) issued by AD.REMOTE.

Legacy Compatibility Notes

Some environments still operate old Windows Server versions. Use the following only if required:

Windows Server 2003

ktpass /princ ALO.LOCAL /domain AD.REMOTE /TrustEncryp aes128-cts arcfour-hmac

Windows Server 2008 additional step

ksetup /SetEncTypeAttr ALO.LOCAL aes128-cts arcfour-hmac

These modes support legacy encryption requirements and can still be found in long-lived enterprise clusters where upgrading the AD domain is not yet completed.

Conclusion

A cross-realm trust between MIT Kerberos and Active Directory remains a powerful and widely-used authentication setup for Hadoop clusters. The configuration provides secure integration with enterprise identity systems while allowing the Hadoop realm to remain isolated and controlled.

With proper encryption settings, realm mappings, and Hadoop auth_to_local rules, users from AD can authenticate seamlessly and work securely across the Hadoop environment.

If you need help with distributed systems, backend engineering, or data platforms, check my Services.

Most read articles

Why Is Customer Obsession Disappearing?

Many companies trade real customer-obsession for automated, low-empathy support. Through examples from Coinbase, PayPal, GO Telecommunications and AT&T, this article shows how reliance on AI chatbots, outsourced call centers, and KPI-driven workflows erodes trust, NPS and customer retention. It argues that human-centric support—treating support as strategic investment instead of cost—is still a core growth engine in competitive markets. It's wild that even with all the cool tech we've got these days, like AI solving complex equations and doing business across time zones in a flash, so many companies are still struggling with the basics: taking care of their customers. The drama around Coinbase's customer support is a prime example of even tech giants messing up. And it's not just Coinbase — it's a big-picture issue for the whole industry. At some point, the idea of "customer obsession" got replaced with "customer automation," and no...

How to scale MySQL perfectly

When MySQL reaches its limits, scaling cannot rely on hardware alone. This article explains how strategic techniques such as caching, sharding and operational optimisation can drastically reduce load and improve application responsiveness. It outlines how in-memory systems like Redis or Memcached offload repeated reads, how horizontal sharding mechanisms distribute data for massive scale, and how tools such as Vitess, ProxySQL and HAProxy support routing, failover and cluster management. The summary also highlights essential practices including query tuning, indexing, replication and connection management. Together these approaches form a modern DevOps strategy that transforms MySQL from a single bottleneck into a resilient, scalable data layer able to grow with your application. When your MySQL database reaches its performance limits, vertical scaling through hardware upgrades provides a temporary solution. Long-term growth, though, requires a more comprehensive approach. This invo...

What the Heck is Superposition and Entanglement?

This post is about superposition and interference in simple, intuitive terms. It describes how quantum states combine, how probability amplitudes add, and why interference patterns appear in systems such as electrons, photons and waves. The goal is to give a clear, non mathematical understanding of how quantum behavior emerges from the rules of wave functions and measurement. If you’ve ever heard the words superposition or entanglement thrown around in conversations about quantum physics, you may have nodded politely while your brain quietly filed them away in the "too confusing to deal with" folder.  These aren't just theoretical quirks; they're the foundation of mind-bending tech like Google's latest quantum chip, the Willow with its 105 qubits. Superposition challenges our understanding of reality, suggesting that particles don't have definite states until observed. This principle is crucial in quantum technologies, enabling phenomena like quantum comp...