Skip to main content

Setting Up MIT Kerberos ↔ Active Directory Cross-Realm Trust for Secure Hadoop Clusters

This post explains how to configure a secure cross-realm Kerberos trust between a MIT KDC and Active Directory for Hadoop environments. It covers modern Kerberos settings, realm definitions, encryption choices, KDC configuration, AD trust creation, and Hadoop’s auth_to_local mapping rules. A final section preserves legacy compatibility for older Windows Server versions, ensuring the article can be used across mixed enterprise environments.

Integrating Hadoop with enterprise identity systems often requires establishing a cross-realm Kerberos trust between a local MIT KDC and an Active Directory (AD) domain. This setup allows Hadoop services to authenticate users from AD while maintaining a separate Hadoop-managed realm.

We walk through a full MIT Kerberos ↔ AD trust configuration using a modern setup, while preserving legacy notes for older Windows environments still found in long-lived clusters.

Example Realms

Replace these with your actual realms and hosts:

ALO.LOCAL      → Local MIT Kerberos realm (Hadoop KDC)
HADOOP1.INTERNAL → Host running the MIT KDC
AD.REMOTE     → Active Directory realm (external domain)

The KDC should be located within the Hadoop network. AD may be remote as long as the two domains can network-route to each other on Kerberos ports (88/UDP+TCP and 749/TCP).

1. Install Required Packages

On the MIT KDC (RHEL, CentOS, AlmaLinux, Rocky, etc.)

yum install krb5-server krb5-libs krb5-workstation -y

On Hadoop nodes (clients)

yum install krb5-libs krb5-workstation -y

Install Java JCE unlimited strength policy if required by your JDK distribution.

2. Configure the MIT KDC

/etc/krb5.conf

[libdefaults]
 default_realm = ALO.LOCAL
 dns_lookup_realm = false
 dns_lookup_kdc = false
 forwardable = true
 proxiable = true
 default_tgs_enctypes = aes256-cts aes128-cts rc4-hmac
 default_tkt_enctypes = aes256-cts aes128-cts rc4-hmac

[realms]
 ALO.LOCAL = {
   kdc = hadoop1.internal:88
   admin_server = hadoop1.internal:749
 }
 AD.REMOTE = {
   kdc = ad.remote.internal:88
   admin_server = ad.remote.internal:749
 }

[domain_realm]
 alo.local = ALO.LOCAL
 .alo.local = ALO.LOCAL
 ad.internal = AD.REMOTE
 .ad.internal = AD.REMOTE

[logging]
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmin.log
 default = FILE:/var/log/krb5lib.log

/var/kerberos/krb5kdc/kdc.conf

[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88

[realms]
 ALO.LOCAL = {
   acl_file = /var/kerberos/krb5kdc/kadm5.acl
   admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
   supported_enctypes = aes256-cts:normal aes128-cts:normal rc4-hmac:normal
 }

/var/kerberos/krb5kdc/kadm5.acl

*/admin@ALO.LOCAL *

Create the realm and start services

kdb5_util create -s -r ALO.LOCAL
service kadmin restart
service krb5kdc restart
chkconfig kadmin on
chkconfig krb5kdc on

Create admin principal

kadmin.local -q "addprinc root/admin"

3. Create the Trust on Active Directory (Modern Workflow)

Run the following from an elevated Windows PowerShell terminal:

# Register the MIT KDC
ksetup /addkdc ALO.LOCAL HADOOP1.INTERNAL

# Create the cross-realm trust
netdom trust ALO.LOCAL /domain:AD.REMOTE /add /realm /passwordt:passw0rd

# Set allowed encryption types
ksetup /SetEncTypeAttr ALO.LOCAL AES256-CTS-HMAC-SHA1-96 AES128-CTS-HMAC-SHA1-96 RC4-HMAC-MD5

After this, AD recognizes the MIT KDC as a trusted external Kerberos realm.

4. Create the AD Trust Principal in MIT Kerberos

kadmin.local: addprinc krbtgt/ALO.LOCAL@AD.REMOTE
password: passw0rd

5. Configure Hadoop’s auth_to_local Rules

core-site.xml

<property>
  <name>hadoop.security.auth_to_local</name>
  <value>
    RULE:[1:$1@$0](.*@\QAD.REMOTE\E$)s/@\QAD.AD.REMOTE\E$//
    RULE:[2:$1@$0](.*@\QAD.REMOTE\E$)s/@\QAD.REMOTE\E$//
    DEFAULT
  </value>
</property>

This strips the AD domain suffix, allowing Hadoop and HDFS to map users from AD to local Linux accounts or group mappings.

6. Test the Trust

kinit username@AD.REMOTE
klist

You should see a ticket-granting ticket (TGT) issued by AD.REMOTE.

Legacy Compatibility Notes

Some environments still operate old Windows Server versions. Use the following only if required:

Windows Server 2003

ktpass /princ ALO.LOCAL /domain AD.REMOTE /TrustEncryp aes128-cts arcfour-hmac

Windows Server 2008 additional step

ksetup /SetEncTypeAttr ALO.LOCAL aes128-cts arcfour-hmac

These modes support legacy encryption requirements and can still be found in long-lived enterprise clusters where upgrading the AD domain is not yet completed.

Conclusion

A cross-realm trust between MIT Kerberos and Active Directory remains a powerful and widely-used authentication setup for Hadoop clusters. The configuration provides secure integration with enterprise identity systems while allowing the Hadoop realm to remain isolated and controlled.

With proper encryption settings, realm mappings, and Hadoop auth_to_local rules, users from AD can authenticate seamlessly and work securely across the Hadoop environment.

If you need help with distributed systems, backend engineering, or data platforms, check my Services.

Most read articles

Why Is Customer Obsession Disappearing?

Many companies trade real customer-obsession for automated, low-empathy support. Through examples from Coinbase, PayPal, GO Telecommunications and AT&T, this article shows how reliance on AI chatbots, outsourced call centers, and KPI-driven workflows erodes trust, NPS and customer retention. It argues that human-centric support—treating support as strategic investment instead of cost—is still a core growth engine in competitive markets. It's wild that even with all the cool tech we've got these days, like AI solving complex equations and doing business across time zones in a flash, so many companies are still struggling with the basics: taking care of their customers. The drama around Coinbase's customer support is a prime example of even tech giants messing up. And it's not just Coinbase — it's a big-picture issue for the whole industry. At some point, the idea of "customer obsession" got replaced with "customer automation," and no...

What the Heck is Superposition and Entanglement?

This post is about superposition and interference in simple, intuitive terms. It describes how quantum states combine, how probability amplitudes add, and why interference patterns appear in systems such as electrons, photons and waves. The goal is to give a clear, non mathematical understanding of how quantum behavior emerges from the rules of wave functions and measurement. If you’ve ever heard the words superposition or entanglement thrown around in conversations about quantum physics, you may have nodded politely while your brain quietly filed them away in the "too confusing to deal with" folder.  These aren't just theoretical quirks; they're the foundation of mind-bending tech like Google's latest quantum chip, the Willow with its 105 qubits. Superposition challenges our understanding of reality, suggesting that particles don't have definite states until observed. This principle is crucial in quantum technologies, enabling phenomena like quantum comp...

SynthLink Compared to Google’s Natural Questions: A Practical Evaluation

SynthLink evaluates reasoning, synthesis and internal consistency across diverse question types. Google’s Natural Questions evaluates extractive QA: finding short text spans inside structured documents. Because real workloads require interpretation, abstraction and multi-step logic, SynthLink exposes capabilities and failure modes that NQ cannot measure. The two benchmarks are complementary, but SynthLink is more aligned with production tasks. Benchmarks such as Google’s Natural Questions (NQ) dominate model evaluation. They provide a reliable, academically stable test for extractive question answering: short queries, grounded answers, and constrained context ranges. But real workloads rarely look like NQ. Production systems must handle ambiguous inputs, multi-step reasoning, poorly structured prompts, and cases where no canonical answer exists. SynthLink was designed for this broader landscape. It focuses on evaluating reasoning, synthesis and internal consistency rather than snippe...