I help teams fix systemic engineering issues: processes, architecture, and clarity.
→ See how I work with teams.
In some environments, it can be useful to make an HDFS filesystem available across networks as an exported share. This walkthrough describes a working scenario using Linux and Hadoop with tools that are typically included in older Hadoop distributions.
The setup uses hadoop-fuse-dfs and libhdfs to mount HDFS locally, and then exports that mount over NFS. Replace namenode.local and <PORT> with values appropriate for your cluster.
1. Install FUSE and libhdfs
yum install hadoop-0.20-fuse.x86_64 hadoop-0.20-libhdfs.x86_64
2. Create a mountpoint
mkdir /hdfs-mount
3. Test mounting HDFS via FUSE
hadoop-fuse-dfs dfs://namenode.local:<PORT> /hdfs-mount -d
If the mount succeeds, you should see output similar to:
INFO fuse_options.c:162 Adding FUSE arg /hdfs-mount
INFO fuse_options.c:110 Ignoring option -d
unique: 1, opcode: INIT (26), nodeid: 0, insize: 56
INIT: 7.10
flags=0x0000000b
max_readahead=0x00020000
INFO fuse_init.c:101 Mounting namenode.local:<PORT>
INIT: 7.8
flags=0x00000001
max_readahead=0x00020000
max_write=0x00020000
unique: 1, error: 0 (Success), outsize: 40
Once you see Success, you can stop the foreground process with Ctrl+C. The command above is mainly for testing.
4. Configure the mount at boot time
To mount HDFS automatically at boot, add an entry to /etc/fstab:
echo "hadoop-fuse-dfs#dfs://namenode.local:<PORT> /hdfs-mount fuse usetrash,rw 0 0" >> /etc/fstab
Then test the configuration:
# mount -a
# mount
[...]
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
fuse on /hdfs-mount type fuse (rw,nosuid,nodev,allow_other,default_permissions)
If you see the FUSE mount entry for /hdfs-mount, the configuration is working.
5. Tuning JVM memory for FUSE
Each FUSE process uses a JVM. To tune memory settings, inspect and adjust:
/etc/default/hadoop-0.20-fuse
Here you can configure Java heap size and other runtime parameters to fit your workload and hardware limits.
6. Export HDFS via NFS (unsecure)
The next step is to export part of the FUSE-mounted HDFS via NFS. Note that this is considered unsecure and should be used only in trusted environments. User IDs and permissions are mapped at the OS level.
6.1 Select the user for NFS exports
Assume you want to export data using the hdfs user. Check its UID and GID:
# id hdfs
uid=104(hdfs) gid=105(hdfs) groups=105(hdfs),104(hadoop) context=root:staff_r:staff_t:SystemLow-SystemHigh
6.2 Create the NFS exports configuration
Define an export for the HDFS user directory in /etc/exports:
# cat /etc/exports
/hdfs-mount/user (fsid=111,rw,wdelay,anonuid=104,anongid=105,sync,insecure,no_subtree_check,no_root_squash)
Explanation (simplified):
rw: read-writefsid=111: unique filesystem ID (seeman 5 exports)wdelay: write delayanonuid=104,anongid=105: map anonymous users to thehdfsuser and groupsync: synchronous writesinsecure: allow connections from non-privileged portsno_subtree_check,no_root_squash: disable subtree checks and root squashing
Exporting only the /hdfs-mount/user directory helps protect system-related paths such as /mapred or other service directories from accidental modification.
Restart the NFS server to apply the configuration:
# service nfs restart
7. Using HDFS as a “local” filesystem via NFS
From a client machine, you can now mount the exported NFS share and work with HDFS as if it were a local directory:
# mount -t nfs <NFS_SERVER_HOST>:/hdfs-mount/user /mnt/hdfs-user
After mounting, you can create or copy job definitions and files directly into HDFS through this NFS path. Keep in mind:
- All file operations are translated through FUSE and libhdfs into HDFS calls.
- Permissions and ownership are mapped to the local
hdfsuser as configured. - Using
rooton clients is a bad idea; stick to regular users and rely on the mapping.
8. Security and compatibility notes
This pattern relies on classic kernels and older Hadoop components:
- It only works reliably from Linux kernel 2.6.27 upwards (as originally tested).
- NFS exports based on FUSE-mounted HDFS are not recommended for multi-tenant or untrusted environments.
- For modern clusters, consider HDFS NFS Gateway, WebHDFS, or object-store abstractions instead.
Within its constraints, this approach can still be useful for legacy clusters and simple, controlled use cases where administrators need quick, filesystem-style access to HDFS.
If you need help with distributed systems, backend engineering, or data platforms, check my Services.