Skip to main content

Posts

Showing posts from May, 2012

How to Filter Multiple Columns in HBase Using SingleColumnValueFilter

Filtering multiple columns in HBase requires combining several SingleColumnValueFilter instances inside a FilterList. By enforcing boolean AND logic across column families, only rows where all required columns contain valid values are returned. This pattern helps control scan output precisely even in older HBase deployments. HBase is a column-oriented database, storing data by column family and qualifier. When executing a scan, filters help reduce the returned data set to only rows matching specific criteria. A frequent challenge is filtering on more than one column simultaneously . For example, you may require that two or more specific columns must contain valid values before a row qualifies. The practical solution is to use multiple SingleColumnValueFilter objects combined in a FilterList . This gives you boolean AND logic across all defined filters. List<Filter> list = new ArrayList<Filter>(2); // Filter on family "fam1", qualifier "VALUE1...

How to Stop New Hadoop MapReduce Jobs Using Queue ACLs

This article shows how to temporarily stop new Hadoop MapReduce jobs from being submitted by enabling ACLs and configuring mapred-queue-acls.xml. Existing jobs continue to run, which makes this pattern useful for maintenance windows or decommissioning work on a classic MapReduce cluster. In a classic Hadoop MapReduce (MRv1) cluster, there are situations where you want to stop accepting new MapReduce jobs while allowing already running jobs to finish. This is especially useful during maintenance, node decommissioning or cluster reconfiguration. One simple way to achieve this is to enable ACLs on the MapReduce job queue and then configure the submission ACL so that effectively nobody is allowed to submit new jobs. 1. Enable ACLs for MapReduce queues First, configure queue ACLs in $HADOOP/conf/mapred-queue-acls.xml . A typical configuration might allow a set of users and groups to submit jobs and a smaller set of admins to manage them: <configuration> <property...