Popular Posts

Monday, January 19, 2015

Step by Step Hive User Defined functions (UDF)

Example simple UDF function  (StringUtilsUDF.java)

Step 1  :  Wrote simple Java function -  example (concat first name & lastName ) – which can be done via hive built in function
Step 2 : ADD JAR /home/gse/stringHiveUDF-1.0.jar;
Step 3 : CREATE TEMPORARY FUNCTION stringcat as 'com.test.udfs.StringUtilsUDF';
Step 4 :  Use the function in the hive select query            
                                select stringcat(billing_analyst_fname,billing_analyst_lname) from accounts


where account_number = 133708;
OK
Naoki,Ando
Time taken: 0.135 seconds, Fetched: 1 row(s)



StringUtilsUDF.java

package com.test.udfs;

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;

public class StringUtilsUDF extends UDF {
private Text result = new Text();
  public Text evaluate(Text strFirst, Text strLast) {
 if (strFirst != null && strLast != null) {
 result.set(StringUtils.strip(strFirst.toString()) + "," + StringUtils.strip(strLast.toString()));
} else {
if (strFirst != null) {
result.set(StringUtils.strip(strFirst.toString()));
} else if (strLast != null) {
result.set(StringUtils.strip(strLast.toString()));
}else{
return null;
}
}
 return result;
    }
}

Saturday, January 17, 2015

invalid LOC header (bad signature)

 some binary file being broken, most likely one of the dependencies.  just delete the respective maven repository and build again
rm -rf ~/.m2/repository/

example
       rm -rf ~/.m2/repository/org

Pagination in MongoDB - can be achieved via skip and limit

Pagination in MongoDB - can be achieved via skip and limit

     skip - NUMBER_OF_ITEMS * (PAGE_NUMBER - 1)
.limit(NUMBER_OF_ITEMS )

MongoDB aggregation result exceeds maximum document size (16MB)

MongoDB aggregation result exceeds maximum document size (16MB)
 
{
    "errmsg" : "exception: aggregation result exceeds maximum document size (16MB)",
    "code" : 16389,
    "ok" : 0
}

Solution : allowDiskUse to true or limit critera to max elements


Example

AggregationOptions aggregationOptions = new AggregationOptions(true,false,null);
Aggregation aggregation = newAggregation(
match(criteria),
limit(10),
sort(Sort.Direction.ASC,  "OrderSubmissionDate")
).withOptions(aggregationOptions);

In mongoDB - newAggregation throws match compilation error


Aggregation aggregation = newAggregation (
match(criteria)
)

solution
Import static newAggregation instead of nonstatic newAggregation
import static org.springframework.data.mongodb.core.aggregation.Aggregation.newAggregation;

Limitations In mongoDB - Cannot add two conditons with same key

Limitations In mongoDB

within same query you cannot have orderstaus in ("In Progress" , "Closed") and orderstatus not in { "$nin" : [ "Closed"]}'