HDP Developer Apache Pig and Hive
HDP Developer Apache Pig and Hive
Hortonworks. We do Hadoop.
Revision 4
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Introducing Apache Spark
counts.saveAsTextFile("hdfs://wordcount-out”)
Action triggers
execution of
whole DAG
rdd.first(): 5
rdd.saveAsTextFile(“myfile”)
rdd=sc.parallelize([1, 2, 3, 4, 5])
rdd.map(lambda x: x*2+1).collect()
[3, 5, 7, 9, 11]
rdd=sc.parallelize([1, 2, 3, 4, 5])
rdd=sc.parallelize([1, 2, 3, 4, 5])