Udfs 2
Udfs 2
in spark, we have various functions in pyspark. Sql. functions but sometimes none of functions
fulfill our requirements to do row level transformations.
that time we go for creating udfs functions.
* Here are the steps we need to follow to develop and use Spark User Defined Functions.
3. Variable can be used as part of Data Frame APIs such as select, filter,etc.
4. When we register, we register with a name. That name can be used as part of selectExpr
or as part of Spark SQL queries using spark.sql.
Input data
Suppose, we want to create udf function which convert string to lowercase, and that udf function
can be used to convert pyspark dataframe columns to convert to lowercase values.
def make_lower(string):
return string.lower()
2. Register the function using spark.udf.register and assign it to a variable.
convert_lower = spark.udf.register("make_lower_func",make_lower)
3. Variable can be used as part of Data Frame APIs such as select, filter,etc.
convert_lower = spark.udf.register("make_lower_func",make_lower)
users_df.select(convert_lower("user_name")).display()
4. When we register, we register with a name. That name can be used as part of selectExpr
or as part of Spark SQL queries using spark.sql or while running sql queries on temp views.
convert_lower = spark.udf.register("make_lower_func",make_lower)
users_df.createOrReplaceTempView("users_data")
%sql