Azure
Monday, 29 July 2019
dataframe : how to groupBy/alias count then filter on count
df.groupBy("x").agg(count("*").alias("cnt"))
top10FemaleFirstNamesDF = (peopleDF.select("firstName").filter("gender=='F'").groupBy("firstName").agg(count("*").alias("cnt")).sort(desc("cnt")).limit(10));
No comments:
Post a Comment
Newer Post
Older Post
Home
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment