How to aggregate same column multiple times in daru

Question

How to aggregate same column multiple times in daru

145 views Asked by janpeterka At 11 October 2019 at 13:49

I want to get grouped aggregated data, but running into a problem with the aggregating same column with multiple functions.

Basically, I want to know if there is a way to rename an aggregated column, so it doesn't rewrite.

Here is my code

    df = Daru::DataFrame.from_activerecord(active_record,
                                           *%i[jobs.id jobs.demand_created_at jobs.quality_rating jobs.service_rating jobs.value_rating SC.name D.czso_region_id])
    df.vectors = Daru::Index.new(%i[job_id demand_created_at quality_rating service_rating value_rating specific_category_name region_id])

    # computed columns
    df[:avg_rating] = ((df[:quality_rating] + df[:service_rating] + df[:value_rating]) / 3.0)
    df[:broad_region_id] = df[:region_id].recode { |i| i[0...-1]}

    df_grouped = df.group_by([:specific_category_name, :broad_region_id, :job_id])
    df_grouped.aggregate(avg_rating: :mean, job_id: :count).aggregate(avg_rating: :mean, job_id: :count)

I'm having problem here:

df_grouped.aggregate(avg_rating: :mean, job_id: :count).aggregate(avg_rating: :mean, job_id: :count)

Basically, I want to write something like this (for example):

df_grouped.aggregate(avg_rating: :mean, avg_rating: :std)

However, this only generates one column named avg_rating and error

(irb):124: warning: key :avg_rating is duplicated and overwritten on line 124

Is there a way to rename aggregated column? The only idea I have is to duplicate columns, but that seems like a very hacky solution.

Original Q&A

There are 1 answers

**janpeterka** · Accepted Answer · 2019-11-25T14:32:34+00:00

Well, I finally found the answer here.

Agreggation of grouped data can be done like this:

df.group_by(:a).aggregate(
 avg_d:    ->(df) { df[:d].mean },
 sum_c:    ->(df) { df[:c].sum },
 avg_of_c: ->(df) { df[:c].mean },
 size_b_with_lambda: ->(grouped){ grouped[:b].size}, 
 uniq_b_with_proc: proc {|grouped| grouped[:b].uniq.size }
)

which solves all my issues

TechQA.

How to aggregate same column multiple times in daru

There are 1 answers

Related Questions in AGGREGATE

Related Questions in DARU

Popular Questions

Popular Tags

Trending Questions