In my df, I need the two x-axis labels under "subset" = "Initial" to be coloured grey50 (like the two bars on the left) and the two x-axis labels under "subset" = "Processed" to be coloured midnightblue (like the two bars on the right).
Code:
df_TP <- data.frame(
categ = c("initial_t_tp", "processed_t_tp", "initial_v_tp", "processed_v_tp"),
group = rep("TP", 4),
absolute = c(86, 85, 21, 21),
percentage = c(84.16, 84.16, 84.00, 84.00),
col = c(0, 1, 0, 1),
subset = c("initial", "processed", "initial", "processed")
)
labels_tp <- c(
"initial_t_tp" = "Training",
"initial_v_tp" = "Validation",
"processed_v_tp" = "Validation",
"processed_t_tp" = "Training"
)
a <- ifelse(df_TP$col == 0, "grey50", "midnightblue")
# Create the bar plot with separation
ggplot(
df_TP,
aes(x = categ, y = absolute, fill = subset)
) +
geom_col(aes(fill = subset)) +
labs(x = NULL, y = "# detected events") +
scale_fill_manual(
values = c("initial" = "grey50", "processed" = "midnightblue")
) +
theme_minimal() +
theme(
axis.title.y = element_text(
size = 30, color = "grey50",
vjust = 2, hjust = 0.95),
axis.text.y = element_text(size = 25, colour = "grey50"),
axis.text.x = ggtext::element_markdown(
size = 25, vjust = 3, colour = a
),
strip.text = element_blank(),
panel.grid.major.y = element_line(size = 0.5, color = "grey85"),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank(),
legend.position = "none",
plot.margin = margin(l = 20, 0, 0, 0),
aspect.ratio = 1/0.9
) +
coord_cartesian(
ylim = c(0, 90), clip = 'off'
) +
facet_grid(
. ~ subset, scales = "free_x", switch = "x",
) +
scale_y_continuous(
limits = c(0, 90), breaks = seq(0, 90, by = 30)
) +
scale_x_discrete(labels = labels_tp)
This outputs:
I'm unable to understand why only Validation is being coloured (both Validation from Initial and Validation from Processed). As seen in my dataframe, the only way it can separate between Training and Validation is by considering a threshold on the y-axis values (Validation is always lower than Training).
I have tried applying the solution here for both a categorical and a numerical condition (which is why the sole reason I created "col" with 0 and 1). I tried the ifelse with "col" and "subset", with no luck.
I have also tried the convoluted solution by Ben in this post:
cols <- c(
"initial_t_tp" = "grey50",
"initial_v_tp" = "grey50",
"processed_v_tp" = "midnightblue",
"processed_t_tp" = "midnightblue"
)
colour = cols[as.character(df_TP$categ[order(df_TP$categ)])]
But no luck.
I understand this post is prone to downvotes because this question has been asked before and have spent the last two hours trying to avoid an extra post. At this point, I'm making mistakes out of tiredness.
I am assuming my code has a little more elements than other questions made here and something is making the code override the dictated logic in favour of either only training or only validation. But what?
Using external vectors (
a
here) in non-NSE elements of ggplot2 expressions can be problematic, since the order of howa
is applied is not necessarily (often is not at all) the same as the order of the columns. I suggest putting the colors into the frame itself.I'm inferring that you want "Training" before "Validation", so we'll need to control the order using
factor
as well.By "baking" (my word) the axis label into the data itself, we can (a) include its color, (b) control its order, and (c) remove the need to change the labels with
scale_x_discrete
.(No longer using
a
orlabels_tp
.)Here's the "baked in data":
(I'm using
dplyr
here, though this can be easily adapted to base R in three steps: (1) changemutate
totransform
, (2) changeif_else
toifelse
, and (3) break the secondxaxs
assignment into a new|> transform(..)
, since transform does not "see" the previous definition ofxaxs
.)With this, we can remove the dependence on
a
andlabels_tp
, and instead tell ggplot andggtext
to format the axis labels directly.