I'm working on a Bash script to calculate GPA for students, taking into account courses that might have been repeated. My goal is to ensure that if a student retakes a course, only the latest grade is considered in the GPA calculation. Additionally, I want to exclude subjects without a grade, as these represent courses that are currently in progress.
I'm using Bash version 5, so associative arrays are available to me, and I've attempted to utilize them to track the latest attempt for each course.
However, my script is not functioning as expected—it considers both attempts of a repeated course instead of just the latest one. Below is the relevant portion of my script:
#!/bin/bash
# Define a function to convert grades to points
grade_to_points() {
case $1 in
A) echo 4 ;;
A-) echo 3.7 ;;
B+) echo 3.3 ;;
B) echo 3 ;;
B-) echo 2.7 ;;
C+) echo 2.3 ;;
C) echo 2 ;;
D) echo 1 ;;
F) echo 0 ;;
*) echo -1 ;; # for subjects with no grade
esac
}
# Extract the list of subject files and student IDs
subject_files=()
student_ids=()
read_subjects=true
for arg in "$@"; do
if [[ $arg == "student" ]]; then
read_subjects=false
continue
fi
if $read_subjects; then
subject_files+=($arg)
else
student_ids+=($arg)
fi
done
# Loop through each student ID to generate the transcript
for student_id in "${student_ids[@]}"; do
# Get the student's name from student.dat
student_name=$(grep "^$student_id" student.dat | cut -d ' ' -f 2-)
echo "Transcript for $student_id $student_name"
total_points=0
subjects_count=0
# Loop through each subject file to find grades for the student
for subject_file in "${subject_files[@]}"; do
if grep -q "^$student_id" $subject_file; then
# Extract subject details and grade
subject_code=$(head -n 1 $subject_file | cut -d ' ' -f 2)
academic_year=$(head -n 1 $subject_file | cut -d ' ' -f 3)
semester=$(head -n 1 $subject_file | cut -d ' ' -f 4)
grade=$(grep "^$student_id" $subject_file | awk '{print ($2=="") ? "" : $2}')
# Print subject details
echo "$subject_code $academic_year Sem $semester $grade"
# Calculate GPA if grade is present
if [[ $grade != "" ]]; then
points=$(grade_to_points $grade)
if [[ $points != -1 ]]; then
total_points=$(echo "$total_points + $points" | bc)
((subjects_count++))
fi
fi
fi
done
# Calculate and print GPA if there are graded subjects
if [[ $subjects_count -gt 0 ]]; then
gpa=$(echo "scale=2; $total_points / $subjects_count" | bc)
echo "GPA for $subjects_count subjects $gpa"
else
echo "No graded subjects found."
fi
echo # New line for separation
done
Wrong Output I get for Input:
transcript COMP* student 1236 1234 1223
Transcript for 1236 peter
COMP1011 2021 Sem 2 A
COMP2411 2022 Sem 1 A
COMP2432 2022 Sem 2
GPA for 2 subjects 4.00
Transcript for 1234 john
COMP1011 2021 Sem 2 B
COMP2411 2022 Sem 1 B-
GPA for 2 subjects 2.85
Transcript for 1223 bob
COMP1011 2021 Sem 2 F
COMP1011 2022 Sem 1 B
COMP2411 2022 Sem 1 C+
COMP2432 2022 Sem 2
GPA for 3 subjects 1.76
Correct Output I should get for Input:
transcript COMP* student 1236 1234 1223
Transcript for 1236 peter
COMP1011 2021 Sem 2 A
COMP2411 2022 Sem 1 A
COMP2432 2022 Sem 2
GPA for 2 subjects 4.00
Transcript for 1234 john
COMP1011 2021 Sem 2 B
COMP2411 2022 Sem 1 B
GPA for 2 subjects 2.85
Transcript for 1223 bob
COMP1011 2021 Sem 2 F
COMP1011 2022 Sem 1 B
COMP2411 2022 Sem 1 C+
COMP2432 2022 Sem 2
GPA for 2 subjects 2.65
Files and content I used:
student.dat
1223 bob
1224 kevin
1225 stuart
1226 otto
1234 john
1235 mary
1236 peter
1237 david
1238 alice
COMP101121S2.dat
Subject COMP1011 2021 2
1223 F
1234 B
1235 B+
1236 A
COMP101122S1.dat
Subject COMP1011 2022 1
1223 B
1224 B+
1225 B1238 C+
COMP241122S1.dat
Subject COMP2411 2022 1
1223 C+
1234 B1235 B
1236 A
COMP243222S2.dat
Subject COMP2432 2022 2
1223
1235
1236
1237
Here's what I've tried:
Using associative arrays to track the most recent attempt for each course. Unfortunately, this doesn't seem to be working as expected.
Ensuring my Bash version supports associative arrays, which it does since I'm on version 5.
My requirements are:
If a student fails and retakes the same subject, consider only the latest grade in the GPA calculation.
Ignore failed grades if a retake is present, but include them if there's no retake.
Do not count subjects with no grade, as they represent ongoing courses.
Can anyone advise on how to modify my script to meet these requirements? Any help would be greatly appreciated!
Here is a significant starting point for how to do this using GNU awk for arrays of arrays,
gensub(),PROCINFO["sorted_in"], and\S/\s:It doesn't do what you want regarding ignoring failed grades if there's a retake or ignoring subjects with no grade (got to leave something for you to do!) but hopefully you'll find the above much easier to understand and modify to do whatever it is you want to do than your existing bash script. If you can't figure out how to do it yourself you can always ask a new question using an awk script as your code sample instead of your bash script.