Here's the original input:
FEATURE,TESTER,LICENSE_USED,PROGRAM,AREA
Low,T6712,23,Element01,FT1
High,T7911,54,Element03,FT2
Medium,E8123,48,Element02,FT3
High,F4309,54,Element02,PB1
Low,F4309,23,Element01,PB1
Low,T7911,23,Element04,FT1
High,E8123,54,Element05,FT2
Medium,F4309,48,Element01,PB1
Let's pretty that up so we can see what's going on:
$ tr , '11' < data.txt | column -tR 3,5
FEATURE TESTER LICENSE_USED PROGRAM AREA
Low T6712 23 Element01 FT1
High T7911 54 Element03 FT2
Medium E8123 48 Element02 FT3
High F4309 54 Element02 PB1
Low F4309 23 Element01 PB1
Low T7911 23 Element04 FT1
High E8123 54 Element05 FT2
Medium F4309 48 Element01 PB1
Here's the expected output:
$ tr , '11' < expected.txt | column -tR 2,3,4,5,6,7,8
TESTER E8123 E8123 F4309 F4309 T6712 T7911 T7911
PROGRAM Element02 Element05 Element01 Element02 Element01 Element03 Element04
AREA FT3 FT2 PB1 PB1 FT1 FT2 FT1
High 0 54 0 54 0 54 0
Low 0 0 23 0 23 0 23
Medium 48 0 48 0 0 0 0
It appears we want three column headers to be translated to row headers and we want to use Low, Medium, and High as row headers per tester/element/area (sorting by Tester makes this apparent):
$ tr , '11' < data.txt | column -tR 3 | sort -k2
Medium E8123 48 Element02 FT3
High E8123 54 Element05 FT2
Low F4309 23 Element01 PB1
Medium F4309 48 Element01 PB1
High F4309 54 Element02 PB1
Low T6712 23 Element01 FT1
Low T7911 23 Element04 FT1
High T7911 54 Element03 FT2
FEATURE TESTER LICENSE_USED PROGRAM AREA
We can easily see that Testers work on different Elements too, so we'll have to account for that:
BEGIN {
FS=","
}
NR > 1 {
data[$2,$4,$5] = data[$2,$4,$5] $1 ":" $3 FS
}
END {
#construct the table
for (tester_element_area in data) {
split(tester_element_area, parts, SUBSEP)
tester = parts[1]
element = parts[2]
area = parts[3]
n = split(data[tester_element_area], d)
template["High"] = 0
template["Medium"] = 0
template["Low"] = 0
for (i = 1; i <= n; i++) {
split(d[i], license, ":")
degree = license[1]
value = license[2]
template[ degree ] = value
}
table["TESTER"] = table["TESTER"] FS tester
table["PROGRAM"] = table["PROGRAM"] FS element
table["AREA"] = table["AREA"] FS area
table["High"] = table["High"] FS template["High"]
table["Medium"] = table["Medium"] FS template["Medium"]
table["Low"] = table["Low"] FS template["Low"]
}
#print the table
header[1] = "TESTER"
header[2] = "PROGRAM"
header[3] = "AREA"
header[4] = "High"
header[5] = "Low"
header[6] = "Medium"
for (i = 1; i <= 6; i++) {
header_name = header[i]
printf header_name
n = split(table[header_name], parts)
for (j = 1; j <= n; j++) {
if (j > 1) {
printf FS
}
printf parts[j]
}
print ""
}
}
Let's see what it returns:
$ awk -f prog.awk < data.txt | tr , '11' | column -tR2,3,4,5,6,7,8
TESTER E8123 T7911 F4309 E8123 T6712 T7911 F4309
PROGRAM Element05 Element04 Element02 Element02 Element01 Element03 Element01
AREA FT2 FT1 PB1 FT3 FT1 FT2 PB1
High 54 0 54 0 0 54 0
Low 0 23 0 0 23 0 23
Medium 0 0 0 48 0 0 48
Not too shabby, the columns aren't ordered exactly correct. They should be sorted. If you're willing to use GAWK, it requires a small code change:
END {
for (tester_element_area in data) {
cols[++i] = tester_element_area
}
m = asort(cols)
#construct the table
for (k = 1; k <= m; k++) {
tester_element_area = cols[k]
...
Output:
$ awk -f prog.awk < data.txt | tr , '11' | column -tR2,3,4,5,6,7,8
TESTER E8123 E8123 F4309 F4309 T6712 T7911 T7911
PROGRAM Element02 Element05 Element01 Element02 Element01 Element03 Element04
AREA FT3 FT2 PB1 PB1 FT1 FT2 FT1
High 0 54 0 54 0 54 0
Low 0 0 23 0 23 0 23
Medium 48 0 48 0 0 0 0
Update: Sort by Area
NR > 1 {
data[$5,$2,$4] = data[$5,$2,$4] $1 ":" $3 FS
}
END {
for (tester_element_area in data) {
cols[++i] = tester_element_area
}
m = asort(cols)
#construct the table
for (k = 1; k <= m; k++) {
tester_element_area = cols[k]
split(tester_element_area, parts, SUBSEP)
area = parts[1]
tester = parts[2]
element = parts[3]
Output:
TESTER T6712 T7911 E8123 T7911 E8123 F4309 F4309
PROGRAM Element01 Element04 Element05 Element03 Element02 Element01 Element02
AREA FT1 FT1 FT2 FT2 FT3 PB1 PB1
High 0 0 54 54 0 0 54
Low 23 23 0 0 0 23 0
Medium 0 0 0 0 48 48 0